File size: 1,907 Bytes
f1fb6c2
 
 
3b9a8c9
52ee694
f1fb6c2
 
3b9a8c9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f1fb6c2
 
 
 
3b9a8c9
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
base_model:
- intfloat/e5-small-v2
license: cc-by-4.0
pipeline_tag: tabular-regression
---

# Paper title and link

The model was presented in the paper [TabSTAR: A Foundation Tabular Model With Semantically Target-Aware
  Representations](https://arxiv.org/abs/2505.18125).

# Paper abstract

The abstract of the paper is the following:

While deep learning has achieved remarkable success across many domains, it
has historically underperformed on tabular learning tasks, which remain
dominated by gradient boosting decision trees (GBDTs). However, recent
advancements are paving the way for Tabular Foundation Models, which can
leverage real-world knowledge and generalize across diverse datasets,
particularly when the data contains free-text. Although incorporating language
model capabilities into tabular tasks has been explored, most existing methods
utilize static, target-agnostic textual representations, limiting their
effectiveness. We introduce TabSTAR: a Foundation Tabular Model with
Semantically Target-Aware Representations. TabSTAR is designed to enable
transfer learning on tabular data with textual features, with an architecture
free of dataset-specific parameters. It unfreezes a pretrained text encoder and
takes as input target tokens, which provide the model with the context needed
to learn task-specific embeddings. TabSTAR achieves state-of-the-art
performance for both medium- and large-sized datasets across known benchmarks
of classification tasks with text features, and its pretraining phase exhibits
scaling laws in the number of datasets, offering a pathway for further
performance improvements.

We’re working on making **TabSTAR** available to everyone. In the meantime, you can find the research code to pretrain the model here:

[🔗 GitHub Repository: alanarazi7/TabSTAR](https://github.com/alanarazi7/TabSTAR)

Project page: https://eilamshapira.com/TabSTAR/