Update README.md
Browse files
README.md
CHANGED
@@ -23,34 +23,64 @@ The two versions are designed for different application scenarios.
|
|
23 |
Jellyfish-13B is suitable for integration into larger data management systems due to its simple and clear responses that can be easily transformed into code.
|
24 |
On the other hand, Jellyfish-13B-Interpreter is more user-oriented, with responses that provide them with in-depth data insights without the necessity for advanced coding skills or an intricate grasp of statistics.
|
25 |
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
|
31 |
-
|
|
32 |
-
| Entity Matching |
|
33 |
-
| Entity Matching |
|
34 |
-
| Entity Matching | Amazon
|
35 |
-
|
|
36 |
-
|
|
37 |
-
|
|
38 |
-
|
|
39 |
-
|
|
40 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
_Accuracy as the metric for data imputation and the F1 score for other tasks._
|
42 |
_For GPT-3.5, GPT-4 we used the few-shot approach, while for Jellyfish and Jellyfish-Interpreter, the zero-shot approach was employed._
|
|
|
43 |
1.
|
44 |
[Ditto](https://arxiv.org/abs/2004.00584) for Entity Matching
|
45 |
[SMAT](https://www.researchgate.net/publication/353920530_SMAT_An_Attention-Based_Deep_Learning_Solution_to_the_Automation_of_Schema_Matching) for Schema Matching
|
46 |
-
[HoloDetect](https://arxiv.org/abs/1904.02285) for Error Detection
|
|
|
47 |
[HoloClean](https://arxiv.org/abs/1702.00820) for Data Imputation
|
48 |
2.
|
49 |
[Large Language Models as Data Preprocessors](https://arxiv.org/abs/2308.16361)
|
50 |
-
3.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
51 |
|
|
|
52 |
|
53 |
-
|
|
|
|
|
|
|
|
|
|
|
54 |
|
55 |
- **Developed by:** Haochen Zhang, Yuyang Dong, Chuan Xiao, Masafumi Oyamada
|
56 |
- **Contact: [email protected]**
|
@@ -136,6 +166,15 @@ Attribute B is [name: {value of name}, description: {value of description}].
|
|
136 |
Are Attribute A and Attribute B semantically equivalent? Choose your answer from: [Yes, No].
|
137 |
```
|
138 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
139 |
### JellyFish-13B-Interpreter
|
140 |
#### For Entity Matching
|
141 |
```
|
|
|
23 |
Jellyfish-13B is suitable for integration into larger data management systems due to its simple and clear responses that can be easily transformed into code.
|
24 |
On the other hand, Jellyfish-13B-Interpreter is more user-oriented, with responses that provide them with in-depth data insights without the necessity for advanced coding skills or an intricate grasp of statistics.
|
25 |
|
26 |
+
More details about the model can be found in the [Jellyfish paper](linktobeadded).
|
27 |
+
|
28 |
+
## Performance on seen tasks
|
29 |
+
|
30 |
+
| Task | Type | Dataset | Non-LLM SoTA<sup>1</sup> | GPT-3.5<sup>2</sup> | GPT-4<sup>2</sup> | Jellyfish-13B-1.1<sup>3</sup>| Jellyfish-13B-Interpreter |
|
31 |
+
| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
|
32 |
+
| Entity Matching | Seen | Fodors-Zagats | 100 | 100 | 100 | 100 | 100 |
|
33 |
+
| Entity Matching | Seen | Beer | 94.37| 96.30 | 100 | 96.77 | 100 |
|
34 |
+
| Entity Matching | Seen | iTunes-Amazon | 97.06| 96.43 | 100 | 98.11 | 96.15 |
|
35 |
+
| Entity Matching | Seen | DBLP-ACM | 98.99| 96.99 | 97.44 | 98.98 | 95.74 |
|
36 |
+
| Entity Matching | Seen | DBLP-GoogleScholar | 95.60| 76.12 | 91.87 | 98.51 | 89.45 |
|
37 |
+
| Entity Matching | Seen | Amazon-Google | 75.58| 66.53 | 74.21 | 81.34 | 56.64 |
|
38 |
+
| Entity Matching | Unseen | Walmart-Amazon | 86.76| 86.17 | 90.27 | 89.42 | 85.16 |
|
39 |
+
| Entity Matching | Unseen | Abt-Buy | 89.33 | -- | 92.77 | 89.58 | -- |
|
40 |
+
| Data Imputation | Seen | Restaurant | 77.20| 94.19 | 97.67 | 94.19 | 93.02 |
|
41 |
+
| Data Imputation | Seen | Buy | 96.50| 98.46 | 100 | 100 | 100 |
|
42 |
+
| Data Imputation | Unseen | Filpkart | 68.00 | -- | 89.94 | 81.68 | -- |
|
43 |
+
| Data Imputation | Unseen | Phone | 86.70| -- | 90.79 | 87.21 | -- |
|
44 |
+
| Error Detection | Seen | Hosptial | 94.40| 90.74 | 90.74 | 95.59 | 65.66 |
|
45 |
+
| Error Detection | Seen | Adult | 99.10| 92.01 | 92.01 | 99.33 | 90.13 |
|
46 |
+
| Error Detection | Unseen | Flights | 81.00 | -- | 83.48 | 82.52 | -- |
|
47 |
+
| Error Detection | Unseen | Rayyan | 79.00| -- | 81.95 | 90.65 | -- |
|
48 |
+
| Schema Matching | Seen | Sythea | 38.50| 57.14 | 66.67 | 36.36 | 30.77 |
|
49 |
+
| Schema Matching | Seen | MIMIC | 20.00| -- | 40.00 | 40.00 | -- |
|
50 |
+
| Schema Matching | Unseen | CMS | 50.00| -- | 19.35 | 59.29 | -- |
|
51 |
+
|
52 |
+
_Few-shot is disabled for Jellyfish-13B on seen datasets and enabled on unseen datasets._
|
53 |
_Accuracy as the metric for data imputation and the F1 score for other tasks._
|
54 |
_For GPT-3.5, GPT-4 we used the few-shot approach, while for Jellyfish and Jellyfish-Interpreter, the zero-shot approach was employed._
|
55 |
+
|
56 |
1.
|
57 |
[Ditto](https://arxiv.org/abs/2004.00584) for Entity Matching
|
58 |
[SMAT](https://www.researchgate.net/publication/353920530_SMAT_An_Attention-Based_Deep_Learning_Solution_to_the_Automation_of_Schema_Matching) for Schema Matching
|
59 |
+
[HoloDetect](https://arxiv.org/abs/1904.02285) for Error Detection seen datasets
|
60 |
+
[RAHA](https://dl.acm.org/doi/10.1145/3299869.3324956) for Error Detection unseen datasets
|
61 |
[HoloClean](https://arxiv.org/abs/1702.00820) for Data Imputation
|
62 |
2.
|
63 |
[Large Language Models as Data Preprocessors](https://arxiv.org/abs/2308.16361)
|
64 |
+
3. We have updated the main branch with Jellyfish-13B version 1.1 .
|
65 |
+
|
66 |
+
## Performance on unseen tasks
|
67 |
+
|
68 |
+
### Column Type Annotation
|
69 |
+
|
70 |
+
| Dataset | RoBERTa (159 shots)<sup>1</sup> | GPT-3.5<sup>1</sup> | GPT-4 | Jellfish-13B-1.1 |
|
71 |
+
| ---- | ---- | ---- | ---- | ---- |
|
72 |
+
| SOTAB | 79.20 | 89.47 | 91.55 | 82.00 |
|
73 |
+
|
74 |
+
1. Results from [Column Type Annotation using ChatGPT](https://arxiv.org/abs/2306.00745)
|
75 |
|
76 |
+
### Attribute Value Extraction
|
77 |
|
78 |
+
| Dataset |Stable Beluga 2 70B<sup>1</sup> | SOLAR 70B<sup>1</sup> | GPT-3.5<sup>1</sup> | GPT-4 <sup>1</sup>| Jellfish-13B-1.1 |
|
79 |
+
| ---- | ---- | ---- | ---- | ---- | ---- |
|
80 |
+
| AE-110k | 52.10 | 49.20 | 61.30 | 55.50 | 58.12 |
|
81 |
+
| OA-Mine | 50.80 | 55.20 | 62.70 | 68.90 | 55.96 |
|
82 |
+
|
83 |
+
1. Results from [Product Attribute Value Extraction using Large Language Models](https://arxiv.org/abs/2310.12537)
|
84 |
|
85 |
- **Developed by:** Haochen Zhang, Yuyang Dong, Chuan Xiao, Masafumi Oyamada
|
86 |
- **Contact: [email protected]**
|
|
|
166 |
Are Attribute A and Attribute B semantically equivalent? Choose your answer from: [Yes, No].
|
167 |
```
|
168 |
|
169 |
+
### For Column Type Annotation
|
170 |
+
|
171 |
+
We follow the prompt in [Column Type Annotation using ChatGPT](https://arxiv.org/abs/2306.00745) (text+inst+2-step).
|
172 |
+
|
173 |
+
### For Attribute Value Extraction
|
174 |
+
|
175 |
+
We follow the prompt in [Product Attribute Value Extraction using Large Language Models](https://arxiv.org/abs/2310.12537) (textual, w/o examples).
|
176 |
+
|
177 |
+
|
178 |
### JellyFish-13B-Interpreter
|
179 |
#### For Entity Matching
|
180 |
```
|