SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 6 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
Out of Scope	'Why is your website so slow?' 'Can I get a shoutout on your social media?' 'I like to listen to classical music'
product faq	'What is the price of the Temple Butidaar Multi Color Border Pure Silk Chiffon Georgette Saree?' 'Do you have the Air Jordan 1 Low Shadow Brown/Brown Kelp- Sail in size 7?' 'Is the lakadong turmeric powder available for purchase?'
order tracking	'What is the expected delivery time for the 10 pack of Cake Boxes to Bhopal?' 'What is the delivery status for my order placed using email address [email protected]?' 'I havent received my order'
product policy	'What is the policy for returning a product that was part of a Cyber Monday sale?' 'Are there any exceptions to the return policy for items that were purchased with a special occasion promotion?' 'Are there any restrictions on returning sneakers with added fur or fur trim?'
product discoverability	'Suggest me some high ankle sneakers' 'Do you have any grocery & gourmet honey available?' 'Do you have any sneaker collaborations with artists?'
general faq	'How many cups of green tea should I drink daily to achieve the recommended therapeutic dosage of ECGC?' 'what is mashru silk' 'What specific compounds in Green Tea contribute to its antioxidant properties?'

Evaluation

Metrics

Label	Accuracy
all	0.8667

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("Are there any sarees with Fekwa Weave technique?")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	4	11.1737	28

Label	Training Sample Count
Out of Scope	35
general faq	24
order tracking	34
product discoverability	40
product faq	40
product policy	40

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0004	1	0.256	-
0.0213	50	0.2639	-
0.0425	100	0.2341	-
0.0638	150	0.0407	-
0.0851	200	0.0698	-
0.1063	250	0.014	-
0.1276	300	0.0069	-
0.1489	350	0.0099	-
0.1701	400	0.0014	-
0.1914	450	0.0007	-
0.2127	500	0.0006	-
0.2339	550	0.0005	-
0.2552	600	0.0006	-
0.2765	650	0.0005	-
0.2977	700	0.0002	-
0.3190	750	0.0005	-
0.3403	800	0.0003	-
0.3615	850	0.0003	-
0.3828	900	0.0002	-
0.4041	950	0.0003	-
0.4254	1000	0.0002	-
0.4466	1050	0.0002	-
0.4679	1100	0.0001	-
0.4892	1150	0.0002	-
0.5104	1200	0.0002	-
0.5317	1250	0.0001	-
0.5530	1300	0.0002	-
0.5742	1350	0.0002	-
0.5955	1400	0.0001	-
0.6168	1450	0.0002	-
0.6380	1500	0.0002	-
0.6593	1550	0.0001	-
0.6806	1600	0.0001	-
0.7018	1650	0.0001	-
0.7231	1700	0.0001	-
0.7444	1750	0.0001	-
0.7656	1800	0.0001	-
0.7869	1850	0.0001	-
0.8082	1900	0.0001	-
0.8294	1950	0.0001	-
0.8507	2000	0.0001	-
0.8720	2050	0.0001	-
0.8932	2100	0.0001	-
0.9145	2150	0.0002	-
0.9358	2200	0.0002	-
0.9570	2250	0.0002	-
0.9783	2300	0.0001	-
0.9996	2350	0.0001	-
1.0208	2400	0.0001	-
1.0421	2450	0.0002	-
1.0634	2500	0.0001	-
1.0846	2550	0.0001	-
1.1059	2600	0.0001	-
1.1272	2650	0.0002	-
1.1484	2700	0.0001	-
1.1697	2750	0.0001	-
1.1910	2800	0.0001	-
1.2123	2850	0.0001	-
1.2335	2900	0.0001	-
1.2548	2950	0.0001	-
1.2761	3000	0.0001	-
1.2973	3050	0.0001	-
1.3186	3100	0.0001	-
1.3399	3150	0.0001	-
1.3611	3200	0.0001	-
1.3824	3250	0.0001	-
1.4037	3300	0.0001	-
1.4249	3350	0.0001	-
1.4462	3400	0.0001	-
1.4675	3450	0.0001	-
1.4887	3500	0.0001	-
1.5100	3550	0.0001	-
1.5313	3600	0.0001	-
1.5525	3650	0.0001	-
1.5738	3700	0.0001	-
1.5951	3750	0.0001	-
1.6163	3800	0.0001	-
1.6376	3850	0.0	-
1.6589	3900	0.0001	-
1.6801	3950	0.0001	-
1.7014	4000	0.0001	-
1.7227	4050	0.0001	-
1.7439	4100	0.0001	-
1.7652	4150	0.0001	-
1.7865	4200	0.0001	-
1.8077	4250	0.0001	-
1.8290	4300	0.0001	-
1.8503	4350	0.0001	-
1.8715	4400	0.0	-
1.8928	4450	0.0001	-
1.9141	4500	0.0001	-
1.9353	4550	0.0001	-
1.9566	4600	0.0001	-
1.9779	4650	0.0001	-
1.9991	4700	0.0001	-

Framework Versions

Python: 3.10.16
SetFit: 1.0.3
Sentence Transformers: 2.7.0
Transformers: 4.40.2
PyTorch: 2.2.2
Datasets: 2.19.1
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Shankhdhar
/

classifier_woog_base_oos