SetFit documentation
SetFit v1.0.0 Migration Guide
SetFit v1.0.0 Migration Guide
To update your code to work with v1.0.0, the following changes must be made:
General Migration Guide
keep_body_frozenfromSetFitModel.unfreezehas been deprecated, simply either pass"head","body"or no arguments to unfreeze both.SupConLosshas been moved fromsetfit.modelingtosetfit.losses. If you are importing it usingfrom setfit.modeling import SupConLoss, then import it likefrom setfit import SupConLossnow instead.use_auth_tokenhas been renamed totokeninSetFitModel.from_pretrained().use_auth_tokenwill keep working until the next major version, but with a warning.
Training Migration Guide
Replace all uses of
SetFitTrainerwith Trainer, and all uses ofDistillationSetFitTrainerwith DistillationTrainer.Remove
num_iterations,num_epochs,learning_rate,batch_size,seed,use_amp,warmup_proportion,distance_metric,margin,samples_per_labelandloss_classfrom aTrainerinitialization, and move them to aTrainerArgumentsinitialization instead. This instance should then be passed to the trainer via theargsargument.num_iterationshas been deprecated, the number of training steps should now be controlled vianum_epochs,max_stepsorEarlyStoppingCallback.learning_ratehas been split up intobody_learning_rateandhead_learning_rate.loss_classhas been renamed toloss.
Stop providing training arguments like
num_epochsdirectly toTrainer.train: pass aTrainingArgumentsinstance via theargsargument instead.Refactor multiple
trainer.train(),trainer.freeze()andtrainer.unfreeze()calls that were previously necessary to train the differentiable head into just onetrainer.train()call by settingbatch_sizeandnum_epochson theTrainingArgumentsdataclass with tuples. The first value in the tuple is for training the embeddings, and the second is for training the classifier.
Hard deprecations
SetFitBaseModel,SKLearnWrapperandSetFitPipelinehave been removed. These can no longer be used starting from v1.0.0.
v1.0.0 Changelog
This list contains new functionality that can be used starting from v1.0.0.
SetFitModel.from_pretrained()now accepts new arguments:device: Specifies the device on which to load the SetFit model.labels: Specify labels corresponding to the training labels - useful if the training labels are integers ranging from0tonum_classes - 1. These are automatically applied on calling SetFitModel.predict().model_card_data: Provide a SetFitModelCardData instance storing data such as model language, license, dataset name, etc. to be used in the automatically generated model cards.
Certain SetFit configuration options, such as the new
labelsargument fromSetFitModel.from_pretrained(), now get saved inconfig_setfit.jsonfiles when a model is saved. This allowslabelsto be automatically fetched when a model is loaded.SetFitModel.predict() now accepts new arguments:
batch_size(defaults to32): The batch size to use in encoding the sentences to embeddings. Higher often means faster processing but higher memory usage.use_labels(defaults toTrue): Whether to use theSetFitModel.labelsto convert integer labels to string labels. Not used if the training labels are already strings.
SetFitModel.encode() has been introduce to convert input sentences to embeddings using the
SentenceTransformerbody.SetFitModel.device has been introduced to determine the device of the model.
AbsaTrainer and AbsaModel have been introduced for applying SetFit for Aspect Based Sentiment Analysis.
Trainer now supports a
callbacksargument for a list oftransformersTrainerCallbackinstances.- By default, all installed callbacks integrated with
transformersare supported, includingTensorBoardCallback,WandbCallbackto log training logs to TensorBoard and W&B, respectively. - The Trainer will now print
embedding_lossin the terminal, as well aseval_embedding_lossifevaluation_strategyis set to"epoch"or"steps"in TrainingArguments.
- By default, all installed callbacks integrated with
Trainer.evaluate() now works with string labels.
An updated contrastive pair sampler increases the variety of training pairs.
TrainingArguments supports various new arguments:
output_dir: The output directory where the model predictions and checkpoints will be written.max_steps: If set to a positive number, the total number of training steps to perform. Overrides num_epochs. The training may stop before reaching the set number of steps when all data is exhausted.sampling_strategy: The sampling strategy of how to draw pairs in training. Possible values are:"oversampling": Draws even number of positive/negative sentence pairs until every sentence pair has been drawn."undersampling": Draws the minimum number of positive/negative sentence pairs until every sentence pair in the minority class has been drawn."unique": Draws every sentence pair combination (likely resulting in unbalanced number of positive/negative sentence pairs).
The default is set to
"oversampling", ensuring all sentence pairs are drawn at least once. Alternatively, settingnum_iterationswill override this argument and determine the number of generated sentence pairs.report_to: The list of integrations to report the results and logs to. Supported platforms are"azure_ml","comet_ml","mlflow","neptune","tensorboard","clearml"and"wandb". Use"all"to report to all integrations installed,"none"for no integrations.run_name: A descriptor for the run. Typically used for wandb and mlflow logging.logging_strategy: The logging strategy to adopt during training. Possible values are:"no": No logging is done during training."epoch": Logging is done at the end of each epoch."steps": Logging is done everylogging_steps.
logging_first_step: Whether to log and evaluate the firstglobal_stepor not.logging_steps: Number of update steps between two logs iflogging_strategy="steps".evaluation_strategy: The evaluation strategy to adopt during training. Possible values are:"no": No evaluation is done during training."steps": Evaluation is done (and logged) everyeval_steps."epoch": Evaluation is done at the end of each epoch.
eval_steps: Number of update steps between two evaluations ifevaluation_strategy="steps". Will default to the same aslogging_stepsif not set.eval_delay: Number of epochs or steps to wait for before the first evaluation can be performed, depending on theevaluation_strategy.eval_max_steps: If set to a positive number, the total number of evaluation steps to perform. The evaluation may stop before reaching the set number of steps when all data is exhausted.save_strategy: The checkpoint save strategy to adopt during training. Possible values are:"no": No save is done during training."epoch": Save is done at the end of each epoch."steps": Save is done everysave_steps.
save_steps: Number of updates steps before two checkpoint saves ifsave_strategy="steps".save_total_limit: If a value is passed, will limit the total amount of checkpoints. Deletes the older checkpoints inoutput_dir. Note, the best model is always preserved if theevaluation_strategyis not"no".load_best_model_at_end: Whether or not to load the best model found during training at the end of training.When set to
True, the parameterssave_strategyneeds to be the same asevaluation_strategy, and in the case it is “steps”,save_stepsmust be a round multiple ofeval_steps.
Pushing SetFit or SetFitABSA models to the Hub with
SetFitModel.push_to_hub()or AbsaModel.push_to_hub() now results in a detailed model card. As an example, see this SetFitModel or this SetFitABSA polarity model.