optimum-onnx documentation
ONNX Runtime Pipelines
ONNX Runtime Pipelines
optimum.onnxruntime.pipeline
< source >( task: str | None = None model: str | ORTModel | None = None config: str | PretrainedConfig | None = None tokenizer: str | PreTrainedTokenizer | PreTrainedTokenizerFast | None = None feature_extractor: str | FeatureExtractionMixin | None = None image_processor: str | BaseImageProcessor | None = None processor: str | ProcessorMixin | None = None revision: str | None = None use_fast: bool = True token: str | bool | None = None device: int | str | torch.device | None = None trust_remote_code: bool | None = None model_kwargs: dict[str, Any] | None = None pipeline_class: Any | None = None **kwargs: Any  ) → Pipeline
Parameters
-  task (str) — The task defining which pipeline will be returned. Currently accepted tasks are:- "audio-classification": will return a- AudioClassificationPipeline.
- "automatic-speech-recognition": will return a- AutomaticSpeechRecognitionPipeline.
- "depth-estimation": will return a- DepthEstimationPipeline.
- "document-question-answering": will return a- DocumentQuestionAnsweringPipeline.
- "feature-extraction": will return a- FeatureExtractionPipeline.
- "fill-mask": will return a- FillMaskPipeline:.
- "image-classification": will return a- ImageClassificationPipeline.
- "image-feature-extraction": will return an- ImageFeatureExtractionPipeline.
- "image-segmentation": will return a- ImageSegmentationPipeline.
- "image-text-to-text": will return a- ImageTextToTextPipeline.
- "image-to-image": will return a- ImageToImagePipeline.
- "image-to-text": will return a- ImageToTextPipeline.
- "mask-generation": will return a- MaskGenerationPipeline.
- "object-detection": will return a- ObjectDetectionPipeline.
- "question-answering": will return a- QuestionAnsweringPipeline.
- "summarization": will return a- SummarizationPipeline.
- "table-question-answering": will return a- TableQuestionAnsweringPipeline.
- "text2text-generation": will return a- Text2TextGenerationPipeline.
- "text-classification"(alias- "sentiment-analysis"available): will return a- TextClassificationPipeline.
- "text-generation": will return a- TextGenerationPipeline:.
- "text-to-audio"(alias- "text-to-speech"available): will return a- TextToAudioPipeline:.
- "token-classification"(alias- "ner"available): will return a- TokenClassificationPipeline.
- "translation": will return a- TranslationPipeline.
- "translation_xx_to_yy": will return a- TranslationPipeline.
- "video-classification": will return a- VideoClassificationPipeline.
- "visual-question-answering": will return a- VisualQuestionAnsweringPipeline.
- "zero-shot-classification": will return a- ZeroShotClassificationPipeline.
- "zero-shot-image-classification": will return a- ZeroShotImageClassificationPipeline.
- "zero-shot-audio-classification": will return a- ZeroShotAudioClassificationPipeline.
- "zero-shot-object-detection": will return a- ZeroShotObjectDetectionPipeline.
 
-  model (strorORTModel, optional) — The model that will be used by the pipeline to make predictions. This can be a model identifier or an actual instance of a ONNX Runtime model inheriting fromORTModel.If not provided, the default for the taskwill be loaded.
-  config (strorPretrainedConfig, optional) — The configuration that will be used by the pipeline to instantiate the model. This can be a model identifier or an actual pretrained model configuration inheriting fromPretrainedConfig.If not provided, the default configuration file for the requested model will be used. That means that if modelis given, its default configuration will be used. However, ifmodelis not supplied, thistask’s default model’s config is used instead.
-  tokenizer (strorPreTrainedTokenizer, optional) — The tokenizer that will be used by the pipeline to encode data for the model. This can be a model identifier or an actual pretrained tokenizer inheriting fromPreTrainedTokenizer.If not provided, the default tokenizer for the given modelwill be loaded (if it is a string). Ifmodelis not specified or not a string, then the default tokenizer forconfigis loaded (if it is a string). However, ifconfigis also not given or not a string, then the default tokenizer for the giventaskwill be loaded.
-  feature_extractor (strorPreTrainedFeatureExtractor, optional) — The feature extractor that will be used by the pipeline to encode data for the model. This can be a model identifier or an actual pretrained feature extractor inheriting fromPreTrainedFeatureExtractor.Feature extractors are used for non-NLP models, such as Speech or Vision models as well as multi-modal models. Multi-modal models will also require a tokenizer to be passed. If not provided, the default feature extractor for the given modelwill be loaded (if it is a string). Ifmodelis not specified or not a string, then the default feature extractor forconfigis loaded (if it is a string). However, ifconfigis also not given or not a string, then the default feature extractor for the giventaskwill be loaded.
-  image_processor (strorBaseImageProcessor, optional) — The image processor that will be used by the pipeline to preprocess images for the model. This can be a model identifier or an actual image processor inheriting fromBaseImageProcessor.Image processors are used for Vision models and multi-modal models that require image inputs. Multi-modal models will also require a tokenizer to be passed. If not provided, the default image processor for the given modelwill be loaded (if it is a string). Ifmodelis not specified or not a string, then the default image processor forconfigis loaded (if it is a string).
-  processor (strorProcessorMixin, optional) — The processor that will be used by the pipeline to preprocess data for the model. This can be a model identifier or an actual processor inheriting fromProcessorMixin.Processors are used for multi-modal models that require multi-modal inputs, for example, a model that requires both text and image inputs. If not provided, the default processor for the given modelwill be loaded (if it is a string). Ifmodelis not specified or not a string, then the default processor forconfigis loaded (if it is a string).
-  revision (str, optional, defaults to"main") — When passing a task name or a string model identifier: The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.
-  use_fast (bool, optional, defaults toTrue) — Whether or not to use a Fast tokenizer if possible (aPreTrainedTokenizerFast).
-  use_auth_token (stror bool, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue, will use the token generated when runninghf auth login(stored in~/.huggingface).
-  device (intorstrortorch.device) — Defines the device (e.g.,"cpu","cuda:1","mps", or a GPU ordinal rank like1) on which this pipeline will be allocated.
-  device_map (strordict[str, Union[int, str, torch.device], optional) — Sent directly asmodel_kwargs(just a simpler shortcut). Whenacceleratelibrary is present, setdevice_map="auto"to compute the most optimizeddevice_mapautomatically (see here for more information).Do not use device_mapANDdeviceat the same time as they will conflict
-  torch_dtype (strortorch.dtype, optional) — Sent directly asmodel_kwargs(just a simpler shortcut) to use the available precision for this model (torch.float16,torch.bfloat16, … or"auto").
-  trust_remote_code (bool, optional, defaults toFalse) — Whether or not to allow for custom code defined on the Hub in their own modeling, configuration, tokenization or even pipeline files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.
-  model_kwargs (dict[str, Any], optional) — Additional dictionary of keyword arguments passed along to the model’sfrom_pretrained(..., **model_kwargs)function.
-  kwargs (dict[str, Any], optional) — Additional keyword arguments passed along to the specific pipeline init (see the documentation for the corresponding pipeline class for possible values).
Returns
Pipeline
A suitable pipeline for the task.
Utility factory method to build a Pipeline with an ONNX Runtime model, similar to transformers.pipeline.
A pipeline consists of:
- One or more components for pre-processing model inputs, such as a tokenizer, image_processor, feature_extractor, or processor.
- A model that generates predictions from the inputs.
- Optional post-processing steps to refine the model’s output, which can also be handled by processors.
While there are such optional arguments as `tokenizer`, `feature_extractor`, `image_processor`, and `processor`, they shouldn't be specified all at once. If these components are not provided, `pipeline` will try to load required ones automatically. In case you want to provide these components explicitly, please refer to a specific pipeline in order to get more details regarding what components are required.
Examples:
>>> from optimum.onnxruntime import pipeline
>>> # Sentiment analysis pipeline
>>> analyzer = pipeline("sentiment-analysis")
>>> # Question answering pipeline, specifying the checkpoint identifier
>>> oracle = pipeline(
...     "question-answering", model="distilbert/distilbert-base-cased-distilled-squad", tokenizer="google-bert/bert-base-cased"
... )
>>> # Named entity recognition pipeline, passing in a specific model and tokenizer
>>> model = ORTModelForTokenClassification.from_pretrained("dbmdz/bert-large-cased-finetuned-conll03-english")
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-cased")
>>> recognizer = pipeline("ner", model=model, tokenizer=tokenizer)