kousik-2310
/

intent-classifier-minilm

+---
+license: apache-2.0
+base_model: microsoft/DialoGPT-medium
+tags:
+- intent-classification
+- text-classification
+- onnx
+- transformers.js
+- nlp
+language:
+- en
+metrics:
+- accuracy
+- f1
+library_name: transformers
+pipeline_tag: text-classification
+---
+# Intent Classifier - MiniLM
+A fine-tuned intent classification model based on MiniLM, optimized for fast inference with multiple ONNX quantization variants.
+## Model Description
+This model is designed for intent classification tasks and has been converted to ONNX format for efficient deployment in various environments, including web browsers using Transformers.js.
+## Model Variants
+This repository contains multiple ONNX model variants optimized for different use cases:
+| Model File | Description | Use Case |
+|------------|-------------|----------|
+| `model.onnx` | Original ONNX model | Best accuracy, larger size |
+| `model_fp16.onnx` | 16-bit floating point | Good balance of accuracy and speed |
+| `model_int8.onnx` | 8-bit integer quantized | Faster inference, smaller size |
+| `model_q4.onnx` | 4-bit quantized | Very fast, very small |
+| `model_q4f16.onnx` | 4-bit with FP16 | Optimized for specific hardware |
+| `model_quantized.onnx` | Standard quantized | General purpose optimization |
+| `model_uint8.onnx` | Unsigned 8-bit | Mobile/edge deployment |
+| `model_bnb4.onnx` | BitsAndBytes 4-bit | Advanced quantization |
+## Quick Start
+### Using with Transformers.js (Browser)
+```javascript
+import { pipeline } from '@xenova/transformers';
+// Load the model
+const classifier = await pipeline('text-classification', 'kousik-2310/intent-classifier-minilm');
+// Classify text
+const result = await classifier('I want to book a flight to New York');
+console.log(result);
+```
+### Using with Python/Transformers
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+from transformers import pipeline
+# Load tokenizer and model
+tokenizer = AutoTokenizer.from_pretrained("kousik-2310/intent-classifier-minilm")
+model = AutoModelForSequenceClassification.from_pretrained("kousik-2310/intent-classifier-minilm")
+# Create pipeline
+classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
+# Classify text
+result = classifier("I want to book a flight to New York")
+print(result)
+```
+### Using ONNX Runtime
+```python
+import onnxruntime as ort
+from transformers import AutoTokenizer
+# Load tokenizer
+tokenizer = AutoTokenizer.from_pretrained("kousik-2310/intent-classifier-minilm")
+# Load ONNX model
+session = ort.InferenceSession("onnx/model_int8.onnx")
+# Tokenize input
+text = "I want to book a flight to New York"
+inputs = tokenizer(text, return_tensors="np", padding=True, truncation=True)
+# Run inference
+outputs = session.run(None, {
+    "input_ids": inputs["input_ids"],
+    "attention_mask": inputs["attention_mask"]
+})
+# Process results
+predictions = outputs[0]
+```
+## Model Architecture
+- **Base Model**: MiniLM architecture
+- **Task**: Text Classification (Intent Recognition)
+- **Framework**: PyTorch → ONNX
+- **Quantization**: Multiple variants available
+## Performance
+The model provides different performance characteristics based on the variant used:
+- **Accuracy**: Best with `model.onnx`, good with quantized versions
+- **Speed**: Fastest with `model_q4.onnx` and `model_int8.onnx`
+- **Size**: Smallest with quantized variants (4-bit, 8-bit)
+## Intended Use
+This model is intended for:
+- Intent classification in chatbots and virtual assistants
+- Text classification tasks
+- Real-time inference in web applications
+- Edge deployment scenarios
+## Training Details
+The model has been fine-tuned for intent classification and converted to multiple ONNX formats for optimal deployment flexibility.
+## Limitations and Bias
+- The model performance depends on the similarity between your use case and the training data
+- Quantized models may have slightly reduced accuracy compared to the full precision model
+- Performance may vary based on the deployment environment
+## How to Cite
+```bibtex
+@misc{intent-classifier-minilm,
+  title={Intent Classifier MiniLM},
+  author={kousik-2310},
+  year={2024},
+  url={https://huggingface.co/kousik-2310/intent-classifier-minilm}
+}
+```
+## License
+This model is released under the Apache 2.0 License.