Model Card for Microsoft-phi-4-Instruct-AutoRound-GPTQ-4bit

Model Overview

Model Name: Microsoft-phi-4-Instruct-AutoRound-GPTQ-4bit
Model Type: Instruction-tuned, Quantized GPT-4-based language model
Quantization: GPTQ 4-bit
Author: Satwik11
Hosted on: Hugging Face

Description

This model is a quantized version of the Microsoft phi-4 Instruct model, designed to deliver high performance while maintaining computational efficiency. By leveraging the GPTQ 4-bit quantization method, it enables deployment in environments with limited resources while retaining a high degree of accuracy.

The model is fine-tuned for instruction-following tasks, making it ideal for applications in conversational AI, question answering, and general-purpose text generation.

Key Features

  • Instruction-tuned: Fine-tuned to follow human-like instructions effectively.
  • Quantized for Efficiency: Uses GPTQ 4-bit quantization to reduce memory requirements and inference latency.
  • Pre-trained Base: Built on the Microsoft phi-4 framework, ensuring state-of-the-art performance on NLP tasks.

Use Cases

  • Chatbots and virtual assistants.
  • Summarization and content generation.
  • Research and educational applications.
  • Semantic search and knowledge retrieval.

Model Details

Architecture

  • Base Model: Microsoft phi-4
  • Quantization Technique: GPTQ (4-bit)
  • Language: English
  • Training Objective: Instruction-following fine-tuning
Downloads last month
321
Safetensors
Model size
2.85B params
Tensor type
I32
BF16
FP16
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for Satwik11/Microsoft-phi-4-Instruct-AutoRound-GPTQ-4bit

Base model

microsoft/phi-4
Quantized
(111)
this model
Adapters
1 model