---
language:
- en
license: mit
library_name: llama.cpp
tags:
- nlp
- information-extraction
- event-detection
datasets:
- custom_dataset
metrics:
- accuracy
base_model: deepseek-ai/deepseek-r1-distill-qwen-1.5b
model-index:
- name: DeepSeek R1 Distill Qwen 1.5B - Fine-tuned Version (Q4_K_M.gguf)
  results:
  - task:
      type: text-generation
      name: Actionable Information Extraction
    dataset:
      type: custom_dataset
      name: Custom Dataset for Event & Bullet Extraction
    metrics:
      - type: latency
        value: 33.78
        name: Prompt Eval Time (ms per token)
        args:
          device: 3090 TI
---
# DeepSeek R1 Distill Qwen 1.5B - Fine-tuned Version (Q4_K_M.gguf)

## Overview
This is a fine-tuned version of **DeepSeek R1 Distill Qwen 1.5B**, optimized for extracting actionable insights and scheduling events from conversations. The model has undergone **2500 steps / 9 epochs** of fine-tuning with **2194 examples**, ensuring high accuracy and efficiency in structured information extraction.

## Model Details
- **Base Model:** DeepSeek R1 Distill Qwen 1.5B
- **Fine-tuning Steps:** 2500
- **Epochs:** 9
- **Dataset Size:** 2194 examples
- **License:** MIT
- **File Format:** GGUF
- **Released Version:** Q4_K_M.gguf

| Metric           | **3090 Ti** | **Raspberry Pi 5** |
|-----------------|-------------------------------|-------------------------------|
| **Prompt Eval Time** | 33.78 ms / 406 tokens (0.08 ms per token, 12017.88 tokens/sec) | 17831.25 ms / 535 tokens (33.33 ms per token, 30.00 tokens/sec) |
| **Eval Time** | 7133.93 ms / 1694 tokens (4.21 ms per token, 237.46 tokens/sec) | 52006.54 ms / 529 tokens (98.31 ms per token, 10.17 tokens/sec) |
| **Total Time** | 7167.72 ms / 2100 tokens | 70881.95 ms / 1064 tokens |
| **Decoding Speed** | N/A | 529 tokens in 70.40s (7.51 tokens/sec) |
| **Sampling Speed** | N/A | 149.33 ms / 530 runs (0.28 ms per token, 3549.26 tokens/sec) |

### **Observations:**
- The **3090 Ti** is significantly faster, handling **12017.88 tokens/sec** for prompt evaluation, compared to **30 tokens/sec** on the **Pi 5**.
- In token evaluation, the **3090 Ti** manages **237.46 tokens/sec**, whereas the **Pi 5** achieves just **10.17 tokens/sec**.
- The **Pi 5**'s total execution time (70.88s) is close to the **3090 Ti**, but it processes far fewer tokens in that time.

## Usage Instructions
### System Prompt
To use this model effectively, initialize it with the following system prompt:

```
### Instruction:
Purpose:  
Extract actionable information from the provided dialog and metadata, generating bullet points with importance rankings and identifying relevant calendar events.

### Steps:

1. **Context Analysis:**
   - Use `CurrentDateTime` to interpret relative time references (e.g., "tomorrow").
   - Prioritize key information based on `InformationRankings`:
     - Higher rank values indicate more emphasis on that aspect.

2. **Bullet Points:**
   - Summarize key points concisely.
   - Assign an importance rank (1-100).
   - Format: `<Bullet_Point>"[Summary]"</Bullet_Point><Rank>[1-100]</Rank>`

3. **Event Detection:**
   - Identify and structure events with clear scheduling details.
   - Format:  
     `<Calendar_Event>EventTitle:"[Title]",StartDate:"[YYYY-MM-DD,HH:MM]",EndDate:"[YYYY-MM-DD,HH:MM or N/A]",Recurrence:"[Daily/Weekly/Monthly or N/A]",Details:"[Summary]"</Calendar_Event>`

4. **Filtering:**
   - Exclude vague, non-actionable statements.
   - Only create events for clear, actionable scheduling needs.

5. **Output Consistency:**
   - Follow the exact XML format.
   - Ensure structured, relevant output.

ONLY REPLY WITH THE XML AFTER YOU END THINK.

Dialog: "{conversations}"
CurrentDateTime: "{date_and_time_and_day}"
InformationRankings: "{information_rankings}"
<think>
```

## How to Run the Model
### Using llama.cpp
If you are using `llama.cpp`, run the model with:
```bash
./main -m Q4_K_M.gguf --prompt "<your prompt>" --temp 0.7 --n-gpu-layers 50
```

### Using Text Generation WebUI
1. Download and place the `Q4_K_M.gguf` file in the models folder.
2. Start the WebUI:
```bash
python server.py --model Q4_K_M.gguf
```
3. Use the system prompt above for structured output.

## Expected Output Format
Example response when processing a conversation:
```xml
<Bullet_Point>"Team meeting scheduled for next Monday to finalize project details."</Bullet_Point><Rank>85</Rank>
<Calendar_Event>EventTitle:"Team Meeting",StartDate:"2025-06-10,10:00",EndDate:"2025-06-10,11:00",Recurrence:"N/A",Details:"Finalizing project details with the team."</Calendar_Event>
```

## License
This model is released under the **MIT License**, allowing free usage, modification, and distribution.

## Contact & Support
For any inquiries or support, please visit [Hugging Face Discussions](https://huggingface.co/unsloth/DeepSeek-R1-GGUF) or open an issue on the repository.