DeepSeek R1 Distill Qwen 1.5B - Fine-tuned Version (Q4_K_M.gguf)

Overview

This is a fine-tuned version of DeepSeek R1 Distill Qwen 1.5B, optimized for extracting actionable insights and scheduling events from conversations. The model has undergone 2500 steps / 9 epochs of fine-tuning with 2194 examples, ensuring high accuracy and efficiency in structured information extraction.

Model Details

Base Model: DeepSeek R1 Distill Qwen 1.5B
Fine-tuning Steps: 2500
Epochs: 9
Dataset Size: 2194 examples
License: MIT
File Format: GGUF
Released Version: Q4_K_M.gguf

Metric	3090 Ti	Raspberry Pi 5
Prompt Eval Time	33.78 ms / 406 tokens (0.08 ms per token, 12017.88 tokens/sec)	17831.25 ms / 535 tokens (33.33 ms per token, 30.00 tokens/sec)
Eval Time	7133.93 ms / 1694 tokens (4.21 ms per token, 237.46 tokens/sec)	52006.54 ms / 529 tokens (98.31 ms per token, 10.17 tokens/sec)
Total Time	7167.72 ms / 2100 tokens	70881.95 ms / 1064 tokens
Decoding Speed	N/A	529 tokens in 70.40s (7.51 tokens/sec)
Sampling Speed	N/A	149.33 ms / 530 runs (0.28 ms per token, 3549.26 tokens/sec)

Observations:

The 3090 Ti is significantly faster, handling 12017.88 tokens/sec for prompt evaluation, compared to 30 tokens/sec on the Pi 5.
In token evaluation, the 3090 Ti manages 237.46 tokens/sec, whereas the Pi 5 achieves just 10.17 tokens/sec.
The Pi 5's total execution time (70.88s) is close to the 3090 Ti, but it processes far fewer tokens in that time.

Usage Instructions

System Prompt

To use this model effectively, initialize it with the following system prompt:

### Instruction:
Purpose:  
Extract actionable information from the provided dialog and metadata, generating bullet points with importance rankings and identifying relevant calendar events.

### Steps:

1. **Context Analysis:**
   - Use `CurrentDateTime` to interpret relative time references (e.g., "tomorrow").
   - Prioritize key information based on `InformationRankings`:
     - Higher rank values indicate more emphasis on that aspect.

2. **Bullet Points:**
   - Summarize key points concisely.
   - Assign an importance rank (1-100).
   - Format: `<Bullet_Point>"[Summary]"</Bullet_Point><Rank>[1-100]</Rank>`

3. **Event Detection:**
   - Identify and structure events with clear scheduling details.
   - Format:  
     `<Calendar_Event>EventTitle:"[Title]",StartDate:"[YYYY-MM-DD,HH:MM]",EndDate:"[YYYY-MM-DD,HH:MM or N/A]",Recurrence:"[Daily/Weekly/Monthly or N/A]",Details:"[Summary]"</Calendar_Event>`

4. **Filtering:**
   - Exclude vague, non-actionable statements.
   - Only create events for clear, actionable scheduling needs.

5. **Output Consistency:**
   - Follow the exact XML format.
   - Ensure structured, relevant output.

ONLY REPLY WITH THE XML AFTER YOU END THINK.

Dialog: "{conversations}"
CurrentDateTime: "{date_and_time_and_day}"
InformationRankings: "{information_rankings}"
<think>

How to Run the Model

Using llama.cpp

If you are using llama.cpp, run the model with:

./main -m Q4_K_M.gguf --prompt "<your prompt>" --temp 0.7 --n-gpu-layers 50

Using Text Generation WebUI

Download and place the Q4_K_M.gguf file in the models folder.
Start the WebUI:

python server.py --model Q4_K_M.gguf

Use the system prompt above for structured output.

Expected Output Format

Example response when processing a conversation:

<Bullet_Point>"Team meeting scheduled for next Monday to finalize project details."</Bullet_Point><Rank>85</Rank>
<Calendar_Event>EventTitle:"Team Meeting",StartDate:"2025-06-10,10:00",EndDate:"2025-06-10,11:00",Recurrence:"N/A",Details:"Finalizing project details with the team."</Calendar_Event>

License

This model is released under the MIT License, allowing free usage, modification, and distribution.

Contact & Support

For any inquiries or support, please visit Hugging Face Discussions or open an issue on the repository.

Alienanthony
/

Events-of-the-day_Calander-planner