metadata
language:
- en
license: mit
library_name: llama.cpp
tags:
- nlp
- information-extraction
- event-detection
datasets:
- custom_dataset
metrics:
- accuracy
base_model: deepseek-ai/deepseek-r1-distill-qwen-1.5b
model-index:
- name: DeepSeek R1 Distill Qwen 1.5B - Fine-tuned Version (Q4_K_M.gguf)
results:
- task:
type: text-generation
name: Actionable Information Extraction
dataset:
type: custom_dataset
name: Custom Dataset for Event & Bullet Extraction
metrics:
- type: latency
value: 33.78
name: Prompt Eval Time (ms per token)
args:
device: 3090 TI
DeepSeek R1 Distill Qwen 1.5B - Fine-tuned Version (Q4_K_M.gguf)
Overview
This is a fine-tuned version of DeepSeek R1 Distill Qwen 1.5B, optimized for extracting actionable insights and scheduling events from conversations. The model has undergone 2500 steps / 9 epochs of fine-tuning with 2194 examples, ensuring high accuracy and efficiency in structured information extraction.
Model Details
- Base Model: DeepSeek R1 Distill Qwen 1.5B
- Fine-tuning Steps: 2500
- Epochs: 9
- Dataset Size: 2194 examples
- License: MIT
- File Format: GGUF
- Released Version: Q4_K_M.gguf
Metric | 3090 Ti | Raspberry Pi 5 |
---|---|---|
Prompt Eval Time | 33.78 ms / 406 tokens (0.08 ms per token, 12017.88 tokens/sec) | 17831.25 ms / 535 tokens (33.33 ms per token, 30.00 tokens/sec) |
Eval Time | 7133.93 ms / 1694 tokens (4.21 ms per token, 237.46 tokens/sec) | 52006.54 ms / 529 tokens (98.31 ms per token, 10.17 tokens/sec) |
Total Time | 7167.72 ms / 2100 tokens | 70881.95 ms / 1064 tokens |
Decoding Speed | N/A | 529 tokens in 70.40s (7.51 tokens/sec) |
Sampling Speed | N/A | 149.33 ms / 530 runs (0.28 ms per token, 3549.26 tokens/sec) |
Observations:
- The 3090 Ti is significantly faster, handling 12017.88 tokens/sec for prompt evaluation, compared to 30 tokens/sec on the Pi 5.
- In token evaluation, the 3090 Ti manages 237.46 tokens/sec, whereas the Pi 5 achieves just 10.17 tokens/sec.
- The Pi 5's total execution time (70.88s) is close to the 3090 Ti, but it processes far fewer tokens in that time.
Usage Instructions
System Prompt
To use this model effectively, initialize it with the following system prompt:
### Instruction:
Purpose:
Extract actionable information from the provided dialog and metadata, generating bullet points with importance rankings and identifying relevant calendar events.
### Steps:
1. **Context Analysis:**
- Use `CurrentDateTime` to interpret relative time references (e.g., "tomorrow").
- Prioritize key information based on `InformationRankings`:
- Higher rank values indicate more emphasis on that aspect.
2. **Bullet Points:**
- Summarize key points concisely.
- Assign an importance rank (1-100).
- Format: `<Bullet_Point>"[Summary]"</Bullet_Point><Rank>[1-100]</Rank>`
3. **Event Detection:**
- Identify and structure events with clear scheduling details.
- Format:
`<Calendar_Event>EventTitle:"[Title]",StartDate:"[YYYY-MM-DD,HH:MM]",EndDate:"[YYYY-MM-DD,HH:MM or N/A]",Recurrence:"[Daily/Weekly/Monthly or N/A]",Details:"[Summary]"</Calendar_Event>`
4. **Filtering:**
- Exclude vague, non-actionable statements.
- Only create events for clear, actionable scheduling needs.
5. **Output Consistency:**
- Follow the exact XML format.
- Ensure structured, relevant output.
ONLY REPLY WITH THE XML AFTER YOU END THINK.
Dialog: "{conversations}"
CurrentDateTime: "{date_and_time_and_day}"
InformationRankings: "{information_rankings}"
<think>
How to Run the Model
Using llama.cpp
If you are using llama.cpp
, run the model with:
./main -m Q4_K_M.gguf --prompt "<your prompt>" --temp 0.7 --n-gpu-layers 50
Using Text Generation WebUI
- Download and place the
Q4_K_M.gguf
file in the models folder. - Start the WebUI:
python server.py --model Q4_K_M.gguf
- Use the system prompt above for structured output.
Expected Output Format
Example response when processing a conversation:
<Bullet_Point>"Team meeting scheduled for next Monday to finalize project details."</Bullet_Point><Rank>85</Rank>
<Calendar_Event>EventTitle:"Team Meeting",StartDate:"2025-06-10,10:00",EndDate:"2025-06-10,11:00",Recurrence:"N/A",Details:"Finalizing project details with the team."</Calendar_Event>
License
This model is released under the MIT License, allowing free usage, modification, and distribution.
Contact & Support
For any inquiries or support, please visit Hugging Face Discussions or open an issue on the repository.