--- language: - en license: mit library_name: llama.cpp tags: - nlp - information-extraction - event-detection datasets: - custom_dataset metrics: - accuracy base_model: deepseek-ai/deepseek-r1-distill-qwen-1.5b model-index: - name: DeepSeek R1 Distill Qwen 1.5B - Fine-tuned Version (Q4_K_M.gguf) results: - task: type: text-generation name: Actionable Information Extraction dataset: type: custom_dataset name: Custom Dataset for Event & Bullet Extraction metrics: - type: latency value: 33.78 name: Prompt Eval Time (ms per token) args: device: 3090 TI --- # DeepSeek R1 Distill Qwen 1.5B - Fine-tuned Version (Q4_K_M.gguf) ## Overview This is a fine-tuned version of **DeepSeek R1 Distill Qwen 1.5B**, optimized for extracting actionable insights and scheduling events from conversations. The model has undergone **2500 steps / 9 epochs** of fine-tuning with **2194 examples**, ensuring high accuracy and efficiency in structured information extraction. ## Model Details - **Base Model:** DeepSeek R1 Distill Qwen 1.5B - **Fine-tuning Steps:** 2500 - **Epochs:** 9 - **Dataset Size:** 2194 examples - **License:** MIT - **File Format:** GGUF - **Released Version:** Q4_K_M.gguf | Metric | **3090 Ti** | **Raspberry Pi 5** | |-----------------|-------------------------------|-------------------------------| | **Prompt Eval Time** | 33.78 ms / 406 tokens (0.08 ms per token, 12017.88 tokens/sec) | 17831.25 ms / 535 tokens (33.33 ms per token, 30.00 tokens/sec) | | **Eval Time** | 7133.93 ms / 1694 tokens (4.21 ms per token, 237.46 tokens/sec) | 52006.54 ms / 529 tokens (98.31 ms per token, 10.17 tokens/sec) | | **Total Time** | 7167.72 ms / 2100 tokens | 70881.95 ms / 1064 tokens | | **Decoding Speed** | N/A | 529 tokens in 70.40s (7.51 tokens/sec) | | **Sampling Speed** | N/A | 149.33 ms / 530 runs (0.28 ms per token, 3549.26 tokens/sec) | ### **Observations:** - The **3090 Ti** is significantly faster, handling **12017.88 tokens/sec** for prompt evaluation, compared to **30 tokens/sec** on the **Pi 5**. - In token evaluation, the **3090 Ti** manages **237.46 tokens/sec**, whereas the **Pi 5** achieves just **10.17 tokens/sec**. - The **Pi 5**'s total execution time (70.88s) is close to the **3090 Ti**, but it processes far fewer tokens in that time. ## Usage Instructions ### System Prompt To use this model effectively, initialize it with the following system prompt: ``` ### Instruction: Purpose: Extract actionable information from the provided dialog and metadata, generating bullet points with importance rankings and identifying relevant calendar events. ### Steps: 1. **Context Analysis:** - Use `CurrentDateTime` to interpret relative time references (e.g., "tomorrow"). - Prioritize key information based on `InformationRankings`: - Higher rank values indicate more emphasis on that aspect. 2. **Bullet Points:** - Summarize key points concisely. - Assign an importance rank (1-100). - Format: `"[Summary]"[1-100]` 3. **Event Detection:** - Identify and structure events with clear scheduling details. - Format: `EventTitle:"[Title]",StartDate:"[YYYY-MM-DD,HH:MM]",EndDate:"[YYYY-MM-DD,HH:MM or N/A]",Recurrence:"[Daily/Weekly/Monthly or N/A]",Details:"[Summary]"` 4. **Filtering:** - Exclude vague, non-actionable statements. - Only create events for clear, actionable scheduling needs. 5. **Output Consistency:** - Follow the exact XML format. - Ensure structured, relevant output. ONLY REPLY WITH THE XML AFTER YOU END THINK. Dialog: "{conversations}" CurrentDateTime: "{date_and_time_and_day}" InformationRankings: "{information_rankings}" ``` ## How to Run the Model ### Using llama.cpp If you are using `llama.cpp`, run the model with: ```bash ./main -m Q4_K_M.gguf --prompt "" --temp 0.7 --n-gpu-layers 50 ``` ### Using Text Generation WebUI 1. Download and place the `Q4_K_M.gguf` file in the models folder. 2. Start the WebUI: ```bash python server.py --model Q4_K_M.gguf ``` 3. Use the system prompt above for structured output. ## Expected Output Format Example response when processing a conversation: ```xml "Team meeting scheduled for next Monday to finalize project details."85 EventTitle:"Team Meeting",StartDate:"2025-06-10,10:00",EndDate:"2025-06-10,11:00",Recurrence:"N/A",Details:"Finalizing project details with the team." ``` ## License This model is released under the **MIT License**, allowing free usage, modification, and distribution. ## Contact & Support For any inquiries or support, please visit [Hugging Face Discussions](https://huggingface.co/unsloth/DeepSeek-R1-GGUF) or open an issue on the repository.