File size: 2,854 Bytes
3434a1c
5c57c70
 
3434a1c
 
 
5c57c70
 
 
 
 
 
3434a1c
 
5c57c70
 
3434a1c
 
5c57c70
 
 
 
 
 
 
 
 
 
 
 
bb3d363
5c57c70
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3434a1c
5c57c70
3434a1c
5c57c70
3434a1c
5c57c70
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
---
license: apache-2.0
base_model: meta-llama/Llama-3.2-1B-Instruct
tags:
- unsloth
- trl
- sft
- json
- structured-output
- fine-tuned
- llama
- pydantic
language:
- en
pipeline_tag: text-generation
library_name: transformers
---

# Llama 3.2 1B JSON Extractor

A fine-tuned version of **Llama 3.2 1B Instruct** specialized for generating structured JSON outputs with high accuracy and schema compliance.

## 🎯 Model Description

This model has been fine-tuned to excel at generating valid, well-structured JSON objects based on Pydantic model schemas. It transforms natural language prompts into properly formatted JSON responses with remarkable consistency.

## 📊 Performance

**🚀 Dramatic Improvement in JSON Generation:**
- **JSON Validity Rate**: 20% → 92% (over 70% improvement)
- **Schema Compliance**: Near-perfect adherence to small-average size Pydantic model structures
- **Generalization**: Successfully handles completely new, unseen Pydantic model classes

## 🔧 Training Details

- **Base Model**: meta-llama/Llama-3.2-1B-Instruct
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation) with Unsloth
- **Training Data**: Synthetic dataset with 15+ diverse Pydantic model types
- **Training Epochs**: 15
- **Batch Size**: 16 (with gradient accumulation)
- **Learning Rate**: 1e-4

## 🏗️ Supported Model Types

The model can generate JSON for 15+ different object types including:

- **Educational**: Course, Resume, Events
- **Entertainment**: FilmIdea, BookReview, GameIdea
- **Business**: TShirtOrder, Recipe, House
- **Characters & Gaming**: FictionalCharacter, GameArtifact
- **Travel**: Itinerary
- **Science**: SollarSystem, TextSummary
- **And many more...**

## 🎯 Key Features

- **High JSON Validity**: 92% success rate in generating valid JSON
- **Schema Compliance**: Follows Pydantic model structures precisely  
- **Strong Generalization**: Works with new, unseen model classes
- **Consistent Output**: Reliable structured data generation
- **Lightweight**: Only 1B parameters for efficient deployment

## 📚 Training Data

The model was fine-tuned on a synthetic dataset containing thousands of examples across diverse domains:
- Character creation and game development
- Business and e-commerce objects
- Educational and professional content
- Entertainment and media descriptions
- Scientific and technical data structures

## 🔗 Links

- **GitHub Repository**: [LLM_FineTuning_4JsonCreation](https://github.com/Dekanenko/Llama_FineTune_JSON_Creation)
- **Base Model**: [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)

## 📄 License

This model is released under the Apache 2.0 license.

## 🙏 Acknowledgments

- **Meta** for the base Llama 3.2 model
- **Unsloth** for efficient fine-tuning framework
- **Hugging Face** for model hosting and ecosystem