Better SQL Agent - Llama 3.1 8B

Training Results

  • Training Samples: 19,480 (SQL analytics + technical conversations)
  • Hardware: NVIDIA 4x A10G GPU (96GB VRAM)

Model Description

This is a high-performance fine-tuned version of Meta-Llama-3.1-8B-Instruct, specifically optimized for:

  • SQL query generation and optimization
  • Data analysis and insights
  • Technical assistance and debugging
  • Tool-based workflows

Training Configuration

  • Base Model: meta-llama/Llama-3.1-8B-Instruct
  • Training Method: LoRA (Low-Rank Adaptation)
    • Rank: 16, Alpha: 32, Dropout: 0.05
  • Quantization: 4-bit with BF16 training precision
  • Context Length: 128K tokens (extended from base)
  • Optimizer: AdamW with cosine scheduling

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load the fine-tuned model
model_name = "abhishekgahlot/better-sql-agent-llama"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True
)

# Generate SQL query
prompt = """<|begin_of_text|><|start_header_id|>user<|end_header_id|>

Create a SQL query to find the top 5 customers by total revenue in 2024:

<|eot_id|><|start_header_id|>assistant<|end_header_id|>

"""

inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], 
                          skip_special_tokens=True)
print(response)

Performance Metrics

Metric Value
Starting Loss 1.53
Final Loss 0.0508
Loss Reduction 96.7%
Training Time 8.9 hours

Use Cases

  • SQL Generation: Create complex queries from natural language
  • Data Analysis: Generate insights and analytical queries
  • Code Assistance: Debug and optimize SQL code
  • Technical Support: Answer database and analytics questions
  • Learning Aid: Explain SQL concepts and best practices

Training Data

The model was trained on a curated dataset of 19,480 high-quality examples including:

  • SQL query generation tasks
  • Data analysis conversations
  • Technical problem-solving dialogues
  • Tool usage patterns and workflows

Optimization Features

  • 4-bit Quantization: Reduced memory footprint
  • Flash Attention: Optimized attention mechanism
  • Mixed Precision: BF16 training for efficiency

License

This model inherits the Llama 3.1 license from the base model. Please review the official license for usage terms.

Acknowledgments

  • Based on Meta's Llama 3.1 8B Instruct model

Model Card Contact

For questions about this model, please open an issue in the repository or contact the model author.


Downloads last month
36
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for abhishekgahlot/better-sql-agent-llama

Finetuned
(1684)
this model