metadata
base_model: Llama-3.2-3B-Instruct
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
- climate-policy
- query-interpretation
- lora
license: apache-2.0
language:
- en
CLEAR Query Interpreter
This is the official implementation of the query interpretation model from our paper "CLEAR: Climate Policy Retrieval and Summarization Using LLMs" (WWW Companion '25).
Model Description
The model is a LoRA adapter fine-tuned on Llama-3.2-3B to decompose natural language queries about climate policies into structured components for precise information retrieval.
Task
Query interpretation for climate policy retrieval, decomposing natural queries into:
- Location (L): Geographic identification
- Topics (T): Climate-related themes
- Intent (I): Specific policy inquiries
Training Details
- Base Model: Llama-3.2-3B
- Training Data: 330 manually annotated queries
- Annotators: Four Australia-based experts with media communication backgrounds
- Hardware: NVIDIA A100 GPU
- Parameters:
- Batch size: 6
- Sequence length: 1024
- Optimizer: AdamW (weight decay 0.05)
- Learning rate: 5e-5
- Epochs: 10
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
device = "cuda" if torch.cuda.is_available() else "cpu"
model_name = "oscarwu/Llama-3.2-3B-CLEAR"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16
).to(device)
# Example query
query = "I live in Burwood (Vic) and want details on renewable energy initiatives. Are solar farms planned?"
# Format prompt
prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Your response must be a valid JSON object, strictly following the requested format.
### Instruction:
Extract location, topics, and search queries from Australian climate policy questions. Your response must be a valid JSON object with the following structure:
{{
"rag_queries": ["query1", "query2", "query3"], // 1-3 policy search queries
"topics": ["topic1", "topic2", "topic3"], // 1-3 climate/environment topics
"location": {{
"query_suburb": "suburb_name or null",
"query_state": "state_code or null",
"query_lga": "lga_name or null"
}}
}}
### Input:
{query}
### Response (valid JSON only):
"""
# Generate response
inputs = tokenizer(prompt, return_tensors="pt").to(device)
outputs = model.generate(**inputs, max_new_tokens=220)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
```json
{
"rag_queries": [
"What renewable energy projects are planned for Burwood?",
"Are there solar farm initiatives in Burwood Victoria?"
],
"topics": [
"renewable energy",
"solar power"
],
"location": {
"query_suburb": "Burwood",
"query_state": "VIC",
"query_lga": null
}
}