oscarwu
/

Llama-3.2-3B-CLEAR

@@ -1,22 +1,110 @@
 ---
-base_model: unsloth/Llama-3.2-3B-Instruct
 tags:
 - text-generation-inference
 - transformers
 - unsloth
 - llama
 - trl
 license: apache-2.0
 language:
 - en
 ---
-# Uploaded  model
-- **Developed by:** oscarwu
-- **License:** apache-2.0
-- **Finetuned from model :** unsloth/Llama-3.2-3B-Instruct
-This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
+base_model: Llama-3.2-3B-Instruct
 tags:
 - text-generation-inference
 - transformers
 - unsloth
 - llama
 - trl
+- climate-policy
+- query-interpretation
+- lora
 license: apache-2.0
 language:
 - en
 ---
+# CLEAR Query Interpreter
+This is the official implementation of the query interpretation model from our paper "CLEAR: Climate Policy Retrieval and Summarization Using LLMs" (WWW Companion '25).
+## Model Description
+The model is a LoRA adapter fine-tuned on Llama-3.2-3B to decompose natural language queries about climate policies into structured components for precise information retrieval.
+### Task
+Query interpretation for climate policy retrieval, decomposing natural queries into:
+- Location (L): Geographic identification
+- Topics (T): Climate-related themes
+- Intent (I): Specific policy inquiries
+### Training Details
+- Base Model: Llama-3.2-3B
+- Training Data: 330 manually annotated queries
+- Annotators: Four Australia-based experts with media communication backgrounds
+- Hardware: NVIDIA A100 GPU
+- Parameters:
+  - Batch size: 6
+  - Sequence length: 1024
+  - Optimizer: AdamW (weight decay 0.05)
+  - Learning rate: 5e-5
+  - Epochs: 10
+## Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+# Load model and tokenizer
+device = "cuda" if torch.cuda.is_available() else "cpu"
+model_name = "oscarwu/Llama-3.2-3B-CLEAR"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(
+   model_name,
+   torch_dtype=torch.float16
+).to(device)
+# Example query
+query = "I live in Burwood (Vic) and want details on renewable energy initiatives. Are solar farms planned?"
+# Format prompt
+prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Your response must be a valid JSON object, strictly following the requested format.
+### Instruction:
+Extract location, topics, and search queries from Australian climate policy questions. Your response must be a valid JSON object with the following structure:
+{{
+ "rag_queries": ["query1", "query2", "query3"],  // 1-3 policy search queries
+ "topics": ["topic1", "topic2", "topic3"],       // 1-3 climate/environment topics
+ "location": {{
+   "query_suburb": "suburb_name or null",
+   "query_state": "state_code or null",
+   "query_lga": "lga_name or null"
+ }}
+}}
+### Input:
+{query}
+### Response (valid JSON only):
+"""
+# Generate response
+inputs = tokenizer(prompt, return_tensors="pt").to(device)
+outputs = model.generate(**inputs, max_new_tokens=220)
+result = tokenizer.decode(outputs[0], skip_special_tokens=True)
+```json
+{
+  "rag_queries": [
+    "What renewable energy projects are planned for Burwood?",
+    "Are there solar farm initiatives in Burwood Victoria?"
+  ],
+  "topics": [
+    "renewable energy",
+    "solar power"
+  ],
+  "location": {
+    "query_suburb": "Burwood",
+    "query_state": "VIC",
+    "query_lga": null
+  }
+}
+```