--- license: apache-2.0 base_model: Qwen/Qwen2.5-3B-Instruct tags: - text-generation - evaluation-agent - cot-reasoning - checkpoint - qwen2.5 - video-assessment - image-assessment library_name: transformers pipeline_tag: text-generation --- # ea-dev-final This is checkpoint **final** (step 471) from fine-tuning [Qwen/Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) for evaluation agent tasks. ## Checkpoint Details - **Checkpoint**: final - **Global Step**: 471 - **Epoch**: 3.00 - **Training Loss**: 0.8296 - **Learning Rate**: unknown - **Base Model**: Qwen2.5-3B-Instruct - **Task**: Multi-modal quality assessment with CoT reasoning ## Model Description This checkpoint is from training an evaluation agent that can assess: - **Video Quality**: Temporal consistency, motion smoothness, object consistency (VBench) - **Image Quality**: Aesthetic quality, semantic alignment, visual fidelity (T2I-CompBench) - **Open-ended Evaluation**: Custom quality assessment tasks The model uses Chain-of-Thought (CoT) reasoning to provide detailed explanations for its evaluations. ## Files Included This checkpoint contains: - **Model Weights**: `model*.safetensors` - The actual model parameters - **Tokenizer**: Complete tokenizer configuration and vocabulary - **Configuration**: Model and generation configuration files **Note**: This checkpoint contains only inference files (no optimizer states). ## Usage ### For Inference ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch # Load the checkpoint model = AutoModelForCausalLM.from_pretrained( "ea-dev-final", torch_dtype=torch.bfloat16, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("ea-dev-final") # Example evaluation prompt prompt = """Please evaluate the quality of this video based on the following criteria: 1. Visual quality and clarity 2. Temporal consistency 3. Motion smoothness Video description: A person walking through a park with trees swaying in the wind. Let me think step by step:""" inputs = tokenizer(prompt, return_tensors="pt") with torch.no_grad(): outputs = model.generate( **inputs, max_length=512, do_sample=True, temperature=0.7, pad_token_id=tokenizer.eos_token_id ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ### Resume Training (if optimizer states included) ```bash # Use with LLaMA-Factory llamafactory-cli train \ --stage sft \ --model_name_or_path ea-dev-final \ --resume_from_checkpoint ea-dev-final ``` ## Training Progress This checkpoint represents an intermediate state in the training process: - **Steps Completed**: 471 - **Epochs**: 3.00 - **Current Loss**: 0.8296 ## Related Models This checkpoint is part of a series. Other checkpoints from the same training run: - Look for repositories with pattern: `ea-dev-checkpoint-*` - Final model: `ea-dev-final` ## License This model checkpoint is released under the Apache 2.0 license. ## Citation If you use this checkpoint, please cite: ```bibtex @misc{eval-agent-qwen2.5-checkpoint-471, title={Evaluation Agent Qwen2.5 Checkpoint 471}, author={Your Name}, year={2025}, howpublished={\url{https://huggingface.co/ea-dev-final}} } ```