2025-08-18 22:26:17 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ | |
2025-08-18 22:26:20 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk). | |
2025-08-18 22:26:28 - INFO - Model loaded in 10.99 seconds | |
2025-08-18 22:26:28 - INFO - GPU Memory Usage after model load: 2.31 GB | |
2025-08-18 22:26:32 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4' | |
2025-08-18 22:26:32 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] Video saved to temporary file: temp_videos/27d85b80-1b2f-42eb-9084-a747364133e1.mp4 | |
2025-08-18 22:26:32 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] Extracting frames using method: uniform, rate/threshold: 30 | |
2025-08-18 22:26:36 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] Extracted 30 frames successfully. Saving to temporary files... | |
2025-08-18 22:26:36 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] 30 frames saved to temp_videos/27d85b80-1b2f-42eb-9084-a747364133e1 | |
2025-08-18 22:26:37 - INFO - Prompt token length: 2276 | |
2025-08-18 22:26:48 - INFO - Tokens per second: 8.544413217338054, Peak GPU memory MB: 4498.375 | |
2025-08-18 22:26:48 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] Cleaned up temporary file: temp_videos/27d85b80-1b2f-42eb-9084-a747364133e1.mp4 | |
2025-08-18 22:26:48 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] Cleaned up temporary frame directory: temp_videos/27d85b80-1b2f-42eb-9084-a747364133e1 | |