|
2025-08-18 22:36:03 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ |
|
2025-08-18 22:36:05 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk). |
|
2025-08-18 22:36:13 - INFO - Model loaded in 10.72 seconds |
|
2025-08-18 22:36:13 - INFO - GPU Memory Usage after model load: 2.31 GB |
|
2025-08-18 22:36:17 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4' |
|
2025-08-18 22:36:17 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Video saved to temporary file: temp_videos/6cf28ab6-d63f-482a-849e-5b626233e7dd.mp4 |
|
2025-08-18 22:36:17 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Extracting frames using method: uniform, rate/threshold: 30 |
|
2025-08-18 22:36:21 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Extracted 30 frames successfully. Saving to temporary files... |
|
2025-08-18 22:36:21 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] 30 frames saved to temp_videos/6cf28ab6-d63f-482a-849e-5b626233e7dd |
|
2025-08-18 22:36:21 - INFO - Prompt token length: 2276 |
|
2025-08-18 22:36:32 - INFO - Tokens per second: 9.058665203909582, Peak GPU memory MB: 4498.375 |
|
2025-08-18 22:36:32 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Inference time: 14.38 seconds, CPU usage: 64.5%, CPU core utilization: [61.5, 64.5, 60.3, 71.7] |
|
2025-08-18 22:36:32 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Cleaned up temporary file: temp_videos/6cf28ab6-d63f-482a-849e-5b626233e7dd.mp4 |
|
2025-08-18 22:36:32 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Cleaned up temporary frame directory: temp_videos/6cf28ab6-d63f-482a-849e-5b626233e7dd |
|
|