Wangtwohappy's picture
Upload folder using huggingface_hub
f8ba0eb verified
2025-08-21 00:37:40 - INFO - Loading model: Qwen/Qwen2.5-VL-3B-Instruct-AWQ
2025-08-21 00:37:42 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2025-08-21 00:37:58 - INFO - Model loaded in 17.64 seconds
2025-08-21 00:37:58 - INFO - GPU Memory Usage after model load: 3250.85 MB
2025-08-21 00:39:14 - INFO - [7b3e4c2f-150e-4db3-a2b2-792ef836f5c3] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
2025-08-21 00:39:14 - INFO - [7b3e4c2f-150e-4db3-a2b2-792ef836f5c3] Video saved to temporary file: temp_videos/7b3e4c2f-150e-4db3-a2b2-792ef836f5c3.mp4
2025-08-21 00:39:14 - INFO - [7b3e4c2f-150e-4db3-a2b2-792ef836f5c3] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 00:39:18 - INFO - [7b3e4c2f-150e-4db3-a2b2-792ef836f5c3] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 00:39:18 - INFO - [7b3e4c2f-150e-4db3-a2b2-792ef836f5c3] 30 frames saved to temp_videos/7b3e4c2f-150e-4db3-a2b2-792ef836f5c3
2025-08-21 00:39:19 - INFO - Prompt token length: 2306