Wangtwohappy's picture
Upload folder using huggingface_hub
f8ba0eb verified
2025-08-21 03:38:46 - INFO - Loading model: openbmb/MiniCPM-V-4
2025-08-21 03:38:46 - INFO - vision_config is None, using default vision config
2025-08-21 03:39:50 - INFO - Model loaded in 64.62 seconds
2025-08-21 03:39:50 - INFO - GPU Memory Usage after model load: 7802.99 MB
2025-08-21 03:39:57 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_001.mp4'
2025-08-21 03:39:57 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Video saved to temporary file: temp_videos/e29d31c5-9a6b-48cd-ac25-7affc04fc186.mp4
2025-08-21 03:39:57 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:40:01 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:40:01 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] 30 frames saved to temp_videos/e29d31c5-9a6b-48cd-ac25-7affc04fc186
2025-08-21 03:40:17 - INFO - vision_config is None, using default vision config
2025-08-21 03:40:35 - INFO - Tokens per second: 8.46238691458392, Peak GPU memory MB: 11824.375
2025-08-21 03:40:35 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Inference time: 37.25 seconds, CPU usage: 28.4%, CPU core utilization: [24.9, 35.0, 21.8, 31.7]
2025-08-21 03:40:35 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Cleaned up temporary frame directory: temp_videos/e29d31c5-9a6b-48cd-ac25-7affc04fc186
2025-08-21 03:40:35 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_001.mp4'
2025-08-21 03:40:35 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Video saved to temporary file: temp_videos/0ed8d6d1-aea6-4701-a3ed-2d877bfc9882.mp4
2025-08-21 03:40:35 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:40:38 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:40:38 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] 30 frames saved to temp_videos/0ed8d6d1-aea6-4701-a3ed-2d877bfc9882
2025-08-21 03:40:51 - INFO - vision_config is None, using default vision config
2025-08-21 03:41:14 - INFO - Tokens per second: 10.024019230028593, Peak GPU memory MB: 11824.375
2025-08-21 03:41:14 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Inference time: 39.58 seconds, CPU usage: 31.8%, CPU core utilization: [16.9, 19.2, 59.5, 31.5]
2025-08-21 03:41:14 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Cleaned up temporary frame directory: temp_videos/0ed8d6d1-aea6-4701-a3ed-2d877bfc9882
2025-08-21 03:41:14 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_002.mp4'
2025-08-21 03:41:14 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Video saved to temporary file: temp_videos/1daa28b5-5708-4bd7-b738-7900bee17284.mp4
2025-08-21 03:41:14 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:41:18 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:41:18 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] 30 frames saved to temp_videos/1daa28b5-5708-4bd7-b738-7900bee17284
2025-08-21 03:41:30 - INFO - vision_config is None, using default vision config
2025-08-21 03:41:42 - INFO - Tokens per second: 6.118521643289556, Peak GPU memory MB: 11824.375
2025-08-21 03:41:42 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Inference time: 28.18 seconds, CPU usage: 33.2%, CPU core utilization: [57.3, 19.5, 10.5, 45.5]
2025-08-21 03:41:42 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Cleaned up temporary frame directory: temp_videos/1daa28b5-5708-4bd7-b738-7900bee17284
2025-08-21 03:41:42 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_003.mp4'
2025-08-21 03:41:42 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Video saved to temporary file: temp_videos/c70dd357-164c-4d57-b24d-8ead295ef24e.mp4
2025-08-21 03:41:42 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:41:46 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:41:46 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] 30 frames saved to temp_videos/c70dd357-164c-4d57-b24d-8ead295ef24e
2025-08-21 03:41:59 - INFO - vision_config is None, using default vision config
2025-08-21 03:42:13 - INFO - Tokens per second: 7.325785835893888, Peak GPU memory MB: 11824.375
2025-08-21 03:42:13 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Inference time: 30.34 seconds, CPU usage: 33.2%, CPU core utilization: [32.8, 46.1, 30.7, 23.2]
2025-08-21 03:42:13 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Cleaned up temporary frame directory: temp_videos/c70dd357-164c-4d57-b24d-8ead295ef24e
2025-08-21 03:42:13 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_004.mp4'
2025-08-21 03:42:13 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Video saved to temporary file: temp_videos/e2eff8d2-37db-4d25-9765-d46404130b2d.mp4
2025-08-21 03:42:13 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:42:16 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:42:16 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] 30 frames saved to temp_videos/e2eff8d2-37db-4d25-9765-d46404130b2d
2025-08-21 03:42:29 - INFO - vision_config is None, using default vision config
2025-08-21 03:42:40 - INFO - Tokens per second: 5.483056762285139, Peak GPU memory MB: 11824.375
2025-08-21 03:42:40 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Inference time: 27.37 seconds, CPU usage: 33.6%, CPU core utilization: [62.6, 13.2, 42.9, 15.6]
2025-08-21 03:42:40 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Cleaned up temporary frame directory: temp_videos/e2eff8d2-37db-4d25-9765-d46404130b2d
2025-08-21 03:42:40 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_005.mp4'
2025-08-21 03:42:40 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Video saved to temporary file: temp_videos/374baf0b-09b4-47f6-bda7-007ed31b73e6.mp4
2025-08-21 03:42:40 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:42:43 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:42:43 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] 30 frames saved to temp_videos/374baf0b-09b4-47f6-bda7-007ed31b73e6
2025-08-21 03:42:56 - INFO - vision_config is None, using default vision config
2025-08-21 03:43:12 - INFO - Tokens per second: 7.8524871607145865, Peak GPU memory MB: 11824.375
2025-08-21 03:43:12 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Inference time: 31.57 seconds, CPU usage: 32.7%, CPU core utilization: [13.2, 42.4, 47.5, 27.5]
2025-08-21 03:43:12 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Cleaned up temporary frame directory: temp_videos/374baf0b-09b4-47f6-bda7-007ed31b73e6
2025-08-21 03:43:12 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_006.mp4'
2025-08-21 03:43:12 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Video saved to temporary file: temp_videos/b9266afc-5115-4696-91ea-9894092513ff.mp4
2025-08-21 03:43:12 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:43:15 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:43:15 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] 30 frames saved to temp_videos/b9266afc-5115-4696-91ea-9894092513ff
2025-08-21 03:43:28 - INFO - vision_config is None, using default vision config
2025-08-21 03:43:40 - INFO - Tokens per second: 5.751292635048318, Peak GPU memory MB: 11824.375
2025-08-21 03:43:40 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Inference time: 27.84 seconds, CPU usage: 33.5%, CPU core utilization: [24.6, 32.6, 13.2, 63.8]
2025-08-21 03:43:40 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Cleaned up temporary frame directory: temp_videos/b9266afc-5115-4696-91ea-9894092513ff
2025-08-21 03:43:40 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_007.mp4'
2025-08-21 03:43:40 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Video saved to temporary file: temp_videos/cf387e92-735c-444b-a102-345d888dc633.mp4
2025-08-21 03:43:40 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:43:43 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:43:43 - INFO - [cf387e92-735c-444b-a102-345d888dc633] 30 frames saved to temp_videos/cf387e92-735c-444b-a102-345d888dc633
2025-08-21 03:43:56 - INFO - vision_config is None, using default vision config
2025-08-21 03:44:08 - INFO - Tokens per second: 6.460640309211369, Peak GPU memory MB: 11824.375
2025-08-21 03:44:08 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Inference time: 28.88 seconds, CPU usage: 33.2%, CPU core utilization: [11.9, 52.0, 12.7, 56.4]
2025-08-21 03:44:08 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Cleaned up temporary frame directory: temp_videos/cf387e92-735c-444b-a102-345d888dc633
2025-08-21 03:44:08 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_008.mp4'
2025-08-21 03:44:08 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Video saved to temporary file: temp_videos/aef21c10-5565-48f5-bcbf-e239a1faa322.mp4
2025-08-21 03:44:08 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:44:12 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:44:12 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] 30 frames saved to temp_videos/aef21c10-5565-48f5-bcbf-e239a1faa322
2025-08-21 03:44:25 - INFO - vision_config is None, using default vision config
2025-08-21 03:44:35 - INFO - Tokens per second: 4.950112497910254, Peak GPU memory MB: 11824.375
2025-08-21 03:44:35 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Inference time: 26.99 seconds, CPU usage: 34.1%, CPU core utilization: [16.3, 12.5, 47.8, 59.8]
2025-08-21 03:44:35 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Cleaned up temporary frame directory: temp_videos/aef21c10-5565-48f5-bcbf-e239a1faa322
2025-08-21 03:44:35 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_009.mp4'
2025-08-21 03:44:35 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Video saved to temporary file: temp_videos/040650dd-914d-453f-a411-b31d1d6897d5.mp4
2025-08-21 03:44:35 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:44:39 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:44:39 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] 30 frames saved to temp_videos/040650dd-914d-453f-a411-b31d1d6897d5
2025-08-21 03:44:52 - INFO - vision_config is None, using default vision config
2025-08-21 03:45:04 - INFO - Tokens per second: 6.046726583056993, Peak GPU memory MB: 11824.375
2025-08-21 03:45:04 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Inference time: 28.31 seconds, CPU usage: 33.8%, CPU core utilization: [45.1, 36.7, 40.1, 13.1]
2025-08-21 03:45:04 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Cleaned up temporary frame directory: temp_videos/040650dd-914d-453f-a411-b31d1d6897d5
2025-08-21 03:45:04 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_010.mp4'
2025-08-21 03:45:04 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Video saved to temporary file: temp_videos/c4922af4-0973-46aa-8ab3-a2904f616ca0.mp4
2025-08-21 03:45:04 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:45:07 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:45:07 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] 30 frames saved to temp_videos/c4922af4-0973-46aa-8ab3-a2904f616ca0
2025-08-21 03:45:20 - INFO - vision_config is None, using default vision config
2025-08-21 03:45:31 - INFO - Tokens per second: 5.012952424490043, Peak GPU memory MB: 11824.375
2025-08-21 03:45:31 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Inference time: 26.92 seconds, CPU usage: 33.9%, CPU core utilization: [45.7, 15.2, 24.5, 49.9]
2025-08-21 03:45:31 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Cleaned up temporary frame directory: temp_videos/c4922af4-0973-46aa-8ab3-a2904f616ca0
2025-08-21 03:45:31 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_011.mp4'
2025-08-21 03:45:31 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Video saved to temporary file: temp_videos/96d2962b-c166-4be7-847e-fe025954af18.mp4
2025-08-21 03:45:31 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:45:34 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:45:34 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] 30 frames saved to temp_videos/96d2962b-c166-4be7-847e-fe025954af18
2025-08-21 03:45:47 - INFO - vision_config is None, using default vision config
2025-08-21 03:45:59 - INFO - Tokens per second: 6.0496181050699604, Peak GPU memory MB: 11824.375
2025-08-21 03:45:59 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Inference time: 28.26 seconds, CPU usage: 33.7%, CPU core utilization: [15.3, 25.6, 46.3, 47.7]
2025-08-21 03:45:59 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Cleaned up temporary frame directory: temp_videos/96d2962b-c166-4be7-847e-fe025954af18
2025-08-21 03:45:59 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_012.mp4'
2025-08-21 03:45:59 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Video saved to temporary file: temp_videos/1861ded3-2706-4381-8e32-07949f940d95.mp4
2025-08-21 03:45:59 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:46:02 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:46:02 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] 30 frames saved to temp_videos/1861ded3-2706-4381-8e32-07949f940d95
2025-08-21 03:46:15 - INFO - vision_config is None, using default vision config
2025-08-21 03:46:27 - INFO - Tokens per second: 6.042554615117601, Peak GPU memory MB: 11824.375
2025-08-21 03:46:27 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Inference time: 28.37 seconds, CPU usage: 33.3%, CPU core utilization: [53.9, 22.6, 36.0, 20.6]
2025-08-21 03:46:27 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Cleaned up temporary frame directory: temp_videos/1861ded3-2706-4381-8e32-07949f940d95
2025-08-21 03:46:27 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_013.mp4'
2025-08-21 03:46:27 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Video saved to temporary file: temp_videos/04268664-f928-4c89-ab42-d07706c93257.mp4
2025-08-21 03:46:27 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:46:31 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:46:31 - INFO - [04268664-f928-4c89-ab42-d07706c93257] 30 frames saved to temp_videos/04268664-f928-4c89-ab42-d07706c93257
2025-08-21 03:46:44 - INFO - vision_config is None, using default vision config
2025-08-21 03:46:55 - INFO - Tokens per second: 5.897914933533588, Peak GPU memory MB: 11824.375
2025-08-21 03:46:55 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Inference time: 28.07 seconds, CPU usage: 33.2%, CPU core utilization: [52.2, 22.0, 22.6, 35.7]
2025-08-21 03:46:55 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Cleaned up temporary frame directory: temp_videos/04268664-f928-4c89-ab42-d07706c93257
2025-08-21 03:46:55 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_014.mp4'
2025-08-21 03:46:55 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Video saved to temporary file: temp_videos/604fd124-b44b-4dfe-b7b4-8d3e1f179d69.mp4
2025-08-21 03:46:55 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:46:59 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:46:59 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] 30 frames saved to temp_videos/604fd124-b44b-4dfe-b7b4-8d3e1f179d69
2025-08-21 03:47:12 - INFO - vision_config is None, using default vision config
2025-08-21 03:47:25 - INFO - Tokens per second: 6.542944804531987, Peak GPU memory MB: 11824.375
2025-08-21 03:47:25 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Inference time: 29.09 seconds, CPU usage: 32.5%, CPU core utilization: [11.8, 19.4, 53.2, 45.3]
2025-08-21 03:47:25 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Cleaned up temporary frame directory: temp_videos/604fd124-b44b-4dfe-b7b4-8d3e1f179d69
2025-08-21 03:47:25 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_015.mp4'
2025-08-21 03:47:25 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Video saved to temporary file: temp_videos/02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d.mp4
2025-08-21 03:47:25 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:47:28 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:47:28 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] 30 frames saved to temp_videos/02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d
2025-08-21 03:47:41 - INFO - vision_config is None, using default vision config
2025-08-21 03:47:52 - INFO - Tokens per second: 5.0694963028257245, Peak GPU memory MB: 11824.375
2025-08-21 03:47:52 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Inference time: 27.06 seconds, CPU usage: 33.0%, CPU core utilization: [18.7, 21.7, 41.3, 50.1]
2025-08-21 03:47:52 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Cleaned up temporary frame directory: temp_videos/02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d
2025-08-21 03:47:52 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_016.mp4'
2025-08-21 03:47:52 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Video saved to temporary file: temp_videos/0ac85bea-34eb-4a31-8263-f7810fb38235.mp4
2025-08-21 03:47:52 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:47:55 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:47:55 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] 30 frames saved to temp_videos/0ac85bea-34eb-4a31-8263-f7810fb38235
2025-08-21 03:48:08 - INFO - vision_config is None, using default vision config
2025-08-21 03:48:19 - INFO - Tokens per second: 5.245383520327553, Peak GPU memory MB: 11824.375
2025-08-21 03:48:19 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Inference time: 27.29 seconds, CPU usage: 33.6%, CPU core utilization: [53.0, 47.3, 12.8, 21.4]
2025-08-21 03:48:19 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Cleaned up temporary frame directory: temp_videos/0ac85bea-34eb-4a31-8263-f7810fb38235