File size: 10,261 Bytes
f8ba0eb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
2025-08-21 00:23:49 - INFO - Loading model: openbmb/MiniCPM-V-4
2025-08-21 00:23:50 - INFO - vision_config is None, using default vision config
2025-08-21 00:24:41 - INFO - Model loaded in 52.24 seconds
2025-08-21 00:24:41 - INFO - GPU Memory Usage after model load: 7802.99 MB
2025-08-21 00:24:48 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
2025-08-21 00:24:48 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] Video saved to temporary file: temp_videos/675b6c5c-5524-4cc9-a700-76d2d090a7a4.mp4
2025-08-21 00:24:48 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 00:24:52 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 00:24:53 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] 30 frames saved to temp_videos/675b6c5c-5524-4cc9-a700-76d2d090a7a4
2025-08-21 00:25:09 - INFO - vision_config is None, using default vision config
2025-08-21 00:25:21 - INFO - Tokens per second: 5.985183148455991, Peak GPU memory MB: 11824.375
2025-08-21 00:25:21 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] Inference time: 33.25 seconds, CPU usage: 37.3%, CPU core utilization: [37.0, 39.5, 36.4, 36.2]
2025-08-21 00:25:21 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] Cleaned up temporary frame directory: temp_videos/675b6c5c-5524-4cc9-a700-76d2d090a7a4
2025-08-21 00:25:21 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_002.mp4'
2025-08-21 00:25:21 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] Video saved to temporary file: temp_videos/3315fc05-4535-4c30-910d-0b8c1a9c8855.mp4
2025-08-21 00:25:21 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 00:25:29 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 00:25:29 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] 30 frames saved to temp_videos/3315fc05-4535-4c30-910d-0b8c1a9c8855
2025-08-21 00:25:41 - INFO - vision_config is None, using default vision config
2025-08-21 00:25:50 - INFO - Tokens per second: 3.7285647057248625, Peak GPU memory MB: 11824.375
2025-08-21 00:25:50 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] Inference time: 29.68 seconds, CPU usage: 50.6%, CPU core utilization: [56.4, 62.1, 35.5, 48.3]
2025-08-21 00:25:51 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] Cleaned up temporary frame directory: temp_videos/3315fc05-4535-4c30-910d-0b8c1a9c8855
2025-08-21 00:25:51 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_003.mp4'
2025-08-21 00:25:51 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] Video saved to temporary file: temp_videos/89135b17-c5fd-406e-8ce7-875b26d87444.mp4
2025-08-21 00:25:51 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 00:25:56 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 00:25:56 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] 30 frames saved to temp_videos/89135b17-c5fd-406e-8ce7-875b26d87444
2025-08-21 00:26:08 - INFO - vision_config is None, using default vision config
2025-08-21 00:26:22 - INFO - Tokens per second: 6.963268555022767, Peak GPU memory MB: 11824.375
2025-08-21 00:26:22 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] Inference time: 31.19 seconds, CPU usage: 38.1%, CPU core utilization: [30.8, 32.9, 49.7, 38.7]
2025-08-21 00:26:22 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] Cleaned up temporary frame directory: temp_videos/89135b17-c5fd-406e-8ce7-875b26d87444
2025-08-21 00:26:22 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_004.mp4'
2025-08-21 00:26:22 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] Video saved to temporary file: temp_videos/c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a.mp4
2025-08-21 00:26:22 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 00:26:27 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 00:26:27 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] 30 frames saved to temp_videos/c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a
2025-08-21 00:26:39 - INFO - vision_config is None, using default vision config
2025-08-21 00:26:53 - INFO - Tokens per second: 7.242406371874894, Peak GPU memory MB: 11824.375
2025-08-21 00:26:53 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] Inference time: 31.55 seconds, CPU usage: 36.5%, CPU core utilization: [47.2, 19.7, 61.2, 17.9]
2025-08-21 00:26:53 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] Cleaned up temporary frame directory: temp_videos/c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a
2025-08-21 00:26:53 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_005.mp4'
2025-08-21 00:26:53 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] Video saved to temporary file: temp_videos/3c7789f8-90dc-45d1-be32-cdc10502bbe2.mp4
2025-08-21 00:26:53 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 00:26:58 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 00:26:58 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] 30 frames saved to temp_videos/3c7789f8-90dc-45d1-be32-cdc10502bbe2
2025-08-21 00:27:11 - INFO - vision_config is None, using default vision config
2025-08-21 00:27:22 - INFO - Tokens per second: 5.385284609810389, Peak GPU memory MB: 11824.375
2025-08-21 00:27:22 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] Inference time: 28.65 seconds, CPU usage: 37.3%, CPU core utilization: [20.6, 21.4, 90.0, 16.9]
2025-08-21 00:27:22 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] Cleaned up temporary frame directory: temp_videos/3c7789f8-90dc-45d1-be32-cdc10502bbe2
2025-08-21 00:27:22 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_006.mp4'
2025-08-21 00:27:22 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] Video saved to temporary file: temp_videos/66ffedf3-6d71-4829-adf4-7859b5b21979.mp4
2025-08-21 00:27:22 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 00:27:27 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 00:27:27 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] 30 frames saved to temp_videos/66ffedf3-6d71-4829-adf4-7859b5b21979
2025-08-21 00:27:40 - INFO - vision_config is None, using default vision config
2025-08-21 00:27:50 - INFO - Tokens per second: 4.504682210102835, Peak GPU memory MB: 11824.375
2025-08-21 00:27:50 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] Inference time: 27.70 seconds, CPU usage: 37.4%, CPU core utilization: [27.3, 19.3, 52.1, 50.8]
2025-08-21 00:27:50 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] Cleaned up temporary frame directory: temp_videos/66ffedf3-6d71-4829-adf4-7859b5b21979
2025-08-21 00:27:50 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_007.mp4'
2025-08-21 00:27:50 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] Video saved to temporary file: temp_videos/2167f629-b4f8-4e08-9179-f8eec50d35ab.mp4
2025-08-21 00:27:50 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 00:27:54 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 00:27:54 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] 30 frames saved to temp_videos/2167f629-b4f8-4e08-9179-f8eec50d35ab
2025-08-21 00:28:07 - INFO - vision_config is None, using default vision config
2025-08-21 00:28:27 - INFO - Tokens per second: 9.168312990435263, Peak GPU memory MB: 11824.375
2025-08-21 00:28:27 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] Inference time: 37.23 seconds, CPU usage: 35.9%, CPU core utilization: [45.1, 19.0, 27.6, 52.0]
2025-08-21 00:28:27 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] Cleaned up temporary frame directory: temp_videos/2167f629-b4f8-4e08-9179-f8eec50d35ab