File size: 24,574 Bytes
f8ba0eb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
2025-08-21 03:38:46 - INFO - Loading model: openbmb/MiniCPM-V-4
2025-08-21 03:38:46 - INFO - vision_config is None, using default vision config
2025-08-21 03:39:50 - INFO - Model loaded in 64.62 seconds
2025-08-21 03:39:50 - INFO - GPU Memory Usage after model load: 7802.99 MB
2025-08-21 03:39:57 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_001.mp4'
2025-08-21 03:39:57 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Video saved to temporary file: temp_videos/e29d31c5-9a6b-48cd-ac25-7affc04fc186.mp4
2025-08-21 03:39:57 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:40:01 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:40:01 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] 30 frames saved to temp_videos/e29d31c5-9a6b-48cd-ac25-7affc04fc186
2025-08-21 03:40:17 - INFO - vision_config is None, using default vision config
2025-08-21 03:40:35 - INFO - Tokens per second: 8.46238691458392, Peak GPU memory MB: 11824.375
2025-08-21 03:40:35 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Inference time: 37.25 seconds, CPU usage: 28.4%, CPU core utilization: [24.9, 35.0, 21.8, 31.7]
2025-08-21 03:40:35 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Cleaned up temporary frame directory: temp_videos/e29d31c5-9a6b-48cd-ac25-7affc04fc186
2025-08-21 03:40:35 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_001.mp4'
2025-08-21 03:40:35 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Video saved to temporary file: temp_videos/0ed8d6d1-aea6-4701-a3ed-2d877bfc9882.mp4
2025-08-21 03:40:35 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:40:38 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:40:38 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] 30 frames saved to temp_videos/0ed8d6d1-aea6-4701-a3ed-2d877bfc9882
2025-08-21 03:40:51 - INFO - vision_config is None, using default vision config
2025-08-21 03:41:14 - INFO - Tokens per second: 10.024019230028593, Peak GPU memory MB: 11824.375
2025-08-21 03:41:14 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Inference time: 39.58 seconds, CPU usage: 31.8%, CPU core utilization: [16.9, 19.2, 59.5, 31.5]
2025-08-21 03:41:14 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Cleaned up temporary frame directory: temp_videos/0ed8d6d1-aea6-4701-a3ed-2d877bfc9882
2025-08-21 03:41:14 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_002.mp4'
2025-08-21 03:41:14 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Video saved to temporary file: temp_videos/1daa28b5-5708-4bd7-b738-7900bee17284.mp4
2025-08-21 03:41:14 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:41:18 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:41:18 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] 30 frames saved to temp_videos/1daa28b5-5708-4bd7-b738-7900bee17284
2025-08-21 03:41:30 - INFO - vision_config is None, using default vision config
2025-08-21 03:41:42 - INFO - Tokens per second: 6.118521643289556, Peak GPU memory MB: 11824.375
2025-08-21 03:41:42 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Inference time: 28.18 seconds, CPU usage: 33.2%, CPU core utilization: [57.3, 19.5, 10.5, 45.5]
2025-08-21 03:41:42 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Cleaned up temporary frame directory: temp_videos/1daa28b5-5708-4bd7-b738-7900bee17284
2025-08-21 03:41:42 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_003.mp4'
2025-08-21 03:41:42 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Video saved to temporary file: temp_videos/c70dd357-164c-4d57-b24d-8ead295ef24e.mp4
2025-08-21 03:41:42 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:41:46 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:41:46 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] 30 frames saved to temp_videos/c70dd357-164c-4d57-b24d-8ead295ef24e
2025-08-21 03:41:59 - INFO - vision_config is None, using default vision config
2025-08-21 03:42:13 - INFO - Tokens per second: 7.325785835893888, Peak GPU memory MB: 11824.375
2025-08-21 03:42:13 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Inference time: 30.34 seconds, CPU usage: 33.2%, CPU core utilization: [32.8, 46.1, 30.7, 23.2]
2025-08-21 03:42:13 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Cleaned up temporary frame directory: temp_videos/c70dd357-164c-4d57-b24d-8ead295ef24e
2025-08-21 03:42:13 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_004.mp4'
2025-08-21 03:42:13 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Video saved to temporary file: temp_videos/e2eff8d2-37db-4d25-9765-d46404130b2d.mp4
2025-08-21 03:42:13 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:42:16 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:42:16 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] 30 frames saved to temp_videos/e2eff8d2-37db-4d25-9765-d46404130b2d
2025-08-21 03:42:29 - INFO - vision_config is None, using default vision config
2025-08-21 03:42:40 - INFO - Tokens per second: 5.483056762285139, Peak GPU memory MB: 11824.375
2025-08-21 03:42:40 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Inference time: 27.37 seconds, CPU usage: 33.6%, CPU core utilization: [62.6, 13.2, 42.9, 15.6]
2025-08-21 03:42:40 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Cleaned up temporary frame directory: temp_videos/e2eff8d2-37db-4d25-9765-d46404130b2d
2025-08-21 03:42:40 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_005.mp4'
2025-08-21 03:42:40 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Video saved to temporary file: temp_videos/374baf0b-09b4-47f6-bda7-007ed31b73e6.mp4
2025-08-21 03:42:40 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:42:43 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:42:43 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] 30 frames saved to temp_videos/374baf0b-09b4-47f6-bda7-007ed31b73e6
2025-08-21 03:42:56 - INFO - vision_config is None, using default vision config
2025-08-21 03:43:12 - INFO - Tokens per second: 7.8524871607145865, Peak GPU memory MB: 11824.375
2025-08-21 03:43:12 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Inference time: 31.57 seconds, CPU usage: 32.7%, CPU core utilization: [13.2, 42.4, 47.5, 27.5]
2025-08-21 03:43:12 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Cleaned up temporary frame directory: temp_videos/374baf0b-09b4-47f6-bda7-007ed31b73e6
2025-08-21 03:43:12 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_006.mp4'
2025-08-21 03:43:12 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Video saved to temporary file: temp_videos/b9266afc-5115-4696-91ea-9894092513ff.mp4
2025-08-21 03:43:12 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:43:15 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:43:15 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] 30 frames saved to temp_videos/b9266afc-5115-4696-91ea-9894092513ff
2025-08-21 03:43:28 - INFO - vision_config is None, using default vision config
2025-08-21 03:43:40 - INFO - Tokens per second: 5.751292635048318, Peak GPU memory MB: 11824.375
2025-08-21 03:43:40 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Inference time: 27.84 seconds, CPU usage: 33.5%, CPU core utilization: [24.6, 32.6, 13.2, 63.8]
2025-08-21 03:43:40 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Cleaned up temporary frame directory: temp_videos/b9266afc-5115-4696-91ea-9894092513ff
2025-08-21 03:43:40 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_007.mp4'
2025-08-21 03:43:40 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Video saved to temporary file: temp_videos/cf387e92-735c-444b-a102-345d888dc633.mp4
2025-08-21 03:43:40 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:43:43 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:43:43 - INFO - [cf387e92-735c-444b-a102-345d888dc633] 30 frames saved to temp_videos/cf387e92-735c-444b-a102-345d888dc633
2025-08-21 03:43:56 - INFO - vision_config is None, using default vision config
2025-08-21 03:44:08 - INFO - Tokens per second: 6.460640309211369, Peak GPU memory MB: 11824.375
2025-08-21 03:44:08 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Inference time: 28.88 seconds, CPU usage: 33.2%, CPU core utilization: [11.9, 52.0, 12.7, 56.4]
2025-08-21 03:44:08 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Cleaned up temporary frame directory: temp_videos/cf387e92-735c-444b-a102-345d888dc633
2025-08-21 03:44:08 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_008.mp4'
2025-08-21 03:44:08 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Video saved to temporary file: temp_videos/aef21c10-5565-48f5-bcbf-e239a1faa322.mp4
2025-08-21 03:44:08 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:44:12 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:44:12 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] 30 frames saved to temp_videos/aef21c10-5565-48f5-bcbf-e239a1faa322
2025-08-21 03:44:25 - INFO - vision_config is None, using default vision config
2025-08-21 03:44:35 - INFO - Tokens per second: 4.950112497910254, Peak GPU memory MB: 11824.375
2025-08-21 03:44:35 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Inference time: 26.99 seconds, CPU usage: 34.1%, CPU core utilization: [16.3, 12.5, 47.8, 59.8]
2025-08-21 03:44:35 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Cleaned up temporary frame directory: temp_videos/aef21c10-5565-48f5-bcbf-e239a1faa322
2025-08-21 03:44:35 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_009.mp4'
2025-08-21 03:44:35 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Video saved to temporary file: temp_videos/040650dd-914d-453f-a411-b31d1d6897d5.mp4
2025-08-21 03:44:35 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:44:39 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:44:39 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] 30 frames saved to temp_videos/040650dd-914d-453f-a411-b31d1d6897d5
2025-08-21 03:44:52 - INFO - vision_config is None, using default vision config
2025-08-21 03:45:04 - INFO - Tokens per second: 6.046726583056993, Peak GPU memory MB: 11824.375
2025-08-21 03:45:04 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Inference time: 28.31 seconds, CPU usage: 33.8%, CPU core utilization: [45.1, 36.7, 40.1, 13.1]
2025-08-21 03:45:04 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Cleaned up temporary frame directory: temp_videos/040650dd-914d-453f-a411-b31d1d6897d5
2025-08-21 03:45:04 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_010.mp4'
2025-08-21 03:45:04 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Video saved to temporary file: temp_videos/c4922af4-0973-46aa-8ab3-a2904f616ca0.mp4
2025-08-21 03:45:04 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:45:07 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:45:07 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] 30 frames saved to temp_videos/c4922af4-0973-46aa-8ab3-a2904f616ca0
2025-08-21 03:45:20 - INFO - vision_config is None, using default vision config
2025-08-21 03:45:31 - INFO - Tokens per second: 5.012952424490043, Peak GPU memory MB: 11824.375
2025-08-21 03:45:31 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Inference time: 26.92 seconds, CPU usage: 33.9%, CPU core utilization: [45.7, 15.2, 24.5, 49.9]
2025-08-21 03:45:31 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Cleaned up temporary frame directory: temp_videos/c4922af4-0973-46aa-8ab3-a2904f616ca0
2025-08-21 03:45:31 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_011.mp4'
2025-08-21 03:45:31 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Video saved to temporary file: temp_videos/96d2962b-c166-4be7-847e-fe025954af18.mp4
2025-08-21 03:45:31 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:45:34 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:45:34 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] 30 frames saved to temp_videos/96d2962b-c166-4be7-847e-fe025954af18
2025-08-21 03:45:47 - INFO - vision_config is None, using default vision config
2025-08-21 03:45:59 - INFO - Tokens per second: 6.0496181050699604, Peak GPU memory MB: 11824.375
2025-08-21 03:45:59 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Inference time: 28.26 seconds, CPU usage: 33.7%, CPU core utilization: [15.3, 25.6, 46.3, 47.7]
2025-08-21 03:45:59 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Cleaned up temporary frame directory: temp_videos/96d2962b-c166-4be7-847e-fe025954af18
2025-08-21 03:45:59 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_012.mp4'
2025-08-21 03:45:59 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Video saved to temporary file: temp_videos/1861ded3-2706-4381-8e32-07949f940d95.mp4
2025-08-21 03:45:59 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:46:02 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:46:02 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] 30 frames saved to temp_videos/1861ded3-2706-4381-8e32-07949f940d95
2025-08-21 03:46:15 - INFO - vision_config is None, using default vision config
2025-08-21 03:46:27 - INFO - Tokens per second: 6.042554615117601, Peak GPU memory MB: 11824.375
2025-08-21 03:46:27 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Inference time: 28.37 seconds, CPU usage: 33.3%, CPU core utilization: [53.9, 22.6, 36.0, 20.6]
2025-08-21 03:46:27 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Cleaned up temporary frame directory: temp_videos/1861ded3-2706-4381-8e32-07949f940d95
2025-08-21 03:46:27 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_013.mp4'
2025-08-21 03:46:27 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Video saved to temporary file: temp_videos/04268664-f928-4c89-ab42-d07706c93257.mp4
2025-08-21 03:46:27 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:46:31 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:46:31 - INFO - [04268664-f928-4c89-ab42-d07706c93257] 30 frames saved to temp_videos/04268664-f928-4c89-ab42-d07706c93257
2025-08-21 03:46:44 - INFO - vision_config is None, using default vision config
2025-08-21 03:46:55 - INFO - Tokens per second: 5.897914933533588, Peak GPU memory MB: 11824.375
2025-08-21 03:46:55 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Inference time: 28.07 seconds, CPU usage: 33.2%, CPU core utilization: [52.2, 22.0, 22.6, 35.7]
2025-08-21 03:46:55 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Cleaned up temporary frame directory: temp_videos/04268664-f928-4c89-ab42-d07706c93257
2025-08-21 03:46:55 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_014.mp4'
2025-08-21 03:46:55 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Video saved to temporary file: temp_videos/604fd124-b44b-4dfe-b7b4-8d3e1f179d69.mp4
2025-08-21 03:46:55 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:46:59 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:46:59 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] 30 frames saved to temp_videos/604fd124-b44b-4dfe-b7b4-8d3e1f179d69
2025-08-21 03:47:12 - INFO - vision_config is None, using default vision config
2025-08-21 03:47:25 - INFO - Tokens per second: 6.542944804531987, Peak GPU memory MB: 11824.375
2025-08-21 03:47:25 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Inference time: 29.09 seconds, CPU usage: 32.5%, CPU core utilization: [11.8, 19.4, 53.2, 45.3]
2025-08-21 03:47:25 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Cleaned up temporary frame directory: temp_videos/604fd124-b44b-4dfe-b7b4-8d3e1f179d69
2025-08-21 03:47:25 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_015.mp4'
2025-08-21 03:47:25 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Video saved to temporary file: temp_videos/02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d.mp4
2025-08-21 03:47:25 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:47:28 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:47:28 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] 30 frames saved to temp_videos/02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d
2025-08-21 03:47:41 - INFO - vision_config is None, using default vision config
2025-08-21 03:47:52 - INFO - Tokens per second: 5.0694963028257245, Peak GPU memory MB: 11824.375
2025-08-21 03:47:52 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Inference time: 27.06 seconds, CPU usage: 33.0%, CPU core utilization: [18.7, 21.7, 41.3, 50.1]
2025-08-21 03:47:52 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Cleaned up temporary frame directory: temp_videos/02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d
2025-08-21 03:47:52 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_016.mp4'
2025-08-21 03:47:52 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Video saved to temporary file: temp_videos/0ac85bea-34eb-4a31-8263-f7810fb38235.mp4
2025-08-21 03:47:52 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Extracting frames using method: uniform, rate/threshold: 30
2025-08-21 03:47:55 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Extracted 30 frames successfully. Saving to temporary files...
2025-08-21 03:47:55 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] 30 frames saved to temp_videos/0ac85bea-34eb-4a31-8263-f7810fb38235
2025-08-21 03:48:08 - INFO - vision_config is None, using default vision config
2025-08-21 03:48:19 - INFO - Tokens per second: 5.245383520327553, Peak GPU memory MB: 11824.375
2025-08-21 03:48:19 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Inference time: 27.29 seconds, CPU usage: 33.6%, CPU core utilization: [53.0, 47.3, 12.8, 21.4]
2025-08-21 03:48:19 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Cleaned up temporary frame directory: temp_videos/0ac85bea-34eb-4a31-8263-f7810fb38235