Wangtwohappy commited on
Commit
f8ba0eb
·
verified ·
1 Parent(s): 3af9d63

Upload folder using huggingface_hub

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +390 -0
  2. API_Transformers/__pycache__/video_processor.cpython-311.pyc +0 -0
  3. API_Transformers/cal.py +18 -0
  4. API_Transformers/delete.py +10 -0
  5. API_Transformers/infer.py +131 -0
  6. API_Transformers/logs/LFM2-VL-1.6B/20250818_232556.log +14 -0
  7. API_Transformers/logs/LFM2-VL-1.6B/20250818_233101.log +28 -0
  8. API_Transformers/logs/LFM2-VL-1.6B/20250818_233342.log +28 -0
  9. API_Transformers/logs/LFM2-VL-1.6B/20250818_233635.log +10 -0
  10. API_Transformers/logs/LFM2-VL-1.6B/20250818_234120.log +34 -0
  11. API_Transformers/logs/LFM2-VL-1.6B/20250818_234837.log +14 -0
  12. API_Transformers/logs/LFM2-VL-1.6B/20250818_234946.log +0 -0
  13. API_Transformers/logs/LFM2-VL-1.6B/20250820_215936.log +44 -0
  14. API_Transformers/logs/LFM2-VL-1.6B/20250820_220950.log +54 -0
  15. API_Transformers/logs/LFM2-VL-1.6B/20250820_221918.log +0 -0
  16. API_Transformers/logs/LFM2-VL-1.6B/20250820_231154.log +4 -0
  17. API_Transformers/logs/LFM2-VL-1.6B/20250820_231714.log +67 -0
  18. API_Transformers/logs/LFM2-VL-1.6B/20250820_232316.log +24 -0
  19. API_Transformers/logs/LFM2-VL-1.6B/20250820_232542.log +130 -0
  20. API_Transformers/logs/MiniCPM-V-4/20250819_004631.log +14 -0
  21. API_Transformers/logs/MiniCPM-V-4/20250819_013451.log +454 -0
  22. API_Transformers/logs/MiniCPM-V-4/20250820_233455.log +0 -0
  23. API_Transformers/logs/MiniCPM-V-4/20250821_002349.log +67 -0
  24. API_Transformers/logs/MiniCPM-V-4/20250821_005748.log +472 -0
  25. API_Transformers/logs/MiniCPM-V-4/20250821_033846.log +157 -0
  26. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_212712.log +1 -0
  27. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_212744.log +20 -0
  28. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_213116.log +9 -0
  29. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_214203.log +20 -0
  30. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_215326.log +44 -0
  31. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_221356.log +22 -0
  32. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_221804.log +19 -0
  33. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_222505.log +18 -0
  34. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_222617.log +13 -0
  35. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_223141.log +14 -0
  36. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_223603.log +14 -0
  37. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_224148.log +14 -0
  38. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_224556.log +0 -0
  39. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250819_010913.log +0 -0
  40. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250821_002944.log +94 -0
  41. API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250821_013207.log +148 -0
  42. API_Transformers/logs/Qwen2.5-VL-3B-Instruct-AWQ/20250821_003308.log +2 -0
  43. API_Transformers/logs/Qwen2.5-VL-3B-Instruct-AWQ/20250821_003548.log +2 -0
  44. API_Transformers/logs/Qwen2.5-VL-3B-Instruct-AWQ/20250821_003740.log +10 -0
  45. API_Transformers/logs/Qwen2.5-VL-3B-Instruct-AWQ/20250821_004253.log +103 -0
  46. API_Transformers/logs/Qwen2.5-VL-3B-Instruct-AWQ/20250821_004907.log +145 -0
  47. API_Transformers/logs/Qwen2.5-VL-3B-Instruct-AWQ/20250821_014204.log +148 -0
  48. API_Transformers/logs/gemma-3-4b-it/20250819_005014.log +28 -0
  49. API_Transformers/logs/gemma-3-4b-it/20250819_005535.log +10 -0
  50. API_Transformers/logs/gemma-3-4b-it/20250819_010310.log +10 -0
.gitattributes CHANGED
@@ -33,3 +33,393 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ API_Transformers/messi/Clips_30s/messi_part_001.mp4 filter=lfs diff=lfs merge=lfs -text
37
+ API_Transformers/messi/Clips_30s/messi_part_002.mp4 filter=lfs diff=lfs merge=lfs -text
38
+ API_Transformers/messi/Clips_30s/messi_part_003.mp4 filter=lfs diff=lfs merge=lfs -text
39
+ API_Transformers/messi/Clips_30s/messi_part_004.mp4 filter=lfs diff=lfs merge=lfs -text
40
+ API_Transformers/messi/Clips_30s/messi_part_005.mp4 filter=lfs diff=lfs merge=lfs -text
41
+ API_Transformers/messi/Clips_30s/messi_part_006.mp4 filter=lfs diff=lfs merge=lfs -text
42
+ API_Transformers/messi/Clips_30s/messi_part_007.mp4 filter=lfs diff=lfs merge=lfs -text
43
+ API_Transformers/messi/Clips_30s/messi_part_008.mp4 filter=lfs diff=lfs merge=lfs -text
44
+ API_Transformers/messi/Clips_30s/messi_part_009.mp4 filter=lfs diff=lfs merge=lfs -text
45
+ API_Transformers/messi/Clips_30s/messi_part_010.mp4 filter=lfs diff=lfs merge=lfs -text
46
+ API_Transformers/messi/Clips_30s/messi_part_011.mp4 filter=lfs diff=lfs merge=lfs -text
47
+ API_Transformers/messi/Clips_30s/messi_part_012.mp4 filter=lfs diff=lfs merge=lfs -text
48
+ API_Transformers/messi/Clips_30s/messi_part_013.mp4 filter=lfs diff=lfs merge=lfs -text
49
+ API_Transformers/messi/Clips_30s/messi_part_014.mp4 filter=lfs diff=lfs merge=lfs -text
50
+ API_Transformers/messi/Clips_30s/messi_part_015.mp4 filter=lfs diff=lfs merge=lfs -text
51
+ API_Transformers/messi/Clips_30s/messi_part_016.mp4 filter=lfs diff=lfs merge=lfs -text
52
+ API_Transformers/messi/Clips_30s/messi_part_017.mp4 filter=lfs diff=lfs merge=lfs -text
53
+ API_Transformers/messi/Clips_30s/messi_part_018.mp4 filter=lfs diff=lfs merge=lfs -text
54
+ API_Transformers/messi/Clips_30s/messi_part_019.mp4 filter=lfs diff=lfs merge=lfs -text
55
+ API_Transformers/messi/Clips_30s/messi_part_020.mp4 filter=lfs diff=lfs merge=lfs -text
56
+ API_Transformers/messi/Clips_30s/messi_part_021.mp4 filter=lfs diff=lfs merge=lfs -text
57
+ API_Transformers/messi/Clips_30s/messi_part_022.mp4 filter=lfs diff=lfs merge=lfs -text
58
+ API_Transformers/messi/Clips_30s/messi_part_023.mp4 filter=lfs diff=lfs merge=lfs -text
59
+ API_Transformers/messi/Clips_30s/messi_part_024.mp4 filter=lfs diff=lfs merge=lfs -text
60
+ API_Transformers/messi/Clips_30s/messi_part_025.mp4 filter=lfs diff=lfs merge=lfs -text
61
+ API_Transformers/messi/Clips_30s/messi_part_026.mp4 filter=lfs diff=lfs merge=lfs -text
62
+ API_Transformers/messi/Clips_30s/messi_part_027.mp4 filter=lfs diff=lfs merge=lfs -text
63
+ API_Transformers/messi/Clips_30s/messi_part_028.mp4 filter=lfs diff=lfs merge=lfs -text
64
+ API_Transformers/messi/Clips_30s/messi_part_029.mp4 filter=lfs diff=lfs merge=lfs -text
65
+ API_Transformers/messi/Clips_30s/messi_part_030.mp4 filter=lfs diff=lfs merge=lfs -text
66
+ API_Transformers/messi/Clips_30s/messi_part_031.mp4 filter=lfs diff=lfs merge=lfs -text
67
+ API_Transformers/messi/Clips_30s/messi_part_032.mp4 filter=lfs diff=lfs merge=lfs -text
68
+ API_Transformers/messi/Clips_30s/messi_part_033.mp4 filter=lfs diff=lfs merge=lfs -text
69
+ API_Transformers/messi/Clips_30s/messi_part_034.mp4 filter=lfs diff=lfs merge=lfs -text
70
+ API_Transformers/messi/Clips_30s/messi_part_035.mp4 filter=lfs diff=lfs merge=lfs -text
71
+ API_Transformers/messi/Clips_30s/messi_part_036.mp4 filter=lfs diff=lfs merge=lfs -text
72
+ API_Transformers/messi/Clips_30s/messi_part_037.mp4 filter=lfs diff=lfs merge=lfs -text
73
+ API_Transformers/messi/Clips_30s/messi_part_038.mp4 filter=lfs diff=lfs merge=lfs -text
74
+ API_Transformers/messi/Clips_30s/messi_part_039.mp4 filter=lfs diff=lfs merge=lfs -text
75
+ API_Transformers/messi/Clips_30s/messi_part_040.mp4 filter=lfs diff=lfs merge=lfs -text
76
+ API_Transformers/messi/Clips_30s/messi_part_041.mp4 filter=lfs diff=lfs merge=lfs -text
77
+ API_Transformers/messi/Clips_30s/messi_part_042.mp4 filter=lfs diff=lfs merge=lfs -text
78
+ API_Transformers/messi/Clips_30s/messi_part_043.mp4 filter=lfs diff=lfs merge=lfs -text
79
+ API_Transformers/messi/Clips_30s/messi_part_044.mp4 filter=lfs diff=lfs merge=lfs -text
80
+ API_Transformers/messi/Clips_30s/messi_part_045.mp4 filter=lfs diff=lfs merge=lfs -text
81
+ API_Transformers/messi/Clips_30s/messi_part_046.mp4 filter=lfs diff=lfs merge=lfs -text
82
+ API_Transformers/messi/Clips_30s/messi_part_047.mp4 filter=lfs diff=lfs merge=lfs -text
83
+ API_Transformers/messi/Clips_30s/messi_part_048.mp4 filter=lfs diff=lfs merge=lfs -text
84
+ API_Transformers/messi/Clips_30s/messi_part_049.mp4 filter=lfs diff=lfs merge=lfs -text
85
+ API_Transformers/messi/Clips_30s/messi_part_050.mp4 filter=lfs diff=lfs merge=lfs -text
86
+ API_Transformers/messi/Clips_30s/messi_part_051.mp4 filter=lfs diff=lfs merge=lfs -text
87
+ API_Transformers/messi/Clips_30s/messi_part_052.mp4 filter=lfs diff=lfs merge=lfs -text
88
+ API_Transformers/messi/Clips_30s/messi_part_053.mp4 filter=lfs diff=lfs merge=lfs -text
89
+ API_Transformers/messi/Clips_30s/messi_part_054.mp4 filter=lfs diff=lfs merge=lfs -text
90
+ API_Transformers/messi/Clips_30s/messi_part_055.mp4 filter=lfs diff=lfs merge=lfs -text
91
+ API_Transformers/messi/Clips_30s/messi_part_056.mp4 filter=lfs diff=lfs merge=lfs -text
92
+ API_Transformers/messi/Clips_30s/messi_part_057.mp4 filter=lfs diff=lfs merge=lfs -text
93
+ API_Transformers/messi/Clips_30s/messi_part_058.mp4 filter=lfs diff=lfs merge=lfs -text
94
+ API_Transformers/messi/Clips_30s/messi_part_059.mp4 filter=lfs diff=lfs merge=lfs -text
95
+ API_Transformers/messi/Clips_30s/messi_part_060.mp4 filter=lfs diff=lfs merge=lfs -text
96
+ API_Transformers/messi/Clips_30s/messi_part_061.mp4 filter=lfs diff=lfs merge=lfs -text
97
+ API_Transformers/messi/Clips_30s/messi_part_062.mp4 filter=lfs diff=lfs merge=lfs -text
98
+ API_Transformers/messi/Clips_30s/messi_part_063.mp4 filter=lfs diff=lfs merge=lfs -text
99
+ API_Transformers/messi/Clips_30s/messi_part_064.mp4 filter=lfs diff=lfs merge=lfs -text
100
+ API_Transformers/messi/Clips_30s/messi_part_065.mp4 filter=lfs diff=lfs merge=lfs -text
101
+ API_Transformers/messi/Clips_30s/messi_part_066.mp4 filter=lfs diff=lfs merge=lfs -text
102
+ API_Transformers/messi/Clips_30s/messi_part_067.mp4 filter=lfs diff=lfs merge=lfs -text
103
+ API_Transformers/messi/Clips_30s/messi_part_068.mp4 filter=lfs diff=lfs merge=lfs -text
104
+ API_Transformers/messi/Clips_30s/messi_part_069.mp4 filter=lfs diff=lfs merge=lfs -text
105
+ API_Transformers/messi/Clips_30s/messi_part_070.mp4 filter=lfs diff=lfs merge=lfs -text
106
+ API_Transformers/messi/Clips_30s/messi_part_071.mp4 filter=lfs diff=lfs merge=lfs -text
107
+ API_Transformers/messi/Clips_30s/messi_part_072.mp4 filter=lfs diff=lfs merge=lfs -text
108
+ API_Transformers/messi/Clips_30s/messi_part_073.mp4 filter=lfs diff=lfs merge=lfs -text
109
+ API_Transformers/messi/Clips_30s/messi_part_074.mp4 filter=lfs diff=lfs merge=lfs -text
110
+ API_Transformers/messi/Clips_30s/messi_part_075.mp4 filter=lfs diff=lfs merge=lfs -text
111
+ API_Transformers/messi/Clips_30s/messi_part_076.mp4 filter=lfs diff=lfs merge=lfs -text
112
+ API_Transformers/messi/Clips_30s/messi_part_077.mp4 filter=lfs diff=lfs merge=lfs -text
113
+ API_Transformers/messi/Clips_30s/messi_part_078.mp4 filter=lfs diff=lfs merge=lfs -text
114
+ API_Transformers/messi/Clips_30s/messi_part_079.mp4 filter=lfs diff=lfs merge=lfs -text
115
+ API_Transformers/messi/Clips_30s/messi_part_080.mp4 filter=lfs diff=lfs merge=lfs -text
116
+ API_Transformers/messi/Clips_30s/messi_part_081.mp4 filter=lfs diff=lfs merge=lfs -text
117
+ API_Transformers/messi/Clips_30s/messi_part_082.mp4 filter=lfs diff=lfs merge=lfs -text
118
+ API_Transformers/messi/Clips_30s/messi_part_083.mp4 filter=lfs diff=lfs merge=lfs -text
119
+ API_Transformers/messi/Clips_30s/messi_part_084.mp4 filter=lfs diff=lfs merge=lfs -text
120
+ API_Transformers/messi/Clips_30s/messi_part_085.mp4 filter=lfs diff=lfs merge=lfs -text
121
+ API_Transformers/messi/Clips_30s/messi_part_086.mp4 filter=lfs diff=lfs merge=lfs -text
122
+ API_Transformers/messi/Clips_30s/messi_part_087.mp4 filter=lfs diff=lfs merge=lfs -text
123
+ API_Transformers/messi/Clips_30s/messi_part_088.mp4 filter=lfs diff=lfs merge=lfs -text
124
+ API_Transformers/messi/Clips_30s/messi_part_089.mp4 filter=lfs diff=lfs merge=lfs -text
125
+ API_Transformers/messi/Clips_30s/messi_part_090.mp4 filter=lfs diff=lfs merge=lfs -text
126
+ API_Transformers/messi/Clips_30s/messi_part_091.mp4 filter=lfs diff=lfs merge=lfs -text
127
+ API_Transformers/messi/Clips_30s/messi_part_092.mp4 filter=lfs diff=lfs merge=lfs -text
128
+ API_Transformers/messi/Clips_30s/messi_part_093.mp4 filter=lfs diff=lfs merge=lfs -text
129
+ API_Transformers/messi/Clips_30s/messi_part_094.mp4 filter=lfs diff=lfs merge=lfs -text
130
+ API_Transformers/messi/Clips_60s/messi_part_001.mp4 filter=lfs diff=lfs merge=lfs -text
131
+ API_Transformers/messi/Clips_60s/messi_part_002.mp4 filter=lfs diff=lfs merge=lfs -text
132
+ API_Transformers/messi/Clips_60s/messi_part_003.mp4 filter=lfs diff=lfs merge=lfs -text
133
+ API_Transformers/messi/Clips_60s/messi_part_004.mp4 filter=lfs diff=lfs merge=lfs -text
134
+ API_Transformers/messi/Clips_60s/messi_part_005.mp4 filter=lfs diff=lfs merge=lfs -text
135
+ API_Transformers/messi/Clips_60s/messi_part_006.mp4 filter=lfs diff=lfs merge=lfs -text
136
+ API_Transformers/messi/Clips_60s/messi_part_007.mp4 filter=lfs diff=lfs merge=lfs -text
137
+ API_Transformers/messi/Clips_60s/messi_part_008.mp4 filter=lfs diff=lfs merge=lfs -text
138
+ API_Transformers/messi/Clips_60s/messi_part_009.mp4 filter=lfs diff=lfs merge=lfs -text
139
+ API_Transformers/messi/Clips_60s/messi_part_010.mp4 filter=lfs diff=lfs merge=lfs -text
140
+ API_Transformers/messi/Clips_60s/messi_part_011.mp4 filter=lfs diff=lfs merge=lfs -text
141
+ API_Transformers/messi/Clips_60s/messi_part_012.mp4 filter=lfs diff=lfs merge=lfs -text
142
+ API_Transformers/messi/Clips_60s/messi_part_013.mp4 filter=lfs diff=lfs merge=lfs -text
143
+ API_Transformers/messi/Clips_60s/messi_part_014.mp4 filter=lfs diff=lfs merge=lfs -text
144
+ API_Transformers/messi/Clips_60s/messi_part_015.mp4 filter=lfs diff=lfs merge=lfs -text
145
+ API_Transformers/messi/Clips_60s/messi_part_016.mp4 filter=lfs diff=lfs merge=lfs -text
146
+ API_Transformers/messi/Clips_60s/messi_part_017.mp4 filter=lfs diff=lfs merge=lfs -text
147
+ API_Transformers/messi/Clips_60s/messi_part_018.mp4 filter=lfs diff=lfs merge=lfs -text
148
+ API_Transformers/messi/Clips_60s/messi_part_019.mp4 filter=lfs diff=lfs merge=lfs -text
149
+ API_Transformers/messi/Clips_60s/messi_part_020.mp4 filter=lfs diff=lfs merge=lfs -text
150
+ API_Transformers/messi/Clips_60s/messi_part_021.mp4 filter=lfs diff=lfs merge=lfs -text
151
+ API_Transformers/messi/Clips_60s/messi_part_022.mp4 filter=lfs diff=lfs merge=lfs -text
152
+ API_Transformers/messi/Clips_60s/messi_part_023.mp4 filter=lfs diff=lfs merge=lfs -text
153
+ API_Transformers/messi/Clips_60s/messi_part_024.mp4 filter=lfs diff=lfs merge=lfs -text
154
+ API_Transformers/messi/Clips_60s/messi_part_025.mp4 filter=lfs diff=lfs merge=lfs -text
155
+ API_Transformers/messi/Clips_60s/messi_part_026.mp4 filter=lfs diff=lfs merge=lfs -text
156
+ API_Transformers/messi/Clips_60s/messi_part_027.mp4 filter=lfs diff=lfs merge=lfs -text
157
+ API_Transformers/messi/Clips_60s/messi_part_028.mp4 filter=lfs diff=lfs merge=lfs -text
158
+ API_Transformers/messi/Clips_60s/messi_part_029.mp4 filter=lfs diff=lfs merge=lfs -text
159
+ API_Transformers/messi/Clips_60s/messi_part_030.mp4 filter=lfs diff=lfs merge=lfs -text
160
+ API_Transformers/messi/Clips_60s/messi_part_031.mp4 filter=lfs diff=lfs merge=lfs -text
161
+ API_Transformers/messi/Clips_60s/messi_part_032.mp4 filter=lfs diff=lfs merge=lfs -text
162
+ API_Transformers/messi/Clips_60s/messi_part_033.mp4 filter=lfs diff=lfs merge=lfs -text
163
+ API_Transformers/messi/Clips_60s/messi_part_034.mp4 filter=lfs diff=lfs merge=lfs -text
164
+ API_Transformers/messi/Clips_60s/messi_part_035.mp4 filter=lfs diff=lfs merge=lfs -text
165
+ API_Transformers/messi/Clips_60s/messi_part_036.mp4 filter=lfs diff=lfs merge=lfs -text
166
+ API_Transformers/messi/Clips_60s/messi_part_037.mp4 filter=lfs diff=lfs merge=lfs -text
167
+ API_Transformers/messi/Clips_60s/messi_part_038.mp4 filter=lfs diff=lfs merge=lfs -text
168
+ API_Transformers/messi/Clips_60s/messi_part_039.mp4 filter=lfs diff=lfs merge=lfs -text
169
+ API_Transformers/messi/Clips_60s/messi_part_040.mp4 filter=lfs diff=lfs merge=lfs -text
170
+ API_Transformers/messi/Clips_60s/messi_part_041.mp4 filter=lfs diff=lfs merge=lfs -text
171
+ API_Transformers/messi/Clips_60s/messi_part_042.mp4 filter=lfs diff=lfs merge=lfs -text
172
+ API_Transformers/messi/Clips_60s/messi_part_043.mp4 filter=lfs diff=lfs merge=lfs -text
173
+ API_Transformers/messi/Clips_60s/messi_part_044.mp4 filter=lfs diff=lfs merge=lfs -text
174
+ API_Transformers/messi/Clips_60s/messi_part_045.mp4 filter=lfs diff=lfs merge=lfs -text
175
+ API_Transformers/messi/Clips_60s/messi_part_046.mp4 filter=lfs diff=lfs merge=lfs -text
176
+ API_Transformers/messi/Clips_60s/messi_part_047.mp4 filter=lfs diff=lfs merge=lfs -text
177
+ Direct_Transformers/videos/sample1_raw.mp4 filter=lfs diff=lfs merge=lfs -text
178
+ Direct_Transformers/videos/sample1_rotated_180.mp4 filter=lfs diff=lfs merge=lfs -text
179
+ Direct_Transformers/videos/sample1_rotated_270.mp4 filter=lfs diff=lfs merge=lfs -text
180
+ Direct_Transformers/videos/sample1_rotated_90.mp4 filter=lfs diff=lfs merge=lfs -text
181
+ Direct_Transformers/videos/sample2.mp4 filter=lfs diff=lfs merge=lfs -text
182
+ vllm-deploy/MiniCPM-V-4-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
183
+ xywang/demo.jpeg filter=lfs diff=lfs merge=lfs -text
184
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0000.jpg filter=lfs diff=lfs merge=lfs -text
185
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0001.jpg filter=lfs diff=lfs merge=lfs -text
186
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0002.jpg filter=lfs diff=lfs merge=lfs -text
187
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0003.jpg filter=lfs diff=lfs merge=lfs -text
188
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0004.jpg filter=lfs diff=lfs merge=lfs -text
189
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0005.jpg filter=lfs diff=lfs merge=lfs -text
190
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0006.jpg filter=lfs diff=lfs merge=lfs -text
191
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0007.jpg filter=lfs diff=lfs merge=lfs -text
192
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0008.jpg filter=lfs diff=lfs merge=lfs -text
193
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0009.jpg filter=lfs diff=lfs merge=lfs -text
194
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0010.jpg filter=lfs diff=lfs merge=lfs -text
195
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0011.jpg filter=lfs diff=lfs merge=lfs -text
196
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0012.jpg filter=lfs diff=lfs merge=lfs -text
197
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0013.jpg filter=lfs diff=lfs merge=lfs -text
198
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0014.jpg filter=lfs diff=lfs merge=lfs -text
199
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0015.jpg filter=lfs diff=lfs merge=lfs -text
200
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0016.jpg filter=lfs diff=lfs merge=lfs -text
201
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0017.jpg filter=lfs diff=lfs merge=lfs -text
202
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0018.jpg filter=lfs diff=lfs merge=lfs -text
203
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0019.jpg filter=lfs diff=lfs merge=lfs -text
204
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0020.jpg filter=lfs diff=lfs merge=lfs -text
205
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0021.jpg filter=lfs diff=lfs merge=lfs -text
206
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0022.jpg filter=lfs diff=lfs merge=lfs -text
207
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0023.jpg filter=lfs diff=lfs merge=lfs -text
208
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0024.jpg filter=lfs diff=lfs merge=lfs -text
209
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0025.jpg filter=lfs diff=lfs merge=lfs -text
210
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0026.jpg filter=lfs diff=lfs merge=lfs -text
211
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0027.jpg filter=lfs diff=lfs merge=lfs -text
212
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0028.jpg filter=lfs diff=lfs merge=lfs -text
213
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0029.jpg filter=lfs diff=lfs merge=lfs -text
214
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0030.jpg filter=lfs diff=lfs merge=lfs -text
215
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0031.jpg filter=lfs diff=lfs merge=lfs -text
216
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0032.jpg filter=lfs diff=lfs merge=lfs -text
217
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0033.jpg filter=lfs diff=lfs merge=lfs -text
218
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0034.jpg filter=lfs diff=lfs merge=lfs -text
219
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0035.jpg filter=lfs diff=lfs merge=lfs -text
220
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0036.jpg filter=lfs diff=lfs merge=lfs -text
221
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0037.jpg filter=lfs diff=lfs merge=lfs -text
222
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0038.jpg filter=lfs diff=lfs merge=lfs -text
223
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0039.jpg filter=lfs diff=lfs merge=lfs -text
224
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0040.jpg filter=lfs diff=lfs merge=lfs -text
225
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0041.jpg filter=lfs diff=lfs merge=lfs -text
226
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0042.jpg filter=lfs diff=lfs merge=lfs -text
227
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0043.jpg filter=lfs diff=lfs merge=lfs -text
228
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0044.jpg filter=lfs diff=lfs merge=lfs -text
229
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0045.jpg filter=lfs diff=lfs merge=lfs -text
230
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0046.jpg filter=lfs diff=lfs merge=lfs -text
231
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0047.jpg filter=lfs diff=lfs merge=lfs -text
232
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0048.jpg filter=lfs diff=lfs merge=lfs -text
233
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0049.jpg filter=lfs diff=lfs merge=lfs -text
234
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0050.jpg filter=lfs diff=lfs merge=lfs -text
235
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0051.jpg filter=lfs diff=lfs merge=lfs -text
236
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0052.jpg filter=lfs diff=lfs merge=lfs -text
237
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0053.jpg filter=lfs diff=lfs merge=lfs -text
238
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0054.jpg filter=lfs diff=lfs merge=lfs -text
239
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0055.jpg filter=lfs diff=lfs merge=lfs -text
240
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0056.jpg filter=lfs diff=lfs merge=lfs -text
241
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0057.jpg filter=lfs diff=lfs merge=lfs -text
242
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0058.jpg filter=lfs diff=lfs merge=lfs -text
243
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0059.jpg filter=lfs diff=lfs merge=lfs -text
244
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0060.jpg filter=lfs diff=lfs merge=lfs -text
245
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0061.jpg filter=lfs diff=lfs merge=lfs -text
246
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0062.jpg filter=lfs diff=lfs merge=lfs -text
247
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0063.jpg filter=lfs diff=lfs merge=lfs -text
248
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0064.jpg filter=lfs diff=lfs merge=lfs -text
249
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0065.jpg filter=lfs diff=lfs merge=lfs -text
250
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0066.jpg filter=lfs diff=lfs merge=lfs -text
251
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0067.jpg filter=lfs diff=lfs merge=lfs -text
252
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0068.jpg filter=lfs diff=lfs merge=lfs -text
253
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0069.jpg filter=lfs diff=lfs merge=lfs -text
254
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0070.jpg filter=lfs diff=lfs merge=lfs -text
255
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0071.jpg filter=lfs diff=lfs merge=lfs -text
256
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0072.jpg filter=lfs diff=lfs merge=lfs -text
257
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0073.jpg filter=lfs diff=lfs merge=lfs -text
258
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0074.jpg filter=lfs diff=lfs merge=lfs -text
259
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0075.jpg filter=lfs diff=lfs merge=lfs -text
260
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0076.jpg filter=lfs diff=lfs merge=lfs -text
261
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0077.jpg filter=lfs diff=lfs merge=lfs -text
262
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0078.jpg filter=lfs diff=lfs merge=lfs -text
263
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0079.jpg filter=lfs diff=lfs merge=lfs -text
264
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0080.jpg filter=lfs diff=lfs merge=lfs -text
265
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0081.jpg filter=lfs diff=lfs merge=lfs -text
266
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0082.jpg filter=lfs diff=lfs merge=lfs -text
267
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0083.jpg filter=lfs diff=lfs merge=lfs -text
268
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0084.jpg filter=lfs diff=lfs merge=lfs -text
269
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0085.jpg filter=lfs diff=lfs merge=lfs -text
270
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0086.jpg filter=lfs diff=lfs merge=lfs -text
271
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0087.jpg filter=lfs diff=lfs merge=lfs -text
272
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0088.jpg filter=lfs diff=lfs merge=lfs -text
273
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0089.jpg filter=lfs diff=lfs merge=lfs -text
274
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0090.jpg filter=lfs diff=lfs merge=lfs -text
275
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0091.jpg filter=lfs diff=lfs merge=lfs -text
276
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0092.jpg filter=lfs diff=lfs merge=lfs -text
277
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0093.jpg filter=lfs diff=lfs merge=lfs -text
278
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0094.jpg filter=lfs diff=lfs merge=lfs -text
279
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0095.jpg filter=lfs diff=lfs merge=lfs -text
280
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0096.jpg filter=lfs diff=lfs merge=lfs -text
281
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0097.jpg filter=lfs diff=lfs merge=lfs -text
282
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0098.jpg filter=lfs diff=lfs merge=lfs -text
283
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0099.jpg filter=lfs diff=lfs merge=lfs -text
284
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0100.jpg filter=lfs diff=lfs merge=lfs -text
285
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0101.jpg filter=lfs diff=lfs merge=lfs -text
286
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0102.jpg filter=lfs diff=lfs merge=lfs -text
287
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0103.jpg filter=lfs diff=lfs merge=lfs -text
288
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0104.jpg filter=lfs diff=lfs merge=lfs -text
289
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0105.jpg filter=lfs diff=lfs merge=lfs -text
290
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0106.jpg filter=lfs diff=lfs merge=lfs -text
291
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0107.jpg filter=lfs diff=lfs merge=lfs -text
292
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0108.jpg filter=lfs diff=lfs merge=lfs -text
293
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0109.jpg filter=lfs diff=lfs merge=lfs -text
294
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0110.jpg filter=lfs diff=lfs merge=lfs -text
295
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0111.jpg filter=lfs diff=lfs merge=lfs -text
296
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0112.jpg filter=lfs diff=lfs merge=lfs -text
297
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0113.jpg filter=lfs diff=lfs merge=lfs -text
298
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0114.jpg filter=lfs diff=lfs merge=lfs -text
299
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0115.jpg filter=lfs diff=lfs merge=lfs -text
300
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7/frame_0116.jpg filter=lfs diff=lfs merge=lfs -text
301
+ xywang/infer/temp_videos/98f6980c-b652-4f5f-afef-18c971d62ac7.mp4 filter=lfs diff=lfs merge=lfs -text
302
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0000.jpg filter=lfs diff=lfs merge=lfs -text
303
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0001.jpg filter=lfs diff=lfs merge=lfs -text
304
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0002.jpg filter=lfs diff=lfs merge=lfs -text
305
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0003.jpg filter=lfs diff=lfs merge=lfs -text
306
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0004.jpg filter=lfs diff=lfs merge=lfs -text
307
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0005.jpg filter=lfs diff=lfs merge=lfs -text
308
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0006.jpg filter=lfs diff=lfs merge=lfs -text
309
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0007.jpg filter=lfs diff=lfs merge=lfs -text
310
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0008.jpg filter=lfs diff=lfs merge=lfs -text
311
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0009.jpg filter=lfs diff=lfs merge=lfs -text
312
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0010.jpg filter=lfs diff=lfs merge=lfs -text
313
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0011.jpg filter=lfs diff=lfs merge=lfs -text
314
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0012.jpg filter=lfs diff=lfs merge=lfs -text
315
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0013.jpg filter=lfs diff=lfs merge=lfs -text
316
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0014.jpg filter=lfs diff=lfs merge=lfs -text
317
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0015.jpg filter=lfs diff=lfs merge=lfs -text
318
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0016.jpg filter=lfs diff=lfs merge=lfs -text
319
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0017.jpg filter=lfs diff=lfs merge=lfs -text
320
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0018.jpg filter=lfs diff=lfs merge=lfs -text
321
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0019.jpg filter=lfs diff=lfs merge=lfs -text
322
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0020.jpg filter=lfs diff=lfs merge=lfs -text
323
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0021.jpg filter=lfs diff=lfs merge=lfs -text
324
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0022.jpg filter=lfs diff=lfs merge=lfs -text
325
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0023.jpg filter=lfs diff=lfs merge=lfs -text
326
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0024.jpg filter=lfs diff=lfs merge=lfs -text
327
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0025.jpg filter=lfs diff=lfs merge=lfs -text
328
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0026.jpg filter=lfs diff=lfs merge=lfs -text
329
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0027.jpg filter=lfs diff=lfs merge=lfs -text
330
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0028.jpg filter=lfs diff=lfs merge=lfs -text
331
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0029.jpg filter=lfs diff=lfs merge=lfs -text
332
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0030.jpg filter=lfs diff=lfs merge=lfs -text
333
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0031.jpg filter=lfs diff=lfs merge=lfs -text
334
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0032.jpg filter=lfs diff=lfs merge=lfs -text
335
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0033.jpg filter=lfs diff=lfs merge=lfs -text
336
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0034.jpg filter=lfs diff=lfs merge=lfs -text
337
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0035.jpg filter=lfs diff=lfs merge=lfs -text
338
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0036.jpg filter=lfs diff=lfs merge=lfs -text
339
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0037.jpg filter=lfs diff=lfs merge=lfs -text
340
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0038.jpg filter=lfs diff=lfs merge=lfs -text
341
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0039.jpg filter=lfs diff=lfs merge=lfs -text
342
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0040.jpg filter=lfs diff=lfs merge=lfs -text
343
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0041.jpg filter=lfs diff=lfs merge=lfs -text
344
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0042.jpg filter=lfs diff=lfs merge=lfs -text
345
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0043.jpg filter=lfs diff=lfs merge=lfs -text
346
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0044.jpg filter=lfs diff=lfs merge=lfs -text
347
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0045.jpg filter=lfs diff=lfs merge=lfs -text
348
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0046.jpg filter=lfs diff=lfs merge=lfs -text
349
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0047.jpg filter=lfs diff=lfs merge=lfs -text
350
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0048.jpg filter=lfs diff=lfs merge=lfs -text
351
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0049.jpg filter=lfs diff=lfs merge=lfs -text
352
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0050.jpg filter=lfs diff=lfs merge=lfs -text
353
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0051.jpg filter=lfs diff=lfs merge=lfs -text
354
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0052.jpg filter=lfs diff=lfs merge=lfs -text
355
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0053.jpg filter=lfs diff=lfs merge=lfs -text
356
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0054.jpg filter=lfs diff=lfs merge=lfs -text
357
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0055.jpg filter=lfs diff=lfs merge=lfs -text
358
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0056.jpg filter=lfs diff=lfs merge=lfs -text
359
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0057.jpg filter=lfs diff=lfs merge=lfs -text
360
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0058.jpg filter=lfs diff=lfs merge=lfs -text
361
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0059.jpg filter=lfs diff=lfs merge=lfs -text
362
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0060.jpg filter=lfs diff=lfs merge=lfs -text
363
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0061.jpg filter=lfs diff=lfs merge=lfs -text
364
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0062.jpg filter=lfs diff=lfs merge=lfs -text
365
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0063.jpg filter=lfs diff=lfs merge=lfs -text
366
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0064.jpg filter=lfs diff=lfs merge=lfs -text
367
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0065.jpg filter=lfs diff=lfs merge=lfs -text
368
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0066.jpg filter=lfs diff=lfs merge=lfs -text
369
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0067.jpg filter=lfs diff=lfs merge=lfs -text
370
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0068.jpg filter=lfs diff=lfs merge=lfs -text
371
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0069.jpg filter=lfs diff=lfs merge=lfs -text
372
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0070.jpg filter=lfs diff=lfs merge=lfs -text
373
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0071.jpg filter=lfs diff=lfs merge=lfs -text
374
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0072.jpg filter=lfs diff=lfs merge=lfs -text
375
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0073.jpg filter=lfs diff=lfs merge=lfs -text
376
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0074.jpg filter=lfs diff=lfs merge=lfs -text
377
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0075.jpg filter=lfs diff=lfs merge=lfs -text
378
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0076.jpg filter=lfs diff=lfs merge=lfs -text
379
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0077.jpg filter=lfs diff=lfs merge=lfs -text
380
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0078.jpg filter=lfs diff=lfs merge=lfs -text
381
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0079.jpg filter=lfs diff=lfs merge=lfs -text
382
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0080.jpg filter=lfs diff=lfs merge=lfs -text
383
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0081.jpg filter=lfs diff=lfs merge=lfs -text
384
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0082.jpg filter=lfs diff=lfs merge=lfs -text
385
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0083.jpg filter=lfs diff=lfs merge=lfs -text
386
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0084.jpg filter=lfs diff=lfs merge=lfs -text
387
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0085.jpg filter=lfs diff=lfs merge=lfs -text
388
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0086.jpg filter=lfs diff=lfs merge=lfs -text
389
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0087.jpg filter=lfs diff=lfs merge=lfs -text
390
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0088.jpg filter=lfs diff=lfs merge=lfs -text
391
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0089.jpg filter=lfs diff=lfs merge=lfs -text
392
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0090.jpg filter=lfs diff=lfs merge=lfs -text
393
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0091.jpg filter=lfs diff=lfs merge=lfs -text
394
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0092.jpg filter=lfs diff=lfs merge=lfs -text
395
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0093.jpg filter=lfs diff=lfs merge=lfs -text
396
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0094.jpg filter=lfs diff=lfs merge=lfs -text
397
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0095.jpg filter=lfs diff=lfs merge=lfs -text
398
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0096.jpg filter=lfs diff=lfs merge=lfs -text
399
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0097.jpg filter=lfs diff=lfs merge=lfs -text
400
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0098.jpg filter=lfs diff=lfs merge=lfs -text
401
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0099.jpg filter=lfs diff=lfs merge=lfs -text
402
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0100.jpg filter=lfs diff=lfs merge=lfs -text
403
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0101.jpg filter=lfs diff=lfs merge=lfs -text
404
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0102.jpg filter=lfs diff=lfs merge=lfs -text
405
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0103.jpg filter=lfs diff=lfs merge=lfs -text
406
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0104.jpg filter=lfs diff=lfs merge=lfs -text
407
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0105.jpg filter=lfs diff=lfs merge=lfs -text
408
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0106.jpg filter=lfs diff=lfs merge=lfs -text
409
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0107.jpg filter=lfs diff=lfs merge=lfs -text
410
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0108.jpg filter=lfs diff=lfs merge=lfs -text
411
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0109.jpg filter=lfs diff=lfs merge=lfs -text
412
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0110.jpg filter=lfs diff=lfs merge=lfs -text
413
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0111.jpg filter=lfs diff=lfs merge=lfs -text
414
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0112.jpg filter=lfs diff=lfs merge=lfs -text
415
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0113.jpg filter=lfs diff=lfs merge=lfs -text
416
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0114.jpg filter=lfs diff=lfs merge=lfs -text
417
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0115.jpg filter=lfs diff=lfs merge=lfs -text
418
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811/frame_0116.jpg filter=lfs diff=lfs merge=lfs -text
419
+ xywang/infer/temp_videos/fd1d9099-d6d6-4b0d-b9d1-4bddd67d9811.mp4 filter=lfs diff=lfs merge=lfs -text
420
+ xywang/test_videos/cosplay.mp4 filter=lfs diff=lfs merge=lfs -text
421
+ xywang/test_videos/duoduo.mp4 filter=lfs diff=lfs merge=lfs -text
422
+ xywang/test_videos/fireworks.mp4 filter=lfs diff=lfs merge=lfs -text
423
+ xywang/test_videos/interview.mp4 filter=lfs diff=lfs merge=lfs -text
424
+ xywang/test_videos/moon.mp4 filter=lfs diff=lfs merge=lfs -text
425
+ xywang/test_videos/park.mp4 filter=lfs diff=lfs merge=lfs -text
API_Transformers/__pycache__/video_processor.cpython-311.pyc ADDED
Binary file (5.69 kB). View file
 
API_Transformers/cal.py ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+
3
+ metric = {
4
+ "tokens_per_second": [],
5
+ "peak_gpu_memory_mb": [],
6
+ "num_generated_tokens": [],
7
+ "inference_time": [],
8
+ "cpu_usage": [],
9
+ }
10
+ for key, value in json.load(open("/mnt/data/xiuying/Code/local_deploy/outputs/mini/mini_60s.json")).items():
11
+ metric["tokens_per_second"].append(value["tokens_per_second"])
12
+ metric["peak_gpu_memory_mb"].append(value["peak_gpu_memory_mb"])
13
+ metric["num_generated_tokens"].append(value["num_generated_tokens"])
14
+ metric["inference_time"].append(value["inference_time"])
15
+ metric["cpu_usage"].append(value["cpu_usage"])
16
+
17
+ for key, value in metric.items():
18
+ print(key, sum(value) / len(value))
API_Transformers/delete.py ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+
3
+ files = ["/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_017.mp4",
4
+ "/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_018.mp4",
5
+ "/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_019.mp4",
6
+ "/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_020.mp4"]
7
+
8
+ for file in files:
9
+ os.remove(file)
10
+ print(f"Deleted {file}")
API_Transformers/infer.py ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import uuid
3
+ import time
4
+ import psutil
5
+ import uvicorn
6
+ import torch
7
+ import cv2
8
+ import shutil
9
+ from fastapi import FastAPI, File, UploadFile, Form, HTTPException
10
+ from fastapi.responses import JSONResponse
11
+ from models.qwen import Qwen2VL
12
+ from models.gemma import Gemma
13
+ from models.minicpm import MiniCPM
14
+ from models.lfm import LFM2
15
+ from video_processor import extract_frames, FrameSamplingMethod
16
+ import argparse
17
+ import json
18
+ import logging
19
+
20
+
21
+
22
+ parser = argparse.ArgumentParser()
23
+ parser.add_argument("--model_path", type=str, default="Qwen/Qwen2.5-VL-3B-Instruct-AWQ")
24
+ args = parser.parse_args()
25
+
26
+
27
+
28
+ # --- 日志和临时文件目录配置 ---
29
+ LOG_DIR = f"logs/{args.model_path.split('/')[-1]}"
30
+ OUTPUT_DIR = f"outputs/{args.model_path.split('/')[-1]}"
31
+ TEMP_VIDEO_DIR = "temp_videos"
32
+ os.makedirs(LOG_DIR, exist_ok=True)
33
+ os.makedirs(OUTPUT_DIR, exist_ok=True)
34
+ os.makedirs(TEMP_VIDEO_DIR, exist_ok=True)
35
+ start_time = time.strftime('%Y%m%d_%H%M%S')
36
+ log_filename = f"{LOG_DIR}/{start_time}.log"
37
+ logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s', datefmt='%Y-%m-%d %H:%M:%S', filename=log_filename, filemode='a')
38
+
39
+ # --- FastAPI 应用初始化 ---
40
+ app = FastAPI(title=f"{args.model_path} Video Inference Service")
41
+ total_output = {}
42
+ # --- 加载模型和处理器 ---
43
+ logging.info(f"Loading model: {args.model_path}")
44
+ model_load_start = time.time()
45
+ if "qwen" in args.model_path.lower():
46
+ model = Qwen2VL(args.model_path)
47
+ elif "gemma" in args.model_path.lower():
48
+ model = Gemma(args.model_path)
49
+ elif "minicpm" in args.model_path.lower():
50
+ model = MiniCPM(args.model_path)
51
+ elif "lfm" in args.model_path.lower():
52
+ model = LFM2(args.model_path)
53
+ model_load_end = time.time()
54
+ GPU_MEMORY_USAGE = f"{torch.cuda.memory_allocated(0)/1024**2:.2f} MB" if torch.cuda.is_available() else "N/A"
55
+ logging.info(f"Model loaded in {model_load_end - model_load_start:.2f} seconds")
56
+ logging.info(f"GPU Memory Usage after model load: {GPU_MEMORY_USAGE}")
57
+
58
+ @app.post("/video-inference/")
59
+ async def video_inference(
60
+ prompt: str = Form(...),
61
+ video_file: str = Form(...),
62
+ sampling_method: FrameSamplingMethod = Form(FrameSamplingMethod.CONTENT_AWARE),
63
+ sampling_rate: int = Form(5),
64
+ ):
65
+ """
66
+ 接收视频和文本提示,进行推理并返回结果。
67
+ """
68
+ request_start_time = time.time()
69
+ request_id = str(uuid.uuid4())
70
+ logging.info(f"[{request_id}] Received new video inference request. Prompt: '{prompt}', Video: '{video_file}'")
71
+
72
+ if not video_file.endswith(".mp4"):
73
+ logging.error(f"[{request_id}] Uploaded file '{video_file}' is not a video.")
74
+ raise HTTPException(status_code=400, detail="Uploaded file is not a video.")
75
+
76
+ file_extension = os.path.splitext(video_file)[1]
77
+ temp_video_path = os.path.join(TEMP_VIDEO_DIR, f"{request_id}{file_extension}")
78
+ temp_frame_dir = os.path.join(TEMP_VIDEO_DIR, request_id)
79
+ os.makedirs(temp_frame_dir, exist_ok=True)
80
+
81
+ try:
82
+
83
+ logging.info(f"[{request_id}] Video saved to temporary file: {temp_video_path}")
84
+ logging.info(f"[{request_id}] Extracting frames using method: {sampling_method.value}, rate/threshold: {sampling_rate}")
85
+
86
+ frames = extract_frames(video_file, sampling_method, sampling_rate)
87
+ if not frames:
88
+ logging.error(f"[{request_id}] Could not extract any frames from the video: {temp_video_path}")
89
+ raise HTTPException(status_code=400, detail="Could not extract any frames from the video.")
90
+
91
+ logging.info(f"[{request_id}] Extracted {len(frames)} frames successfully. Saving to temporary files...")
92
+
93
+ # 将帧保存到临时文件并获取其路径
94
+ frame_paths = []
95
+ for i, frame in enumerate(frames):
96
+ frame_path = os.path.join(temp_frame_dir, f"frame_{i:04d}.jpg")
97
+ cv2.imwrite(frame_path, frame)
98
+ abs_frame_path = os.path.abspath(frame_path)
99
+ frame_paths.append(abs_frame_path)
100
+
101
+ logging.info(f"[{request_id}] {len(frame_paths)} frames saved to {temp_frame_dir}")
102
+
103
+ output = model.generate(frame_paths, prompt)
104
+
105
+ logging.info(f"Tokens per second: {output['tokens_per_second']}, Peak GPU memory MB: {output['peak_gpu_memory_mb']}")
106
+
107
+ inference_end_time = time.time()
108
+ cpu_usage = psutil.cpu_percent(interval=None)
109
+ cpu_core_utilization = psutil.cpu_percent(interval=None, percpu=True)
110
+ logging.info(f"[{request_id}] Inference time: {inference_end_time - request_start_time:.2f} seconds, CPU usage: {cpu_usage}%, CPU core utilization: {cpu_core_utilization}")
111
+ output["inference_time"] = inference_end_time - request_start_time
112
+ output["cpu_usage"] = cpu_usage
113
+ output["cpu_core_utilization"] = cpu_core_utilization
114
+ output["num_generated_tokens"] = output["num_generated_tokens"]
115
+
116
+ return JSONResponse(content=output)
117
+
118
+ except Exception as e:
119
+ logging.error(f"[{request_id}] An error occurred during processing: {str(e)}", exc_info=True)
120
+ raise HTTPException(status_code=500, detail=f"An error occurred during processing: {str(e)}")
121
+ finally:
122
+ if os.path.exists(temp_video_path):
123
+ os.remove(temp_video_path)
124
+ logging.info(f"[{request_id}] Cleaned up temporary file: {temp_video_path}")
125
+ if os.path.exists(temp_frame_dir):
126
+ shutil.rmtree(temp_frame_dir)
127
+ logging.info(f"[{request_id}] Cleaned up temporary frame directory: {temp_frame_dir}")
128
+
129
+
130
+ if __name__ == "__main__":
131
+ uvicorn.run(app, host="0.0.0.0", port=8010)
API_Transformers/logs/LFM2-VL-1.6B/20250818_232556.log ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 23:25:56 - INFO - Loading model: LiquidAI/LFM2-VL-1.6B
2
+ 2025-08-18 23:25:58 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 23:26:26 - INFO - Model loaded in 29.54 seconds
4
+ 2025-08-18 23:26:26 - INFO - GPU Memory Usage after model load: 3023.64 MB
5
+ 2025-08-18 23:28:45 - INFO - [2d0a4e6b-87a3-4f80-9d2e-24b74787acdb] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 23:28:45 - INFO - [2d0a4e6b-87a3-4f80-9d2e-24b74787acdb] Video saved to temporary file: temp_videos/2d0a4e6b-87a3-4f80-9d2e-24b74787acdb.mp4
7
+ 2025-08-18 23:28:45 - INFO - [2d0a4e6b-87a3-4f80-9d2e-24b74787acdb] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 23:28:48 - INFO - [2d0a4e6b-87a3-4f80-9d2e-24b74787acdb] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 23:28:48 - INFO - [2d0a4e6b-87a3-4f80-9d2e-24b74787acdb] 30 frames saved to temp_videos/2d0a4e6b-87a3-4f80-9d2e-24b74787acdb
10
+ 2025-08-18 23:28:48 - INFO - Prompt token length: 783
11
+ 2025-08-18 23:28:54 - INFO - Tokens per second: 28.289322629768442, Peak GPU memory MB: 4206.375
12
+ 2025-08-18 23:28:54 - INFO - [2d0a4e6b-87a3-4f80-9d2e-24b74787acdb] Inference time: 8.48 seconds, CPU usage: 22.5%, CPU core utilization: [21.5, 23.9, 21.6, 23.0]
13
+ 2025-08-18 23:28:54 - INFO - [2d0a4e6b-87a3-4f80-9d2e-24b74787acdb] Cleaned up temporary file: temp_videos/2d0a4e6b-87a3-4f80-9d2e-24b74787acdb.mp4
14
+ 2025-08-18 23:28:54 - INFO - [2d0a4e6b-87a3-4f80-9d2e-24b74787acdb] Cleaned up temporary frame directory: temp_videos/2d0a4e6b-87a3-4f80-9d2e-24b74787acdb
API_Transformers/logs/LFM2-VL-1.6B/20250818_233101.log ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 23:31:01 - INFO - Loading model: LiquidAI/LFM2-VL-1.6B
2
+ 2025-08-18 23:31:02 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 23:31:09 - INFO - Model loaded in 7.34 seconds
4
+ 2025-08-18 23:31:09 - INFO - GPU Memory Usage after model load: 3023.64 MB
5
+ 2025-08-18 23:31:56 - INFO - [653718a5-3216-4691-b041-10603df59fa5] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 23:31:56 - INFO - [653718a5-3216-4691-b041-10603df59fa5] Video saved to temporary file: temp_videos/653718a5-3216-4691-b041-10603df59fa5.mp4
7
+ 2025-08-18 23:31:56 - INFO - [653718a5-3216-4691-b041-10603df59fa5] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 23:32:01 - INFO - [653718a5-3216-4691-b041-10603df59fa5] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 23:32:01 - INFO - [653718a5-3216-4691-b041-10603df59fa5] 30 frames saved to temp_videos/653718a5-3216-4691-b041-10603df59fa5
10
+ 2025-08-18 23:32:01 - ERROR - [653718a5-3216-4691-b041-10603df59fa5] An error occurred during processing: Incorrect format used for image. Should be an url linking to an image, a base64 string, a local path, or a PIL image.
11
+ Traceback (most recent call last):
12
+ File "/mnt/data/xiuying/Code/local_deploy/infer.py", line 107, in video_inference
13
+ output = model.generate(frame_paths, prompt)
14
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15
+ File "/mnt/data/xiuying/Code/local_deploy/models/lfm.py", line 52, in generate
16
+ inputs = self.processor.apply_chat_template(
17
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18
+ File "/home/xiuying/miniconda3/envs/gptq/lib/python3.11/site-packages/transformers/utils/deprecation.py", line 172, in wrapped_func
19
+ return func(*args, **kwargs)
20
+ ^^^^^^^^^^^^^^^^^^^^^
21
+ File "/home/xiuying/miniconda3/envs/gptq/lib/python3.11/site-packages/transformers/processing_utils.py", line 1552, in apply_chat_template
22
+ images.append(load_image(fname))
23
+ ^^^^^^^^^^^^^^^^^
24
+ File "/home/xiuying/miniconda3/envs/gptq/lib/python3.11/site-packages/transformers/image_utils.py", line 493, in load_image
25
+ raise TypeError(
26
+ TypeError: Incorrect format used for image. Should be an url linking to an image, a base64 string, a local path, or a PIL image.
27
+ 2025-08-18 23:32:01 - INFO - [653718a5-3216-4691-b041-10603df59fa5] Cleaned up temporary file: temp_videos/653718a5-3216-4691-b041-10603df59fa5.mp4
28
+ 2025-08-18 23:32:01 - INFO - [653718a5-3216-4691-b041-10603df59fa5] Cleaned up temporary frame directory: temp_videos/653718a5-3216-4691-b041-10603df59fa5
API_Transformers/logs/LFM2-VL-1.6B/20250818_233342.log ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 23:33:42 - INFO - Loading model: LiquidAI/LFM2-VL-1.6B
2
+ 2025-08-18 23:33:43 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 23:33:49 - INFO - Model loaded in 7.34 seconds
4
+ 2025-08-18 23:33:49 - INFO - GPU Memory Usage after model load: 3023.64 MB
5
+ 2025-08-18 23:33:54 - INFO - [c448d9a9-b14c-4a05-a6b6-67879d899cd1] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 23:33:54 - INFO - [c448d9a9-b14c-4a05-a6b6-67879d899cd1] Video saved to temporary file: temp_videos/c448d9a9-b14c-4a05-a6b6-67879d899cd1.mp4
7
+ 2025-08-18 23:33:54 - INFO - [c448d9a9-b14c-4a05-a6b6-67879d899cd1] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 23:33:57 - INFO - [c448d9a9-b14c-4a05-a6b6-67879d899cd1] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 23:33:57 - INFO - [c448d9a9-b14c-4a05-a6b6-67879d899cd1] 30 frames saved to temp_videos/c448d9a9-b14c-4a05-a6b6-67879d899cd1
10
+ 2025-08-18 23:33:57 - ERROR - [c448d9a9-b14c-4a05-a6b6-67879d899cd1] An error occurred during processing: Incorrect format used for image. Should be an url linking to an image, a base64 string, a local path, or a PIL image.
11
+ Traceback (most recent call last):
12
+ File "/mnt/data/xiuying/Code/local_deploy/infer.py", line 107, in video_inference
13
+ output = model.generate(frame_paths, prompt)
14
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15
+ File "/mnt/data/xiuying/Code/local_deploy/models/lfm.py", line 52, in generate
16
+ inputs = self.processor.apply_chat_template(
17
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18
+ File "/home/xiuying/miniconda3/envs/gptq/lib/python3.11/site-packages/transformers/utils/deprecation.py", line 172, in wrapped_func
19
+ return func(*args, **kwargs)
20
+ ^^^^^^^^^^^^^^^^^^^^^
21
+ File "/home/xiuying/miniconda3/envs/gptq/lib/python3.11/site-packages/transformers/processing_utils.py", line 1552, in apply_chat_template
22
+ images.append(load_image(fname))
23
+ ^^^^^^^^^^^^^^^^^
24
+ File "/home/xiuying/miniconda3/envs/gptq/lib/python3.11/site-packages/transformers/image_utils.py", line 493, in load_image
25
+ raise TypeError(
26
+ TypeError: Incorrect format used for image. Should be an url linking to an image, a base64 string, a local path, or a PIL image.
27
+ 2025-08-18 23:33:57 - INFO - [c448d9a9-b14c-4a05-a6b6-67879d899cd1] Cleaned up temporary file: temp_videos/c448d9a9-b14c-4a05-a6b6-67879d899cd1.mp4
28
+ 2025-08-18 23:33:57 - INFO - [c448d9a9-b14c-4a05-a6b6-67879d899cd1] Cleaned up temporary frame directory: temp_videos/c448d9a9-b14c-4a05-a6b6-67879d899cd1
API_Transformers/logs/LFM2-VL-1.6B/20250818_233635.log ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 23:36:35 - INFO - Loading model: LiquidAI/LFM2-VL-1.6B
2
+ 2025-08-18 23:36:36 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 23:36:43 - INFO - Model loaded in 7.59 seconds
4
+ 2025-08-18 23:36:43 - INFO - GPU Memory Usage after model load: 3023.64 MB
5
+ 2025-08-18 23:36:48 - INFO - [d90238c8-2478-4c9d-bb69-a2535b9d8011] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 23:36:48 - INFO - [d90238c8-2478-4c9d-bb69-a2535b9d8011] Video saved to temporary file: temp_videos/d90238c8-2478-4c9d-bb69-a2535b9d8011.mp4
7
+ 2025-08-18 23:36:48 - INFO - [d90238c8-2478-4c9d-bb69-a2535b9d8011] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 23:36:53 - INFO - [d90238c8-2478-4c9d-bb69-a2535b9d8011] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 23:36:53 - INFO - [d90238c8-2478-4c9d-bb69-a2535b9d8011] 30 frames saved to temp_videos/d90238c8-2478-4c9d-bb69-a2535b9d8011
10
+ 2025-08-18 23:36:56 - INFO - Prompt token length: 23084
API_Transformers/logs/LFM2-VL-1.6B/20250818_234120.log ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 23:41:20 - INFO - Loading model: LiquidAI/LFM2-VL-1.6B
2
+ 2025-08-18 23:41:21 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 23:41:27 - INFO - Model loaded in 7.18 seconds
4
+ 2025-08-18 23:41:27 - INFO - GPU Memory Usage after model load: 3023.64 MB
5
+ 2025-08-18 23:41:34 - INFO - [5e9460f4-1fa1-4b3e-94b6-34725b6dc288] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 23:41:34 - INFO - [5e9460f4-1fa1-4b3e-94b6-34725b6dc288] Video saved to temporary file: temp_videos/5e9460f4-1fa1-4b3e-94b6-34725b6dc288.mp4
7
+ 2025-08-18 23:41:34 - INFO - [5e9460f4-1fa1-4b3e-94b6-34725b6dc288] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 23:41:38 - INFO - [5e9460f4-1fa1-4b3e-94b6-34725b6dc288] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 23:41:38 - INFO - [5e9460f4-1fa1-4b3e-94b6-34725b6dc288] 30 frames saved to temp_videos/5e9460f4-1fa1-4b3e-94b6-34725b6dc288
10
+ 2025-08-18 23:41:39 - INFO - Prompt token length: 3584
11
+ 2025-08-18 23:41:55 - INFO - Tokens per second: 4.9186034292897505, Peak GPU memory MB: 9376.375
12
+ 2025-08-18 23:41:55 - INFO - [5e9460f4-1fa1-4b3e-94b6-34725b6dc288] Inference time: 21.78 seconds, CPU usage: 65.6%, CPU core utilization: [62.8, 62.3, 67.5, 69.8]
13
+ 2025-08-18 23:41:55 - INFO - [5e9460f4-1fa1-4b3e-94b6-34725b6dc288] Cleaned up temporary file: temp_videos/5e9460f4-1fa1-4b3e-94b6-34725b6dc288.mp4
14
+ 2025-08-18 23:41:55 - INFO - [5e9460f4-1fa1-4b3e-94b6-34725b6dc288] Cleaned up temporary frame directory: temp_videos/5e9460f4-1fa1-4b3e-94b6-34725b6dc288
15
+ 2025-08-18 23:44:19 - INFO - [02d3e7ea-134b-413e-a2e6-d99b6e2130b1] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
16
+ 2025-08-18 23:44:19 - INFO - [02d3e7ea-134b-413e-a2e6-d99b6e2130b1] Video saved to temporary file: temp_videos/02d3e7ea-134b-413e-a2e6-d99b6e2130b1.mp4
17
+ 2025-08-18 23:44:19 - INFO - [02d3e7ea-134b-413e-a2e6-d99b6e2130b1] Extracting frames using method: uniform, rate/threshold: 30
18
+ 2025-08-18 23:44:22 - INFO - [02d3e7ea-134b-413e-a2e6-d99b6e2130b1] Extracted 30 frames successfully. Saving to temporary files...
19
+ 2025-08-18 23:44:22 - INFO - [02d3e7ea-134b-413e-a2e6-d99b6e2130b1] 30 frames saved to temp_videos/02d3e7ea-134b-413e-a2e6-d99b6e2130b1
20
+ 2025-08-18 23:44:22 - INFO - Prompt token length: 3584
21
+ 2025-08-18 23:44:39 - INFO - Tokens per second: 4.978868742700231, Peak GPU memory MB: 9376.375
22
+ 2025-08-18 23:44:39 - INFO - [02d3e7ea-134b-413e-a2e6-d99b6e2130b1] Inference time: 20.31 seconds, CPU usage: 53.2%, CPU core utilization: [52.9, 53.5, 55.2, 51.1]
23
+ 2025-08-18 23:44:39 - INFO - [02d3e7ea-134b-413e-a2e6-d99b6e2130b1] Cleaned up temporary file: temp_videos/02d3e7ea-134b-413e-a2e6-d99b6e2130b1.mp4
24
+ 2025-08-18 23:44:39 - INFO - [02d3e7ea-134b-413e-a2e6-d99b6e2130b1] Cleaned up temporary frame directory: temp_videos/02d3e7ea-134b-413e-a2e6-d99b6e2130b1
25
+ 2025-08-18 23:45:04 - INFO - [7f8e9ce9-aad6-4a0e-8e5c-2a5a8931110b] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
26
+ 2025-08-18 23:45:04 - INFO - [7f8e9ce9-aad6-4a0e-8e5c-2a5a8931110b] Video saved to temporary file: temp_videos/7f8e9ce9-aad6-4a0e-8e5c-2a5a8931110b.mp4
27
+ 2025-08-18 23:45:04 - INFO - [7f8e9ce9-aad6-4a0e-8e5c-2a5a8931110b] Extracting frames using method: uniform, rate/threshold: 30
28
+ 2025-08-18 23:45:09 - INFO - [7f8e9ce9-aad6-4a0e-8e5c-2a5a8931110b] Extracted 30 frames successfully. Saving to temporary files...
29
+ 2025-08-18 23:45:09 - INFO - [7f8e9ce9-aad6-4a0e-8e5c-2a5a8931110b] 30 frames saved to temp_videos/7f8e9ce9-aad6-4a0e-8e5c-2a5a8931110b
30
+ 2025-08-18 23:45:09 - INFO - Prompt token length: 3584
31
+ 2025-08-18 23:45:26 - INFO - Tokens per second: 4.974079275567591, Peak GPU memory MB: 9376.375
32
+ 2025-08-18 23:45:26 - INFO - [7f8e9ce9-aad6-4a0e-8e5c-2a5a8931110b] Inference time: 21.92 seconds, CPU usage: 60.2%, CPU core utilization: [60.4, 56.6, 56.5, 67.3]
33
+ 2025-08-18 23:45:26 - INFO - [7f8e9ce9-aad6-4a0e-8e5c-2a5a8931110b] Cleaned up temporary file: temp_videos/7f8e9ce9-aad6-4a0e-8e5c-2a5a8931110b.mp4
34
+ 2025-08-18 23:45:26 - INFO - [7f8e9ce9-aad6-4a0e-8e5c-2a5a8931110b] Cleaned up temporary frame directory: temp_videos/7f8e9ce9-aad6-4a0e-8e5c-2a5a8931110b
API_Transformers/logs/LFM2-VL-1.6B/20250818_234837.log ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 23:48:37 - INFO - Loading model: LiquidAI/LFM2-VL-1.6B
2
+ 2025-08-18 23:48:38 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 23:48:44 - INFO - Model loaded in 7.31 seconds
4
+ 2025-08-18 23:48:44 - INFO - GPU Memory Usage after model load: 3023.64 MB
5
+ 2025-08-18 23:49:06 - INFO - [20d53a50-ffe8-4d54-94e1-cd4a287c9be8] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 23:49:06 - INFO - [20d53a50-ffe8-4d54-94e1-cd4a287c9be8] Video saved to temporary file: temp_videos/20d53a50-ffe8-4d54-94e1-cd4a287c9be8.mp4
7
+ 2025-08-18 23:49:06 - INFO - [20d53a50-ffe8-4d54-94e1-cd4a287c9be8] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 23:49:09 - INFO - [20d53a50-ffe8-4d54-94e1-cd4a287c9be8] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 23:49:09 - INFO - [20d53a50-ffe8-4d54-94e1-cd4a287c9be8] 30 frames saved to temp_videos/20d53a50-ffe8-4d54-94e1-cd4a287c9be8
10
+ 2025-08-18 23:49:10 - INFO - Prompt token length: 3584
11
+ 2025-08-18 23:49:27 - INFO - Tokens per second: 34.94049134256706, Peak GPU memory MB: 9376.375
12
+ 2025-08-18 23:49:27 - INFO - [20d53a50-ffe8-4d54-94e1-cd4a287c9be8] Inference time: 20.83 seconds, CPU usage: 63.8%, CPU core utilization: [60.4, 62.1, 69.4, 63.2]
13
+ 2025-08-18 23:49:27 - INFO - [20d53a50-ffe8-4d54-94e1-cd4a287c9be8] Cleaned up temporary file: temp_videos/20d53a50-ffe8-4d54-94e1-cd4a287c9be8.mp4
14
+ 2025-08-18 23:49:27 - INFO - [20d53a50-ffe8-4d54-94e1-cd4a287c9be8] Cleaned up temporary frame directory: temp_videos/20d53a50-ffe8-4d54-94e1-cd4a287c9be8
API_Transformers/logs/LFM2-VL-1.6B/20250818_234946.log ADDED
The diff for this file is too large to render. See raw diff
 
API_Transformers/logs/LFM2-VL-1.6B/20250820_215936.log ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-20 21:59:36 - INFO - Loading model: LiquidAI/LFM2-VL-1.6B
2
+ 2025-08-20 21:59:37 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-20 22:00:07 - INFO - Model loaded in 31.10 seconds
4
+ 2025-08-20 22:00:07 - INFO - GPU Memory Usage after model load: 3023.64 MB
5
+ 2025-08-20 22:07:37 - INFO - [5d6ff8cd-5dd3-4c3f-bb14-6bb5a2d49aa0] Received new video inference request. Prompt: '视频里发生了什么?', Video: 'sample_part_001.mp4'
6
+ 2025-08-20 22:07:37 - INFO - [5d6ff8cd-5dd3-4c3f-bb14-6bb5a2d49aa0] Video saved to temporary file: temp_videos/5d6ff8cd-5dd3-4c3f-bb14-6bb5a2d49aa0.mp4
7
+ 2025-08-20 22:07:37 - INFO - [5d6ff8cd-5dd3-4c3f-bb14-6bb5a2d49aa0] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-20 22:07:43 - INFO - [5d6ff8cd-5dd3-4c3f-bb14-6bb5a2d49aa0] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-20 22:07:43 - INFO - [5d6ff8cd-5dd3-4c3f-bb14-6bb5a2d49aa0] 30 frames saved to temp_videos/5d6ff8cd-5dd3-4c3f-bb14-6bb5a2d49aa0
10
+ 2025-08-20 22:07:44 - INFO - Prompt token length: 3585
11
+ 2025-08-20 22:08:04 - INFO - Tokens per second: 40.26488449689874, Peak GPU memory MB: 9378.375
12
+ 2025-08-20 22:08:04 - INFO - [5d6ff8cd-5dd3-4c3f-bb14-6bb5a2d49aa0] Inference time: 27.25 seconds, CPU usage: 31.0%, CPU core utilization: [30.5, 30.3, 31.0, 32.3]
13
+ 2025-08-20 22:08:04 - INFO - [5d6ff8cd-5dd3-4c3f-bb14-6bb5a2d49aa0] Cleaned up temporary file: temp_videos/5d6ff8cd-5dd3-4c3f-bb14-6bb5a2d49aa0.mp4
14
+ 2025-08-20 22:08:04 - INFO - [5d6ff8cd-5dd3-4c3f-bb14-6bb5a2d49aa0] Cleaned up temporary frame directory: temp_videos/5d6ff8cd-5dd3-4c3f-bb14-6bb5a2d49aa0
15
+ 2025-08-20 22:08:04 - INFO - [7d334c4a-5ec5-47d1-9af6-5134d7017e7c] Received new video inference request. Prompt: '视频里发生了什么?', Video: 'sample_part_001.mp4'
16
+ 2025-08-20 22:08:04 - INFO - [7d334c4a-5ec5-47d1-9af6-5134d7017e7c] Video saved to temporary file: temp_videos/7d334c4a-5ec5-47d1-9af6-5134d7017e7c.mp4
17
+ 2025-08-20 22:08:04 - INFO - [7d334c4a-5ec5-47d1-9af6-5134d7017e7c] Extracting frames using method: uniform, rate/threshold: 30
18
+ 2025-08-20 22:08:12 - INFO - [7d334c4a-5ec5-47d1-9af6-5134d7017e7c] Extracted 30 frames successfully. Saving to temporary files...
19
+ 2025-08-20 22:08:12 - INFO - [7d334c4a-5ec5-47d1-9af6-5134d7017e7c] 30 frames saved to temp_videos/7d334c4a-5ec5-47d1-9af6-5134d7017e7c
20
+ 2025-08-20 22:08:13 - INFO - Prompt token length: 3585
21
+ 2025-08-20 22:08:32 - INFO - Tokens per second: 41.8782758123788, Peak GPU memory MB: 9378.375
22
+ 2025-08-20 22:08:32 - INFO - [7d334c4a-5ec5-47d1-9af6-5134d7017e7c] Inference time: 27.77 seconds, CPU usage: 76.0%, CPU core utilization: [65.2, 79.8, 85.9, 73.3]
23
+ 2025-08-20 22:08:32 - INFO - [7d334c4a-5ec5-47d1-9af6-5134d7017e7c] Cleaned up temporary file: temp_videos/7d334c4a-5ec5-47d1-9af6-5134d7017e7c.mp4
24
+ 2025-08-20 22:08:32 - INFO - [7d334c4a-5ec5-47d1-9af6-5134d7017e7c] Cleaned up temporary frame directory: temp_videos/7d334c4a-5ec5-47d1-9af6-5134d7017e7c
25
+ 2025-08-20 22:08:32 - INFO - [47246552-af71-4d3c-b034-bc4f74bcfaee] Received new video inference request. Prompt: '视频里发生了什么?', Video: 'sample_part_002.mp4'
26
+ 2025-08-20 22:08:32 - INFO - [47246552-af71-4d3c-b034-bc4f74bcfaee] Video saved to temporary file: temp_videos/47246552-af71-4d3c-b034-bc4f74bcfaee.mp4
27
+ 2025-08-20 22:08:32 - INFO - [47246552-af71-4d3c-b034-bc4f74bcfaee] Extracting frames using method: uniform, rate/threshold: 30
28
+ 2025-08-20 22:08:41 - INFO - [47246552-af71-4d3c-b034-bc4f74bcfaee] Extracted 30 frames successfully. Saving to temporary files...
29
+ 2025-08-20 22:08:41 - INFO - [47246552-af71-4d3c-b034-bc4f74bcfaee] 30 frames saved to temp_videos/47246552-af71-4d3c-b034-bc4f74bcfaee
30
+ 2025-08-20 22:08:41 - INFO - Prompt token length: 3585
31
+ 2025-08-20 22:08:58 - INFO - Tokens per second: 37.12922143875923, Peak GPU memory MB: 9378.375
32
+ 2025-08-20 22:08:58 - INFO - [47246552-af71-4d3c-b034-bc4f74bcfaee] Inference time: 26.39 seconds, CPU usage: 78.7%, CPU core utilization: [77.4, 73.7, 79.6, 83.9]
33
+ 2025-08-20 22:08:58 - INFO - [47246552-af71-4d3c-b034-bc4f74bcfaee] Cleaned up temporary file: temp_videos/47246552-af71-4d3c-b034-bc4f74bcfaee.mp4
34
+ 2025-08-20 22:08:58 - INFO - [47246552-af71-4d3c-b034-bc4f74bcfaee] Cleaned up temporary frame directory: temp_videos/47246552-af71-4d3c-b034-bc4f74bcfaee
35
+ 2025-08-20 22:08:59 - INFO - [cc126bf6-5c43-48b6-aa29-d38a216bc50a] Received new video inference request. Prompt: '视频里发生了什么?', Video: 'sample_part_002.mp4'
36
+ 2025-08-20 22:08:59 - INFO - [cc126bf6-5c43-48b6-aa29-d38a216bc50a] Video saved to temporary file: temp_videos/cc126bf6-5c43-48b6-aa29-d38a216bc50a.mp4
37
+ 2025-08-20 22:08:59 - INFO - [cc126bf6-5c43-48b6-aa29-d38a216bc50a] Extracting frames using method: uniform, rate/threshold: 30
38
+ 2025-08-20 22:09:05 - INFO - [cc126bf6-5c43-48b6-aa29-d38a216bc50a] Extracted 30 frames successfully. Saving to temporary files...
39
+ 2025-08-20 22:09:05 - INFO - [cc126bf6-5c43-48b6-aa29-d38a216bc50a] 30 frames saved to temp_videos/cc126bf6-5c43-48b6-aa29-d38a216bc50a
40
+ 2025-08-20 22:09:06 - INFO - Prompt token length: 3585
41
+ 2025-08-20 22:09:23 - INFO - Tokens per second: 42.08466552693894, Peak GPU memory MB: 9378.375
42
+ 2025-08-20 22:09:23 - INFO - [cc126bf6-5c43-48b6-aa29-d38a216bc50a] Inference time: 24.01 seconds, CPU usage: 79.0%, CPU core utilization: [77.0, 80.9, 81.3, 76.9]
43
+ 2025-08-20 22:09:23 - INFO - [cc126bf6-5c43-48b6-aa29-d38a216bc50a] Cleaned up temporary file: temp_videos/cc126bf6-5c43-48b6-aa29-d38a216bc50a.mp4
44
+ 2025-08-20 22:09:23 - INFO - [cc126bf6-5c43-48b6-aa29-d38a216bc50a] Cleaned up temporary frame directory: temp_videos/cc126bf6-5c43-48b6-aa29-d38a216bc50a
API_Transformers/logs/LFM2-VL-1.6B/20250820_220950.log ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-20 22:09:50 - INFO - Loading model: LiquidAI/LFM2-VL-1.6B
2
+ 2025-08-20 22:09:52 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-20 22:09:58 - INFO - Model loaded in 7.53 seconds
4
+ 2025-08-20 22:09:58 - INFO - GPU Memory Usage after model load: 3023.64 MB
5
+ 2025-08-20 22:10:07 - INFO - [afa49686-78ae-46eb-b2a4-8f75a70553bd] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: 'sample_part_001.mp4'
6
+ 2025-08-20 22:10:07 - INFO - [afa49686-78ae-46eb-b2a4-8f75a70553bd] Video saved to temporary file: temp_videos/afa49686-78ae-46eb-b2a4-8f75a70553bd.mp4
7
+ 2025-08-20 22:10:07 - INFO - [afa49686-78ae-46eb-b2a4-8f75a70553bd] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-20 22:10:15 - INFO - [afa49686-78ae-46eb-b2a4-8f75a70553bd] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-20 22:10:15 - INFO - [afa49686-78ae-46eb-b2a4-8f75a70553bd] 30 frames saved to temp_videos/afa49686-78ae-46eb-b2a4-8f75a70553bd
10
+ 2025-08-20 22:10:15 - INFO - Prompt token length: 3604
11
+ 2025-08-20 22:10:35 - INFO - Tokens per second: 41.32216588841384, Peak GPU memory MB: 9378.375
12
+ 2025-08-20 22:10:35 - INFO - [afa49686-78ae-46eb-b2a4-8f75a70553bd] Inference time: 28.69 seconds, CPU usage: 70.0%, CPU core utilization: [68.6, 67.0, 70.7, 73.7]
13
+ 2025-08-20 22:10:35 - INFO - [afa49686-78ae-46eb-b2a4-8f75a70553bd] Cleaned up temporary file: temp_videos/afa49686-78ae-46eb-b2a4-8f75a70553bd.mp4
14
+ 2025-08-20 22:10:35 - INFO - [afa49686-78ae-46eb-b2a4-8f75a70553bd] Cleaned up temporary frame directory: temp_videos/afa49686-78ae-46eb-b2a4-8f75a70553bd
15
+ 2025-08-20 22:10:35 - INFO - [3afaa144-dd25-428a-99bf-293426382d0b] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: 'sample_part_001.mp4'
16
+ 2025-08-20 22:10:36 - INFO - [3afaa144-dd25-428a-99bf-293426382d0b] Video saved to temporary file: temp_videos/3afaa144-dd25-428a-99bf-293426382d0b.mp4
17
+ 2025-08-20 22:10:36 - INFO - [3afaa144-dd25-428a-99bf-293426382d0b] Extracting frames using method: uniform, rate/threshold: 30
18
+ 2025-08-20 22:10:44 - INFO - [3afaa144-dd25-428a-99bf-293426382d0b] Extracted 30 frames successfully. Saving to temporary files...
19
+ 2025-08-20 22:10:44 - INFO - [3afaa144-dd25-428a-99bf-293426382d0b] 30 frames saved to temp_videos/3afaa144-dd25-428a-99bf-293426382d0b
20
+ 2025-08-20 22:10:44 - INFO - Prompt token length: 3604
21
+ 2025-08-20 22:11:04 - INFO - Tokens per second: 40.60897346605822, Peak GPU memory MB: 9378.375
22
+ 2025-08-20 22:11:04 - INFO - [3afaa144-dd25-428a-99bf-293426382d0b] Inference time: 28.89 seconds, CPU usage: 75.7%, CPU core utilization: [75.0, 65.5, 73.7, 88.6]
23
+ 2025-08-20 22:11:04 - INFO - [3afaa144-dd25-428a-99bf-293426382d0b] Cleaned up temporary file: temp_videos/3afaa144-dd25-428a-99bf-293426382d0b.mp4
24
+ 2025-08-20 22:11:04 - INFO - [3afaa144-dd25-428a-99bf-293426382d0b] Cleaned up temporary frame directory: temp_videos/3afaa144-dd25-428a-99bf-293426382d0b
25
+ 2025-08-20 22:11:05 - INFO - [5a1c46da-fc10-4080-a178-c2c775a7381d] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: 'sample_part_002.mp4'
26
+ 2025-08-20 22:11:05 - INFO - [5a1c46da-fc10-4080-a178-c2c775a7381d] Video saved to temporary file: temp_videos/5a1c46da-fc10-4080-a178-c2c775a7381d.mp4
27
+ 2025-08-20 22:11:05 - INFO - [5a1c46da-fc10-4080-a178-c2c775a7381d] Extracting frames using method: uniform, rate/threshold: 30
28
+ 2025-08-20 22:11:12 - INFO - [5a1c46da-fc10-4080-a178-c2c775a7381d] Extracted 30 frames successfully. Saving to temporary files...
29
+ 2025-08-20 22:11:12 - INFO - [5a1c46da-fc10-4080-a178-c2c775a7381d] 30 frames saved to temp_videos/5a1c46da-fc10-4080-a178-c2c775a7381d
30
+ 2025-08-20 22:11:13 - INFO - Prompt token length: 3604
31
+ 2025-08-20 22:11:35 - INFO - Tokens per second: 39.89175853584641, Peak GPU memory MB: 9378.375
32
+ 2025-08-20 22:11:35 - INFO - [5a1c46da-fc10-4080-a178-c2c775a7381d] Inference time: 30.34 seconds, CPU usage: 77.2%, CPU core utilization: [74.2, 76.3, 80.4, 77.9]
33
+ 2025-08-20 22:11:35 - INFO - [5a1c46da-fc10-4080-a178-c2c775a7381d] Cleaned up temporary file: temp_videos/5a1c46da-fc10-4080-a178-c2c775a7381d.mp4
34
+ 2025-08-20 22:11:35 - INFO - [5a1c46da-fc10-4080-a178-c2c775a7381d] Cleaned up temporary frame directory: temp_videos/5a1c46da-fc10-4080-a178-c2c775a7381d
35
+ 2025-08-20 22:11:35 - INFO - [edf5e6ee-7477-451f-8335-8dd2e4335d67] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: 'sample_part_002.mp4'
36
+ 2025-08-20 22:11:35 - INFO - [edf5e6ee-7477-451f-8335-8dd2e4335d67] Video saved to temporary file: temp_videos/edf5e6ee-7477-451f-8335-8dd2e4335d67.mp4
37
+ 2025-08-20 22:11:35 - INFO - [edf5e6ee-7477-451f-8335-8dd2e4335d67] Extracting frames using method: uniform, rate/threshold: 30
38
+ 2025-08-20 22:11:42 - INFO - [edf5e6ee-7477-451f-8335-8dd2e4335d67] Extracted 30 frames successfully. Saving to temporary files...
39
+ 2025-08-20 22:11:42 - INFO - [edf5e6ee-7477-451f-8335-8dd2e4335d67] 30 frames saved to temp_videos/edf5e6ee-7477-451f-8335-8dd2e4335d67
40
+ 2025-08-20 22:11:43 - INFO - Prompt token length: 3604
41
+ 2025-08-20 22:12:05 - INFO - Tokens per second: 39.00521947487598, Peak GPU memory MB: 9378.375
42
+ 2025-08-20 22:12:05 - INFO - [edf5e6ee-7477-451f-8335-8dd2e4335d67] Inference time: 30.32 seconds, CPU usage: 79.7%, CPU core utilization: [79.9, 67.1, 78.2, 93.3]
43
+ 2025-08-20 22:12:05 - INFO - [edf5e6ee-7477-451f-8335-8dd2e4335d67] Cleaned up temporary file: temp_videos/edf5e6ee-7477-451f-8335-8dd2e4335d67.mp4
44
+ 2025-08-20 22:12:05 - INFO - [edf5e6ee-7477-451f-8335-8dd2e4335d67] Cleaned up temporary frame directory: temp_videos/edf5e6ee-7477-451f-8335-8dd2e4335d67
45
+ 2025-08-20 22:12:06 - INFO - [0fddc63e-dbc6-4c80-8242-088580de8927] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: 'sample_part_003.mp4'
46
+ 2025-08-20 22:12:06 - INFO - [0fddc63e-dbc6-4c80-8242-088580de8927] Video saved to temporary file: temp_videos/0fddc63e-dbc6-4c80-8242-088580de8927.mp4
47
+ 2025-08-20 22:12:06 - INFO - [0fddc63e-dbc6-4c80-8242-088580de8927] Extracting frames using method: uniform, rate/threshold: 30
48
+ 2025-08-20 22:12:13 - INFO - [0fddc63e-dbc6-4c80-8242-088580de8927] Extracted 30 frames successfully. Saving to temporary files...
49
+ 2025-08-20 22:12:13 - INFO - [0fddc63e-dbc6-4c80-8242-088580de8927] 30 frames saved to temp_videos/0fddc63e-dbc6-4c80-8242-088580de8927
50
+ 2025-08-20 22:12:13 - INFO - Prompt token length: 3604
51
+ 2025-08-20 22:12:30 - INFO - Tokens per second: 41.222647353168156, Peak GPU memory MB: 9378.375
52
+ 2025-08-20 22:12:30 - INFO - [0fddc63e-dbc6-4c80-8242-088580de8927] Inference time: 24.71 seconds, CPU usage: 79.2%, CPU core utilization: [77.9, 74.8, 80.1, 83.8]
53
+ 2025-08-20 22:12:30 - INFO - [0fddc63e-dbc6-4c80-8242-088580de8927] Cleaned up temporary file: temp_videos/0fddc63e-dbc6-4c80-8242-088580de8927.mp4
54
+ 2025-08-20 22:12:30 - INFO - [0fddc63e-dbc6-4c80-8242-088580de8927] Cleaned up temporary frame directory: temp_videos/0fddc63e-dbc6-4c80-8242-088580de8927
API_Transformers/logs/LFM2-VL-1.6B/20250820_221918.log ADDED
The diff for this file is too large to render. See raw diff
 
API_Transformers/logs/LFM2-VL-1.6B/20250820_231154.log ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ 2025-08-20 23:11:54 - INFO - Loading model: LiquidAI/LFM2-VL-1.6B
2
+ 2025-08-20 23:11:55 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-20 23:12:02 - INFO - Model loaded in 7.77 seconds
4
+ 2025-08-20 23:12:02 - INFO - GPU Memory Usage after model load: 3023.64 MB
API_Transformers/logs/LFM2-VL-1.6B/20250820_231714.log ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-20 23:17:14 - INFO - Loading model: LiquidAI/LFM2-VL-1.6B
2
+ 2025-08-20 23:17:16 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-20 23:17:22 - INFO - Model loaded in 7.52 seconds
4
+ 2025-08-20 23:17:22 - INFO - GPU Memory Usage after model load: 3023.64 MB
5
+ 2025-08-20 23:18:11 - INFO - [7395856c-f721-471f-85da-b8268191bd53] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
6
+ 2025-08-20 23:18:11 - INFO - [7395856c-f721-471f-85da-b8268191bd53] Video saved to temporary file: temp_videos/7395856c-f721-471f-85da-b8268191bd53.mp4
7
+ 2025-08-20 23:18:11 - INFO - [7395856c-f721-471f-85da-b8268191bd53] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-20 23:18:16 - INFO - [7395856c-f721-471f-85da-b8268191bd53] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-20 23:18:16 - INFO - [7395856c-f721-471f-85da-b8268191bd53] 30 frames saved to temp_videos/7395856c-f721-471f-85da-b8268191bd53
10
+ 2025-08-20 23:18:16 - INFO - Prompt token length: 3604
11
+ 2025-08-20 23:18:36 - INFO - Tokens per second: 43.071821438037745, Peak GPU memory MB: 9378.375
12
+ 2025-08-20 23:18:36 - INFO - [7395856c-f721-471f-85da-b8268191bd53] Inference time: 24.96 seconds, CPU usage: 27.5%, CPU core utilization: [19.8, 26.4, 21.3, 42.5]
13
+ 2025-08-20 23:18:36 - INFO - [7395856c-f721-471f-85da-b8268191bd53] Cleaned up temporary frame directory: temp_videos/7395856c-f721-471f-85da-b8268191bd53
14
+ 2025-08-20 23:18:36 - INFO - [d8cb0a25-8bb0-4b63-9b5a-30f5dc54fd7a] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
15
+ 2025-08-20 23:18:36 - INFO - [d8cb0a25-8bb0-4b63-9b5a-30f5dc54fd7a] Video saved to temporary file: temp_videos/d8cb0a25-8bb0-4b63-9b5a-30f5dc54fd7a.mp4
16
+ 2025-08-20 23:18:36 - INFO - [d8cb0a25-8bb0-4b63-9b5a-30f5dc54fd7a] Extracting frames using method: uniform, rate/threshold: 30
17
+ 2025-08-20 23:18:41 - INFO - [d8cb0a25-8bb0-4b63-9b5a-30f5dc54fd7a] Extracted 30 frames successfully. Saving to temporary files...
18
+ 2025-08-20 23:18:41 - INFO - [d8cb0a25-8bb0-4b63-9b5a-30f5dc54fd7a] 30 frames saved to temp_videos/d8cb0a25-8bb0-4b63-9b5a-30f5dc54fd7a
19
+ 2025-08-20 23:18:41 - INFO - Prompt token length: 3604
20
+ 2025-08-20 23:19:01 - INFO - Tokens per second: 43.41840127709154, Peak GPU memory MB: 9378.375
21
+ 2025-08-20 23:19:01 - INFO - [d8cb0a25-8bb0-4b63-9b5a-30f5dc54fd7a] Inference time: 24.87 seconds, CPU usage: 38.8%, CPU core utilization: [32.1, 40.6, 19.5, 63.0]
22
+ 2025-08-20 23:19:01 - INFO - [d8cb0a25-8bb0-4b63-9b5a-30f5dc54fd7a] Cleaned up temporary frame directory: temp_videos/d8cb0a25-8bb0-4b63-9b5a-30f5dc54fd7a
23
+ 2025-08-20 23:20:04 - INFO - [9679be14-bd46-4299-ad82-b4bab2ae763c] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
24
+ 2025-08-20 23:20:04 - INFO - [9679be14-bd46-4299-ad82-b4bab2ae763c] Video saved to temporary file: temp_videos/9679be14-bd46-4299-ad82-b4bab2ae763c.mp4
25
+ 2025-08-20 23:20:04 - INFO - [9679be14-bd46-4299-ad82-b4bab2ae763c] Extracting frames using method: uniform, rate/threshold: 30
26
+ 2025-08-20 23:20:08 - INFO - [9679be14-bd46-4299-ad82-b4bab2ae763c] Extracted 30 frames successfully. Saving to temporary files...
27
+ 2025-08-20 23:20:08 - INFO - [9679be14-bd46-4299-ad82-b4bab2ae763c] 30 frames saved to temp_videos/9679be14-bd46-4299-ad82-b4bab2ae763c
28
+ 2025-08-20 23:20:09 - INFO - Prompt token length: 3604
29
+ 2025-08-20 23:20:28 - INFO - Tokens per second: 43.04488796653849, Peak GPU memory MB: 9378.375
30
+ 2025-08-20 23:20:28 - INFO - [9679be14-bd46-4299-ad82-b4bab2ae763c] Inference time: 24.78 seconds, CPU usage: 14.3%, CPU core utilization: [8.6, 10.9, 9.3, 28.5]
31
+ 2025-08-20 23:20:28 - INFO - [9679be14-bd46-4299-ad82-b4bab2ae763c] Cleaned up temporary frame directory: temp_videos/9679be14-bd46-4299-ad82-b4bab2ae763c
32
+ 2025-08-20 23:20:28 - INFO - [3d9610ba-18fc-4512-876c-62775d568781] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
33
+ 2025-08-20 23:20:28 - INFO - [3d9610ba-18fc-4512-876c-62775d568781] Video saved to temporary file: temp_videos/3d9610ba-18fc-4512-876c-62775d568781.mp4
34
+ 2025-08-20 23:20:28 - INFO - [3d9610ba-18fc-4512-876c-62775d568781] Extracting frames using method: uniform, rate/threshold: 30
35
+ 2025-08-20 23:20:33 - INFO - [3d9610ba-18fc-4512-876c-62775d568781] Extracted 30 frames successfully. Saving to temporary files...
36
+ 2025-08-20 23:20:33 - INFO - [3d9610ba-18fc-4512-876c-62775d568781] 30 frames saved to temp_videos/3d9610ba-18fc-4512-876c-62775d568781
37
+ 2025-08-20 23:20:34 - INFO - Prompt token length: 3604
38
+ 2025-08-20 23:20:53 - INFO - Tokens per second: 43.20754976197641, Peak GPU memory MB: 9378.375
39
+ 2025-08-20 23:20:53 - INFO - [3d9610ba-18fc-4512-876c-62775d568781] Inference time: 24.83 seconds, CPU usage: 38.8%, CPU core utilization: [20.3, 39.8, 19.5, 75.4]
40
+ 2025-08-20 23:20:53 - INFO - [3d9610ba-18fc-4512-876c-62775d568781] Cleaned up temporary frame directory: temp_videos/3d9610ba-18fc-4512-876c-62775d568781
41
+ 2025-08-20 23:20:53 - INFO - [91f8f3d5-15de-406b-834c-7aa4204cb5de] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_002.mp4'
42
+ 2025-08-20 23:20:53 - INFO - [91f8f3d5-15de-406b-834c-7aa4204cb5de] Video saved to temporary file: temp_videos/91f8f3d5-15de-406b-834c-7aa4204cb5de.mp4
43
+ 2025-08-20 23:20:53 - INFO - [91f8f3d5-15de-406b-834c-7aa4204cb5de] Extracting frames using method: uniform, rate/threshold: 30
44
+ 2025-08-20 23:20:58 - INFO - [91f8f3d5-15de-406b-834c-7aa4204cb5de] Extracted 30 frames successfully. Saving to temporary files...
45
+ 2025-08-20 23:20:58 - INFO - [91f8f3d5-15de-406b-834c-7aa4204cb5de] 30 frames saved to temp_videos/91f8f3d5-15de-406b-834c-7aa4204cb5de
46
+ 2025-08-20 23:20:58 - INFO - Prompt token length: 3604
47
+ 2025-08-20 23:21:21 - INFO - Tokens per second: 38.828257610417374, Peak GPU memory MB: 9378.375
48
+ 2025-08-20 23:21:21 - INFO - [91f8f3d5-15de-406b-834c-7aa4204cb5de] Inference time: 27.69 seconds, CPU usage: 54.6%, CPU core utilization: [46.3, 65.5, 40.6, 66.1]
49
+ 2025-08-20 23:21:21 - INFO - [91f8f3d5-15de-406b-834c-7aa4204cb5de] Cleaned up temporary frame directory: temp_videos/91f8f3d5-15de-406b-834c-7aa4204cb5de
50
+ 2025-08-20 23:21:21 - INFO - [9f1dd041-e69e-4183-8efa-f61fd08559c2] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_002.mp4'
51
+ 2025-08-20 23:21:21 - INFO - [9f1dd041-e69e-4183-8efa-f61fd08559c2] Video saved to temporary file: temp_videos/9f1dd041-e69e-4183-8efa-f61fd08559c2.mp4
52
+ 2025-08-20 23:21:21 - INFO - [9f1dd041-e69e-4183-8efa-f61fd08559c2] Extracting frames using method: uniform, rate/threshold: 30
53
+ 2025-08-20 23:21:29 - INFO - [9f1dd041-e69e-4183-8efa-f61fd08559c2] Extracted 30 frames successfully. Saving to temporary files...
54
+ 2025-08-20 23:21:29 - INFO - [9f1dd041-e69e-4183-8efa-f61fd08559c2] 30 frames saved to temp_videos/9f1dd041-e69e-4183-8efa-f61fd08559c2
55
+ 2025-08-20 23:21:29 - INFO - Prompt token length: 3604
56
+ 2025-08-20 23:21:51 - INFO - Tokens per second: 43.03845068270847, Peak GPU memory MB: 9378.375
57
+ 2025-08-20 23:21:51 - INFO - [9f1dd041-e69e-4183-8efa-f61fd08559c2] Inference time: 29.71 seconds, CPU usage: 53.7%, CPU core utilization: [39.4, 55.5, 81.5, 38.5]
58
+ 2025-08-20 23:21:51 - INFO - [9f1dd041-e69e-4183-8efa-f61fd08559c2] Cleaned up temporary frame directory: temp_videos/9f1dd041-e69e-4183-8efa-f61fd08559c2
59
+ 2025-08-20 23:21:51 - INFO - [827dbb61-d685-495c-883d-067090aa9e49] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_003.mp4'
60
+ 2025-08-20 23:21:51 - INFO - [827dbb61-d685-495c-883d-067090aa9e49] Video saved to temporary file: temp_videos/827dbb61-d685-495c-883d-067090aa9e49.mp4
61
+ 2025-08-20 23:21:51 - INFO - [827dbb61-d685-495c-883d-067090aa9e49] Extracting frames using method: uniform, rate/threshold: 30
62
+ 2025-08-20 23:21:56 - INFO - [827dbb61-d685-495c-883d-067090aa9e49] Extracted 30 frames successfully. Saving to temporary files...
63
+ 2025-08-20 23:21:56 - INFO - [827dbb61-d685-495c-883d-067090aa9e49] 30 frames saved to temp_videos/827dbb61-d685-495c-883d-067090aa9e49
64
+ 2025-08-20 23:21:56 - INFO - Prompt token length: 3604
65
+ 2025-08-20 23:22:13 - INFO - Tokens per second: 43.19212817813234, Peak GPU memory MB: 9378.375
66
+ 2025-08-20 23:22:13 - INFO - [827dbb61-d685-495c-883d-067090aa9e49] Inference time: 22.10 seconds, CPU usage: 40.3%, CPU core utilization: [22.0, 88.1, 21.6, 29.3]
67
+ 2025-08-20 23:22:13 - INFO - [827dbb61-d685-495c-883d-067090aa9e49] Cleaned up temporary frame directory: temp_videos/827dbb61-d685-495c-883d-067090aa9e49
API_Transformers/logs/LFM2-VL-1.6B/20250820_232316.log ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-20 23:23:16 - INFO - Loading model: LiquidAI/LFM2-VL-1.6B
2
+ 2025-08-20 23:23:18 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-20 23:23:24 - INFO - Model loaded in 7.69 seconds
4
+ 2025-08-20 23:23:24 - INFO - GPU Memory Usage after model load: 3023.64 MB
5
+ 2025-08-20 23:23:36 - INFO - [dbbd5e6a-91d4-4c60-91ae-6b6d212302e1] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: 'sample_part_001.mp4'
6
+ 2025-08-20 23:23:36 - INFO - [dbbd5e6a-91d4-4c60-91ae-6b6d212302e1] Video saved to temporary file: temp_videos/dbbd5e6a-91d4-4c60-91ae-6b6d212302e1.mp4
7
+ 2025-08-20 23:23:36 - INFO - [dbbd5e6a-91d4-4c60-91ae-6b6d212302e1] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-20 23:23:44 - INFO - [dbbd5e6a-91d4-4c60-91ae-6b6d212302e1] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-20 23:23:44 - INFO - [dbbd5e6a-91d4-4c60-91ae-6b6d212302e1] 30 frames saved to temp_videos/dbbd5e6a-91d4-4c60-91ae-6b6d212302e1
10
+ 2025-08-20 23:23:44 - INFO - Prompt token length: 3604
11
+ 2025-08-20 23:24:04 - INFO - Tokens per second: 42.609188518901775, Peak GPU memory MB: 9378.375
12
+ 2025-08-20 23:24:04 - INFO - [dbbd5e6a-91d4-4c60-91ae-6b6d212302e1] Inference time: 28.08 seconds, CPU usage: 61.6%, CPU core utilization: [56.9, 67.6, 54.1, 67.9]
13
+ 2025-08-20 23:24:04 - INFO - [dbbd5e6a-91d4-4c60-91ae-6b6d212302e1] Cleaned up temporary file: temp_videos/dbbd5e6a-91d4-4c60-91ae-6b6d212302e1.mp4
14
+ 2025-08-20 23:24:04 - INFO - [dbbd5e6a-91d4-4c60-91ae-6b6d212302e1] Cleaned up temporary frame directory: temp_videos/dbbd5e6a-91d4-4c60-91ae-6b6d212302e1
15
+ 2025-08-20 23:24:04 - INFO - [2750fe67-230a-46a2-87b9-928b60a2a3d4] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: 'sample_part_002.mp4'
16
+ 2025-08-20 23:24:04 - INFO - [2750fe67-230a-46a2-87b9-928b60a2a3d4] Video saved to temporary file: temp_videos/2750fe67-230a-46a2-87b9-928b60a2a3d4.mp4
17
+ 2025-08-20 23:24:04 - INFO - [2750fe67-230a-46a2-87b9-928b60a2a3d4] Extracting frames using method: uniform, rate/threshold: 30
18
+ 2025-08-20 23:24:09 - INFO - [2750fe67-230a-46a2-87b9-928b60a2a3d4] Extracted 30 frames successfully. Saving to temporary files...
19
+ 2025-08-20 23:24:09 - INFO - [2750fe67-230a-46a2-87b9-928b60a2a3d4] 30 frames saved to temp_videos/2750fe67-230a-46a2-87b9-928b60a2a3d4
20
+ 2025-08-20 23:24:10 - INFO - Prompt token length: 3604
21
+ 2025-08-20 23:24:32 - INFO - Tokens per second: 38.75598021320056, Peak GPU memory MB: 9378.375
22
+ 2025-08-20 23:24:32 - INFO - [2750fe67-230a-46a2-87b9-928b60a2a3d4] Inference time: 27.76 seconds, CPU usage: 76.2%, CPU core utilization: [78.9, 78.2, 72.8, 75.0]
23
+ 2025-08-20 23:24:32 - INFO - [2750fe67-230a-46a2-87b9-928b60a2a3d4] Cleaned up temporary file: temp_videos/2750fe67-230a-46a2-87b9-928b60a2a3d4.mp4
24
+ 2025-08-20 23:24:32 - INFO - [2750fe67-230a-46a2-87b9-928b60a2a3d4] Cleaned up temporary frame directory: temp_videos/2750fe67-230a-46a2-87b9-928b60a2a3d4
API_Transformers/logs/LFM2-VL-1.6B/20250820_232542.log ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-20 23:25:42 - INFO - Loading model: LiquidAI/LFM2-VL-1.6B
2
+ 2025-08-20 23:25:44 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-20 23:25:50 - INFO - Model loaded in 7.25 seconds
4
+ 2025-08-20 23:25:50 - INFO - GPU Memory Usage after model load: 3023.64 MB
5
+ 2025-08-20 23:26:55 - INFO - [e83cd3f1-609c-4419-b86e-463266ac54ce] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
6
+ 2025-08-20 23:26:55 - INFO - [e83cd3f1-609c-4419-b86e-463266ac54ce] Video saved to temporary file: temp_videos/e83cd3f1-609c-4419-b86e-463266ac54ce.mp4
7
+ 2025-08-20 23:26:55 - INFO - [e83cd3f1-609c-4419-b86e-463266ac54ce] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-20 23:27:00 - INFO - [e83cd3f1-609c-4419-b86e-463266ac54ce] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-20 23:27:00 - INFO - [e83cd3f1-609c-4419-b86e-463266ac54ce] 30 frames saved to temp_videos/e83cd3f1-609c-4419-b86e-463266ac54ce
10
+ 2025-08-20 23:27:00 - INFO - Prompt token length: 3604
11
+ 2025-08-20 23:27:20 - INFO - Tokens per second: 43.03910315479847, Peak GPU memory MB: 9378.375
12
+ 2025-08-20 23:27:20 - INFO - [e83cd3f1-609c-4419-b86e-463266ac54ce] Inference time: 25.01 seconds, CPU usage: 40.1%, CPU core utilization: [36.1, 41.1, 37.2, 46.1]
13
+ 2025-08-20 23:27:20 - INFO - [e83cd3f1-609c-4419-b86e-463266ac54ce] Cleaned up temporary frame directory: temp_videos/e83cd3f1-609c-4419-b86e-463266ac54ce
14
+ 2025-08-20 23:27:20 - INFO - [09fa6c2e-50b5-4c0a-ab72-c399a68e3b19] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_002.mp4'
15
+ 2025-08-20 23:27:20 - INFO - [09fa6c2e-50b5-4c0a-ab72-c399a68e3b19] Video saved to temporary file: temp_videos/09fa6c2e-50b5-4c0a-ab72-c399a68e3b19.mp4
16
+ 2025-08-20 23:27:20 - INFO - [09fa6c2e-50b5-4c0a-ab72-c399a68e3b19] Extracting frames using method: uniform, rate/threshold: 30
17
+ 2025-08-20 23:27:25 - INFO - [09fa6c2e-50b5-4c0a-ab72-c399a68e3b19] Extracted 30 frames successfully. Saving to temporary files...
18
+ 2025-08-20 23:27:25 - INFO - [09fa6c2e-50b5-4c0a-ab72-c399a68e3b19] 30 frames saved to temp_videos/09fa6c2e-50b5-4c0a-ab72-c399a68e3b19
19
+ 2025-08-20 23:27:25 - INFO - Prompt token length: 3604
20
+ 2025-08-20 23:27:47 - INFO - Tokens per second: 42.95401647014546, Peak GPU memory MB: 9378.375
21
+ 2025-08-20 23:27:47 - INFO - [09fa6c2e-50b5-4c0a-ab72-c399a68e3b19] Inference time: 27.03 seconds, CPU usage: 39.6%, CPU core utilization: [61.4, 39.9, 35.3, 21.8]
22
+ 2025-08-20 23:27:47 - INFO - [09fa6c2e-50b5-4c0a-ab72-c399a68e3b19] Cleaned up temporary frame directory: temp_videos/09fa6c2e-50b5-4c0a-ab72-c399a68e3b19
23
+ 2025-08-20 23:27:47 - INFO - [d2c64140-0f5b-4e4a-83b7-feabd7c4323d] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_003.mp4'
24
+ 2025-08-20 23:27:47 - INFO - [d2c64140-0f5b-4e4a-83b7-feabd7c4323d] Video saved to temporary file: temp_videos/d2c64140-0f5b-4e4a-83b7-feabd7c4323d.mp4
25
+ 2025-08-20 23:27:47 - INFO - [d2c64140-0f5b-4e4a-83b7-feabd7c4323d] Extracting frames using method: uniform, rate/threshold: 30
26
+ 2025-08-20 23:27:52 - INFO - [d2c64140-0f5b-4e4a-83b7-feabd7c4323d] Extracted 30 frames successfully. Saving to temporary files...
27
+ 2025-08-20 23:27:52 - INFO - [d2c64140-0f5b-4e4a-83b7-feabd7c4323d] 30 frames saved to temp_videos/d2c64140-0f5b-4e4a-83b7-feabd7c4323d
28
+ 2025-08-20 23:27:52 - INFO - Prompt token length: 3604
29
+ 2025-08-20 23:28:09 - INFO - Tokens per second: 43.51874005006489, Peak GPU memory MB: 9378.375
30
+ 2025-08-20 23:28:09 - INFO - [d2c64140-0f5b-4e4a-83b7-feabd7c4323d] Inference time: 22.02 seconds, CPU usage: 39.8%, CPU core utilization: [81.6, 22.0, 31.3, 24.0]
31
+ 2025-08-20 23:28:09 - INFO - [d2c64140-0f5b-4e4a-83b7-feabd7c4323d] Cleaned up temporary frame directory: temp_videos/d2c64140-0f5b-4e4a-83b7-feabd7c4323d
32
+ 2025-08-20 23:28:09 - INFO - [d27c71e9-88ae-44b1-834e-64d54e1645f9] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_004.mp4'
33
+ 2025-08-20 23:28:09 - INFO - [d27c71e9-88ae-44b1-834e-64d54e1645f9] Video saved to temporary file: temp_videos/d27c71e9-88ae-44b1-834e-64d54e1645f9.mp4
34
+ 2025-08-20 23:28:09 - INFO - [d27c71e9-88ae-44b1-834e-64d54e1645f9] Extracting frames using method: uniform, rate/threshold: 30
35
+ 2025-08-20 23:28:14 - INFO - [d27c71e9-88ae-44b1-834e-64d54e1645f9] Extracted 30 frames successfully. Saving to temporary files...
36
+ 2025-08-20 23:28:14 - INFO - [d27c71e9-88ae-44b1-834e-64d54e1645f9] 30 frames saved to temp_videos/d27c71e9-88ae-44b1-834e-64d54e1645f9
37
+ 2025-08-20 23:28:14 - INFO - Prompt token length: 3604
38
+ 2025-08-20 23:28:32 - INFO - Tokens per second: 42.80125306128213, Peak GPU memory MB: 9378.375
39
+ 2025-08-20 23:28:32 - INFO - [d27c71e9-88ae-44b1-834e-64d54e1645f9] Inference time: 22.93 seconds, CPU usage: 39.8%, CPU core utilization: [19.9, 21.9, 21.1, 96.1]
40
+ 2025-08-20 23:28:32 - INFO - [d27c71e9-88ae-44b1-834e-64d54e1645f9] Cleaned up temporary frame directory: temp_videos/d27c71e9-88ae-44b1-834e-64d54e1645f9
41
+ 2025-08-20 23:28:32 - INFO - [16164179-73f7-4aa7-a42b-e6453e0f48af] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_005.mp4'
42
+ 2025-08-20 23:28:32 - INFO - [16164179-73f7-4aa7-a42b-e6453e0f48af] Video saved to temporary file: temp_videos/16164179-73f7-4aa7-a42b-e6453e0f48af.mp4
43
+ 2025-08-20 23:28:32 - INFO - [16164179-73f7-4aa7-a42b-e6453e0f48af] Extracting frames using method: uniform, rate/threshold: 30
44
+ 2025-08-20 23:28:37 - INFO - [16164179-73f7-4aa7-a42b-e6453e0f48af] Extracted 30 frames successfully. Saving to temporary files...
45
+ 2025-08-20 23:28:37 - INFO - [16164179-73f7-4aa7-a42b-e6453e0f48af] 30 frames saved to temp_videos/16164179-73f7-4aa7-a42b-e6453e0f48af
46
+ 2025-08-20 23:28:37 - INFO - Prompt token length: 3604
47
+ 2025-08-20 23:28:57 - INFO - Tokens per second: 42.835329663650555, Peak GPU memory MB: 9378.375
48
+ 2025-08-20 23:28:57 - INFO - [16164179-73f7-4aa7-a42b-e6453e0f48af] Inference time: 24.90 seconds, CPU usage: 38.4%, CPU core utilization: [21.2, 37.7, 18.6, 76.1]
49
+ 2025-08-20 23:28:57 - INFO - [16164179-73f7-4aa7-a42b-e6453e0f48af] Cleaned up temporary frame directory: temp_videos/16164179-73f7-4aa7-a42b-e6453e0f48af
50
+ 2025-08-20 23:28:57 - INFO - [371792c3-0d46-4934-8417-8fb6f5b7e4c2] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_006.mp4'
51
+ 2025-08-20 23:28:57 - INFO - [371792c3-0d46-4934-8417-8fb6f5b7e4c2] Video saved to temporary file: temp_videos/371792c3-0d46-4934-8417-8fb6f5b7e4c2.mp4
52
+ 2025-08-20 23:28:57 - INFO - [371792c3-0d46-4934-8417-8fb6f5b7e4c2] Extracting frames using method: uniform, rate/threshold: 30
53
+ 2025-08-20 23:29:02 - INFO - [371792c3-0d46-4934-8417-8fb6f5b7e4c2] Extracted 30 frames successfully. Saving to temporary files...
54
+ 2025-08-20 23:29:02 - INFO - [371792c3-0d46-4934-8417-8fb6f5b7e4c2] 30 frames saved to temp_videos/371792c3-0d46-4934-8417-8fb6f5b7e4c2
55
+ 2025-08-20 23:29:02 - INFO - Prompt token length: 3604
56
+ 2025-08-20 23:29:21 - INFO - Tokens per second: 43.349566710658124, Peak GPU memory MB: 9378.375
57
+ 2025-08-20 23:29:21 - INFO - [371792c3-0d46-4934-8417-8fb6f5b7e4c2] Inference time: 24.44 seconds, CPU usage: 39.0%, CPU core utilization: [45.2, 19.7, 70.7, 20.1]
58
+ 2025-08-20 23:29:21 - INFO - [371792c3-0d46-4934-8417-8fb6f5b7e4c2] Cleaned up temporary frame directory: temp_videos/371792c3-0d46-4934-8417-8fb6f5b7e4c2
59
+ 2025-08-20 23:29:21 - INFO - [fe4d9541-064c-4c8d-bb08-c1d347420c33] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_007.mp4'
60
+ 2025-08-20 23:29:21 - INFO - [fe4d9541-064c-4c8d-bb08-c1d347420c33] Video saved to temporary file: temp_videos/fe4d9541-064c-4c8d-bb08-c1d347420c33.mp4
61
+ 2025-08-20 23:29:21 - INFO - [fe4d9541-064c-4c8d-bb08-c1d347420c33] Extracting frames using method: uniform, rate/threshold: 30
62
+ 2025-08-20 23:29:26 - INFO - [fe4d9541-064c-4c8d-bb08-c1d347420c33] Extracted 30 frames successfully. Saving to temporary files...
63
+ 2025-08-20 23:29:26 - INFO - [fe4d9541-064c-4c8d-bb08-c1d347420c33] 30 frames saved to temp_videos/fe4d9541-064c-4c8d-bb08-c1d347420c33
64
+ 2025-08-20 23:29:27 - INFO - Prompt token length: 3604
65
+ 2025-08-20 23:29:43 - INFO - Tokens per second: 43.652183023369325, Peak GPU memory MB: 9378.375
66
+ 2025-08-20 23:29:43 - INFO - [fe4d9541-064c-4c8d-bb08-c1d347420c33] Inference time: 21.99 seconds, CPU usage: 39.8%, CPU core utilization: [80.7, 22.9, 34.6, 21.1]
67
+ 2025-08-20 23:29:43 - INFO - [fe4d9541-064c-4c8d-bb08-c1d347420c33] Cleaned up temporary frame directory: temp_videos/fe4d9541-064c-4c8d-bb08-c1d347420c33
68
+ 2025-08-20 23:29:43 - INFO - [e5947bb6-739e-4bc1-bbe4-9ab58dc731d6] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_008.mp4'
69
+ 2025-08-20 23:29:43 - INFO - [e5947bb6-739e-4bc1-bbe4-9ab58dc731d6] Video saved to temporary file: temp_videos/e5947bb6-739e-4bc1-bbe4-9ab58dc731d6.mp4
70
+ 2025-08-20 23:29:43 - INFO - [e5947bb6-739e-4bc1-bbe4-9ab58dc731d6] Extracting frames using method: uniform, rate/threshold: 30
71
+ 2025-08-20 23:29:48 - INFO - [e5947bb6-739e-4bc1-bbe4-9ab58dc731d6] Extracted 30 frames successfully. Saving to temporary files...
72
+ 2025-08-20 23:29:48 - INFO - [e5947bb6-739e-4bc1-bbe4-9ab58dc731d6] 30 frames saved to temp_videos/e5947bb6-739e-4bc1-bbe4-9ab58dc731d6
73
+ 2025-08-20 23:29:49 - INFO - Prompt token length: 3604
74
+ 2025-08-20 23:30:08 - INFO - Tokens per second: 42.780439620939916, Peak GPU memory MB: 9378.375
75
+ 2025-08-20 23:30:08 - INFO - [e5947bb6-739e-4bc1-bbe4-9ab58dc731d6] Inference time: 24.21 seconds, CPU usage: 41.1%, CPU core utilization: [62.1, 38.6, 41.8, 21.9]
76
+ 2025-08-20 23:30:08 - INFO - [e5947bb6-739e-4bc1-bbe4-9ab58dc731d6] Cleaned up temporary frame directory: temp_videos/e5947bb6-739e-4bc1-bbe4-9ab58dc731d6
77
+ 2025-08-20 23:30:08 - INFO - [b8e6aa49-574c-447b-b3ef-fd02c306e746] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_009.mp4'
78
+ 2025-08-20 23:30:08 - INFO - [b8e6aa49-574c-447b-b3ef-fd02c306e746] Video saved to temporary file: temp_videos/b8e6aa49-574c-447b-b3ef-fd02c306e746.mp4
79
+ 2025-08-20 23:30:08 - INFO - [b8e6aa49-574c-447b-b3ef-fd02c306e746] Extracting frames using method: uniform, rate/threshold: 30
80
+ 2025-08-20 23:30:13 - INFO - [b8e6aa49-574c-447b-b3ef-fd02c306e746] Extracted 30 frames successfully. Saving to temporary files...
81
+ 2025-08-20 23:30:13 - INFO - [b8e6aa49-574c-447b-b3ef-fd02c306e746] 30 frames saved to temp_videos/b8e6aa49-574c-447b-b3ef-fd02c306e746
82
+ 2025-08-20 23:30:13 - INFO - Prompt token length: 3604
83
+ 2025-08-20 23:30:33 - INFO - Tokens per second: 43.12406437638801, Peak GPU memory MB: 9378.375
84
+ 2025-08-20 23:30:33 - INFO - [b8e6aa49-574c-447b-b3ef-fd02c306e746] Inference time: 25.30 seconds, CPU usage: 38.6%, CPU core utilization: [18.8, 47.6, 19.3, 68.7]
85
+ 2025-08-20 23:30:33 - INFO - [b8e6aa49-574c-447b-b3ef-fd02c306e746] Cleaned up temporary frame directory: temp_videos/b8e6aa49-574c-447b-b3ef-fd02c306e746
86
+ 2025-08-20 23:32:24 - INFO - [f04edbef-149d-425b-8805-113e2ea54029] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
87
+ 2025-08-20 23:32:24 - INFO - [f04edbef-149d-425b-8805-113e2ea54029] Video saved to temporary file: temp_videos/f04edbef-149d-425b-8805-113e2ea54029.mp4
88
+ 2025-08-20 23:32:24 - INFO - [f04edbef-149d-425b-8805-113e2ea54029] Extracting frames using method: uniform, rate/threshold: 30
89
+ 2025-08-20 23:32:29 - INFO - [f04edbef-149d-425b-8805-113e2ea54029] Extracted 30 frames successfully. Saving to temporary files...
90
+ 2025-08-20 23:32:29 - INFO - [f04edbef-149d-425b-8805-113e2ea54029] 30 frames saved to temp_videos/f04edbef-149d-425b-8805-113e2ea54029
91
+ 2025-08-20 23:32:30 - INFO - Prompt token length: 3613
92
+ 2025-08-20 23:32:46 - INFO - Tokens per second: 43.735840716197266, Peak GPU memory MB: 9378.375
93
+ 2025-08-20 23:32:46 - INFO - [f04edbef-149d-425b-8805-113e2ea54029] Inference time: 22.10 seconds, CPU usage: 8.4%, CPU core utilization: [9.6, 5.9, 10.7, 7.5]
94
+ 2025-08-20 23:32:46 - INFO - [f04edbef-149d-425b-8805-113e2ea54029] Cleaned up temporary frame directory: temp_videos/f04edbef-149d-425b-8805-113e2ea54029
95
+ 2025-08-20 23:32:46 - INFO - [a0f9ff7d-76c8-42a2-ac7f-8064d56ae6f2] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_002.mp4'
96
+ 2025-08-20 23:32:46 - INFO - [a0f9ff7d-76c8-42a2-ac7f-8064d56ae6f2] Video saved to temporary file: temp_videos/a0f9ff7d-76c8-42a2-ac7f-8064d56ae6f2.mp4
97
+ 2025-08-20 23:32:46 - INFO - [a0f9ff7d-76c8-42a2-ac7f-8064d56ae6f2] Extracting frames using method: uniform, rate/threshold: 30
98
+ 2025-08-20 23:32:51 - INFO - [a0f9ff7d-76c8-42a2-ac7f-8064d56ae6f2] Extracted 30 frames successfully. Saving to temporary files...
99
+ 2025-08-20 23:32:51 - INFO - [a0f9ff7d-76c8-42a2-ac7f-8064d56ae6f2] 30 frames saved to temp_videos/a0f9ff7d-76c8-42a2-ac7f-8064d56ae6f2
100
+ 2025-08-20 23:32:52 - INFO - Prompt token length: 3613
101
+ 2025-08-20 23:33:09 - INFO - Tokens per second: 43.61585189482321, Peak GPU memory MB: 9378.375
102
+ 2025-08-20 23:33:09 - INFO - [a0f9ff7d-76c8-42a2-ac7f-8064d56ae6f2] Inference time: 22.20 seconds, CPU usage: 41.2%, CPU core utilization: [40.5, 22.9, 75.8, 25.4]
103
+ 2025-08-20 23:33:09 - INFO - [a0f9ff7d-76c8-42a2-ac7f-8064d56ae6f2] Cleaned up temporary frame directory: temp_videos/a0f9ff7d-76c8-42a2-ac7f-8064d56ae6f2
104
+ 2025-08-20 23:33:09 - INFO - [2f221dc2-a54d-4d32-8900-2b5c046cebaf] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_003.mp4'
105
+ 2025-08-20 23:33:09 - INFO - [2f221dc2-a54d-4d32-8900-2b5c046cebaf] Video saved to temporary file: temp_videos/2f221dc2-a54d-4d32-8900-2b5c046cebaf.mp4
106
+ 2025-08-20 23:33:09 - INFO - [2f221dc2-a54d-4d32-8900-2b5c046cebaf] Extracting frames using method: uniform, rate/threshold: 30
107
+ 2025-08-20 23:33:14 - INFO - [2f221dc2-a54d-4d32-8900-2b5c046cebaf] Extracted 30 frames successfully. Saving to temporary files...
108
+ 2025-08-20 23:33:14 - INFO - [2f221dc2-a54d-4d32-8900-2b5c046cebaf] 30 frames saved to temp_videos/2f221dc2-a54d-4d32-8900-2b5c046cebaf
109
+ 2025-08-20 23:33:14 - INFO - Prompt token length: 3613
110
+ 2025-08-20 23:33:31 - INFO - Tokens per second: 43.7656086803256, Peak GPU memory MB: 9378.375
111
+ 2025-08-20 23:33:31 - INFO - [2f221dc2-a54d-4d32-8900-2b5c046cebaf] Inference time: 21.94 seconds, CPU usage: 40.5%, CPU core utilization: [37.9, 46.5, 22.3, 55.2]
112
+ 2025-08-20 23:33:31 - INFO - [2f221dc2-a54d-4d32-8900-2b5c046cebaf] Cleaned up temporary frame directory: temp_videos/2f221dc2-a54d-4d32-8900-2b5c046cebaf
113
+ 2025-08-20 23:33:31 - INFO - [a0b144b5-c0eb-4c16-8826-7047eed0dbed] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_004.mp4'
114
+ 2025-08-20 23:33:31 - INFO - [a0b144b5-c0eb-4c16-8826-7047eed0dbed] Video saved to temporary file: temp_videos/a0b144b5-c0eb-4c16-8826-7047eed0dbed.mp4
115
+ 2025-08-20 23:33:31 - INFO - [a0b144b5-c0eb-4c16-8826-7047eed0dbed] Extracting frames using method: uniform, rate/threshold: 30
116
+ 2025-08-20 23:33:35 - INFO - [a0b144b5-c0eb-4c16-8826-7047eed0dbed] Extracted 30 frames successfully. Saving to temporary files...
117
+ 2025-08-20 23:33:35 - INFO - [a0b144b5-c0eb-4c16-8826-7047eed0dbed] 30 frames saved to temp_videos/a0b144b5-c0eb-4c16-8826-7047eed0dbed
118
+ 2025-08-20 23:33:36 - INFO - Prompt token length: 3613
119
+ 2025-08-20 23:33:52 - INFO - Tokens per second: 43.67951928934803, Peak GPU memory MB: 9378.375
120
+ 2025-08-20 23:33:52 - INFO - [a0b144b5-c0eb-4c16-8826-7047eed0dbed] Inference time: 21.80 seconds, CPU usage: 40.2%, CPU core utilization: [67.0, 20.9, 50.2, 22.5]
121
+ 2025-08-20 23:33:52 - INFO - [a0b144b5-c0eb-4c16-8826-7047eed0dbed] Cleaned up temporary frame directory: temp_videos/a0b144b5-c0eb-4c16-8826-7047eed0dbed
122
+ 2025-08-20 23:33:52 - INFO - [526b7643-3bfd-4e06-91c3-b91651e42819] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_005.mp4'
123
+ 2025-08-20 23:33:52 - INFO - [526b7643-3bfd-4e06-91c3-b91651e42819] Video saved to temporary file: temp_videos/526b7643-3bfd-4e06-91c3-b91651e42819.mp4
124
+ 2025-08-20 23:33:52 - INFO - [526b7643-3bfd-4e06-91c3-b91651e42819] Extracting frames using method: uniform, rate/threshold: 30
125
+ 2025-08-20 23:33:57 - INFO - [526b7643-3bfd-4e06-91c3-b91651e42819] Extracted 30 frames successfully. Saving to temporary files...
126
+ 2025-08-20 23:33:57 - INFO - [526b7643-3bfd-4e06-91c3-b91651e42819] 30 frames saved to temp_videos/526b7643-3bfd-4e06-91c3-b91651e42819
127
+ 2025-08-20 23:33:58 - INFO - Prompt token length: 3613
128
+ 2025-08-20 23:34:15 - INFO - Tokens per second: 43.287042978207154, Peak GPU memory MB: 9378.375
129
+ 2025-08-20 23:34:15 - INFO - [526b7643-3bfd-4e06-91c3-b91651e42819] Inference time: 22.61 seconds, CPU usage: 40.3%, CPU core utilization: [21.9, 27.2, 21.6, 90.4]
130
+ 2025-08-20 23:34:15 - INFO - [526b7643-3bfd-4e06-91c3-b91651e42819] Cleaned up temporary frame directory: temp_videos/526b7643-3bfd-4e06-91c3-b91651e42819
API_Transformers/logs/MiniCPM-V-4/20250819_004631.log ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-19 00:46:31 - INFO - Loading model: openbmb/MiniCPM-V-4
2
+ 2025-08-19 00:46:31 - INFO - vision_config is None, using default vision config
3
+ 2025-08-19 00:47:35 - INFO - Model loaded in 64.26 seconds
4
+ 2025-08-19 00:47:35 - INFO - GPU Memory Usage after model load: 7802.99 MB
5
+ 2025-08-19 00:48:00 - INFO - [be95cc0f-0dca-41c0-a89b-822971620a94] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-19 00:48:00 - INFO - [be95cc0f-0dca-41c0-a89b-822971620a94] Video saved to temporary file: temp_videos/be95cc0f-0dca-41c0-a89b-822971620a94.mp4
7
+ 2025-08-19 00:48:00 - INFO - [be95cc0f-0dca-41c0-a89b-822971620a94] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-19 00:48:02 - INFO - [be95cc0f-0dca-41c0-a89b-822971620a94] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-19 00:48:02 - INFO - [be95cc0f-0dca-41c0-a89b-822971620a94] 30 frames saved to temp_videos/be95cc0f-0dca-41c0-a89b-822971620a94
10
+ 2025-08-19 00:48:20 - INFO - vision_config is None, using default vision config
11
+ 2025-08-19 00:48:40 - INFO - Tokens per second: 9.330762006151874, Peak GPU memory MB: 11824.375
12
+ 2025-08-19 00:48:40 - INFO - [be95cc0f-0dca-41c0-a89b-822971620a94] Inference time: 40.34 seconds, CPU usage: 32.3%, CPU core utilization: [30.8, 32.9, 36.8, 28.6]
13
+ 2025-08-19 00:48:40 - INFO - [be95cc0f-0dca-41c0-a89b-822971620a94] Cleaned up temporary file: temp_videos/be95cc0f-0dca-41c0-a89b-822971620a94.mp4
14
+ 2025-08-19 00:48:40 - INFO - [be95cc0f-0dca-41c0-a89b-822971620a94] Cleaned up temporary frame directory: temp_videos/be95cc0f-0dca-41c0-a89b-822971620a94
API_Transformers/logs/MiniCPM-V-4/20250819_013451.log ADDED
@@ -0,0 +1,454 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-19 01:34:51 - INFO - Loading model: openbmb/MiniCPM-V-4
2
+ 2025-08-19 01:34:51 - INFO - vision_config is None, using default vision config
3
+ 2025-08-19 01:35:55 - INFO - Model loaded in 64.47 seconds
4
+ 2025-08-19 01:35:55 - INFO - GPU Memory Usage after model load: 7802.99 MB
5
+ 2025-08-19 01:36:23 - INFO - [5f513c59-eb71-4e8a-82d4-e0872d45ebdd] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-19 01:36:23 - INFO - [5f513c59-eb71-4e8a-82d4-e0872d45ebdd] Video saved to temporary file: temp_videos/5f513c59-eb71-4e8a-82d4-e0872d45ebdd.mp4
7
+ 2025-08-19 01:36:23 - INFO - [5f513c59-eb71-4e8a-82d4-e0872d45ebdd] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-19 01:36:26 - INFO - [5f513c59-eb71-4e8a-82d4-e0872d45ebdd] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-19 01:36:26 - INFO - [5f513c59-eb71-4e8a-82d4-e0872d45ebdd] 30 frames saved to temp_videos/5f513c59-eb71-4e8a-82d4-e0872d45ebdd
10
+ 2025-08-19 01:36:43 - INFO - vision_config is None, using default vision config
11
+ 2025-08-19 01:37:11 - INFO - Tokens per second: 10.464318161039833, Peak GPU memory MB: 11824.375
12
+ 2025-08-19 01:37:11 - INFO - [5f513c59-eb71-4e8a-82d4-e0872d45ebdd] Inference time: 47.79 seconds, CPU usage: 24.3%, CPU core utilization: [22.4, 30.4, 23.4, 21.0]
13
+ 2025-08-19 01:37:11 - INFO - [5f513c59-eb71-4e8a-82d4-e0872d45ebdd] Cleaned up temporary file: temp_videos/5f513c59-eb71-4e8a-82d4-e0872d45ebdd.mp4
14
+ 2025-08-19 01:37:11 - INFO - [5f513c59-eb71-4e8a-82d4-e0872d45ebdd] Cleaned up temporary frame directory: temp_videos/5f513c59-eb71-4e8a-82d4-e0872d45ebdd
15
+ 2025-08-19 01:37:11 - INFO - [fa0ae957-fdb5-40cc-95dd-32ca84f8be61] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_002.mp4'
16
+ 2025-08-19 01:37:11 - INFO - [fa0ae957-fdb5-40cc-95dd-32ca84f8be61] Video saved to temporary file: temp_videos/fa0ae957-fdb5-40cc-95dd-32ca84f8be61.mp4
17
+ 2025-08-19 01:37:11 - INFO - [fa0ae957-fdb5-40cc-95dd-32ca84f8be61] Extracting frames using method: uniform, rate/threshold: 30
18
+ 2025-08-19 01:37:17 - INFO - [fa0ae957-fdb5-40cc-95dd-32ca84f8be61] Extracted 30 frames successfully. Saving to temporary files...
19
+ 2025-08-19 01:37:17 - INFO - [fa0ae957-fdb5-40cc-95dd-32ca84f8be61] 30 frames saved to temp_videos/fa0ae957-fdb5-40cc-95dd-32ca84f8be61
20
+ 2025-08-19 01:37:30 - INFO - vision_config is None, using default vision config
21
+ 2025-08-19 01:38:05 - INFO - Tokens per second: 11.217051883159877, Peak GPU memory MB: 11824.375
22
+ 2025-08-19 01:38:05 - INFO - [fa0ae957-fdb5-40cc-95dd-32ca84f8be61] Inference time: 53.80 seconds, CPU usage: 37.5%, CPU core utilization: [37.0, 34.8, 51.4, 26.6]
23
+ 2025-08-19 01:38:05 - INFO - [fa0ae957-fdb5-40cc-95dd-32ca84f8be61] Cleaned up temporary file: temp_videos/fa0ae957-fdb5-40cc-95dd-32ca84f8be61.mp4
24
+ 2025-08-19 01:38:05 - INFO - [fa0ae957-fdb5-40cc-95dd-32ca84f8be61] Cleaned up temporary frame directory: temp_videos/fa0ae957-fdb5-40cc-95dd-32ca84f8be61
25
+ 2025-08-19 01:38:05 - INFO - [b55d97c9-7bdb-4950-b049-3d72db40e001] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_003.mp4'
26
+ 2025-08-19 01:38:05 - INFO - [b55d97c9-7bdb-4950-b049-3d72db40e001] Video saved to temporary file: temp_videos/b55d97c9-7bdb-4950-b049-3d72db40e001.mp4
27
+ 2025-08-19 01:38:05 - INFO - [b55d97c9-7bdb-4950-b049-3d72db40e001] Extracting frames using method: uniform, rate/threshold: 30
28
+ 2025-08-19 01:38:10 - INFO - [b55d97c9-7bdb-4950-b049-3d72db40e001] Extracted 30 frames successfully. Saving to temporary files...
29
+ 2025-08-19 01:38:10 - INFO - [b55d97c9-7bdb-4950-b049-3d72db40e001] 30 frames saved to temp_videos/b55d97c9-7bdb-4950-b049-3d72db40e001
30
+ 2025-08-19 01:38:23 - INFO - vision_config is None, using default vision config
31
+ 2025-08-19 01:38:47 - INFO - Tokens per second: 10.186586825003255, Peak GPU memory MB: 11824.375
32
+ 2025-08-19 01:38:47 - INFO - [b55d97c9-7bdb-4950-b049-3d72db40e001] Inference time: 42.50 seconds, CPU usage: 38.7%, CPU core utilization: [19.7, 39.0, 22.6, 73.4]
33
+ 2025-08-19 01:38:47 - INFO - [b55d97c9-7bdb-4950-b049-3d72db40e001] Cleaned up temporary file: temp_videos/b55d97c9-7bdb-4950-b049-3d72db40e001.mp4
34
+ 2025-08-19 01:38:47 - INFO - [b55d97c9-7bdb-4950-b049-3d72db40e001] Cleaned up temporary frame directory: temp_videos/b55d97c9-7bdb-4950-b049-3d72db40e001
35
+ 2025-08-19 01:38:48 - INFO - [67e694e2-b806-4fed-a8da-5238bdc3deba] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_004.mp4'
36
+ 2025-08-19 01:38:48 - INFO - [67e694e2-b806-4fed-a8da-5238bdc3deba] Video saved to temporary file: temp_videos/67e694e2-b806-4fed-a8da-5238bdc3deba.mp4
37
+ 2025-08-19 01:38:48 - INFO - [67e694e2-b806-4fed-a8da-5238bdc3deba] Extracting frames using method: uniform, rate/threshold: 30
38
+ 2025-08-19 01:38:53 - INFO - [67e694e2-b806-4fed-a8da-5238bdc3deba] Extracted 30 frames successfully. Saving to temporary files...
39
+ 2025-08-19 01:38:53 - INFO - [67e694e2-b806-4fed-a8da-5238bdc3deba] 30 frames saved to temp_videos/67e694e2-b806-4fed-a8da-5238bdc3deba
40
+ 2025-08-19 01:39:06 - INFO - vision_config is None, using default vision config
41
+ 2025-08-19 01:39:32 - INFO - Tokens per second: 10.441963736015857, Peak GPU memory MB: 11824.375
42
+ 2025-08-19 01:39:32 - INFO - [67e694e2-b806-4fed-a8da-5238bdc3deba] Inference time: 44.20 seconds, CPU usage: 37.8%, CPU core utilization: [45.7, 33.2, 52.3, 20.1]
43
+ 2025-08-19 01:39:32 - INFO - [67e694e2-b806-4fed-a8da-5238bdc3deba] Cleaned up temporary file: temp_videos/67e694e2-b806-4fed-a8da-5238bdc3deba.mp4
44
+ 2025-08-19 01:39:32 - INFO - [67e694e2-b806-4fed-a8da-5238bdc3deba] Cleaned up temporary frame directory: temp_videos/67e694e2-b806-4fed-a8da-5238bdc3deba
45
+ 2025-08-19 01:39:32 - INFO - [e48d00ae-e1ad-4f14-82d3-b63013965879] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_005.mp4'
46
+ 2025-08-19 01:39:32 - INFO - [e48d00ae-e1ad-4f14-82d3-b63013965879] Video saved to temporary file: temp_videos/e48d00ae-e1ad-4f14-82d3-b63013965879.mp4
47
+ 2025-08-19 01:39:32 - INFO - [e48d00ae-e1ad-4f14-82d3-b63013965879] Extracting frames using method: uniform, rate/threshold: 30
48
+ 2025-08-19 01:39:37 - INFO - [e48d00ae-e1ad-4f14-82d3-b63013965879] Extracted 30 frames successfully. Saving to temporary files...
49
+ 2025-08-19 01:39:37 - INFO - [e48d00ae-e1ad-4f14-82d3-b63013965879] 30 frames saved to temp_videos/e48d00ae-e1ad-4f14-82d3-b63013965879
50
+ 2025-08-19 01:39:50 - INFO - vision_config is None, using default vision config
51
+ 2025-08-19 01:40:24 - INFO - Tokens per second: 11.117427002797763, Peak GPU memory MB: 11824.375
52
+ 2025-08-19 01:40:24 - INFO - [e48d00ae-e1ad-4f14-82d3-b63013965879] Inference time: 52.02 seconds, CPU usage: 37.7%, CPU core utilization: [22.2, 47.8, 42.5, 38.0]
53
+ 2025-08-19 01:40:24 - INFO - [e48d00ae-e1ad-4f14-82d3-b63013965879] Cleaned up temporary file: temp_videos/e48d00ae-e1ad-4f14-82d3-b63013965879.mp4
54
+ 2025-08-19 01:40:24 - INFO - [e48d00ae-e1ad-4f14-82d3-b63013965879] Cleaned up temporary frame directory: temp_videos/e48d00ae-e1ad-4f14-82d3-b63013965879
55
+ 2025-08-19 01:40:24 - INFO - [a1477422-c3f9-4646-aa38-ab9853d12940] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_006.mp4'
56
+ 2025-08-19 01:40:24 - INFO - [a1477422-c3f9-4646-aa38-ab9853d12940] Video saved to temporary file: temp_videos/a1477422-c3f9-4646-aa38-ab9853d12940.mp4
57
+ 2025-08-19 01:40:24 - INFO - [a1477422-c3f9-4646-aa38-ab9853d12940] Extracting frames using method: uniform, rate/threshold: 30
58
+ 2025-08-19 01:40:29 - INFO - [a1477422-c3f9-4646-aa38-ab9853d12940] Extracted 30 frames successfully. Saving to temporary files...
59
+ 2025-08-19 01:40:29 - INFO - [a1477422-c3f9-4646-aa38-ab9853d12940] 30 frames saved to temp_videos/a1477422-c3f9-4646-aa38-ab9853d12940
60
+ 2025-08-19 01:40:42 - INFO - vision_config is None, using default vision config
61
+ 2025-08-19 01:41:12 - INFO - Tokens per second: 10.739343508452546, Peak GPU memory MB: 11824.375
62
+ 2025-08-19 01:41:12 - INFO - [a1477422-c3f9-4646-aa38-ab9853d12940] Inference time: 47.88 seconds, CPU usage: 49.1%, CPU core utilization: [40.6, 44.5, 64.1, 47.1]
63
+ 2025-08-19 01:41:12 - INFO - [a1477422-c3f9-4646-aa38-ab9853d12940] Cleaned up temporary file: temp_videos/a1477422-c3f9-4646-aa38-ab9853d12940.mp4
64
+ 2025-08-19 01:41:12 - INFO - [a1477422-c3f9-4646-aa38-ab9853d12940] Cleaned up temporary frame directory: temp_videos/a1477422-c3f9-4646-aa38-ab9853d12940
65
+ 2025-08-19 01:41:12 - INFO - [f193d31f-d522-4a86-b522-2a418a14e805] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_007.mp4'
66
+ 2025-08-19 01:41:12 - INFO - [f193d31f-d522-4a86-b522-2a418a14e805] Video saved to temporary file: temp_videos/f193d31f-d522-4a86-b522-2a418a14e805.mp4
67
+ 2025-08-19 01:41:12 - INFO - [f193d31f-d522-4a86-b522-2a418a14e805] Extracting frames using method: uniform, rate/threshold: 30
68
+ 2025-08-19 01:41:20 - INFO - [f193d31f-d522-4a86-b522-2a418a14e805] Extracted 30 frames successfully. Saving to temporary files...
69
+ 2025-08-19 01:41:20 - INFO - [f193d31f-d522-4a86-b522-2a418a14e805] 30 frames saved to temp_videos/f193d31f-d522-4a86-b522-2a418a14e805
70
+ 2025-08-19 01:41:33 - INFO - vision_config is None, using default vision config
71
+ 2025-08-19 01:41:59 - INFO - Tokens per second: 10.35898493058187, Peak GPU memory MB: 11824.375
72
+ 2025-08-19 01:41:59 - INFO - [f193d31f-d522-4a86-b522-2a418a14e805] Inference time: 47.03 seconds, CPU usage: 45.1%, CPU core utilization: [40.8, 41.1, 35.0, 63.3]
73
+ 2025-08-19 01:41:59 - INFO - [f193d31f-d522-4a86-b522-2a418a14e805] Cleaned up temporary file: temp_videos/f193d31f-d522-4a86-b522-2a418a14e805.mp4
74
+ 2025-08-19 01:41:59 - INFO - [f193d31f-d522-4a86-b522-2a418a14e805] Cleaned up temporary frame directory: temp_videos/f193d31f-d522-4a86-b522-2a418a14e805
75
+ 2025-08-19 01:41:59 - INFO - [91650f5d-a4b0-4697-a296-99c3924d4e4e] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_008.mp4'
76
+ 2025-08-19 01:41:59 - INFO - [91650f5d-a4b0-4697-a296-99c3924d4e4e] Video saved to temporary file: temp_videos/91650f5d-a4b0-4697-a296-99c3924d4e4e.mp4
77
+ 2025-08-19 01:41:59 - INFO - [91650f5d-a4b0-4697-a296-99c3924d4e4e] Extracting frames using method: uniform, rate/threshold: 30
78
+ 2025-08-19 01:42:04 - INFO - [91650f5d-a4b0-4697-a296-99c3924d4e4e] Extracted 30 frames successfully. Saving to temporary files...
79
+ 2025-08-19 01:42:04 - INFO - [91650f5d-a4b0-4697-a296-99c3924d4e4e] 30 frames saved to temp_videos/91650f5d-a4b0-4697-a296-99c3924d4e4e
80
+ 2025-08-19 01:42:17 - INFO - vision_config is None, using default vision config
81
+ 2025-08-19 01:42:42 - INFO - Tokens per second: 10.245622255156498, Peak GPU memory MB: 11824.375
82
+ 2025-08-19 01:42:42 - INFO - [91650f5d-a4b0-4697-a296-99c3924d4e4e] Inference time: 42.95 seconds, CPU usage: 37.7%, CPU core utilization: [19.0, 25.2, 47.2, 59.2]
83
+ 2025-08-19 01:42:42 - INFO - [91650f5d-a4b0-4697-a296-99c3924d4e4e] Cleaned up temporary file: temp_videos/91650f5d-a4b0-4697-a296-99c3924d4e4e.mp4
84
+ 2025-08-19 01:42:42 - INFO - [91650f5d-a4b0-4697-a296-99c3924d4e4e] Cleaned up temporary frame directory: temp_videos/91650f5d-a4b0-4697-a296-99c3924d4e4e
85
+ 2025-08-19 01:42:42 - INFO - [62d9d4de-0e76-48c3-aadc-bd1d6ff2a7b6] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_009.mp4'
86
+ 2025-08-19 01:42:42 - INFO - [62d9d4de-0e76-48c3-aadc-bd1d6ff2a7b6] Video saved to temporary file: temp_videos/62d9d4de-0e76-48c3-aadc-bd1d6ff2a7b6.mp4
87
+ 2025-08-19 01:42:42 - INFO - [62d9d4de-0e76-48c3-aadc-bd1d6ff2a7b6] Extracting frames using method: uniform, rate/threshold: 30
88
+ 2025-08-19 01:42:46 - INFO - [62d9d4de-0e76-48c3-aadc-bd1d6ff2a7b6] Extracted 30 frames successfully. Saving to temporary files...
89
+ 2025-08-19 01:42:46 - INFO - [62d9d4de-0e76-48c3-aadc-bd1d6ff2a7b6] 30 frames saved to temp_videos/62d9d4de-0e76-48c3-aadc-bd1d6ff2a7b6
90
+ 2025-08-19 01:42:59 - INFO - vision_config is None, using default vision config
91
+ 2025-08-19 01:43:20 - INFO - Tokens per second: 9.668944653220052, Peak GPU memory MB: 11824.375
92
+ 2025-08-19 01:43:20 - INFO - [62d9d4de-0e76-48c3-aadc-bd1d6ff2a7b6] Inference time: 38.62 seconds, CPU usage: 37.4%, CPU core utilization: [29.2, 59.9, 33.9, 26.4]
93
+ 2025-08-19 01:43:20 - INFO - [62d9d4de-0e76-48c3-aadc-bd1d6ff2a7b6] Cleaned up temporary file: temp_videos/62d9d4de-0e76-48c3-aadc-bd1d6ff2a7b6.mp4
94
+ 2025-08-19 01:43:20 - INFO - [62d9d4de-0e76-48c3-aadc-bd1d6ff2a7b6] Cleaned up temporary frame directory: temp_videos/62d9d4de-0e76-48c3-aadc-bd1d6ff2a7b6
95
+ 2025-08-19 01:43:20 - INFO - [28b04049-5753-4381-96f8-648268642404] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_010.mp4'
96
+ 2025-08-19 01:43:20 - INFO - [28b04049-5753-4381-96f8-648268642404] Video saved to temporary file: temp_videos/28b04049-5753-4381-96f8-648268642404.mp4
97
+ 2025-08-19 01:43:20 - INFO - [28b04049-5753-4381-96f8-648268642404] Extracting frames using method: uniform, rate/threshold: 30
98
+ 2025-08-19 01:43:26 - INFO - [28b04049-5753-4381-96f8-648268642404] Extracted 30 frames successfully. Saving to temporary files...
99
+ 2025-08-19 01:43:26 - INFO - [28b04049-5753-4381-96f8-648268642404] 30 frames saved to temp_videos/28b04049-5753-4381-96f8-648268642404
100
+ 2025-08-19 01:43:39 - INFO - vision_config is None, using default vision config
101
+ 2025-08-19 01:44:07 - INFO - Tokens per second: 10.571052792003025, Peak GPU memory MB: 11824.375
102
+ 2025-08-19 01:44:07 - INFO - [28b04049-5753-4381-96f8-648268642404] Inference time: 46.56 seconds, CPU usage: 58.2%, CPU core utilization: [47.8, 64.3, 48.6, 72.1]
103
+ 2025-08-19 01:44:07 - INFO - [28b04049-5753-4381-96f8-648268642404] Cleaned up temporary file: temp_videos/28b04049-5753-4381-96f8-648268642404.mp4
104
+ 2025-08-19 01:44:07 - INFO - [28b04049-5753-4381-96f8-648268642404] Cleaned up temporary frame directory: temp_videos/28b04049-5753-4381-96f8-648268642404
105
+ 2025-08-19 01:44:07 - INFO - [b6c7b63d-5909-4e4d-a822-f476e3891ec5] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_011.mp4'
106
+ 2025-08-19 01:44:07 - INFO - [b6c7b63d-5909-4e4d-a822-f476e3891ec5] Video saved to temporary file: temp_videos/b6c7b63d-5909-4e4d-a822-f476e3891ec5.mp4
107
+ 2025-08-19 01:44:07 - INFO - [b6c7b63d-5909-4e4d-a822-f476e3891ec5] Extracting frames using method: uniform, rate/threshold: 30
108
+ 2025-08-19 01:44:13 - INFO - [b6c7b63d-5909-4e4d-a822-f476e3891ec5] Extracted 30 frames successfully. Saving to temporary files...
109
+ 2025-08-19 01:44:13 - INFO - [b6c7b63d-5909-4e4d-a822-f476e3891ec5] 30 frames saved to temp_videos/b6c7b63d-5909-4e4d-a822-f476e3891ec5
110
+ 2025-08-19 01:44:26 - INFO - vision_config is None, using default vision config
111
+ 2025-08-19 01:44:53 - INFO - Tokens per second: 10.543281182614821, Peak GPU memory MB: 11824.375
112
+ 2025-08-19 01:44:53 - INFO - [b6c7b63d-5909-4e4d-a822-f476e3891ec5] Inference time: 45.94 seconds, CPU usage: 37.9%, CPU core utilization: [35.8, 33.7, 38.4, 43.8]
113
+ 2025-08-19 01:44:53 - INFO - [b6c7b63d-5909-4e4d-a822-f476e3891ec5] Cleaned up temporary file: temp_videos/b6c7b63d-5909-4e4d-a822-f476e3891ec5.mp4
114
+ 2025-08-19 01:44:53 - INFO - [b6c7b63d-5909-4e4d-a822-f476e3891ec5] Cleaned up temporary frame directory: temp_videos/b6c7b63d-5909-4e4d-a822-f476e3891ec5
115
+ 2025-08-19 01:44:53 - INFO - [c21259e8-cf6f-4def-8187-054b3e96dad1] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_012.mp4'
116
+ 2025-08-19 01:44:53 - INFO - [c21259e8-cf6f-4def-8187-054b3e96dad1] Video saved to temporary file: temp_videos/c21259e8-cf6f-4def-8187-054b3e96dad1.mp4
117
+ 2025-08-19 01:44:53 - INFO - [c21259e8-cf6f-4def-8187-054b3e96dad1] Extracting frames using method: uniform, rate/threshold: 30
118
+ 2025-08-19 01:44:59 - INFO - [c21259e8-cf6f-4def-8187-054b3e96dad1] Extracted 30 frames successfully. Saving to temporary files...
119
+ 2025-08-19 01:44:59 - INFO - [c21259e8-cf6f-4def-8187-054b3e96dad1] 30 frames saved to temp_videos/c21259e8-cf6f-4def-8187-054b3e96dad1
120
+ 2025-08-19 01:45:12 - INFO - vision_config is None, using default vision config
121
+ 2025-08-19 01:45:42 - INFO - Tokens per second: 10.81279707721416, Peak GPU memory MB: 11824.375
122
+ 2025-08-19 01:45:42 - INFO - [c21259e8-cf6f-4def-8187-054b3e96dad1] Inference time: 48.60 seconds, CPU usage: 35.6%, CPU core utilization: [25.0, 52.5, 36.8, 28.1]
123
+ 2025-08-19 01:45:42 - INFO - [c21259e8-cf6f-4def-8187-054b3e96dad1] Cleaned up temporary file: temp_videos/c21259e8-cf6f-4def-8187-054b3e96dad1.mp4
124
+ 2025-08-19 01:45:42 - INFO - [c21259e8-cf6f-4def-8187-054b3e96dad1] Cleaned up temporary frame directory: temp_videos/c21259e8-cf6f-4def-8187-054b3e96dad1
125
+ 2025-08-19 01:45:42 - INFO - [2893b5f9-ff91-49b9-a805-5cfb913513dc] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_013.mp4'
126
+ 2025-08-19 01:45:42 - INFO - [2893b5f9-ff91-49b9-a805-5cfb913513dc] Video saved to temporary file: temp_videos/2893b5f9-ff91-49b9-a805-5cfb913513dc.mp4
127
+ 2025-08-19 01:45:42 - INFO - [2893b5f9-ff91-49b9-a805-5cfb913513dc] Extracting frames using method: uniform, rate/threshold: 30
128
+ 2025-08-19 01:45:47 - INFO - [2893b5f9-ff91-49b9-a805-5cfb913513dc] Extracted 30 frames successfully. Saving to temporary files...
129
+ 2025-08-19 01:45:47 - INFO - [2893b5f9-ff91-49b9-a805-5cfb913513dc] 30 frames saved to temp_videos/2893b5f9-ff91-49b9-a805-5cfb913513dc
130
+ 2025-08-19 01:46:00 - INFO - vision_config is None, using default vision config
131
+ 2025-08-19 01:46:29 - INFO - Tokens per second: 10.750196309237642, Peak GPU memory MB: 11824.375
132
+ 2025-08-19 01:46:29 - INFO - [2893b5f9-ff91-49b9-a805-5cfb913513dc] Inference time: 47.04 seconds, CPU usage: 33.7%, CPU core utilization: [22.7, 33.2, 16.0, 63.0]
133
+ 2025-08-19 01:46:29 - INFO - [2893b5f9-ff91-49b9-a805-5cfb913513dc] Cleaned up temporary file: temp_videos/2893b5f9-ff91-49b9-a805-5cfb913513dc.mp4
134
+ 2025-08-19 01:46:29 - INFO - [2893b5f9-ff91-49b9-a805-5cfb913513dc] Cleaned up temporary frame directory: temp_videos/2893b5f9-ff91-49b9-a805-5cfb913513dc
135
+ 2025-08-19 01:46:29 - INFO - [6a1d46d1-8d12-4277-b881-67852e9ec9fc] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_014.mp4'
136
+ 2025-08-19 01:46:29 - INFO - [6a1d46d1-8d12-4277-b881-67852e9ec9fc] Video saved to temporary file: temp_videos/6a1d46d1-8d12-4277-b881-67852e9ec9fc.mp4
137
+ 2025-08-19 01:46:29 - INFO - [6a1d46d1-8d12-4277-b881-67852e9ec9fc] Extracting frames using method: uniform, rate/threshold: 30
138
+ 2025-08-19 01:46:33 - INFO - [6a1d46d1-8d12-4277-b881-67852e9ec9fc] Extracted 30 frames successfully. Saving to temporary files...
139
+ 2025-08-19 01:46:33 - INFO - [6a1d46d1-8d12-4277-b881-67852e9ec9fc] 30 frames saved to temp_videos/6a1d46d1-8d12-4277-b881-67852e9ec9fc
140
+ 2025-08-19 01:46:46 - INFO - vision_config is None, using default vision config
141
+ 2025-08-19 01:47:09 - INFO - Tokens per second: 9.95437319443051, Peak GPU memory MB: 11824.375
142
+ 2025-08-19 01:47:09 - INFO - [6a1d46d1-8d12-4277-b881-67852e9ec9fc] Inference time: 40.29 seconds, CPU usage: 33.9%, CPU core utilization: [14.9, 36.2, 35.4, 49.2]
143
+ 2025-08-19 01:47:09 - INFO - [6a1d46d1-8d12-4277-b881-67852e9ec9fc] Cleaned up temporary file: temp_videos/6a1d46d1-8d12-4277-b881-67852e9ec9fc.mp4
144
+ 2025-08-19 01:47:09 - INFO - [6a1d46d1-8d12-4277-b881-67852e9ec9fc] Cleaned up temporary frame directory: temp_videos/6a1d46d1-8d12-4277-b881-67852e9ec9fc
145
+ 2025-08-19 01:47:09 - INFO - [b2144dd4-543c-49c4-b47a-ac4a270fbb05] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_015.mp4'
146
+ 2025-08-19 01:47:09 - INFO - [b2144dd4-543c-49c4-b47a-ac4a270fbb05] Video saved to temporary file: temp_videos/b2144dd4-543c-49c4-b47a-ac4a270fbb05.mp4
147
+ 2025-08-19 01:47:09 - INFO - [b2144dd4-543c-49c4-b47a-ac4a270fbb05] Extracting frames using method: uniform, rate/threshold: 30
148
+ 2025-08-19 01:47:14 - INFO - [b2144dd4-543c-49c4-b47a-ac4a270fbb05] Extracted 30 frames successfully. Saving to temporary files...
149
+ 2025-08-19 01:47:15 - INFO - [b2144dd4-543c-49c4-b47a-ac4a270fbb05] 30 frames saved to temp_videos/b2144dd4-543c-49c4-b47a-ac4a270fbb05
150
+ 2025-08-19 01:47:27 - INFO - vision_config is None, using default vision config
151
+ 2025-08-19 01:47:56 - INFO - Tokens per second: 10.678644069294082, Peak GPU memory MB: 11824.375
152
+ 2025-08-19 01:47:56 - INFO - [b2144dd4-543c-49c4-b47a-ac4a270fbb05] Inference time: 46.50 seconds, CPU usage: 34.2%, CPU core utilization: [30.7, 18.2, 50.7, 37.2]
153
+ 2025-08-19 01:47:56 - INFO - [b2144dd4-543c-49c4-b47a-ac4a270fbb05] Cleaned up temporary file: temp_videos/b2144dd4-543c-49c4-b47a-ac4a270fbb05.mp4
154
+ 2025-08-19 01:47:56 - INFO - [b2144dd4-543c-49c4-b47a-ac4a270fbb05] Cleaned up temporary frame directory: temp_videos/b2144dd4-543c-49c4-b47a-ac4a270fbb05
155
+ 2025-08-19 01:47:56 - INFO - [b968ad5e-f870-4d22-9878-aa9b5b0a119d] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_016.mp4'
156
+ 2025-08-19 01:47:56 - INFO - [b968ad5e-f870-4d22-9878-aa9b5b0a119d] Video saved to temporary file: temp_videos/b968ad5e-f870-4d22-9878-aa9b5b0a119d.mp4
157
+ 2025-08-19 01:47:56 - INFO - [b968ad5e-f870-4d22-9878-aa9b5b0a119d] Extracting frames using method: uniform, rate/threshold: 30
158
+ 2025-08-19 01:48:00 - INFO - [b968ad5e-f870-4d22-9878-aa9b5b0a119d] Extracted 30 frames successfully. Saving to temporary files...
159
+ 2025-08-19 01:48:00 - INFO - [b968ad5e-f870-4d22-9878-aa9b5b0a119d] 30 frames saved to temp_videos/b968ad5e-f870-4d22-9878-aa9b5b0a119d
160
+ 2025-08-19 01:48:13 - INFO - vision_config is None, using default vision config
161
+ 2025-08-19 01:48:53 - INFO - Tokens per second: 11.436392353304559, Peak GPU memory MB: 11824.375
162
+ 2025-08-19 01:48:53 - INFO - [b968ad5e-f870-4d22-9878-aa9b5b0a119d] Inference time: 57.42 seconds, CPU usage: 31.0%, CPU core utilization: [32.8, 51.8, 16.3, 23.3]
163
+ 2025-08-19 01:48:53 - INFO - [b968ad5e-f870-4d22-9878-aa9b5b0a119d] Cleaned up temporary file: temp_videos/b968ad5e-f870-4d22-9878-aa9b5b0a119d.mp4
164
+ 2025-08-19 01:48:53 - INFO - [b968ad5e-f870-4d22-9878-aa9b5b0a119d] Cleaned up temporary frame directory: temp_videos/b968ad5e-f870-4d22-9878-aa9b5b0a119d
165
+ 2025-08-19 01:48:53 - INFO - [3ddab2f0-f085-4d1f-8968-7a2815622372] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_017.mp4'
166
+ 2025-08-19 01:48:53 - INFO - [3ddab2f0-f085-4d1f-8968-7a2815622372] Video saved to temporary file: temp_videos/3ddab2f0-f085-4d1f-8968-7a2815622372.mp4
167
+ 2025-08-19 01:48:53 - INFO - [3ddab2f0-f085-4d1f-8968-7a2815622372] Extracting frames using method: uniform, rate/threshold: 30
168
+ 2025-08-19 01:48:59 - INFO - [3ddab2f0-f085-4d1f-8968-7a2815622372] Extracted 30 frames successfully. Saving to temporary files...
169
+ 2025-08-19 01:48:59 - INFO - [3ddab2f0-f085-4d1f-8968-7a2815622372] 30 frames saved to temp_videos/3ddab2f0-f085-4d1f-8968-7a2815622372
170
+ 2025-08-19 01:49:12 - INFO - vision_config is None, using default vision config
171
+ 2025-08-19 01:49:43 - INFO - Tokens per second: 10.977650622627742, Peak GPU memory MB: 11824.375
172
+ 2025-08-19 01:49:43 - INFO - [3ddab2f0-f085-4d1f-8968-7a2815622372] Inference time: 50.17 seconds, CPU usage: 33.8%, CPU core utilization: [29.7, 20.5, 31.8, 53.2]
173
+ 2025-08-19 01:49:43 - INFO - [3ddab2f0-f085-4d1f-8968-7a2815622372] Cleaned up temporary file: temp_videos/3ddab2f0-f085-4d1f-8968-7a2815622372.mp4
174
+ 2025-08-19 01:49:43 - INFO - [3ddab2f0-f085-4d1f-8968-7a2815622372] Cleaned up temporary frame directory: temp_videos/3ddab2f0-f085-4d1f-8968-7a2815622372
175
+ 2025-08-19 01:49:43 - INFO - [64082720-8509-4a38-a4e8-21caaaa28d68] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_018.mp4'
176
+ 2025-08-19 01:49:43 - INFO - [64082720-8509-4a38-a4e8-21caaaa28d68] Video saved to temporary file: temp_videos/64082720-8509-4a38-a4e8-21caaaa28d68.mp4
177
+ 2025-08-19 01:49:43 - INFO - [64082720-8509-4a38-a4e8-21caaaa28d68] Extracting frames using method: uniform, rate/threshold: 30
178
+ 2025-08-19 01:49:48 - INFO - [64082720-8509-4a38-a4e8-21caaaa28d68] Extracted 30 frames successfully. Saving to temporary files...
179
+ 2025-08-19 01:49:48 - INFO - [64082720-8509-4a38-a4e8-21caaaa28d68] 30 frames saved to temp_videos/64082720-8509-4a38-a4e8-21caaaa28d68
180
+ 2025-08-19 01:50:01 - INFO - vision_config is None, using default vision config
181
+ 2025-08-19 01:50:30 - INFO - Tokens per second: 10.787881404092886, Peak GPU memory MB: 11824.375
182
+ 2025-08-19 01:50:30 - INFO - [64082720-8509-4a38-a4e8-21caaaa28d68] Inference time: 46.92 seconds, CPU usage: 33.3%, CPU core utilization: [15.0, 23.0, 64.7, 30.6]
183
+ 2025-08-19 01:50:30 - INFO - [64082720-8509-4a38-a4e8-21caaaa28d68] Cleaned up temporary file: temp_videos/64082720-8509-4a38-a4e8-21caaaa28d68.mp4
184
+ 2025-08-19 01:50:30 - INFO - [64082720-8509-4a38-a4e8-21caaaa28d68] Cleaned up temporary frame directory: temp_videos/64082720-8509-4a38-a4e8-21caaaa28d68
185
+ 2025-08-19 01:50:30 - INFO - [287d15c6-2c14-4a24-8d66-ea1a1c087723] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_019.mp4'
186
+ 2025-08-19 01:50:30 - INFO - [287d15c6-2c14-4a24-8d66-ea1a1c087723] Video saved to temporary file: temp_videos/287d15c6-2c14-4a24-8d66-ea1a1c087723.mp4
187
+ 2025-08-19 01:50:30 - INFO - [287d15c6-2c14-4a24-8d66-ea1a1c087723] Extracting frames using method: uniform, rate/threshold: 30
188
+ 2025-08-19 01:50:36 - INFO - [287d15c6-2c14-4a24-8d66-ea1a1c087723] Extracted 30 frames successfully. Saving to temporary files...
189
+ 2025-08-19 01:50:36 - INFO - [287d15c6-2c14-4a24-8d66-ea1a1c087723] 30 frames saved to temp_videos/287d15c6-2c14-4a24-8d66-ea1a1c087723
190
+ 2025-08-19 01:50:49 - INFO - vision_config is None, using default vision config
191
+ 2025-08-19 01:51:16 - INFO - Tokens per second: 10.528072417044749, Peak GPU memory MB: 11824.375
192
+ 2025-08-19 01:51:16 - INFO - [287d15c6-2c14-4a24-8d66-ea1a1c087723] Inference time: 45.31 seconds, CPU usage: 34.1%, CPU core utilization: [66.5, 26.9, 29.2, 13.9]
193
+ 2025-08-19 01:51:16 - INFO - [287d15c6-2c14-4a24-8d66-ea1a1c087723] Cleaned up temporary file: temp_videos/287d15c6-2c14-4a24-8d66-ea1a1c087723.mp4
194
+ 2025-08-19 01:51:16 - INFO - [287d15c6-2c14-4a24-8d66-ea1a1c087723] Cleaned up temporary frame directory: temp_videos/287d15c6-2c14-4a24-8d66-ea1a1c087723
195
+ 2025-08-19 01:51:16 - INFO - [baccf9ea-be8e-4c3e-bd4d-44d91751d8a3] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_020.mp4'
196
+ 2025-08-19 01:51:16 - INFO - [baccf9ea-be8e-4c3e-bd4d-44d91751d8a3] Video saved to temporary file: temp_videos/baccf9ea-be8e-4c3e-bd4d-44d91751d8a3.mp4
197
+ 2025-08-19 01:51:16 - INFO - [baccf9ea-be8e-4c3e-bd4d-44d91751d8a3] Extracting frames using method: uniform, rate/threshold: 30
198
+ 2025-08-19 01:51:21 - INFO - [baccf9ea-be8e-4c3e-bd4d-44d91751d8a3] Extracted 30 frames successfully. Saving to temporary files...
199
+ 2025-08-19 01:51:21 - INFO - [baccf9ea-be8e-4c3e-bd4d-44d91751d8a3] 30 frames saved to temp_videos/baccf9ea-be8e-4c3e-bd4d-44d91751d8a3
200
+ 2025-08-19 01:51:34 - INFO - vision_config is None, using default vision config
201
+ 2025-08-19 01:51:56 - INFO - Tokens per second: 10.013143127227265, Peak GPU memory MB: 11824.375
202
+ 2025-08-19 01:51:56 - INFO - [baccf9ea-be8e-4c3e-bd4d-44d91751d8a3] Inference time: 40.79 seconds, CPU usage: 50.2%, CPU core utilization: [55.2, 35.5, 70.4, 39.7]
203
+ 2025-08-19 01:51:56 - INFO - [baccf9ea-be8e-4c3e-bd4d-44d91751d8a3] Cleaned up temporary file: temp_videos/baccf9ea-be8e-4c3e-bd4d-44d91751d8a3.mp4
204
+ 2025-08-19 01:51:56 - INFO - [baccf9ea-be8e-4c3e-bd4d-44d91751d8a3] Cleaned up temporary frame directory: temp_videos/baccf9ea-be8e-4c3e-bd4d-44d91751d8a3
205
+ 2025-08-19 01:51:57 - INFO - [74d5f1c8-04e1-4be4-b8c6-ff57447ffe57] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_021.mp4'
206
+ 2025-08-19 01:51:57 - INFO - [74d5f1c8-04e1-4be4-b8c6-ff57447ffe57] Video saved to temporary file: temp_videos/74d5f1c8-04e1-4be4-b8c6-ff57447ffe57.mp4
207
+ 2025-08-19 01:51:57 - INFO - [74d5f1c8-04e1-4be4-b8c6-ff57447ffe57] Extracting frames using method: uniform, rate/threshold: 30
208
+ 2025-08-19 01:52:02 - INFO - [74d5f1c8-04e1-4be4-b8c6-ff57447ffe57] Extracted 30 frames successfully. Saving to temporary files...
209
+ 2025-08-19 01:52:02 - INFO - [74d5f1c8-04e1-4be4-b8c6-ff57447ffe57] 30 frames saved to temp_videos/74d5f1c8-04e1-4be4-b8c6-ff57447ffe57
210
+ 2025-08-19 01:52:15 - INFO - vision_config is None, using default vision config
211
+ 2025-08-19 01:52:44 - INFO - Tokens per second: 10.80753428151764, Peak GPU memory MB: 11824.375
212
+ 2025-08-19 01:52:44 - INFO - [74d5f1c8-04e1-4be4-b8c6-ff57447ffe57] Inference time: 47.44 seconds, CPU usage: 33.8%, CPU core utilization: [21.7, 32.7, 49.9, 30.7]
213
+ 2025-08-19 01:52:44 - INFO - [74d5f1c8-04e1-4be4-b8c6-ff57447ffe57] Cleaned up temporary file: temp_videos/74d5f1c8-04e1-4be4-b8c6-ff57447ffe57.mp4
214
+ 2025-08-19 01:52:44 - INFO - [74d5f1c8-04e1-4be4-b8c6-ff57447ffe57] Cleaned up temporary frame directory: temp_videos/74d5f1c8-04e1-4be4-b8c6-ff57447ffe57
215
+ 2025-08-19 01:52:44 - INFO - [3fa3173d-a3e7-4543-b51b-0740cf6590fe] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_022.mp4'
216
+ 2025-08-19 01:52:44 - INFO - [3fa3173d-a3e7-4543-b51b-0740cf6590fe] Video saved to temporary file: temp_videos/3fa3173d-a3e7-4543-b51b-0740cf6590fe.mp4
217
+ 2025-08-19 01:52:44 - INFO - [3fa3173d-a3e7-4543-b51b-0740cf6590fe] Extracting frames using method: uniform, rate/threshold: 30
218
+ 2025-08-19 01:52:49 - INFO - [3fa3173d-a3e7-4543-b51b-0740cf6590fe] Extracted 30 frames successfully. Saving to temporary files...
219
+ 2025-08-19 01:52:49 - INFO - [3fa3173d-a3e7-4543-b51b-0740cf6590fe] 30 frames saved to temp_videos/3fa3173d-a3e7-4543-b51b-0740cf6590fe
220
+ 2025-08-19 01:53:02 - INFO - vision_config is None, using default vision config
221
+ 2025-08-19 01:53:38 - INFO - Tokens per second: 11.271056271043205, Peak GPU memory MB: 11824.375
222
+ 2025-08-19 01:53:38 - INFO - [3fa3173d-a3e7-4543-b51b-0740cf6590fe] Inference time: 54.02 seconds, CPU usage: 32.1%, CPU core utilization: [14.2, 64.7, 38.0, 11.4]
223
+ 2025-08-19 01:53:38 - INFO - [3fa3173d-a3e7-4543-b51b-0740cf6590fe] Cleaned up temporary file: temp_videos/3fa3173d-a3e7-4543-b51b-0740cf6590fe.mp4
224
+ 2025-08-19 01:53:38 - INFO - [3fa3173d-a3e7-4543-b51b-0740cf6590fe] Cleaned up temporary frame directory: temp_videos/3fa3173d-a3e7-4543-b51b-0740cf6590fe
225
+ 2025-08-19 01:53:38 - INFO - [f2c7c4ac-514a-4ba2-acaa-363c9f2e16c1] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_023.mp4'
226
+ 2025-08-19 01:53:38 - INFO - [f2c7c4ac-514a-4ba2-acaa-363c9f2e16c1] Video saved to temporary file: temp_videos/f2c7c4ac-514a-4ba2-acaa-363c9f2e16c1.mp4
227
+ 2025-08-19 01:53:38 - INFO - [f2c7c4ac-514a-4ba2-acaa-363c9f2e16c1] Extracting frames using method: uniform, rate/threshold: 30
228
+ 2025-08-19 01:53:43 - INFO - [f2c7c4ac-514a-4ba2-acaa-363c9f2e16c1] Extracted 30 frames successfully. Saving to temporary files...
229
+ 2025-08-19 01:53:43 - INFO - [f2c7c4ac-514a-4ba2-acaa-363c9f2e16c1] 30 frames saved to temp_videos/f2c7c4ac-514a-4ba2-acaa-363c9f2e16c1
230
+ 2025-08-19 01:53:56 - INFO - vision_config is None, using default vision config
231
+ 2025-08-19 01:54:23 - INFO - Tokens per second: 10.584057777083409, Peak GPU memory MB: 11824.375
232
+ 2025-08-19 01:54:23 - INFO - [f2c7c4ac-514a-4ba2-acaa-363c9f2e16c1] Inference time: 45.30 seconds, CPU usage: 33.8%, CPU core utilization: [27.8, 11.4, 55.2, 40.8]
233
+ 2025-08-19 01:54:23 - INFO - [f2c7c4ac-514a-4ba2-acaa-363c9f2e16c1] Cleaned up temporary file: temp_videos/f2c7c4ac-514a-4ba2-acaa-363c9f2e16c1.mp4
234
+ 2025-08-19 01:54:23 - INFO - [f2c7c4ac-514a-4ba2-acaa-363c9f2e16c1] Cleaned up temporary frame directory: temp_videos/f2c7c4ac-514a-4ba2-acaa-363c9f2e16c1
235
+ 2025-08-19 01:54:23 - INFO - [49e141c3-89cb-4d64-8e6c-fe5f5be26dc0] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_024.mp4'
236
+ 2025-08-19 01:54:23 - INFO - [49e141c3-89cb-4d64-8e6c-fe5f5be26dc0] Video saved to temporary file: temp_videos/49e141c3-89cb-4d64-8e6c-fe5f5be26dc0.mp4
237
+ 2025-08-19 01:54:23 - INFO - [49e141c3-89cb-4d64-8e6c-fe5f5be26dc0] Extracting frames using method: uniform, rate/threshold: 30
238
+ 2025-08-19 01:54:28 - INFO - [49e141c3-89cb-4d64-8e6c-fe5f5be26dc0] Extracted 30 frames successfully. Saving to temporary files...
239
+ 2025-08-19 01:54:28 - INFO - [49e141c3-89cb-4d64-8e6c-fe5f5be26dc0] 30 frames saved to temp_videos/49e141c3-89cb-4d64-8e6c-fe5f5be26dc0
240
+ 2025-08-19 01:54:41 - INFO - vision_config is None, using default vision config
241
+ 2025-08-19 01:55:07 - INFO - Tokens per second: 10.459799750467855, Peak GPU memory MB: 11824.375
242
+ 2025-08-19 01:55:07 - INFO - [49e141c3-89cb-4d64-8e6c-fe5f5be26dc0] Inference time: 43.99 seconds, CPU usage: 33.8%, CPU core utilization: [22.6, 28.1, 58.6, 25.9]
243
+ 2025-08-19 01:55:07 - INFO - [49e141c3-89cb-4d64-8e6c-fe5f5be26dc0] Cleaned up temporary file: temp_videos/49e141c3-89cb-4d64-8e6c-fe5f5be26dc0.mp4
244
+ 2025-08-19 01:55:07 - INFO - [49e141c3-89cb-4d64-8e6c-fe5f5be26dc0] Cleaned up temporary frame directory: temp_videos/49e141c3-89cb-4d64-8e6c-fe5f5be26dc0
245
+ 2025-08-19 01:55:49 - INFO - [8582051e-6ca3-4710-b3fa-332e5371ab3a] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
246
+ 2025-08-19 01:55:49 - INFO - [8582051e-6ca3-4710-b3fa-332e5371ab3a] Video saved to temporary file: temp_videos/8582051e-6ca3-4710-b3fa-332e5371ab3a.mp4
247
+ 2025-08-19 01:55:49 - INFO - [8582051e-6ca3-4710-b3fa-332e5371ab3a] Extracting frames using method: uniform, rate/threshold: 30
248
+ 2025-08-19 01:55:52 - INFO - [8582051e-6ca3-4710-b3fa-332e5371ab3a] Extracted 30 frames successfully. Saving to temporary files...
249
+ 2025-08-19 01:55:52 - INFO - [8582051e-6ca3-4710-b3fa-332e5371ab3a] 30 frames saved to temp_videos/8582051e-6ca3-4710-b3fa-332e5371ab3a
250
+ 2025-08-19 01:56:05 - INFO - vision_config is None, using default vision config
251
+ 2025-08-19 01:56:31 - INFO - Tokens per second: 10.459373498424311, Peak GPU memory MB: 11824.375
252
+ 2025-08-19 01:56:31 - INFO - [8582051e-6ca3-4710-b3fa-332e5371ab3a] Inference time: 41.89 seconds, CPU usage: 17.0%, CPU core utilization: [15.1, 18.5, 27.0, 7.5]
253
+ 2025-08-19 01:56:31 - INFO - [8582051e-6ca3-4710-b3fa-332e5371ab3a] Cleaned up temporary file: temp_videos/8582051e-6ca3-4710-b3fa-332e5371ab3a.mp4
254
+ 2025-08-19 01:56:31 - INFO - [8582051e-6ca3-4710-b3fa-332e5371ab3a] Cleaned up temporary frame directory: temp_videos/8582051e-6ca3-4710-b3fa-332e5371ab3a
255
+ 2025-08-19 01:56:31 - INFO - [1ea63879-2b96-4512-843c-4c2fe0b32d56] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_002.mp4'
256
+ 2025-08-19 01:56:31 - INFO - [1ea63879-2b96-4512-843c-4c2fe0b32d56] Video saved to temporary file: temp_videos/1ea63879-2b96-4512-843c-4c2fe0b32d56.mp4
257
+ 2025-08-19 01:56:31 - INFO - [1ea63879-2b96-4512-843c-4c2fe0b32d56] Extracting frames using method: uniform, rate/threshold: 30
258
+ 2025-08-19 01:56:37 - INFO - [1ea63879-2b96-4512-843c-4c2fe0b32d56] Extracted 30 frames successfully. Saving to temporary files...
259
+ 2025-08-19 01:56:37 - INFO - [1ea63879-2b96-4512-843c-4c2fe0b32d56] 30 frames saved to temp_videos/1ea63879-2b96-4512-843c-4c2fe0b32d56
260
+ 2025-08-19 01:56:49 - INFO - vision_config is None, using default vision config
261
+ 2025-08-19 01:57:16 - INFO - Tokens per second: 10.521444099900755, Peak GPU memory MB: 11824.375
262
+ 2025-08-19 01:57:16 - INFO - [1ea63879-2b96-4512-843c-4c2fe0b32d56] Inference time: 45.18 seconds, CPU usage: 34.8%, CPU core utilization: [27.2, 30.1, 57.7, 24.0]
263
+ 2025-08-19 01:57:16 - INFO - [1ea63879-2b96-4512-843c-4c2fe0b32d56] Cleaned up temporary file: temp_videos/1ea63879-2b96-4512-843c-4c2fe0b32d56.mp4
264
+ 2025-08-19 01:57:16 - INFO - [1ea63879-2b96-4512-843c-4c2fe0b32d56] Cleaned up temporary frame directory: temp_videos/1ea63879-2b96-4512-843c-4c2fe0b32d56
265
+ 2025-08-19 01:57:16 - INFO - [f5145952-2d58-4e30-8ed0-e883cb9c33d2] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_003.mp4'
266
+ 2025-08-19 01:57:16 - INFO - [f5145952-2d58-4e30-8ed0-e883cb9c33d2] Video saved to temporary file: temp_videos/f5145952-2d58-4e30-8ed0-e883cb9c33d2.mp4
267
+ 2025-08-19 01:57:16 - INFO - [f5145952-2d58-4e30-8ed0-e883cb9c33d2] Extracting frames using method: uniform, rate/threshold: 30
268
+ 2025-08-19 01:57:21 - INFO - [f5145952-2d58-4e30-8ed0-e883cb9c33d2] Extracted 30 frames successfully. Saving to temporary files...
269
+ 2025-08-19 01:57:21 - INFO - [f5145952-2d58-4e30-8ed0-e883cb9c33d2] 30 frames saved to temp_videos/f5145952-2d58-4e30-8ed0-e883cb9c33d2
270
+ 2025-08-19 01:57:34 - INFO - vision_config is None, using default vision config
271
+ 2025-08-19 01:57:57 - INFO - Tokens per second: 10.026139095405306, Peak GPU memory MB: 11824.375
272
+ 2025-08-19 01:57:57 - INFO - [f5145952-2d58-4e30-8ed0-e883cb9c33d2] Inference time: 41.03 seconds, CPU usage: 34.5%, CPU core utilization: [49.0, 20.1, 32.4, 36.6]
273
+ 2025-08-19 01:57:57 - INFO - [f5145952-2d58-4e30-8ed0-e883cb9c33d2] Cleaned up temporary file: temp_videos/f5145952-2d58-4e30-8ed0-e883cb9c33d2.mp4
274
+ 2025-08-19 01:57:57 - INFO - [f5145952-2d58-4e30-8ed0-e883cb9c33d2] Cleaned up temporary frame directory: temp_videos/f5145952-2d58-4e30-8ed0-e883cb9c33d2
275
+ 2025-08-19 01:57:57 - INFO - [e3cf41f4-2ad6-4236-ae33-3a5e359e3e12] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_004.mp4'
276
+ 2025-08-19 01:57:57 - INFO - [e3cf41f4-2ad6-4236-ae33-3a5e359e3e12] Video saved to temporary file: temp_videos/e3cf41f4-2ad6-4236-ae33-3a5e359e3e12.mp4
277
+ 2025-08-19 01:57:57 - INFO - [e3cf41f4-2ad6-4236-ae33-3a5e359e3e12] Extracting frames using method: uniform, rate/threshold: 30
278
+ 2025-08-19 01:58:02 - INFO - [e3cf41f4-2ad6-4236-ae33-3a5e359e3e12] Extracted 30 frames successfully. Saving to temporary files...
279
+ 2025-08-19 01:58:02 - INFO - [e3cf41f4-2ad6-4236-ae33-3a5e359e3e12] 30 frames saved to temp_videos/e3cf41f4-2ad6-4236-ae33-3a5e359e3e12
280
+ 2025-08-19 01:58:15 - INFO - vision_config is None, using default vision config
281
+ 2025-08-19 01:58:47 - INFO - Tokens per second: 11.01241993759931, Peak GPU memory MB: 11824.375
282
+ 2025-08-19 01:58:47 - INFO - [e3cf41f4-2ad6-4236-ae33-3a5e359e3e12] Inference time: 49.77 seconds, CPU usage: 32.8%, CPU core utilization: [26.8, 13.1, 79.1, 12.0]
283
+ 2025-08-19 01:58:47 - INFO - [e3cf41f4-2ad6-4236-ae33-3a5e359e3e12] Cleaned up temporary file: temp_videos/e3cf41f4-2ad6-4236-ae33-3a5e359e3e12.mp4
284
+ 2025-08-19 01:58:47 - INFO - [e3cf41f4-2ad6-4236-ae33-3a5e359e3e12] Cleaned up temporary frame directory: temp_videos/e3cf41f4-2ad6-4236-ae33-3a5e359e3e12
285
+ 2025-08-19 01:58:47 - INFO - [503dd6a1-31f7-4858-8147-e63031a49a4d] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_005.mp4'
286
+ 2025-08-19 01:58:47 - INFO - [503dd6a1-31f7-4858-8147-e63031a49a4d] Video saved to temporary file: temp_videos/503dd6a1-31f7-4858-8147-e63031a49a4d.mp4
287
+ 2025-08-19 01:58:47 - INFO - [503dd6a1-31f7-4858-8147-e63031a49a4d] Extracting frames using method: uniform, rate/threshold: 30
288
+ 2025-08-19 01:58:52 - INFO - [503dd6a1-31f7-4858-8147-e63031a49a4d] Extracted 30 frames successfully. Saving to temporary files...
289
+ 2025-08-19 01:58:52 - INFO - [503dd6a1-31f7-4858-8147-e63031a49a4d] 30 frames saved to temp_videos/503dd6a1-31f7-4858-8147-e63031a49a4d
290
+ 2025-08-19 01:59:05 - INFO - vision_config is None, using default vision config
291
+ 2025-08-19 01:59:32 - INFO - Tokens per second: 10.591706857753069, Peak GPU memory MB: 11824.375
292
+ 2025-08-19 01:59:32 - INFO - [503dd6a1-31f7-4858-8147-e63031a49a4d] Inference time: 45.01 seconds, CPU usage: 34.2%, CPU core utilization: [40.7, 15.3, 26.2, 54.4]
293
+ 2025-08-19 01:59:32 - INFO - [503dd6a1-31f7-4858-8147-e63031a49a4d] Cleaned up temporary file: temp_videos/503dd6a1-31f7-4858-8147-e63031a49a4d.mp4
294
+ 2025-08-19 01:59:32 - INFO - [503dd6a1-31f7-4858-8147-e63031a49a4d] Cleaned up temporary frame directory: temp_videos/503dd6a1-31f7-4858-8147-e63031a49a4d
295
+ 2025-08-19 01:59:32 - INFO - [1d4c6c16-0d5b-4a6d-99ef-c03f51be1bda] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_006.mp4'
296
+ 2025-08-19 01:59:32 - INFO - [1d4c6c16-0d5b-4a6d-99ef-c03f51be1bda] Video saved to temporary file: temp_videos/1d4c6c16-0d5b-4a6d-99ef-c03f51be1bda.mp4
297
+ 2025-08-19 01:59:32 - INFO - [1d4c6c16-0d5b-4a6d-99ef-c03f51be1bda] Extracting frames using method: uniform, rate/threshold: 30
298
+ 2025-08-19 01:59:38 - INFO - [1d4c6c16-0d5b-4a6d-99ef-c03f51be1bda] Extracted 30 frames successfully. Saving to temporary files...
299
+ 2025-08-19 01:59:38 - INFO - [1d4c6c16-0d5b-4a6d-99ef-c03f51be1bda] 30 frames saved to temp_videos/1d4c6c16-0d5b-4a6d-99ef-c03f51be1bda
300
+ 2025-08-19 01:59:51 - INFO - vision_config is None, using default vision config
301
+ 2025-08-19 02:00:22 - INFO - Tokens per second: 10.896315328259965, Peak GPU memory MB: 11824.375
302
+ 2025-08-19 02:00:22 - INFO - [1d4c6c16-0d5b-4a6d-99ef-c03f51be1bda] Inference time: 49.56 seconds, CPU usage: 34.5%, CPU core utilization: [17.4, 61.3, 13.1, 46.3]
303
+ 2025-08-19 02:00:22 - INFO - [1d4c6c16-0d5b-4a6d-99ef-c03f51be1bda] Cleaned up temporary file: temp_videos/1d4c6c16-0d5b-4a6d-99ef-c03f51be1bda.mp4
304
+ 2025-08-19 02:00:22 - INFO - [1d4c6c16-0d5b-4a6d-99ef-c03f51be1bda] Cleaned up temporary frame directory: temp_videos/1d4c6c16-0d5b-4a6d-99ef-c03f51be1bda
305
+ 2025-08-19 02:00:22 - INFO - [66b06d03-2766-4af8-ad8f-a6fbabe537b1] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_007.mp4'
306
+ 2025-08-19 02:00:22 - INFO - [66b06d03-2766-4af8-ad8f-a6fbabe537b1] Video saved to temporary file: temp_videos/66b06d03-2766-4af8-ad8f-a6fbabe537b1.mp4
307
+ 2025-08-19 02:00:22 - INFO - [66b06d03-2766-4af8-ad8f-a6fbabe537b1] Extracting frames using method: uniform, rate/threshold: 30
308
+ 2025-08-19 02:00:27 - INFO - [66b06d03-2766-4af8-ad8f-a6fbabe537b1] Extracted 30 frames successfully. Saving to temporary files...
309
+ 2025-08-19 02:00:27 - INFO - [66b06d03-2766-4af8-ad8f-a6fbabe537b1] 30 frames saved to temp_videos/66b06d03-2766-4af8-ad8f-a6fbabe537b1
310
+ 2025-08-19 02:00:40 - INFO - vision_config is None, using default vision config
311
+ 2025-08-19 02:01:01 - INFO - Tokens per second: 9.590173753322563, Peak GPU memory MB: 11824.375
312
+ 2025-08-19 02:01:01 - INFO - [66b06d03-2766-4af8-ad8f-a6fbabe537b1] Inference time: 38.85 seconds, CPU usage: 35.6%, CPU core utilization: [35.1, 15.9, 76.1, 14.9]
313
+ 2025-08-19 02:01:01 - INFO - [66b06d03-2766-4af8-ad8f-a6fbabe537b1] Cleaned up temporary file: temp_videos/66b06d03-2766-4af8-ad8f-a6fbabe537b1.mp4
314
+ 2025-08-19 02:01:01 - INFO - [66b06d03-2766-4af8-ad8f-a6fbabe537b1] Cleaned up temporary frame directory: temp_videos/66b06d03-2766-4af8-ad8f-a6fbabe537b1
315
+ 2025-08-19 02:01:01 - INFO - [952b41e2-c5c5-42f2-a5bb-1145e3e1fb34] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_008.mp4'
316
+ 2025-08-19 02:01:01 - INFO - [952b41e2-c5c5-42f2-a5bb-1145e3e1fb34] Video saved to temporary file: temp_videos/952b41e2-c5c5-42f2-a5bb-1145e3e1fb34.mp4
317
+ 2025-08-19 02:01:01 - INFO - [952b41e2-c5c5-42f2-a5bb-1145e3e1fb34] Extracting frames using method: uniform, rate/threshold: 30
318
+ 2025-08-19 02:01:06 - INFO - [952b41e2-c5c5-42f2-a5bb-1145e3e1fb34] Extracted 30 frames successfully. Saving to temporary files...
319
+ 2025-08-19 02:01:06 - INFO - [952b41e2-c5c5-42f2-a5bb-1145e3e1fb34] 30 frames saved to temp_videos/952b41e2-c5c5-42f2-a5bb-1145e3e1fb34
320
+ 2025-08-19 02:01:19 - INFO - vision_config is None, using default vision config
321
+ 2025-08-19 02:01:46 - INFO - Tokens per second: 10.608430309528273, Peak GPU memory MB: 11824.375
322
+ 2025-08-19 02:01:46 - INFO - [952b41e2-c5c5-42f2-a5bb-1145e3e1fb34] Inference time: 45.49 seconds, CPU usage: 34.3%, CPU core utilization: [49.6, 30.9, 40.9, 15.7]
323
+ 2025-08-19 02:01:46 - INFO - [952b41e2-c5c5-42f2-a5bb-1145e3e1fb34] Cleaned up temporary file: temp_videos/952b41e2-c5c5-42f2-a5bb-1145e3e1fb34.mp4
324
+ 2025-08-19 02:01:46 - INFO - [952b41e2-c5c5-42f2-a5bb-1145e3e1fb34] Cleaned up temporary frame directory: temp_videos/952b41e2-c5c5-42f2-a5bb-1145e3e1fb34
325
+ 2025-08-19 02:01:46 - INFO - [51cad348-55db-43de-9338-3051e9131844] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_009.mp4'
326
+ 2025-08-19 02:01:46 - INFO - [51cad348-55db-43de-9338-3051e9131844] Video saved to temporary file: temp_videos/51cad348-55db-43de-9338-3051e9131844.mp4
327
+ 2025-08-19 02:01:46 - INFO - [51cad348-55db-43de-9338-3051e9131844] Extracting frames using method: uniform, rate/threshold: 30
328
+ 2025-08-19 02:01:51 - INFO - [51cad348-55db-43de-9338-3051e9131844] Extracted 30 frames successfully. Saving to temporary files...
329
+ 2025-08-19 02:01:51 - INFO - [51cad348-55db-43de-9338-3051e9131844] 30 frames saved to temp_videos/51cad348-55db-43de-9338-3051e9131844
330
+ 2025-08-19 02:02:04 - INFO - vision_config is None, using default vision config
331
+ 2025-08-19 02:02:37 - INFO - Tokens per second: 11.082226675537616, Peak GPU memory MB: 11824.375
332
+ 2025-08-19 02:02:37 - INFO - [51cad348-55db-43de-9338-3051e9131844] Inference time: 50.99 seconds, CPU usage: 33.1%, CPU core utilization: [33.1, 16.3, 62.2, 20.7]
333
+ 2025-08-19 02:02:37 - INFO - [51cad348-55db-43de-9338-3051e9131844] Cleaned up temporary file: temp_videos/51cad348-55db-43de-9338-3051e9131844.mp4
334
+ 2025-08-19 02:02:37 - INFO - [51cad348-55db-43de-9338-3051e9131844] Cleaned up temporary frame directory: temp_videos/51cad348-55db-43de-9338-3051e9131844
335
+ 2025-08-19 02:02:37 - INFO - [41b3e8c9-9970-45ff-abb7-6a1f7e8964fb] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_010.mp4'
336
+ 2025-08-19 02:02:37 - INFO - [41b3e8c9-9970-45ff-abb7-6a1f7e8964fb] Video saved to temporary file: temp_videos/41b3e8c9-9970-45ff-abb7-6a1f7e8964fb.mp4
337
+ 2025-08-19 02:02:37 - INFO - [41b3e8c9-9970-45ff-abb7-6a1f7e8964fb] Extracting frames using method: uniform, rate/threshold: 30
338
+ 2025-08-19 02:02:42 - INFO - [41b3e8c9-9970-45ff-abb7-6a1f7e8964fb] Extracted 30 frames successfully. Saving to temporary files...
339
+ 2025-08-19 02:02:42 - INFO - [41b3e8c9-9970-45ff-abb7-6a1f7e8964fb] 30 frames saved to temp_videos/41b3e8c9-9970-45ff-abb7-6a1f7e8964fb
340
+ 2025-08-19 02:02:55 - INFO - vision_config is None, using default vision config
341
+ 2025-08-19 02:03:22 - INFO - Tokens per second: 10.551310576211458, Peak GPU memory MB: 11824.375
342
+ 2025-08-19 02:03:22 - INFO - [41b3e8c9-9970-45ff-abb7-6a1f7e8964fb] Inference time: 44.85 seconds, CPU usage: 33.3%, CPU core utilization: [21.1, 35.7, 30.0, 46.5]
343
+ 2025-08-19 02:03:22 - INFO - [41b3e8c9-9970-45ff-abb7-6a1f7e8964fb] Cleaned up temporary file: temp_videos/41b3e8c9-9970-45ff-abb7-6a1f7e8964fb.mp4
344
+ 2025-08-19 02:03:22 - INFO - [41b3e8c9-9970-45ff-abb7-6a1f7e8964fb] Cleaned up temporary frame directory: temp_videos/41b3e8c9-9970-45ff-abb7-6a1f7e8964fb
345
+ 2025-08-19 02:03:22 - INFO - [4393c650-c393-4346-bc0b-a9ef5e9ff838] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_011.mp4'
346
+ 2025-08-19 02:03:22 - INFO - [4393c650-c393-4346-bc0b-a9ef5e9ff838] Video saved to temporary file: temp_videos/4393c650-c393-4346-bc0b-a9ef5e9ff838.mp4
347
+ 2025-08-19 02:03:22 - INFO - [4393c650-c393-4346-bc0b-a9ef5e9ff838] Extracting frames using method: uniform, rate/threshold: 30
348
+ 2025-08-19 02:03:27 - INFO - [4393c650-c393-4346-bc0b-a9ef5e9ff838] Extracted 30 frames successfully. Saving to temporary files...
349
+ 2025-08-19 02:03:27 - INFO - [4393c650-c393-4346-bc0b-a9ef5e9ff838] 30 frames saved to temp_videos/4393c650-c393-4346-bc0b-a9ef5e9ff838
350
+ 2025-08-19 02:03:40 - INFO - vision_config is None, using default vision config
351
+ 2025-08-19 02:04:10 - INFO - Tokens per second: 10.880078766885976, Peak GPU memory MB: 11824.375
352
+ 2025-08-19 02:04:10 - INFO - [4393c650-c393-4346-bc0b-a9ef5e9ff838] Inference time: 48.24 seconds, CPU usage: 33.2%, CPU core utilization: [44.6, 26.5, 48.9, 12.8]
353
+ 2025-08-19 02:04:10 - INFO - [4393c650-c393-4346-bc0b-a9ef5e9ff838] Cleaned up temporary file: temp_videos/4393c650-c393-4346-bc0b-a9ef5e9ff838.mp4
354
+ 2025-08-19 02:04:10 - INFO - [4393c650-c393-4346-bc0b-a9ef5e9ff838] Cleaned up temporary frame directory: temp_videos/4393c650-c393-4346-bc0b-a9ef5e9ff838
355
+ 2025-08-19 02:04:10 - INFO - [63ee585b-d4ad-4f77-9c46-b1d0ee9660d5] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_012.mp4'
356
+ 2025-08-19 02:04:10 - INFO - [63ee585b-d4ad-4f77-9c46-b1d0ee9660d5] Video saved to temporary file: temp_videos/63ee585b-d4ad-4f77-9c46-b1d0ee9660d5.mp4
357
+ 2025-08-19 02:04:10 - INFO - [63ee585b-d4ad-4f77-9c46-b1d0ee9660d5] Extracting frames using method: uniform, rate/threshold: 30
358
+ 2025-08-19 02:04:16 - INFO - [63ee585b-d4ad-4f77-9c46-b1d0ee9660d5] Extracted 30 frames successfully. Saving to temporary files...
359
+ 2025-08-19 02:04:16 - INFO - [63ee585b-d4ad-4f77-9c46-b1d0ee9660d5] 30 frames saved to temp_videos/63ee585b-d4ad-4f77-9c46-b1d0ee9660d5
360
+ 2025-08-19 02:04:29 - INFO - vision_config is None, using default vision config
361
+ 2025-08-19 02:05:06 - INFO - Tokens per second: 11.316073008449894, Peak GPU memory MB: 11824.375
362
+ 2025-08-19 02:05:06 - INFO - [63ee585b-d4ad-4f77-9c46-b1d0ee9660d5] Inference time: 55.46 seconds, CPU usage: 32.8%, CPU core utilization: [30.7, 22.7, 65.4, 12.2]
363
+ 2025-08-19 02:05:06 - INFO - [63ee585b-d4ad-4f77-9c46-b1d0ee9660d5] Cleaned up temporary file: temp_videos/63ee585b-d4ad-4f77-9c46-b1d0ee9660d5.mp4
364
+ 2025-08-19 02:05:06 - INFO - [63ee585b-d4ad-4f77-9c46-b1d0ee9660d5] Cleaned up temporary frame directory: temp_videos/63ee585b-d4ad-4f77-9c46-b1d0ee9660d5
365
+ 2025-08-19 02:05:06 - INFO - [cd415598-0457-4f7c-9c0e-c48a9a385529] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_013.mp4'
366
+ 2025-08-19 02:05:06 - INFO - [cd415598-0457-4f7c-9c0e-c48a9a385529] Video saved to temporary file: temp_videos/cd415598-0457-4f7c-9c0e-c48a9a385529.mp4
367
+ 2025-08-19 02:05:06 - INFO - [cd415598-0457-4f7c-9c0e-c48a9a385529] Extracting frames using method: uniform, rate/threshold: 30
368
+ 2025-08-19 02:05:11 - INFO - [cd415598-0457-4f7c-9c0e-c48a9a385529] Extracted 30 frames successfully. Saving to temporary files...
369
+ 2025-08-19 02:05:11 - INFO - [cd415598-0457-4f7c-9c0e-c48a9a385529] 30 frames saved to temp_videos/cd415598-0457-4f7c-9c0e-c48a9a385529
370
+ 2025-08-19 02:05:24 - INFO - vision_config is None, using default vision config
371
+ 2025-08-19 02:05:48 - INFO - Tokens per second: 10.139085446565758, Peak GPU memory MB: 11824.375
372
+ 2025-08-19 02:05:48 - INFO - [cd415598-0457-4f7c-9c0e-c48a9a385529] Inference time: 41.62 seconds, CPU usage: 34.4%, CPU core utilization: [43.3, 14.6, 67.4, 12.2]
373
+ 2025-08-19 02:05:48 - INFO - [cd415598-0457-4f7c-9c0e-c48a9a385529] Cleaned up temporary file: temp_videos/cd415598-0457-4f7c-9c0e-c48a9a385529.mp4
374
+ 2025-08-19 02:05:48 - INFO - [cd415598-0457-4f7c-9c0e-c48a9a385529] Cleaned up temporary frame directory: temp_videos/cd415598-0457-4f7c-9c0e-c48a9a385529
375
+ 2025-08-19 02:05:48 - INFO - [d767da5c-73c4-4e2f-b5bd-53f01d5e199e] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_014.mp4'
376
+ 2025-08-19 02:05:48 - INFO - [d767da5c-73c4-4e2f-b5bd-53f01d5e199e] Video saved to temporary file: temp_videos/d767da5c-73c4-4e2f-b5bd-53f01d5e199e.mp4
377
+ 2025-08-19 02:05:48 - INFO - [d767da5c-73c4-4e2f-b5bd-53f01d5e199e] Extracting frames using method: uniform, rate/threshold: 30
378
+ 2025-08-19 02:05:52 - INFO - [d767da5c-73c4-4e2f-b5bd-53f01d5e199e] Extracted 30 frames successfully. Saving to temporary files...
379
+ 2025-08-19 02:05:52 - INFO - [d767da5c-73c4-4e2f-b5bd-53f01d5e199e] 30 frames saved to temp_videos/d767da5c-73c4-4e2f-b5bd-53f01d5e199e
380
+ 2025-08-19 02:06:05 - INFO - vision_config is None, using default vision config
381
+ 2025-08-19 02:06:29 - INFO - Tokens per second: 10.099402253183086, Peak GPU memory MB: 11824.375
382
+ 2025-08-19 02:06:29 - INFO - [d767da5c-73c4-4e2f-b5bd-53f01d5e199e] Inference time: 41.11 seconds, CPU usage: 33.9%, CPU core utilization: [40.9, 18.6, 62.6, 13.5]
383
+ 2025-08-19 02:06:29 - INFO - [d767da5c-73c4-4e2f-b5bd-53f01d5e199e] Cleaned up temporary file: temp_videos/d767da5c-73c4-4e2f-b5bd-53f01d5e199e.mp4
384
+ 2025-08-19 02:06:29 - INFO - [d767da5c-73c4-4e2f-b5bd-53f01d5e199e] Cleaned up temporary frame directory: temp_videos/d767da5c-73c4-4e2f-b5bd-53f01d5e199e
385
+ 2025-08-19 02:06:29 - INFO - [e58b05d0-3bc8-4979-b14b-942d95c2e3ef] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_015.mp4'
386
+ 2025-08-19 02:06:29 - INFO - [e58b05d0-3bc8-4979-b14b-942d95c2e3ef] Video saved to temporary file: temp_videos/e58b05d0-3bc8-4979-b14b-942d95c2e3ef.mp4
387
+ 2025-08-19 02:06:29 - INFO - [e58b05d0-3bc8-4979-b14b-942d95c2e3ef] Extracting frames using method: uniform, rate/threshold: 30
388
+ 2025-08-19 02:06:34 - INFO - [e58b05d0-3bc8-4979-b14b-942d95c2e3ef] Extracted 30 frames successfully. Saving to temporary files...
389
+ 2025-08-19 02:06:34 - INFO - [e58b05d0-3bc8-4979-b14b-942d95c2e3ef] 30 frames saved to temp_videos/e58b05d0-3bc8-4979-b14b-942d95c2e3ef
390
+ 2025-08-19 02:06:47 - INFO - vision_config is None, using default vision config
391
+ 2025-08-19 02:07:19 - INFO - Tokens per second: 11.047986460564802, Peak GPU memory MB: 11824.375
392
+ 2025-08-19 02:07:19 - INFO - [e58b05d0-3bc8-4979-b14b-942d95c2e3ef] Inference time: 50.63 seconds, CPU usage: 33.6%, CPU core utilization: [34.3, 14.5, 72.1, 13.3]
393
+ 2025-08-19 02:07:20 - INFO - [e58b05d0-3bc8-4979-b14b-942d95c2e3ef] Cleaned up temporary file: temp_videos/e58b05d0-3bc8-4979-b14b-942d95c2e3ef.mp4
394
+ 2025-08-19 02:07:20 - INFO - [e58b05d0-3bc8-4979-b14b-942d95c2e3ef] Cleaned up temporary frame directory: temp_videos/e58b05d0-3bc8-4979-b14b-942d95c2e3ef
395
+ 2025-08-19 02:07:20 - INFO - [7f388f20-df20-4494-81cf-2c4dbbb55c24] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_016.mp4'
396
+ 2025-08-19 02:07:20 - INFO - [7f388f20-df20-4494-81cf-2c4dbbb55c24] Video saved to temporary file: temp_videos/7f388f20-df20-4494-81cf-2c4dbbb55c24.mp4
397
+ 2025-08-19 02:07:20 - INFO - [7f388f20-df20-4494-81cf-2c4dbbb55c24] Extracting frames using method: uniform, rate/threshold: 30
398
+ 2025-08-19 02:07:25 - INFO - [7f388f20-df20-4494-81cf-2c4dbbb55c24] Extracted 30 frames successfully. Saving to temporary files...
399
+ 2025-08-19 02:07:25 - INFO - [7f388f20-df20-4494-81cf-2c4dbbb55c24] 30 frames saved to temp_videos/7f388f20-df20-4494-81cf-2c4dbbb55c24
400
+ 2025-08-19 02:07:38 - INFO - vision_config is None, using default vision config
401
+ 2025-08-19 02:08:03 - INFO - Tokens per second: 10.276083296878193, Peak GPU memory MB: 11824.375
402
+ 2025-08-19 02:08:03 - INFO - [7f388f20-df20-4494-81cf-2c4dbbb55c24] Inference time: 42.98 seconds, CPU usage: 34.5%, CPU core utilization: [48.9, 17.9, 49.7, 21.6]
403
+ 2025-08-19 02:08:03 - INFO - [7f388f20-df20-4494-81cf-2c4dbbb55c24] Cleaned up temporary file: temp_videos/7f388f20-df20-4494-81cf-2c4dbbb55c24.mp4
404
+ 2025-08-19 02:08:03 - INFO - [7f388f20-df20-4494-81cf-2c4dbbb55c24] Cleaned up temporary frame directory: temp_videos/7f388f20-df20-4494-81cf-2c4dbbb55c24
405
+ 2025-08-19 02:08:03 - INFO - [065e7051-f3d7-44d7-9df4-3ec85f833683] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_017.mp4'
406
+ 2025-08-19 02:08:03 - INFO - [065e7051-f3d7-44d7-9df4-3ec85f833683] Video saved to temporary file: temp_videos/065e7051-f3d7-44d7-9df4-3ec85f833683.mp4
407
+ 2025-08-19 02:08:03 - INFO - [065e7051-f3d7-44d7-9df4-3ec85f833683] Extracting frames using method: uniform, rate/threshold: 30
408
+ 2025-08-19 02:08:08 - INFO - [065e7051-f3d7-44d7-9df4-3ec85f833683] Extracted 30 frames successfully. Saving to temporary files...
409
+ 2025-08-19 02:08:08 - INFO - [065e7051-f3d7-44d7-9df4-3ec85f833683] 30 frames saved to temp_videos/065e7051-f3d7-44d7-9df4-3ec85f833683
410
+ 2025-08-19 02:08:21 - INFO - vision_config is None, using default vision config
411
+ 2025-08-19 02:08:43 - INFO - Tokens per second: 9.871157401870102, Peak GPU memory MB: 11824.375
412
+ 2025-08-19 02:08:43 - INFO - [065e7051-f3d7-44d7-9df4-3ec85f833683] Inference time: 40.32 seconds, CPU usage: 34.5%, CPU core utilization: [28.7, 13.0, 82.0, 14.2]
413
+ 2025-08-19 02:08:43 - INFO - [065e7051-f3d7-44d7-9df4-3ec85f833683] Cleaned up temporary file: temp_videos/065e7051-f3d7-44d7-9df4-3ec85f833683.mp4
414
+ 2025-08-19 02:08:43 - INFO - [065e7051-f3d7-44d7-9df4-3ec85f833683] Cleaned up temporary frame directory: temp_videos/065e7051-f3d7-44d7-9df4-3ec85f833683
415
+ 2025-08-19 02:08:43 - INFO - [3a448a91-9d31-4350-a00a-5e0d9a3e65c5] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_018.mp4'
416
+ 2025-08-19 02:08:43 - INFO - [3a448a91-9d31-4350-a00a-5e0d9a3e65c5] Video saved to temporary file: temp_videos/3a448a91-9d31-4350-a00a-5e0d9a3e65c5.mp4
417
+ 2025-08-19 02:08:43 - INFO - [3a448a91-9d31-4350-a00a-5e0d9a3e65c5] Extracting frames using method: uniform, rate/threshold: 30
418
+ 2025-08-19 02:08:48 - INFO - [3a448a91-9d31-4350-a00a-5e0d9a3e65c5] Extracted 30 frames successfully. Saving to temporary files...
419
+ 2025-08-19 02:08:48 - INFO - [3a448a91-9d31-4350-a00a-5e0d9a3e65c5] 30 frames saved to temp_videos/3a448a91-9d31-4350-a00a-5e0d9a3e65c5
420
+ 2025-08-19 02:09:01 - INFO - vision_config is None, using default vision config
421
+ 2025-08-19 02:09:27 - INFO - Tokens per second: 10.527802025402586, Peak GPU memory MB: 11824.375
422
+ 2025-08-19 02:09:27 - INFO - [3a448a91-9d31-4350-a00a-5e0d9a3e65c5] Inference time: 44.41 seconds, CPU usage: 33.7%, CPU core utilization: [38.7, 49.8, 27.9, 18.3]
423
+ 2025-08-19 02:09:27 - INFO - [3a448a91-9d31-4350-a00a-5e0d9a3e65c5] Cleaned up temporary file: temp_videos/3a448a91-9d31-4350-a00a-5e0d9a3e65c5.mp4
424
+ 2025-08-19 02:09:27 - INFO - [3a448a91-9d31-4350-a00a-5e0d9a3e65c5] Cleaned up temporary frame directory: temp_videos/3a448a91-9d31-4350-a00a-5e0d9a3e65c5
425
+ 2025-08-19 02:09:28 - INFO - [f86185e1-3df4-4b24-87c3-25b7006b3847] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_019.mp4'
426
+ 2025-08-19 02:09:28 - INFO - [f86185e1-3df4-4b24-87c3-25b7006b3847] Video saved to temporary file: temp_videos/f86185e1-3df4-4b24-87c3-25b7006b3847.mp4
427
+ 2025-08-19 02:09:28 - INFO - [f86185e1-3df4-4b24-87c3-25b7006b3847] Extracting frames using method: uniform, rate/threshold: 30
428
+ 2025-08-19 02:09:33 - INFO - [f86185e1-3df4-4b24-87c3-25b7006b3847] Extracted 30 frames successfully. Saving to temporary files...
429
+ 2025-08-19 02:09:33 - INFO - [f86185e1-3df4-4b24-87c3-25b7006b3847] 30 frames saved to temp_videos/f86185e1-3df4-4b24-87c3-25b7006b3847
430
+ 2025-08-19 02:09:46 - INFO - vision_config is None, using default vision config
431
+ 2025-08-19 02:10:09 - INFO - Tokens per second: 9.991467008425984, Peak GPU memory MB: 11824.375
432
+ 2025-08-19 02:10:09 - INFO - [f86185e1-3df4-4b24-87c3-25b7006b3847] Inference time: 41.05 seconds, CPU usage: 34.8%, CPU core utilization: [27.7, 55.4, 40.1, 16.1]
433
+ 2025-08-19 02:10:09 - INFO - [f86185e1-3df4-4b24-87c3-25b7006b3847] Cleaned up temporary file: temp_videos/f86185e1-3df4-4b24-87c3-25b7006b3847.mp4
434
+ 2025-08-19 02:10:09 - INFO - [f86185e1-3df4-4b24-87c3-25b7006b3847] Cleaned up temporary frame directory: temp_videos/f86185e1-3df4-4b24-87c3-25b7006b3847
435
+ 2025-08-19 02:10:09 - INFO - [86d1ce1c-a659-4370-8d5e-2a04d68fb02d] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_020.mp4'
436
+ 2025-08-19 02:10:09 - INFO - [86d1ce1c-a659-4370-8d5e-2a04d68fb02d] Video saved to temporary file: temp_videos/86d1ce1c-a659-4370-8d5e-2a04d68fb02d.mp4
437
+ 2025-08-19 02:10:09 - INFO - [86d1ce1c-a659-4370-8d5e-2a04d68fb02d] Extracting frames using method: uniform, rate/threshold: 30
438
+ 2025-08-19 02:10:14 - INFO - [86d1ce1c-a659-4370-8d5e-2a04d68fb02d] Extracted 30 frames successfully. Saving to temporary files...
439
+ 2025-08-19 02:10:14 - INFO - [86d1ce1c-a659-4370-8d5e-2a04d68fb02d] 30 frames saved to temp_videos/86d1ce1c-a659-4370-8d5e-2a04d68fb02d
440
+ 2025-08-19 02:10:27 - INFO - vision_config is None, using default vision config
441
+ 2025-08-19 02:10:56 - INFO - Tokens per second: 10.789004911780786, Peak GPU memory MB: 11824.375
442
+ 2025-08-19 02:10:56 - INFO - [86d1ce1c-a659-4370-8d5e-2a04d68fb02d] Inference time: 47.12 seconds, CPU usage: 33.5%, CPU core utilization: [26.3, 11.4, 57.7, 38.4]
443
+ 2025-08-19 02:10:56 - INFO - [86d1ce1c-a659-4370-8d5e-2a04d68fb02d] Cleaned up temporary file: temp_videos/86d1ce1c-a659-4370-8d5e-2a04d68fb02d.mp4
444
+ 2025-08-19 02:10:56 - INFO - [86d1ce1c-a659-4370-8d5e-2a04d68fb02d] Cleaned up temporary frame directory: temp_videos/86d1ce1c-a659-4370-8d5e-2a04d68fb02d
445
+ 2025-08-19 02:10:56 - INFO - [bbf0d618-6d38-4015-a288-e1ed0f93febb] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_021.mp4'
446
+ 2025-08-19 02:10:56 - INFO - [bbf0d618-6d38-4015-a288-e1ed0f93febb] Video saved to temporary file: temp_videos/bbf0d618-6d38-4015-a288-e1ed0f93febb.mp4
447
+ 2025-08-19 02:10:56 - INFO - [bbf0d618-6d38-4015-a288-e1ed0f93febb] Extracting frames using method: uniform, rate/threshold: 30
448
+ 2025-08-19 02:11:01 - INFO - [bbf0d618-6d38-4015-a288-e1ed0f93febb] Extracted 30 frames successfully. Saving to temporary files...
449
+ 2025-08-19 02:11:01 - INFO - [bbf0d618-6d38-4015-a288-e1ed0f93febb] 30 frames saved to temp_videos/bbf0d618-6d38-4015-a288-e1ed0f93febb
450
+ 2025-08-19 02:11:14 - INFO - vision_config is None, using default vision config
451
+ 2025-08-19 02:11:42 - INFO - Tokens per second: 10.710302542840308, Peak GPU memory MB: 11824.375
452
+ 2025-08-19 02:11:42 - INFO - [bbf0d618-6d38-4015-a288-e1ed0f93febb] Inference time: 46.57 seconds, CPU usage: 33.3%, CPU core utilization: [35.4, 27.4, 14.8, 55.5]
453
+ 2025-08-19 02:11:42 - INFO - [bbf0d618-6d38-4015-a288-e1ed0f93febb] Cleaned up temporary file: temp_videos/bbf0d618-6d38-4015-a288-e1ed0f93febb.mp4
454
+ 2025-08-19 02:11:42 - INFO - [bbf0d618-6d38-4015-a288-e1ed0f93febb] Cleaned up temporary frame directory: temp_videos/bbf0d618-6d38-4015-a288-e1ed0f93febb
API_Transformers/logs/MiniCPM-V-4/20250820_233455.log ADDED
The diff for this file is too large to render. See raw diff
 
API_Transformers/logs/MiniCPM-V-4/20250821_002349.log ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-21 00:23:49 - INFO - Loading model: openbmb/MiniCPM-V-4
2
+ 2025-08-21 00:23:50 - INFO - vision_config is None, using default vision config
3
+ 2025-08-21 00:24:41 - INFO - Model loaded in 52.24 seconds
4
+ 2025-08-21 00:24:41 - INFO - GPU Memory Usage after model load: 7802.99 MB
5
+ 2025-08-21 00:24:48 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
6
+ 2025-08-21 00:24:48 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] Video saved to temporary file: temp_videos/675b6c5c-5524-4cc9-a700-76d2d090a7a4.mp4
7
+ 2025-08-21 00:24:48 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-21 00:24:52 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-21 00:24:53 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] 30 frames saved to temp_videos/675b6c5c-5524-4cc9-a700-76d2d090a7a4
10
+ 2025-08-21 00:25:09 - INFO - vision_config is None, using default vision config
11
+ 2025-08-21 00:25:21 - INFO - Tokens per second: 5.985183148455991, Peak GPU memory MB: 11824.375
12
+ 2025-08-21 00:25:21 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] Inference time: 33.25 seconds, CPU usage: 37.3%, CPU core utilization: [37.0, 39.5, 36.4, 36.2]
13
+ 2025-08-21 00:25:21 - INFO - [675b6c5c-5524-4cc9-a700-76d2d090a7a4] Cleaned up temporary frame directory: temp_videos/675b6c5c-5524-4cc9-a700-76d2d090a7a4
14
+ 2025-08-21 00:25:21 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_002.mp4'
15
+ 2025-08-21 00:25:21 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] Video saved to temporary file: temp_videos/3315fc05-4535-4c30-910d-0b8c1a9c8855.mp4
16
+ 2025-08-21 00:25:21 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] Extracting frames using method: uniform, rate/threshold: 30
17
+ 2025-08-21 00:25:29 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] Extracted 30 frames successfully. Saving to temporary files...
18
+ 2025-08-21 00:25:29 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] 30 frames saved to temp_videos/3315fc05-4535-4c30-910d-0b8c1a9c8855
19
+ 2025-08-21 00:25:41 - INFO - vision_config is None, using default vision config
20
+ 2025-08-21 00:25:50 - INFO - Tokens per second: 3.7285647057248625, Peak GPU memory MB: 11824.375
21
+ 2025-08-21 00:25:50 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] Inference time: 29.68 seconds, CPU usage: 50.6%, CPU core utilization: [56.4, 62.1, 35.5, 48.3]
22
+ 2025-08-21 00:25:51 - INFO - [3315fc05-4535-4c30-910d-0b8c1a9c8855] Cleaned up temporary frame directory: temp_videos/3315fc05-4535-4c30-910d-0b8c1a9c8855
23
+ 2025-08-21 00:25:51 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_003.mp4'
24
+ 2025-08-21 00:25:51 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] Video saved to temporary file: temp_videos/89135b17-c5fd-406e-8ce7-875b26d87444.mp4
25
+ 2025-08-21 00:25:51 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] Extracting frames using method: uniform, rate/threshold: 30
26
+ 2025-08-21 00:25:56 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] Extracted 30 frames successfully. Saving to temporary files...
27
+ 2025-08-21 00:25:56 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] 30 frames saved to temp_videos/89135b17-c5fd-406e-8ce7-875b26d87444
28
+ 2025-08-21 00:26:08 - INFO - vision_config is None, using default vision config
29
+ 2025-08-21 00:26:22 - INFO - Tokens per second: 6.963268555022767, Peak GPU memory MB: 11824.375
30
+ 2025-08-21 00:26:22 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] Inference time: 31.19 seconds, CPU usage: 38.1%, CPU core utilization: [30.8, 32.9, 49.7, 38.7]
31
+ 2025-08-21 00:26:22 - INFO - [89135b17-c5fd-406e-8ce7-875b26d87444] Cleaned up temporary frame directory: temp_videos/89135b17-c5fd-406e-8ce7-875b26d87444
32
+ 2025-08-21 00:26:22 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_004.mp4'
33
+ 2025-08-21 00:26:22 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] Video saved to temporary file: temp_videos/c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a.mp4
34
+ 2025-08-21 00:26:22 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] Extracting frames using method: uniform, rate/threshold: 30
35
+ 2025-08-21 00:26:27 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] Extracted 30 frames successfully. Saving to temporary files...
36
+ 2025-08-21 00:26:27 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] 30 frames saved to temp_videos/c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a
37
+ 2025-08-21 00:26:39 - INFO - vision_config is None, using default vision config
38
+ 2025-08-21 00:26:53 - INFO - Tokens per second: 7.242406371874894, Peak GPU memory MB: 11824.375
39
+ 2025-08-21 00:26:53 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] Inference time: 31.55 seconds, CPU usage: 36.5%, CPU core utilization: [47.2, 19.7, 61.2, 17.9]
40
+ 2025-08-21 00:26:53 - INFO - [c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a] Cleaned up temporary frame directory: temp_videos/c196a0cc-7f44-4d02-8ddb-af01ad8e7a9a
41
+ 2025-08-21 00:26:53 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_005.mp4'
42
+ 2025-08-21 00:26:53 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] Video saved to temporary file: temp_videos/3c7789f8-90dc-45d1-be32-cdc10502bbe2.mp4
43
+ 2025-08-21 00:26:53 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] Extracting frames using method: uniform, rate/threshold: 30
44
+ 2025-08-21 00:26:58 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] Extracted 30 frames successfully. Saving to temporary files...
45
+ 2025-08-21 00:26:58 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] 30 frames saved to temp_videos/3c7789f8-90dc-45d1-be32-cdc10502bbe2
46
+ 2025-08-21 00:27:11 - INFO - vision_config is None, using default vision config
47
+ 2025-08-21 00:27:22 - INFO - Tokens per second: 5.385284609810389, Peak GPU memory MB: 11824.375
48
+ 2025-08-21 00:27:22 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] Inference time: 28.65 seconds, CPU usage: 37.3%, CPU core utilization: [20.6, 21.4, 90.0, 16.9]
49
+ 2025-08-21 00:27:22 - INFO - [3c7789f8-90dc-45d1-be32-cdc10502bbe2] Cleaned up temporary frame directory: temp_videos/3c7789f8-90dc-45d1-be32-cdc10502bbe2
50
+ 2025-08-21 00:27:22 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_006.mp4'
51
+ 2025-08-21 00:27:22 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] Video saved to temporary file: temp_videos/66ffedf3-6d71-4829-adf4-7859b5b21979.mp4
52
+ 2025-08-21 00:27:22 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] Extracting frames using method: uniform, rate/threshold: 30
53
+ 2025-08-21 00:27:27 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] Extracted 30 frames successfully. Saving to temporary files...
54
+ 2025-08-21 00:27:27 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] 30 frames saved to temp_videos/66ffedf3-6d71-4829-adf4-7859b5b21979
55
+ 2025-08-21 00:27:40 - INFO - vision_config is None, using default vision config
56
+ 2025-08-21 00:27:50 - INFO - Tokens per second: 4.504682210102835, Peak GPU memory MB: 11824.375
57
+ 2025-08-21 00:27:50 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] Inference time: 27.70 seconds, CPU usage: 37.4%, CPU core utilization: [27.3, 19.3, 52.1, 50.8]
58
+ 2025-08-21 00:27:50 - INFO - [66ffedf3-6d71-4829-adf4-7859b5b21979] Cleaned up temporary frame directory: temp_videos/66ffedf3-6d71-4829-adf4-7859b5b21979
59
+ 2025-08-21 00:27:50 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_007.mp4'
60
+ 2025-08-21 00:27:50 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] Video saved to temporary file: temp_videos/2167f629-b4f8-4e08-9179-f8eec50d35ab.mp4
61
+ 2025-08-21 00:27:50 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] Extracting frames using method: uniform, rate/threshold: 30
62
+ 2025-08-21 00:27:54 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] Extracted 30 frames successfully. Saving to temporary files...
63
+ 2025-08-21 00:27:54 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] 30 frames saved to temp_videos/2167f629-b4f8-4e08-9179-f8eec50d35ab
64
+ 2025-08-21 00:28:07 - INFO - vision_config is None, using default vision config
65
+ 2025-08-21 00:28:27 - INFO - Tokens per second: 9.168312990435263, Peak GPU memory MB: 11824.375
66
+ 2025-08-21 00:28:27 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] Inference time: 37.23 seconds, CPU usage: 35.9%, CPU core utilization: [45.1, 19.0, 27.6, 52.0]
67
+ 2025-08-21 00:28:27 - INFO - [2167f629-b4f8-4e08-9179-f8eec50d35ab] Cleaned up temporary frame directory: temp_videos/2167f629-b4f8-4e08-9179-f8eec50d35ab
API_Transformers/logs/MiniCPM-V-4/20250821_005748.log ADDED
@@ -0,0 +1,472 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-21 00:57:48 - INFO - Loading model: openbmb/MiniCPM-V-4
2
+ 2025-08-21 00:57:49 - INFO - vision_config is None, using default vision config
3
+ 2025-08-21 00:58:53 - INFO - Model loaded in 64.86 seconds
4
+ 2025-08-21 00:58:53 - INFO - GPU Memory Usage after model load: 7802.99 MB
5
+ 2025-08-21 01:00:40 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
6
+ 2025-08-21 01:00:40 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] Video saved to temporary file: temp_videos/f1b33371-19eb-4445-b227-d46a97ed0050.mp4
7
+ 2025-08-21 01:00:40 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-21 01:00:45 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-21 01:00:45 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] 30 frames saved to temp_videos/f1b33371-19eb-4445-b227-d46a97ed0050
10
+ 2025-08-21 01:01:03 - INFO - vision_config is None, using default vision config
11
+ 2025-08-21 01:01:14 - INFO - Tokens per second: 5.106639419779758, Peak GPU memory MB: 11824.375
12
+ 2025-08-21 01:01:14 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] Inference time: 33.66 seconds, CPU usage: 16.3%, CPU core utilization: [16.0, 13.4, 20.9, 14.9]
13
+ 2025-08-21 01:01:14 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] Cleaned up temporary frame directory: temp_videos/f1b33371-19eb-4445-b227-d46a97ed0050
14
+ 2025-08-21 01:01:14 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_002.mp4'
15
+ 2025-08-21 01:01:14 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] Video saved to temporary file: temp_videos/a95d5cef-7c7e-42ee-8944-0bfe41e9beed.mp4
16
+ 2025-08-21 01:01:14 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] Extracting frames using method: uniform, rate/threshold: 30
17
+ 2025-08-21 01:01:19 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] Extracted 30 frames successfully. Saving to temporary files...
18
+ 2025-08-21 01:01:19 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] 30 frames saved to temp_videos/a95d5cef-7c7e-42ee-8944-0bfe41e9beed
19
+ 2025-08-21 01:01:32 - INFO - vision_config is None, using default vision config
20
+ 2025-08-21 01:01:43 - INFO - Tokens per second: 5.976602351659928, Peak GPU memory MB: 11824.375
21
+ 2025-08-21 01:01:43 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] Inference time: 29.21 seconds, CPU usage: 36.9%, CPU core utilization: [52.9, 47.3, 16.7, 30.9]
22
+ 2025-08-21 01:01:43 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] Cleaned up temporary frame directory: temp_videos/a95d5cef-7c7e-42ee-8944-0bfe41e9beed
23
+ 2025-08-21 01:01:43 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_003.mp4'
24
+ 2025-08-21 01:01:43 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] Video saved to temporary file: temp_videos/3074ddc0-8709-448c-b863-d209d175a408.mp4
25
+ 2025-08-21 01:01:43 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] Extracting frames using method: uniform, rate/threshold: 30
26
+ 2025-08-21 01:01:48 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] Extracted 30 frames successfully. Saving to temporary files...
27
+ 2025-08-21 01:01:48 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] 30 frames saved to temp_videos/3074ddc0-8709-448c-b863-d209d175a408
28
+ 2025-08-21 01:02:01 - INFO - vision_config is None, using default vision config
29
+ 2025-08-21 01:02:11 - INFO - Tokens per second: 5.11842779774044, Peak GPU memory MB: 11824.375
30
+ 2025-08-21 01:02:11 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] Inference time: 28.20 seconds, CPU usage: 37.6%, CPU core utilization: [51.0, 23.2, 57.8, 18.2]
31
+ 2025-08-21 01:02:11 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] Cleaned up temporary frame directory: temp_videos/3074ddc0-8709-448c-b863-d209d175a408
32
+ 2025-08-21 01:02:11 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_004.mp4'
33
+ 2025-08-21 01:02:11 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] Video saved to temporary file: temp_videos/07ccf239-d23c-4776-8404-e885a43e8515.mp4
34
+ 2025-08-21 01:02:11 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] Extracting frames using method: uniform, rate/threshold: 30
35
+ 2025-08-21 01:02:16 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] Extracted 30 frames successfully. Saving to temporary files...
36
+ 2025-08-21 01:02:16 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] 30 frames saved to temp_videos/07ccf239-d23c-4776-8404-e885a43e8515
37
+ 2025-08-21 01:02:29 - INFO - vision_config is None, using default vision config
38
+ 2025-08-21 01:02:43 - INFO - Tokens per second: 7.094428465980155, Peak GPU memory MB: 11824.375
39
+ 2025-08-21 01:02:43 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] Inference time: 31.26 seconds, CPU usage: 36.1%, CPU core utilization: [16.6, 46.3, 15.5, 66.2]
40
+ 2025-08-21 01:02:43 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] Cleaned up temporary frame directory: temp_videos/07ccf239-d23c-4776-8404-e885a43e8515
41
+ 2025-08-21 01:02:43 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_005.mp4'
42
+ 2025-08-21 01:02:43 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] Video saved to temporary file: temp_videos/2bbdb8ec-7655-434d-834f-cb58caa1d778.mp4
43
+ 2025-08-21 01:02:43 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] Extracting frames using method: uniform, rate/threshold: 30
44
+ 2025-08-21 01:02:48 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] Extracted 30 frames successfully. Saving to temporary files...
45
+ 2025-08-21 01:02:48 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] 30 frames saved to temp_videos/2bbdb8ec-7655-434d-834f-cb58caa1d778
46
+ 2025-08-21 01:03:00 - INFO - vision_config is None, using default vision config
47
+ 2025-08-21 01:03:11 - INFO - Tokens per second: 5.149733378245225, Peak GPU memory MB: 11824.375
48
+ 2025-08-21 01:03:11 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] Inference time: 28.50 seconds, CPU usage: 37.4%, CPU core utilization: [19.1, 93.3, 16.5, 20.5]
49
+ 2025-08-21 01:03:11 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] Cleaned up temporary frame directory: temp_videos/2bbdb8ec-7655-434d-834f-cb58caa1d778
50
+ 2025-08-21 01:03:11 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_006.mp4'
51
+ 2025-08-21 01:03:11 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] Video saved to temporary file: temp_videos/2e2162d1-d7a0-4b02-940b-60b27a47d77e.mp4
52
+ 2025-08-21 01:03:11 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] Extracting frames using method: uniform, rate/threshold: 30
53
+ 2025-08-21 01:03:16 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] Extracted 30 frames successfully. Saving to temporary files...
54
+ 2025-08-21 01:03:16 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] 30 frames saved to temp_videos/2e2162d1-d7a0-4b02-940b-60b27a47d77e
55
+ 2025-08-21 01:03:29 - INFO - vision_config is None, using default vision config
56
+ 2025-08-21 01:03:37 - INFO - Tokens per second: 1.8942455900034438, Peak GPU memory MB: 11824.375
57
+ 2025-08-21 01:03:37 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] Inference time: 25.68 seconds, CPU usage: 37.9%, CPU core utilization: [27.3, 31.8, 57.5, 35.1]
58
+ 2025-08-21 01:03:37 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] Cleaned up temporary frame directory: temp_videos/2e2162d1-d7a0-4b02-940b-60b27a47d77e
59
+ 2025-08-21 01:03:37 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_007.mp4'
60
+ 2025-08-21 01:03:37 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] Video saved to temporary file: temp_videos/f08503e2-3694-4dd9-b73f-a2cebaac51af.mp4
61
+ 2025-08-21 01:03:37 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] Extracting frames using method: uniform, rate/threshold: 30
62
+ 2025-08-21 01:03:42 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] Extracted 30 frames successfully. Saving to temporary files...
63
+ 2025-08-21 01:03:42 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] 30 frames saved to temp_videos/f08503e2-3694-4dd9-b73f-a2cebaac51af
64
+ 2025-08-21 01:03:55 - INFO - vision_config is None, using default vision config
65
+ 2025-08-21 01:04:03 - INFO - Tokens per second: 2.3168022945047397, Peak GPU memory MB: 11824.375
66
+ 2025-08-21 01:04:03 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] Inference time: 25.99 seconds, CPU usage: 37.8%, CPU core utilization: [22.5, 61.1, 17.4, 50.2]
67
+ 2025-08-21 01:04:03 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] Cleaned up temporary frame directory: temp_videos/f08503e2-3694-4dd9-b73f-a2cebaac51af
68
+ 2025-08-21 01:04:03 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_008.mp4'
69
+ 2025-08-21 01:04:03 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] Video saved to temporary file: temp_videos/63cd6cc8-794f-4d25-9168-10dbba530d1d.mp4
70
+ 2025-08-21 01:04:03 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] Extracting frames using method: uniform, rate/threshold: 30
71
+ 2025-08-21 01:04:08 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] Extracted 30 frames successfully. Saving to temporary files...
72
+ 2025-08-21 01:04:08 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] 30 frames saved to temp_videos/63cd6cc8-794f-4d25-9168-10dbba530d1d
73
+ 2025-08-21 01:04:21 - INFO - vision_config is None, using default vision config
74
+ 2025-08-21 01:04:30 - INFO - Tokens per second: 4.072920119323832, Peak GPU memory MB: 11824.375
75
+ 2025-08-21 01:04:30 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] Inference time: 27.38 seconds, CPU usage: 37.7%, CPU core utilization: [26.2, 41.3, 45.1, 38.4]
76
+ 2025-08-21 01:04:30 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] Cleaned up temporary frame directory: temp_videos/63cd6cc8-794f-4d25-9168-10dbba530d1d
77
+ 2025-08-21 01:04:30 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_009.mp4'
78
+ 2025-08-21 01:04:30 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] Video saved to temporary file: temp_videos/9c39529e-632b-4ee3-8076-2ecd9890f71e.mp4
79
+ 2025-08-21 01:04:30 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] Extracting frames using method: uniform, rate/threshold: 30
80
+ 2025-08-21 01:04:35 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] Extracted 30 frames successfully. Saving to temporary files...
81
+ 2025-08-21 01:04:35 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] 30 frames saved to temp_videos/9c39529e-632b-4ee3-8076-2ecd9890f71e
82
+ 2025-08-21 01:04:48 - INFO - vision_config is None, using default vision config
83
+ 2025-08-21 01:04:56 - INFO - Tokens per second: 2.3167189804012382, Peak GPU memory MB: 11824.375
84
+ 2025-08-21 01:04:56 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] Inference time: 26.01 seconds, CPU usage: 38.1%, CPU core utilization: [21.9, 49.8, 55.8, 25.0]
85
+ 2025-08-21 01:04:56 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] Cleaned up temporary frame directory: temp_videos/9c39529e-632b-4ee3-8076-2ecd9890f71e
86
+ 2025-08-21 01:04:56 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_010.mp4'
87
+ 2025-08-21 01:04:56 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] Video saved to temporary file: temp_videos/81880b7c-ee68-4a30-849f-1d4ec94e298d.mp4
88
+ 2025-08-21 01:04:56 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] Extracting frames using method: uniform, rate/threshold: 30
89
+ 2025-08-21 01:05:01 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] Extracted 30 frames successfully. Saving to temporary files...
90
+ 2025-08-21 01:05:01 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] 30 frames saved to temp_videos/81880b7c-ee68-4a30-849f-1d4ec94e298d
91
+ 2025-08-21 01:05:14 - INFO - vision_config is None, using default vision config
92
+ 2025-08-21 01:05:24 - INFO - Tokens per second: 3.995493400475914, Peak GPU memory MB: 11824.375
93
+ 2025-08-21 01:05:24 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] Inference time: 27.32 seconds, CPU usage: 37.8%, CPU core utilization: [24.0, 30.3, 46.3, 50.6]
94
+ 2025-08-21 01:05:24 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] Cleaned up temporary frame directory: temp_videos/81880b7c-ee68-4a30-849f-1d4ec94e298d
95
+ 2025-08-21 01:05:24 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_011.mp4'
96
+ 2025-08-21 01:05:24 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] Video saved to temporary file: temp_videos/17dbdf76-a781-446b-92e4-20f0c09ea7fb.mp4
97
+ 2025-08-21 01:05:24 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] Extracting frames using method: uniform, rate/threshold: 30
98
+ 2025-08-21 01:05:28 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] Extracted 30 frames successfully. Saving to temporary files...
99
+ 2025-08-21 01:05:28 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] 30 frames saved to temp_videos/17dbdf76-a781-446b-92e4-20f0c09ea7fb
100
+ 2025-08-21 01:05:41 - INFO - vision_config is None, using default vision config
101
+ 2025-08-21 01:05:53 - INFO - Tokens per second: 6.202907021110835, Peak GPU memory MB: 11824.375
102
+ 2025-08-21 01:05:53 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] Inference time: 29.86 seconds, CPU usage: 36.8%, CPU core utilization: [55.8, 29.8, 44.5, 17.1]
103
+ 2025-08-21 01:05:53 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] Cleaned up temporary frame directory: temp_videos/17dbdf76-a781-446b-92e4-20f0c09ea7fb
104
+ 2025-08-21 01:05:53 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_012.mp4'
105
+ 2025-08-21 01:05:53 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] Video saved to temporary file: temp_videos/cb51ff1b-2681-489a-944e-810fb0878d91.mp4
106
+ 2025-08-21 01:05:53 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] Extracting frames using method: uniform, rate/threshold: 30
107
+ 2025-08-21 01:05:58 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] Extracted 30 frames successfully. Saving to temporary files...
108
+ 2025-08-21 01:05:58 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] 30 frames saved to temp_videos/cb51ff1b-2681-489a-944e-810fb0878d91
109
+ 2025-08-21 01:06:11 - INFO - vision_config is None, using default vision config
110
+ 2025-08-21 01:06:24 - INFO - Tokens per second: 6.5597487765640565, Peak GPU memory MB: 11824.375
111
+ 2025-08-21 01:06:24 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] Inference time: 30.37 seconds, CPU usage: 36.6%, CPU core utilization: [25.1, 23.9, 51.7, 45.5]
112
+ 2025-08-21 01:06:24 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] Cleaned up temporary frame directory: temp_videos/cb51ff1b-2681-489a-944e-810fb0878d91
113
+ 2025-08-21 01:06:24 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_013.mp4'
114
+ 2025-08-21 01:06:24 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] Video saved to temporary file: temp_videos/05904247-99c1-419b-974f-352384eb4d6f.mp4
115
+ 2025-08-21 01:06:24 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] Extracting frames using method: uniform, rate/threshold: 30
116
+ 2025-08-21 01:06:29 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] Extracted 30 frames successfully. Saving to temporary files...
117
+ 2025-08-21 01:06:29 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] 30 frames saved to temp_videos/05904247-99c1-419b-974f-352384eb4d6f
118
+ 2025-08-21 01:06:42 - INFO - vision_config is None, using default vision config
119
+ 2025-08-21 01:06:54 - INFO - Tokens per second: 6.599854347514273, Peak GPU memory MB: 11824.375
120
+ 2025-08-21 01:06:54 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] Inference time: 30.52 seconds, CPU usage: 36.9%, CPU core utilization: [68.2, 18.6, 42.7, 18.1]
121
+ 2025-08-21 01:06:54 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] Cleaned up temporary frame directory: temp_videos/05904247-99c1-419b-974f-352384eb4d6f
122
+ 2025-08-21 01:06:54 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_014.mp4'
123
+ 2025-08-21 01:06:54 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] Video saved to temporary file: temp_videos/1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c.mp4
124
+ 2025-08-21 01:06:54 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] Extracting frames using method: uniform, rate/threshold: 30
125
+ 2025-08-21 01:06:59 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] Extracted 30 frames successfully. Saving to temporary files...
126
+ 2025-08-21 01:06:59 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] 30 frames saved to temp_videos/1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c
127
+ 2025-08-21 01:07:12 - INFO - vision_config is None, using default vision config
128
+ 2025-08-21 01:07:24 - INFO - Tokens per second: 6.004141023516783, Peak GPU memory MB: 11824.375
129
+ 2025-08-21 01:07:24 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] Inference time: 29.61 seconds, CPU usage: 36.9%, CPU core utilization: [54.0, 34.3, 33.5, 25.6]
130
+ 2025-08-21 01:07:24 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] Cleaned up temporary frame directory: temp_videos/1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c
131
+ 2025-08-21 01:07:24 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_015.mp4'
132
+ 2025-08-21 01:07:24 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] Video saved to temporary file: temp_videos/4180adcc-5e38-40df-9153-9dc175d55b7d.mp4
133
+ 2025-08-21 01:07:24 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] Extracting frames using method: uniform, rate/threshold: 30
134
+ 2025-08-21 01:07:29 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] Extracted 30 frames successfully. Saving to temporary files...
135
+ 2025-08-21 01:07:29 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] 30 frames saved to temp_videos/4180adcc-5e38-40df-9153-9dc175d55b7d
136
+ 2025-08-21 01:07:42 - INFO - vision_config is None, using default vision config
137
+ 2025-08-21 01:07:57 - INFO - Tokens per second: 7.644809074843396, Peak GPU memory MB: 11824.375
138
+ 2025-08-21 01:07:57 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] Inference time: 32.59 seconds, CPU usage: 35.7%, CPU core utilization: [26.1, 37.3, 48.7, 30.7]
139
+ 2025-08-21 01:07:57 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] Cleaned up temporary frame directory: temp_videos/4180adcc-5e38-40df-9153-9dc175d55b7d
140
+ 2025-08-21 01:07:57 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_016.mp4'
141
+ 2025-08-21 01:07:57 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] Video saved to temporary file: temp_videos/31151bfe-e639-4354-8d86-1446365b7a6d.mp4
142
+ 2025-08-21 01:07:57 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] Extracting frames using method: uniform, rate/threshold: 30
143
+ 2025-08-21 01:08:01 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] Extracted 30 frames successfully. Saving to temporary files...
144
+ 2025-08-21 01:08:02 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] 30 frames saved to temp_videos/31151bfe-e639-4354-8d86-1446365b7a6d
145
+ 2025-08-21 01:08:14 - INFO - vision_config is None, using default vision config
146
+ 2025-08-21 01:08:24 - INFO - Tokens per second: 3.917112300612553, Peak GPU memory MB: 11824.375
147
+ 2025-08-21 01:08:24 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] Inference time: 27.22 seconds, CPU usage: 37.9%, CPU core utilization: [26.2, 35.0, 54.6, 35.9]
148
+ 2025-08-21 01:08:24 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] Cleaned up temporary frame directory: temp_videos/31151bfe-e639-4354-8d86-1446365b7a6d
149
+ 2025-08-21 01:08:24 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_017.mp4'
150
+ 2025-08-21 01:08:24 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] Video saved to temporary file: temp_videos/bea72135-e169-4ffa-8a4d-711a63d29de7.mp4
151
+ 2025-08-21 01:08:24 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] Extracting frames using method: uniform, rate/threshold: 30
152
+ 2025-08-21 01:08:29 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] Extracted 30 frames successfully. Saving to temporary files...
153
+ 2025-08-21 01:08:29 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] 30 frames saved to temp_videos/bea72135-e169-4ffa-8a4d-711a63d29de7
154
+ 2025-08-21 01:08:42 - INFO - vision_config is None, using default vision config
155
+ 2025-08-21 01:08:52 - INFO - Tokens per second: 5.140461789786328, Peak GPU memory MB: 11824.375
156
+ 2025-08-21 01:08:52 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] Inference time: 28.39 seconds, CPU usage: 36.8%, CPU core utilization: [30.2, 35.9, 49.2, 32.1]
157
+ 2025-08-21 01:08:52 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] Cleaned up temporary frame directory: temp_videos/bea72135-e169-4ffa-8a4d-711a63d29de7
158
+ 2025-08-21 01:08:52 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_018.mp4'
159
+ 2025-08-21 01:08:52 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] Video saved to temporary file: temp_videos/0225d6ec-246f-4a57-8cd4-937fc512a6da.mp4
160
+ 2025-08-21 01:08:52 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] Extracting frames using method: uniform, rate/threshold: 30
161
+ 2025-08-21 01:08:57 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] Extracted 30 frames successfully. Saving to temporary files...
162
+ 2025-08-21 01:08:57 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] 30 frames saved to temp_videos/0225d6ec-246f-4a57-8cd4-937fc512a6da
163
+ 2025-08-21 01:09:10 - INFO - vision_config is None, using default vision config
164
+ 2025-08-21 01:09:25 - INFO - Tokens per second: 7.796685733470381, Peak GPU memory MB: 11824.375
165
+ 2025-08-21 01:09:25 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] Inference time: 32.85 seconds, CPU usage: 36.0%, CPU core utilization: [47.5, 17.9, 27.3, 51.2]
166
+ 2025-08-21 01:09:25 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] Cleaned up temporary frame directory: temp_videos/0225d6ec-246f-4a57-8cd4-937fc512a6da
167
+ 2025-08-21 01:09:25 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_019.mp4'
168
+ 2025-08-21 01:09:25 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] Video saved to temporary file: temp_videos/a4d435b1-e6a4-4c53-883e-4df02b912d3a.mp4
169
+ 2025-08-21 01:09:25 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] Extracting frames using method: uniform, rate/threshold: 30
170
+ 2025-08-21 01:09:30 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] Extracted 30 frames successfully. Saving to temporary files...
171
+ 2025-08-21 01:09:30 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] 30 frames saved to temp_videos/a4d435b1-e6a4-4c53-883e-4df02b912d3a
172
+ 2025-08-21 01:09:43 - INFO - vision_config is None, using default vision config
173
+ 2025-08-21 01:09:59 - INFO - Tokens per second: 8.04906132671884, Peak GPU memory MB: 11824.375
174
+ 2025-08-21 01:09:59 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] Inference time: 33.57 seconds, CPU usage: 35.7%, CPU core utilization: [28.8, 35.7, 39.0, 39.3]
175
+ 2025-08-21 01:09:59 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] Cleaned up temporary frame directory: temp_videos/a4d435b1-e6a4-4c53-883e-4df02b912d3a
176
+ 2025-08-21 01:09:59 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_020.mp4'
177
+ 2025-08-21 01:09:59 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] Video saved to temporary file: temp_videos/cb7f092b-c376-4f47-80ca-804b08b972a4.mp4
178
+ 2025-08-21 01:09:59 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] Extracting frames using method: uniform, rate/threshold: 30
179
+ 2025-08-21 01:10:04 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] Extracted 30 frames successfully. Saving to temporary files...
180
+ 2025-08-21 01:10:04 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] 30 frames saved to temp_videos/cb7f092b-c376-4f47-80ca-804b08b972a4
181
+ 2025-08-21 01:10:16 - INFO - vision_config is None, using default vision config
182
+ 2025-08-21 01:10:29 - INFO - Tokens per second: 6.551982232700224, Peak GPU memory MB: 11824.375
183
+ 2025-08-21 01:10:29 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] Inference time: 30.41 seconds, CPU usage: 36.7%, CPU core utilization: [35.2, 27.6, 62.1, 22.0]
184
+ 2025-08-21 01:10:29 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] Cleaned up temporary frame directory: temp_videos/cb7f092b-c376-4f47-80ca-804b08b972a4
185
+ 2025-08-21 01:10:29 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_021.mp4'
186
+ 2025-08-21 01:10:29 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] Video saved to temporary file: temp_videos/2e42703b-a52c-49f8-a396-ae02062d1c39.mp4
187
+ 2025-08-21 01:10:29 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] Extracting frames using method: uniform, rate/threshold: 30
188
+ 2025-08-21 01:10:34 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] Extracted 30 frames successfully. Saving to temporary files...
189
+ 2025-08-21 01:10:34 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] 30 frames saved to temp_videos/2e42703b-a52c-49f8-a396-ae02062d1c39
190
+ 2025-08-21 01:10:47 - INFO - vision_config is None, using default vision config
191
+ 2025-08-21 01:10:58 - INFO - Tokens per second: 5.068557863754886, Peak GPU memory MB: 11824.375
192
+ 2025-08-21 01:10:58 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] Inference time: 28.44 seconds, CPU usage: 37.4%, CPU core utilization: [24.8, 47.0, 16.6, 61.0]
193
+ 2025-08-21 01:10:58 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] Cleaned up temporary frame directory: temp_videos/2e42703b-a52c-49f8-a396-ae02062d1c39
194
+ 2025-08-21 01:10:58 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_022.mp4'
195
+ 2025-08-21 01:10:58 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] Video saved to temporary file: temp_videos/b5420571-277f-43c0-ba2e-141c5b252721.mp4
196
+ 2025-08-21 01:10:58 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] Extracting frames using method: uniform, rate/threshold: 30
197
+ 2025-08-21 01:11:02 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] Extracted 30 frames successfully. Saving to temporary files...
198
+ 2025-08-21 01:11:02 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] 30 frames saved to temp_videos/b5420571-277f-43c0-ba2e-141c5b252721
199
+ 2025-08-21 01:11:15 - INFO - vision_config is None, using default vision config
200
+ 2025-08-21 01:11:26 - INFO - Tokens per second: 4.765763711580632, Peak GPU memory MB: 11824.375
201
+ 2025-08-21 01:11:26 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] Inference time: 27.98 seconds, CPU usage: 37.6%, CPU core utilization: [43.1, 34.8, 55.9, 16.7]
202
+ 2025-08-21 01:11:26 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] Cleaned up temporary frame directory: temp_videos/b5420571-277f-43c0-ba2e-141c5b252721
203
+ 2025-08-21 01:11:26 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_023.mp4'
204
+ 2025-08-21 01:11:26 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] Video saved to temporary file: temp_videos/0df499a8-8a08-4b6b-a8bd-63fd84cde688.mp4
205
+ 2025-08-21 01:11:26 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] Extracting frames using method: uniform, rate/threshold: 30
206
+ 2025-08-21 01:11:30 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] Extracted 30 frames successfully. Saving to temporary files...
207
+ 2025-08-21 01:11:31 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] 30 frames saved to temp_videos/0df499a8-8a08-4b6b-a8bd-63fd84cde688
208
+ 2025-08-21 01:11:43 - INFO - vision_config is None, using default vision config
209
+ 2025-08-21 01:11:58 - INFO - Tokens per second: 7.483257212457778, Peak GPU memory MB: 11824.375
210
+ 2025-08-21 01:11:58 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] Inference time: 32.23 seconds, CPU usage: 36.1%, CPU core utilization: [26.1, 27.6, 47.6, 43.3]
211
+ 2025-08-21 01:11:58 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] Cleaned up temporary frame directory: temp_videos/0df499a8-8a08-4b6b-a8bd-63fd84cde688
212
+ 2025-08-21 01:11:58 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_024.mp4'
213
+ 2025-08-21 01:11:58 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] Video saved to temporary file: temp_videos/1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a.mp4
214
+ 2025-08-21 01:11:58 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] Extracting frames using method: uniform, rate/threshold: 30
215
+ 2025-08-21 01:12:03 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] Extracted 30 frames successfully. Saving to temporary files...
216
+ 2025-08-21 01:12:03 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] 30 frames saved to temp_videos/1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a
217
+ 2025-08-21 01:12:16 - INFO - vision_config is None, using default vision config
218
+ 2025-08-21 01:12:25 - INFO - Tokens per second: 3.676425320411625, Peak GPU memory MB: 11824.375
219
+ 2025-08-21 01:12:25 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] Inference time: 27.02 seconds, CPU usage: 38.5%, CPU core utilization: [40.1, 28.5, 57.3, 28.2]
220
+ 2025-08-21 01:12:25 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] Cleaned up temporary frame directory: temp_videos/1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a
221
+ 2025-08-21 01:12:25 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_025.mp4'
222
+ 2025-08-21 01:12:25 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] Video saved to temporary file: temp_videos/5794ceef-3e1b-4291-b263-2b236146168a.mp4
223
+ 2025-08-21 01:12:25 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] Extracting frames using method: uniform, rate/threshold: 30
224
+ 2025-08-21 01:12:30 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] Extracted 30 frames successfully. Saving to temporary files...
225
+ 2025-08-21 01:12:30 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] 30 frames saved to temp_videos/5794ceef-3e1b-4291-b263-2b236146168a
226
+ 2025-08-21 01:12:43 - INFO - vision_config is None, using default vision config
227
+ 2025-08-21 01:12:53 - INFO - Tokens per second: 5.01668206513031, Peak GPU memory MB: 11824.375
228
+ 2025-08-21 01:12:53 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] Inference time: 28.36 seconds, CPU usage: 37.5%, CPU core utilization: [29.8, 41.1, 41.8, 37.3]
229
+ 2025-08-21 01:12:53 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] Cleaned up temporary frame directory: temp_videos/5794ceef-3e1b-4291-b263-2b236146168a
230
+ 2025-08-21 01:12:53 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_026.mp4'
231
+ 2025-08-21 01:12:53 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] Video saved to temporary file: temp_videos/686109f9-7315-4a90-8563-98c12607d0a8.mp4
232
+ 2025-08-21 01:12:53 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] Extracting frames using method: uniform, rate/threshold: 30
233
+ 2025-08-21 01:12:58 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] Extracted 30 frames successfully. Saving to temporary files...
234
+ 2025-08-21 01:12:58 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] 30 frames saved to temp_videos/686109f9-7315-4a90-8563-98c12607d0a8
235
+ 2025-08-21 01:13:11 - INFO - vision_config is None, using default vision config
236
+ 2025-08-21 01:13:23 - INFO - Tokens per second: 6.198040985676184, Peak GPU memory MB: 11824.375
237
+ 2025-08-21 01:13:23 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] Inference time: 29.83 seconds, CPU usage: 37.0%, CPU core utilization: [26.8, 32.5, 30.5, 58.1]
238
+ 2025-08-21 01:13:23 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] Cleaned up temporary frame directory: temp_videos/686109f9-7315-4a90-8563-98c12607d0a8
239
+ 2025-08-21 01:13:23 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_027.mp4'
240
+ 2025-08-21 01:13:23 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] Video saved to temporary file: temp_videos/3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c.mp4
241
+ 2025-08-21 01:13:23 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] Extracting frames using method: uniform, rate/threshold: 30
242
+ 2025-08-21 01:13:28 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] Extracted 30 frames successfully. Saving to temporary files...
243
+ 2025-08-21 01:13:28 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] 30 frames saved to temp_videos/3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c
244
+ 2025-08-21 01:13:41 - INFO - vision_config is None, using default vision config
245
+ 2025-08-21 01:13:54 - INFO - Tokens per second: 6.7633887422454855, Peak GPU memory MB: 11824.375
246
+ 2025-08-21 01:13:54 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] Inference time: 30.77 seconds, CPU usage: 36.5%, CPU core utilization: [64.3, 25.4, 39.5, 16.6]
247
+ 2025-08-21 01:13:54 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] Cleaned up temporary frame directory: temp_videos/3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c
248
+ 2025-08-21 01:13:54 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_028.mp4'
249
+ 2025-08-21 01:13:54 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] Video saved to temporary file: temp_videos/c0db1c86-c4f5-4f7a-92ec-0b0631bde80c.mp4
250
+ 2025-08-21 01:13:54 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] Extracting frames using method: uniform, rate/threshold: 30
251
+ 2025-08-21 01:13:59 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] Extracted 30 frames successfully. Saving to temporary files...
252
+ 2025-08-21 01:13:59 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] 30 frames saved to temp_videos/c0db1c86-c4f5-4f7a-92ec-0b0631bde80c
253
+ 2025-08-21 01:14:12 - INFO - vision_config is None, using default vision config
254
+ 2025-08-21 01:14:24 - INFO - Tokens per second: 6.241116142632914, Peak GPU memory MB: 11824.375
255
+ 2025-08-21 01:14:24 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] Inference time: 29.90 seconds, CPU usage: 36.9%, CPU core utilization: [52.1, 24.6, 27.8, 43.1]
256
+ 2025-08-21 01:14:24 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] Cleaned up temporary frame directory: temp_videos/c0db1c86-c4f5-4f7a-92ec-0b0631bde80c
257
+ 2025-08-21 01:14:24 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_029.mp4'
258
+ 2025-08-21 01:14:24 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] Video saved to temporary file: temp_videos/bb36dcfa-9c43-4bf7-8c8e-becac106dbbb.mp4
259
+ 2025-08-21 01:14:24 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] Extracting frames using method: uniform, rate/threshold: 30
260
+ 2025-08-21 01:14:29 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] Extracted 30 frames successfully. Saving to temporary files...
261
+ 2025-08-21 01:14:29 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] 30 frames saved to temp_videos/bb36dcfa-9c43-4bf7-8c8e-becac106dbbb
262
+ 2025-08-21 01:14:42 - INFO - vision_config is None, using default vision config
263
+ 2025-08-21 01:14:53 - INFO - Tokens per second: 5.427872993526753, Peak GPU memory MB: 11824.375
264
+ 2025-08-21 01:14:53 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] Inference time: 28.90 seconds, CPU usage: 36.8%, CPU core utilization: [58.1, 25.7, 27.4, 36.1]
265
+ 2025-08-21 01:14:53 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] Cleaned up temporary frame directory: temp_videos/bb36dcfa-9c43-4bf7-8c8e-becac106dbbb
266
+ 2025-08-21 01:14:53 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_030.mp4'
267
+ 2025-08-21 01:14:53 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] Video saved to temporary file: temp_videos/78a73f26-9f35-4eff-bd02-6c440580ce76.mp4
268
+ 2025-08-21 01:14:53 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] Extracting frames using method: uniform, rate/threshold: 30
269
+ 2025-08-21 01:14:58 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] Extracted 30 frames successfully. Saving to temporary files...
270
+ 2025-08-21 01:14:58 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] 30 frames saved to temp_videos/78a73f26-9f35-4eff-bd02-6c440580ce76
271
+ 2025-08-21 01:15:11 - INFO - vision_config is None, using default vision config
272
+ 2025-08-21 01:15:24 - INFO - Tokens per second: 6.759735939024082, Peak GPU memory MB: 11824.375
273
+ 2025-08-21 01:15:24 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] Inference time: 30.85 seconds, CPU usage: 36.6%, CPU core utilization: [41.3, 20.6, 31.9, 52.4]
274
+ 2025-08-21 01:15:24 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] Cleaned up temporary frame directory: temp_videos/78a73f26-9f35-4eff-bd02-6c440580ce76
275
+ 2025-08-21 01:15:24 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_031.mp4'
276
+ 2025-08-21 01:15:24 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] Video saved to temporary file: temp_videos/59928551-1922-42e9-b7ef-b8f27f8d44a7.mp4
277
+ 2025-08-21 01:15:24 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] Extracting frames using method: uniform, rate/threshold: 30
278
+ 2025-08-21 01:15:28 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] Extracted 30 frames successfully. Saving to temporary files...
279
+ 2025-08-21 01:15:28 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] 30 frames saved to temp_videos/59928551-1922-42e9-b7ef-b8f27f8d44a7
280
+ 2025-08-21 01:15:41 - INFO - vision_config is None, using default vision config
281
+ 2025-08-21 01:15:52 - INFO - Tokens per second: 5.539399356957513, Peak GPU memory MB: 11824.375
282
+ 2025-08-21 01:15:52 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] Inference time: 28.92 seconds, CPU usage: 37.0%, CPU core utilization: [36.2, 29.1, 66.1, 16.6]
283
+ 2025-08-21 01:15:53 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] Cleaned up temporary frame directory: temp_videos/59928551-1922-42e9-b7ef-b8f27f8d44a7
284
+ 2025-08-21 01:15:53 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_032.mp4'
285
+ 2025-08-21 01:15:53 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] Video saved to temporary file: temp_videos/824a7ab4-629c-45f5-9a3d-4a4db67f847f.mp4
286
+ 2025-08-21 01:15:53 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] Extracting frames using method: uniform, rate/threshold: 30
287
+ 2025-08-21 01:15:57 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] Extracted 30 frames successfully. Saving to temporary files...
288
+ 2025-08-21 01:15:57 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] 30 frames saved to temp_videos/824a7ab4-629c-45f5-9a3d-4a4db67f847f
289
+ 2025-08-21 01:16:10 - INFO - vision_config is None, using default vision config
290
+ 2025-08-21 01:16:19 - INFO - Tokens per second: 3.2576537369296608, Peak GPU memory MB: 11824.375
291
+ 2025-08-21 01:16:19 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] Inference time: 26.58 seconds, CPU usage: 38.1%, CPU core utilization: [48.7, 36.2, 31.6, 36.0]
292
+ 2025-08-21 01:16:19 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] Cleaned up temporary frame directory: temp_videos/824a7ab4-629c-45f5-9a3d-4a4db67f847f
293
+ 2025-08-21 01:16:19 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_033.mp4'
294
+ 2025-08-21 01:16:19 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] Video saved to temporary file: temp_videos/9483b45f-591e-4e30-b51b-94a81dec9839.mp4
295
+ 2025-08-21 01:16:19 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] Extracting frames using method: uniform, rate/threshold: 30
296
+ 2025-08-21 01:16:24 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] Extracted 30 frames successfully. Saving to temporary files...
297
+ 2025-08-21 01:16:24 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] 30 frames saved to temp_videos/9483b45f-591e-4e30-b51b-94a81dec9839
298
+ 2025-08-21 01:16:37 - INFO - vision_config is None, using default vision config
299
+ 2025-08-21 01:16:49 - INFO - Tokens per second: 6.511908523166135, Peak GPU memory MB: 11824.375
300
+ 2025-08-21 01:16:49 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] Inference time: 30.29 seconds, CPU usage: 36.5%, CPU core utilization: [55.4, 25.4, 48.9, 16.2]
301
+ 2025-08-21 01:16:49 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] Cleaned up temporary frame directory: temp_videos/9483b45f-591e-4e30-b51b-94a81dec9839
302
+ 2025-08-21 01:16:49 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_034.mp4'
303
+ 2025-08-21 01:16:49 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] Video saved to temporary file: temp_videos/8229188c-0cc1-4717-8e19-43ece6e413b5.mp4
304
+ 2025-08-21 01:16:49 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] Extracting frames using method: uniform, rate/threshold: 30
305
+ 2025-08-21 01:16:54 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] Extracted 30 frames successfully. Saving to temporary files...
306
+ 2025-08-21 01:16:54 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] 30 frames saved to temp_videos/8229188c-0cc1-4717-8e19-43ece6e413b5
307
+ 2025-08-21 01:17:07 - INFO - vision_config is None, using default vision config
308
+ 2025-08-21 01:17:18 - INFO - Tokens per second: 5.322139736625115, Peak GPU memory MB: 11824.375
309
+ 2025-08-21 01:17:18 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] Inference time: 28.59 seconds, CPU usage: 37.4%, CPU core utilization: [41.1, 43.3, 45.5, 19.4]
310
+ 2025-08-21 01:17:18 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] Cleaned up temporary frame directory: temp_videos/8229188c-0cc1-4717-8e19-43ece6e413b5
311
+ 2025-08-21 01:17:18 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_035.mp4'
312
+ 2025-08-21 01:17:18 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] Video saved to temporary file: temp_videos/00df263f-0a55-4646-a5f1-7392cbd3a66e.mp4
313
+ 2025-08-21 01:17:18 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] Extracting frames using method: uniform, rate/threshold: 30
314
+ 2025-08-21 01:17:23 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] Extracted 30 frames successfully. Saving to temporary files...
315
+ 2025-08-21 01:17:23 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] 30 frames saved to temp_videos/00df263f-0a55-4646-a5f1-7392cbd3a66e
316
+ 2025-08-21 01:17:36 - INFO - vision_config is None, using default vision config
317
+ 2025-08-21 01:17:44 - INFO - Tokens per second: 2.001110574040108, Peak GPU memory MB: 11824.375
318
+ 2025-08-21 01:17:44 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] Inference time: 25.77 seconds, CPU usage: 38.4%, CPU core utilization: [35.6, 36.8, 32.8, 48.4]
319
+ 2025-08-21 01:17:44 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] Cleaned up temporary frame directory: temp_videos/00df263f-0a55-4646-a5f1-7392cbd3a66e
320
+ 2025-08-21 01:17:44 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_036.mp4'
321
+ 2025-08-21 01:17:44 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] Video saved to temporary file: temp_videos/56354734-256f-4dd4-a7ca-33f9f37b588a.mp4
322
+ 2025-08-21 01:17:44 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] Extracting frames using method: uniform, rate/threshold: 30
323
+ 2025-08-21 01:17:49 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] Extracted 30 frames successfully. Saving to temporary files...
324
+ 2025-08-21 01:17:49 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] 30 frames saved to temp_videos/56354734-256f-4dd4-a7ca-33f9f37b588a
325
+ 2025-08-21 01:18:02 - INFO - vision_config is None, using default vision config
326
+ 2025-08-21 01:18:11 - INFO - Tokens per second: 4.287825767821965, Peak GPU memory MB: 11824.375
327
+ 2025-08-21 01:18:11 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] Inference time: 27.47 seconds, CPU usage: 37.6%, CPU core utilization: [18.0, 46.2, 68.9, 17.6]
328
+ 2025-08-21 01:18:11 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] Cleaned up temporary frame directory: temp_videos/56354734-256f-4dd4-a7ca-33f9f37b588a
329
+ 2025-08-21 01:18:11 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_037.mp4'
330
+ 2025-08-21 01:18:11 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] Video saved to temporary file: temp_videos/43fc9b52-3741-493c-b317-62cd85256985.mp4
331
+ 2025-08-21 01:18:11 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] Extracting frames using method: uniform, rate/threshold: 30
332
+ 2025-08-21 01:18:16 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] Extracted 30 frames successfully. Saving to temporary files...
333
+ 2025-08-21 01:18:16 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] 30 frames saved to temp_videos/43fc9b52-3741-493c-b317-62cd85256985
334
+ 2025-08-21 01:18:29 - INFO - vision_config is None, using default vision config
335
+ 2025-08-21 01:18:40 - INFO - Tokens per second: 5.201636019010958, Peak GPU memory MB: 11824.375
336
+ 2025-08-21 01:18:40 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] Inference time: 28.50 seconds, CPU usage: 37.4%, CPU core utilization: [36.0, 34.1, 41.3, 38.1]
337
+ 2025-08-21 01:18:40 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] Cleaned up temporary frame directory: temp_videos/43fc9b52-3741-493c-b317-62cd85256985
338
+ 2025-08-21 01:18:40 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_038.mp4'
339
+ 2025-08-21 01:18:40 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] Video saved to temporary file: temp_videos/bb21c0a9-84dd-4b05-8543-a9d4b52958ba.mp4
340
+ 2025-08-21 01:18:40 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] Extracting frames using method: uniform, rate/threshold: 30
341
+ 2025-08-21 01:18:45 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] Extracted 30 frames successfully. Saving to temporary files...
342
+ 2025-08-21 01:18:45 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] 30 frames saved to temp_videos/bb21c0a9-84dd-4b05-8543-a9d4b52958ba
343
+ 2025-08-21 01:18:58 - INFO - vision_config is None, using default vision config
344
+ 2025-08-21 01:19:09 - INFO - Tokens per second: 5.959308976482591, Peak GPU memory MB: 11824.375
345
+ 2025-08-21 01:19:09 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] Inference time: 29.47 seconds, CPU usage: 36.6%, CPU core utilization: [47.9, 26.7, 42.6, 29.1]
346
+ 2025-08-21 01:19:09 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] Cleaned up temporary frame directory: temp_videos/bb21c0a9-84dd-4b05-8543-a9d4b52958ba
347
+ 2025-08-21 01:19:09 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_039.mp4'
348
+ 2025-08-21 01:19:09 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] Video saved to temporary file: temp_videos/c1c756cf-8d88-40f1-99d6-35014bca5417.mp4
349
+ 2025-08-21 01:19:09 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] Extracting frames using method: uniform, rate/threshold: 30
350
+ 2025-08-21 01:19:14 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] Extracted 30 frames successfully. Saving to temporary files...
351
+ 2025-08-21 01:19:14 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] 30 frames saved to temp_videos/c1c756cf-8d88-40f1-99d6-35014bca5417
352
+ 2025-08-21 01:19:27 - INFO - vision_config is None, using default vision config
353
+ 2025-08-21 01:19:36 - INFO - Tokens per second: 3.3460227628996955, Peak GPU memory MB: 11824.375
354
+ 2025-08-21 01:19:36 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] Inference time: 26.83 seconds, CPU usage: 38.0%, CPU core utilization: [50.8, 48.6, 31.4, 21.4]
355
+ 2025-08-21 01:19:36 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] Cleaned up temporary frame directory: temp_videos/c1c756cf-8d88-40f1-99d6-35014bca5417
356
+ 2025-08-21 01:19:36 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_040.mp4'
357
+ 2025-08-21 01:19:36 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] Video saved to temporary file: temp_videos/e408569f-7a5d-4851-961e-1b0408acf6fd.mp4
358
+ 2025-08-21 01:19:36 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] Extracting frames using method: uniform, rate/threshold: 30
359
+ 2025-08-21 01:19:41 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] Extracted 30 frames successfully. Saving to temporary files...
360
+ 2025-08-21 01:19:41 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] 30 frames saved to temp_videos/e408569f-7a5d-4851-961e-1b0408acf6fd
361
+ 2025-08-21 01:19:54 - INFO - vision_config is None, using default vision config
362
+ 2025-08-21 01:20:03 - INFO - Tokens per second: 3.758188071762749, Peak GPU memory MB: 11824.375
363
+ 2025-08-21 01:20:03 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] Inference time: 27.11 seconds, CPU usage: 37.6%, CPU core utilization: [32.3, 62.0, 32.1, 23.9]
364
+ 2025-08-21 01:20:03 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] Cleaned up temporary frame directory: temp_videos/e408569f-7a5d-4851-961e-1b0408acf6fd
365
+ 2025-08-21 01:20:03 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_041.mp4'
366
+ 2025-08-21 01:20:03 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] Video saved to temporary file: temp_videos/5dac0a79-3673-4ab1-b2ca-cefc83712b60.mp4
367
+ 2025-08-21 01:20:03 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] Extracting frames using method: uniform, rate/threshold: 30
368
+ 2025-08-21 01:20:08 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] Extracted 30 frames successfully. Saving to temporary files...
369
+ 2025-08-21 01:20:08 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] 30 frames saved to temp_videos/5dac0a79-3673-4ab1-b2ca-cefc83712b60
370
+ 2025-08-21 01:20:21 - INFO - vision_config is None, using default vision config
371
+ 2025-08-21 01:20:30 - INFO - Tokens per second: 3.839752750295925, Peak GPU memory MB: 11824.375
372
+ 2025-08-21 01:20:30 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] Inference time: 27.12 seconds, CPU usage: 37.8%, CPU core utilization: [28.4, 42.0, 34.0, 46.9]
373
+ 2025-08-21 01:20:30 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] Cleaned up temporary frame directory: temp_videos/5dac0a79-3673-4ab1-b2ca-cefc83712b60
374
+ 2025-08-21 01:20:30 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_042.mp4'
375
+ 2025-08-21 01:20:30 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] Video saved to temporary file: temp_videos/92c0e821-54b8-4c96-801a-be04166c4502.mp4
376
+ 2025-08-21 01:20:30 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] Extracting frames using method: uniform, rate/threshold: 30
377
+ 2025-08-21 01:20:35 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] Extracted 30 frames successfully. Saving to temporary files...
378
+ 2025-08-21 01:20:35 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] 30 frames saved to temp_videos/92c0e821-54b8-4c96-801a-be04166c4502
379
+ 2025-08-21 01:20:48 - INFO - vision_config is None, using default vision config
380
+ 2025-08-21 01:21:01 - INFO - Tokens per second: 6.5598252890181605, Peak GPU memory MB: 11824.375
381
+ 2025-08-21 01:21:01 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] Inference time: 30.37 seconds, CPU usage: 36.1%, CPU core utilization: [20.5, 62.7, 16.5, 44.7]
382
+ 2025-08-21 01:21:01 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] Cleaned up temporary frame directory: temp_videos/92c0e821-54b8-4c96-801a-be04166c4502
383
+ 2025-08-21 01:21:01 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_043.mp4'
384
+ 2025-08-21 01:21:01 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] Video saved to temporary file: temp_videos/f542ced9-0803-492e-a5c3-1a8cf04f1129.mp4
385
+ 2025-08-21 01:21:01 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] Extracting frames using method: uniform, rate/threshold: 30
386
+ 2025-08-21 01:21:06 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] Extracted 30 frames successfully. Saving to temporary files...
387
+ 2025-08-21 01:21:06 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] 30 frames saved to temp_videos/f542ced9-0803-492e-a5c3-1a8cf04f1129
388
+ 2025-08-21 01:21:19 - INFO - vision_config is None, using default vision config
389
+ 2025-08-21 01:21:31 - INFO - Tokens per second: 6.379927791197094, Peak GPU memory MB: 11824.375
390
+ 2025-08-21 01:21:31 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] Inference time: 30.11 seconds, CPU usage: 36.8%, CPU core utilization: [27.4, 48.1, 37.8, 33.9]
391
+ 2025-08-21 01:21:31 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] Cleaned up temporary frame directory: temp_videos/f542ced9-0803-492e-a5c3-1a8cf04f1129
392
+ 2025-08-21 01:21:31 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_044.mp4'
393
+ 2025-08-21 01:21:31 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] Video saved to temporary file: temp_videos/01cce6ea-c917-49ae-b644-91a34b7204c5.mp4
394
+ 2025-08-21 01:21:31 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] Extracting frames using method: uniform, rate/threshold: 30
395
+ 2025-08-21 01:21:36 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] Extracted 30 frames successfully. Saving to temporary files...
396
+ 2025-08-21 01:21:36 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] 30 frames saved to temp_videos/01cce6ea-c917-49ae-b644-91a34b7204c5
397
+ 2025-08-21 01:21:49 - INFO - vision_config is None, using default vision config
398
+ 2025-08-21 01:21:58 - INFO - Tokens per second: 3.7502463323747084, Peak GPU memory MB: 11824.375
399
+ 2025-08-21 01:21:58 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] Inference time: 27.08 seconds, CPU usage: 48.6%, CPU core utilization: [82.7, 37.0, 40.9, 33.6]
400
+ 2025-08-21 01:21:58 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] Cleaned up temporary frame directory: temp_videos/01cce6ea-c917-49ae-b644-91a34b7204c5
401
+ 2025-08-21 01:21:58 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_045.mp4'
402
+ 2025-08-21 01:21:58 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] Video saved to temporary file: temp_videos/9738fd4f-d14a-4967-bdb7-b1c9156add2a.mp4
403
+ 2025-08-21 01:21:58 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] Extracting frames using method: uniform, rate/threshold: 30
404
+ 2025-08-21 01:22:06 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] Extracted 30 frames successfully. Saving to temporary files...
405
+ 2025-08-21 01:22:06 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] 30 frames saved to temp_videos/9738fd4f-d14a-4967-bdb7-b1c9156add2a
406
+ 2025-08-21 01:22:19 - INFO - vision_config is None, using default vision config
407
+ 2025-08-21 01:22:26 - INFO - Tokens per second: 1.8954757210352398, Peak GPU memory MB: 11824.375
408
+ 2025-08-21 01:22:26 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] Inference time: 28.46 seconds, CPU usage: 50.5%, CPU core utilization: [45.6, 51.9, 34.0, 70.4]
409
+ 2025-08-21 01:22:26 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] Cleaned up temporary frame directory: temp_videos/9738fd4f-d14a-4967-bdb7-b1c9156add2a
410
+ 2025-08-21 01:22:26 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_046.mp4'
411
+ 2025-08-21 01:22:26 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] Video saved to temporary file: temp_videos/e5c0b243-2f04-4afd-a044-8e574b65e7fe.mp4
412
+ 2025-08-21 01:22:26 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] Extracting frames using method: uniform, rate/threshold: 30
413
+ 2025-08-21 01:22:31 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] Extracted 30 frames successfully. Saving to temporary files...
414
+ 2025-08-21 01:22:31 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] 30 frames saved to temp_videos/e5c0b243-2f04-4afd-a044-8e574b65e7fe
415
+ 2025-08-21 01:22:44 - INFO - vision_config is None, using default vision config
416
+ 2025-08-21 01:22:56 - INFO - Tokens per second: 5.655760454979962, Peak GPU memory MB: 11824.375
417
+ 2025-08-21 01:22:56 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] Inference time: 29.07 seconds, CPU usage: 36.7%, CPU core utilization: [19.4, 41.5, 59.0, 27.0]
418
+ 2025-08-21 01:22:56 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] Cleaned up temporary frame directory: temp_videos/e5c0b243-2f04-4afd-a044-8e574b65e7fe
419
+ 2025-08-21 01:22:56 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_047.mp4'
420
+ 2025-08-21 01:22:56 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] Video saved to temporary file: temp_videos/2c19cb12-ab5a-476b-8773-34b5991b8716.mp4
421
+ 2025-08-21 01:22:56 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] Extracting frames using method: uniform, rate/threshold: 30
422
+ 2025-08-21 01:23:00 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] Extracted 30 frames successfully. Saving to temporary files...
423
+ 2025-08-21 01:23:00 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] 30 frames saved to temp_videos/2c19cb12-ab5a-476b-8773-34b5991b8716
424
+ 2025-08-21 01:23:13 - INFO - vision_config is None, using default vision config
425
+ 2025-08-21 01:23:23 - INFO - Tokens per second: 3.9947178064700855, Peak GPU memory MB: 11824.375
426
+ 2025-08-21 01:23:23 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] Inference time: 27.28 seconds, CPU usage: 37.9%, CPU core utilization: [27.5, 21.8, 42.6, 59.5]
427
+ 2025-08-21 01:23:23 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] Cleaned up temporary frame directory: temp_videos/2c19cb12-ab5a-476b-8773-34b5991b8716
428
+ 2025-08-21 01:23:23 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_048.mp4'
429
+ 2025-08-21 01:23:23 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] Video saved to temporary file: temp_videos/1f521869-b6a6-4ffc-b7f9-3ba6424abdc9.mp4
430
+ 2025-08-21 01:23:23 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] Extracting frames using method: uniform, rate/threshold: 30
431
+ 2025-08-21 01:23:28 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] Extracted 30 frames successfully. Saving to temporary files...
432
+ 2025-08-21 01:23:28 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] 30 frames saved to temp_videos/1f521869-b6a6-4ffc-b7f9-3ba6424abdc9
433
+ 2025-08-21 01:23:41 - INFO - vision_config is None, using default vision config
434
+ 2025-08-21 01:23:52 - INFO - Tokens per second: 5.431023554612727, Peak GPU memory MB: 11824.375
435
+ 2025-08-21 01:23:52 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] Inference time: 28.81 seconds, CPU usage: 36.9%, CPU core utilization: [35.2, 33.7, 57.7, 21.0]
436
+ 2025-08-21 01:23:52 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] Cleaned up temporary frame directory: temp_videos/1f521869-b6a6-4ffc-b7f9-3ba6424abdc9
437
+ 2025-08-21 01:23:52 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_049.mp4'
438
+ 2025-08-21 01:23:52 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] Video saved to temporary file: temp_videos/bb6768fb-6e99-4507-b665-27c2c1ae0b50.mp4
439
+ 2025-08-21 01:23:52 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] Extracting frames using method: uniform, rate/threshold: 30
440
+ 2025-08-21 01:23:57 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] Extracted 30 frames successfully. Saving to temporary files...
441
+ 2025-08-21 01:23:57 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] 30 frames saved to temp_videos/bb6768fb-6e99-4507-b665-27c2c1ae0b50
442
+ 2025-08-21 01:24:09 - INFO - vision_config is None, using default vision config
443
+ 2025-08-21 01:24:20 - INFO - Tokens per second: 5.315589822361752, Peak GPU memory MB: 11824.375
444
+ 2025-08-21 01:24:20 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] Inference time: 28.63 seconds, CPU usage: 44.5%, CPU core utilization: [27.1, 55.7, 35.4, 59.8]
445
+ 2025-08-21 01:24:20 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] Cleaned up temporary frame directory: temp_videos/bb6768fb-6e99-4507-b665-27c2c1ae0b50
446
+ 2025-08-21 01:24:20 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_050.mp4'
447
+ 2025-08-21 01:24:20 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] Video saved to temporary file: temp_videos/7c0a1e36-be63-4b18-9096-cf0faa8f5ca7.mp4
448
+ 2025-08-21 01:24:20 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] Extracting frames using method: uniform, rate/threshold: 30
449
+ 2025-08-21 01:24:26 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] Extracted 30 frames successfully. Saving to temporary files...
450
+ 2025-08-21 01:24:26 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] 30 frames saved to temp_videos/7c0a1e36-be63-4b18-9096-cf0faa8f5ca7
451
+ 2025-08-21 01:24:39 - INFO - vision_config is None, using default vision config
452
+ 2025-08-21 01:24:48 - INFO - Tokens per second: 4.2201806034921, Peak GPU memory MB: 11824.375
453
+ 2025-08-21 01:24:48 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] Inference time: 28.00 seconds, CPU usage: 37.8%, CPU core utilization: [45.9, 36.6, 32.4, 36.4]
454
+ 2025-08-21 01:24:48 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] Cleaned up temporary frame directory: temp_videos/7c0a1e36-be63-4b18-9096-cf0faa8f5ca7
455
+ 2025-08-21 01:24:48 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_051.mp4'
456
+ 2025-08-21 01:24:48 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] Video saved to temporary file: temp_videos/3a2ca98e-47b8-45aa-8834-9dc0e398936a.mp4
457
+ 2025-08-21 01:24:48 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] Extracting frames using method: uniform, rate/threshold: 30
458
+ 2025-08-21 01:24:53 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] Extracted 30 frames successfully. Saving to temporary files...
459
+ 2025-08-21 01:24:53 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] 30 frames saved to temp_videos/3a2ca98e-47b8-45aa-8834-9dc0e398936a
460
+ 2025-08-21 01:25:06 - INFO - vision_config is None, using default vision config
461
+ 2025-08-21 01:25:16 - INFO - Tokens per second: 4.498009394398241, Peak GPU memory MB: 11824.375
462
+ 2025-08-21 01:25:16 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] Inference time: 27.74 seconds, CPU usage: 37.8%, CPU core utilization: [34.1, 28.6, 47.5, 40.9]
463
+ 2025-08-21 01:25:16 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] Cleaned up temporary frame directory: temp_videos/3a2ca98e-47b8-45aa-8834-9dc0e398936a
464
+ 2025-08-21 01:25:16 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_052.mp4'
465
+ 2025-08-21 01:25:16 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] Video saved to temporary file: temp_videos/acb71d01-022e-4d9d-a98c-903f36965977.mp4
466
+ 2025-08-21 01:25:16 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] Extracting frames using method: uniform, rate/threshold: 30
467
+ 2025-08-21 01:25:22 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] Extracted 30 frames successfully. Saving to temporary files...
468
+ 2025-08-21 01:25:22 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] 30 frames saved to temp_videos/acb71d01-022e-4d9d-a98c-903f36965977
469
+ 2025-08-21 01:25:35 - INFO - vision_config is None, using default vision config
470
+ 2025-08-21 01:25:47 - INFO - Tokens per second: 6.6824774787667796, Peak GPU memory MB: 11824.375
471
+ 2025-08-21 01:25:47 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] Inference time: 31.31 seconds, CPU usage: 44.4%, CPU core utilization: [35.8, 44.5, 56.2, 41.3]
472
+ 2025-08-21 01:25:47 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] Cleaned up temporary frame directory: temp_videos/acb71d01-022e-4d9d-a98c-903f36965977
API_Transformers/logs/MiniCPM-V-4/20250821_033846.log ADDED
@@ -0,0 +1,157 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-21 03:38:46 - INFO - Loading model: openbmb/MiniCPM-V-4
2
+ 2025-08-21 03:38:46 - INFO - vision_config is None, using default vision config
3
+ 2025-08-21 03:39:50 - INFO - Model loaded in 64.62 seconds
4
+ 2025-08-21 03:39:50 - INFO - GPU Memory Usage after model load: 7802.99 MB
5
+ 2025-08-21 03:39:57 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_001.mp4'
6
+ 2025-08-21 03:39:57 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Video saved to temporary file: temp_videos/e29d31c5-9a6b-48cd-ac25-7affc04fc186.mp4
7
+ 2025-08-21 03:39:57 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-21 03:40:01 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-21 03:40:01 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] 30 frames saved to temp_videos/e29d31c5-9a6b-48cd-ac25-7affc04fc186
10
+ 2025-08-21 03:40:17 - INFO - vision_config is None, using default vision config
11
+ 2025-08-21 03:40:35 - INFO - Tokens per second: 8.46238691458392, Peak GPU memory MB: 11824.375
12
+ 2025-08-21 03:40:35 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Inference time: 37.25 seconds, CPU usage: 28.4%, CPU core utilization: [24.9, 35.0, 21.8, 31.7]
13
+ 2025-08-21 03:40:35 - INFO - [e29d31c5-9a6b-48cd-ac25-7affc04fc186] Cleaned up temporary frame directory: temp_videos/e29d31c5-9a6b-48cd-ac25-7affc04fc186
14
+ 2025-08-21 03:40:35 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_001.mp4'
15
+ 2025-08-21 03:40:35 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Video saved to temporary file: temp_videos/0ed8d6d1-aea6-4701-a3ed-2d877bfc9882.mp4
16
+ 2025-08-21 03:40:35 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Extracting frames using method: uniform, rate/threshold: 30
17
+ 2025-08-21 03:40:38 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Extracted 30 frames successfully. Saving to temporary files...
18
+ 2025-08-21 03:40:38 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] 30 frames saved to temp_videos/0ed8d6d1-aea6-4701-a3ed-2d877bfc9882
19
+ 2025-08-21 03:40:51 - INFO - vision_config is None, using default vision config
20
+ 2025-08-21 03:41:14 - INFO - Tokens per second: 10.024019230028593, Peak GPU memory MB: 11824.375
21
+ 2025-08-21 03:41:14 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Inference time: 39.58 seconds, CPU usage: 31.8%, CPU core utilization: [16.9, 19.2, 59.5, 31.5]
22
+ 2025-08-21 03:41:14 - INFO - [0ed8d6d1-aea6-4701-a3ed-2d877bfc9882] Cleaned up temporary frame directory: temp_videos/0ed8d6d1-aea6-4701-a3ed-2d877bfc9882
23
+ 2025-08-21 03:41:14 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_002.mp4'
24
+ 2025-08-21 03:41:14 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Video saved to temporary file: temp_videos/1daa28b5-5708-4bd7-b738-7900bee17284.mp4
25
+ 2025-08-21 03:41:14 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Extracting frames using method: uniform, rate/threshold: 30
26
+ 2025-08-21 03:41:18 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Extracted 30 frames successfully. Saving to temporary files...
27
+ 2025-08-21 03:41:18 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] 30 frames saved to temp_videos/1daa28b5-5708-4bd7-b738-7900bee17284
28
+ 2025-08-21 03:41:30 - INFO - vision_config is None, using default vision config
29
+ 2025-08-21 03:41:42 - INFO - Tokens per second: 6.118521643289556, Peak GPU memory MB: 11824.375
30
+ 2025-08-21 03:41:42 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Inference time: 28.18 seconds, CPU usage: 33.2%, CPU core utilization: [57.3, 19.5, 10.5, 45.5]
31
+ 2025-08-21 03:41:42 - INFO - [1daa28b5-5708-4bd7-b738-7900bee17284] Cleaned up temporary frame directory: temp_videos/1daa28b5-5708-4bd7-b738-7900bee17284
32
+ 2025-08-21 03:41:42 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_003.mp4'
33
+ 2025-08-21 03:41:42 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Video saved to temporary file: temp_videos/c70dd357-164c-4d57-b24d-8ead295ef24e.mp4
34
+ 2025-08-21 03:41:42 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Extracting frames using method: uniform, rate/threshold: 30
35
+ 2025-08-21 03:41:46 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Extracted 30 frames successfully. Saving to temporary files...
36
+ 2025-08-21 03:41:46 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] 30 frames saved to temp_videos/c70dd357-164c-4d57-b24d-8ead295ef24e
37
+ 2025-08-21 03:41:59 - INFO - vision_config is None, using default vision config
38
+ 2025-08-21 03:42:13 - INFO - Tokens per second: 7.325785835893888, Peak GPU memory MB: 11824.375
39
+ 2025-08-21 03:42:13 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Inference time: 30.34 seconds, CPU usage: 33.2%, CPU core utilization: [32.8, 46.1, 30.7, 23.2]
40
+ 2025-08-21 03:42:13 - INFO - [c70dd357-164c-4d57-b24d-8ead295ef24e] Cleaned up temporary frame directory: temp_videos/c70dd357-164c-4d57-b24d-8ead295ef24e
41
+ 2025-08-21 03:42:13 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_004.mp4'
42
+ 2025-08-21 03:42:13 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Video saved to temporary file: temp_videos/e2eff8d2-37db-4d25-9765-d46404130b2d.mp4
43
+ 2025-08-21 03:42:13 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Extracting frames using method: uniform, rate/threshold: 30
44
+ 2025-08-21 03:42:16 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Extracted 30 frames successfully. Saving to temporary files...
45
+ 2025-08-21 03:42:16 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] 30 frames saved to temp_videos/e2eff8d2-37db-4d25-9765-d46404130b2d
46
+ 2025-08-21 03:42:29 - INFO - vision_config is None, using default vision config
47
+ 2025-08-21 03:42:40 - INFO - Tokens per second: 5.483056762285139, Peak GPU memory MB: 11824.375
48
+ 2025-08-21 03:42:40 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Inference time: 27.37 seconds, CPU usage: 33.6%, CPU core utilization: [62.6, 13.2, 42.9, 15.6]
49
+ 2025-08-21 03:42:40 - INFO - [e2eff8d2-37db-4d25-9765-d46404130b2d] Cleaned up temporary frame directory: temp_videos/e2eff8d2-37db-4d25-9765-d46404130b2d
50
+ 2025-08-21 03:42:40 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_005.mp4'
51
+ 2025-08-21 03:42:40 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Video saved to temporary file: temp_videos/374baf0b-09b4-47f6-bda7-007ed31b73e6.mp4
52
+ 2025-08-21 03:42:40 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Extracting frames using method: uniform, rate/threshold: 30
53
+ 2025-08-21 03:42:43 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Extracted 30 frames successfully. Saving to temporary files...
54
+ 2025-08-21 03:42:43 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] 30 frames saved to temp_videos/374baf0b-09b4-47f6-bda7-007ed31b73e6
55
+ 2025-08-21 03:42:56 - INFO - vision_config is None, using default vision config
56
+ 2025-08-21 03:43:12 - INFO - Tokens per second: 7.8524871607145865, Peak GPU memory MB: 11824.375
57
+ 2025-08-21 03:43:12 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Inference time: 31.57 seconds, CPU usage: 32.7%, CPU core utilization: [13.2, 42.4, 47.5, 27.5]
58
+ 2025-08-21 03:43:12 - INFO - [374baf0b-09b4-47f6-bda7-007ed31b73e6] Cleaned up temporary frame directory: temp_videos/374baf0b-09b4-47f6-bda7-007ed31b73e6
59
+ 2025-08-21 03:43:12 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_006.mp4'
60
+ 2025-08-21 03:43:12 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Video saved to temporary file: temp_videos/b9266afc-5115-4696-91ea-9894092513ff.mp4
61
+ 2025-08-21 03:43:12 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Extracting frames using method: uniform, rate/threshold: 30
62
+ 2025-08-21 03:43:15 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Extracted 30 frames successfully. Saving to temporary files...
63
+ 2025-08-21 03:43:15 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] 30 frames saved to temp_videos/b9266afc-5115-4696-91ea-9894092513ff
64
+ 2025-08-21 03:43:28 - INFO - vision_config is None, using default vision config
65
+ 2025-08-21 03:43:40 - INFO - Tokens per second: 5.751292635048318, Peak GPU memory MB: 11824.375
66
+ 2025-08-21 03:43:40 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Inference time: 27.84 seconds, CPU usage: 33.5%, CPU core utilization: [24.6, 32.6, 13.2, 63.8]
67
+ 2025-08-21 03:43:40 - INFO - [b9266afc-5115-4696-91ea-9894092513ff] Cleaned up temporary frame directory: temp_videos/b9266afc-5115-4696-91ea-9894092513ff
68
+ 2025-08-21 03:43:40 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_007.mp4'
69
+ 2025-08-21 03:43:40 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Video saved to temporary file: temp_videos/cf387e92-735c-444b-a102-345d888dc633.mp4
70
+ 2025-08-21 03:43:40 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Extracting frames using method: uniform, rate/threshold: 30
71
+ 2025-08-21 03:43:43 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Extracted 30 frames successfully. Saving to temporary files...
72
+ 2025-08-21 03:43:43 - INFO - [cf387e92-735c-444b-a102-345d888dc633] 30 frames saved to temp_videos/cf387e92-735c-444b-a102-345d888dc633
73
+ 2025-08-21 03:43:56 - INFO - vision_config is None, using default vision config
74
+ 2025-08-21 03:44:08 - INFO - Tokens per second: 6.460640309211369, Peak GPU memory MB: 11824.375
75
+ 2025-08-21 03:44:08 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Inference time: 28.88 seconds, CPU usage: 33.2%, CPU core utilization: [11.9, 52.0, 12.7, 56.4]
76
+ 2025-08-21 03:44:08 - INFO - [cf387e92-735c-444b-a102-345d888dc633] Cleaned up temporary frame directory: temp_videos/cf387e92-735c-444b-a102-345d888dc633
77
+ 2025-08-21 03:44:08 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_008.mp4'
78
+ 2025-08-21 03:44:08 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Video saved to temporary file: temp_videos/aef21c10-5565-48f5-bcbf-e239a1faa322.mp4
79
+ 2025-08-21 03:44:08 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Extracting frames using method: uniform, rate/threshold: 30
80
+ 2025-08-21 03:44:12 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Extracted 30 frames successfully. Saving to temporary files...
81
+ 2025-08-21 03:44:12 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] 30 frames saved to temp_videos/aef21c10-5565-48f5-bcbf-e239a1faa322
82
+ 2025-08-21 03:44:25 - INFO - vision_config is None, using default vision config
83
+ 2025-08-21 03:44:35 - INFO - Tokens per second: 4.950112497910254, Peak GPU memory MB: 11824.375
84
+ 2025-08-21 03:44:35 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Inference time: 26.99 seconds, CPU usage: 34.1%, CPU core utilization: [16.3, 12.5, 47.8, 59.8]
85
+ 2025-08-21 03:44:35 - INFO - [aef21c10-5565-48f5-bcbf-e239a1faa322] Cleaned up temporary frame directory: temp_videos/aef21c10-5565-48f5-bcbf-e239a1faa322
86
+ 2025-08-21 03:44:35 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_009.mp4'
87
+ 2025-08-21 03:44:35 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Video saved to temporary file: temp_videos/040650dd-914d-453f-a411-b31d1d6897d5.mp4
88
+ 2025-08-21 03:44:35 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Extracting frames using method: uniform, rate/threshold: 30
89
+ 2025-08-21 03:44:39 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Extracted 30 frames successfully. Saving to temporary files...
90
+ 2025-08-21 03:44:39 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] 30 frames saved to temp_videos/040650dd-914d-453f-a411-b31d1d6897d5
91
+ 2025-08-21 03:44:52 - INFO - vision_config is None, using default vision config
92
+ 2025-08-21 03:45:04 - INFO - Tokens per second: 6.046726583056993, Peak GPU memory MB: 11824.375
93
+ 2025-08-21 03:45:04 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Inference time: 28.31 seconds, CPU usage: 33.8%, CPU core utilization: [45.1, 36.7, 40.1, 13.1]
94
+ 2025-08-21 03:45:04 - INFO - [040650dd-914d-453f-a411-b31d1d6897d5] Cleaned up temporary frame directory: temp_videos/040650dd-914d-453f-a411-b31d1d6897d5
95
+ 2025-08-21 03:45:04 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_010.mp4'
96
+ 2025-08-21 03:45:04 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Video saved to temporary file: temp_videos/c4922af4-0973-46aa-8ab3-a2904f616ca0.mp4
97
+ 2025-08-21 03:45:04 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Extracting frames using method: uniform, rate/threshold: 30
98
+ 2025-08-21 03:45:07 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Extracted 30 frames successfully. Saving to temporary files...
99
+ 2025-08-21 03:45:07 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] 30 frames saved to temp_videos/c4922af4-0973-46aa-8ab3-a2904f616ca0
100
+ 2025-08-21 03:45:20 - INFO - vision_config is None, using default vision config
101
+ 2025-08-21 03:45:31 - INFO - Tokens per second: 5.012952424490043, Peak GPU memory MB: 11824.375
102
+ 2025-08-21 03:45:31 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Inference time: 26.92 seconds, CPU usage: 33.9%, CPU core utilization: [45.7, 15.2, 24.5, 49.9]
103
+ 2025-08-21 03:45:31 - INFO - [c4922af4-0973-46aa-8ab3-a2904f616ca0] Cleaned up temporary frame directory: temp_videos/c4922af4-0973-46aa-8ab3-a2904f616ca0
104
+ 2025-08-21 03:45:31 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_011.mp4'
105
+ 2025-08-21 03:45:31 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Video saved to temporary file: temp_videos/96d2962b-c166-4be7-847e-fe025954af18.mp4
106
+ 2025-08-21 03:45:31 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Extracting frames using method: uniform, rate/threshold: 30
107
+ 2025-08-21 03:45:34 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Extracted 30 frames successfully. Saving to temporary files...
108
+ 2025-08-21 03:45:34 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] 30 frames saved to temp_videos/96d2962b-c166-4be7-847e-fe025954af18
109
+ 2025-08-21 03:45:47 - INFO - vision_config is None, using default vision config
110
+ 2025-08-21 03:45:59 - INFO - Tokens per second: 6.0496181050699604, Peak GPU memory MB: 11824.375
111
+ 2025-08-21 03:45:59 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Inference time: 28.26 seconds, CPU usage: 33.7%, CPU core utilization: [15.3, 25.6, 46.3, 47.7]
112
+ 2025-08-21 03:45:59 - INFO - [96d2962b-c166-4be7-847e-fe025954af18] Cleaned up temporary frame directory: temp_videos/96d2962b-c166-4be7-847e-fe025954af18
113
+ 2025-08-21 03:45:59 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_012.mp4'
114
+ 2025-08-21 03:45:59 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Video saved to temporary file: temp_videos/1861ded3-2706-4381-8e32-07949f940d95.mp4
115
+ 2025-08-21 03:45:59 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Extracting frames using method: uniform, rate/threshold: 30
116
+ 2025-08-21 03:46:02 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Extracted 30 frames successfully. Saving to temporary files...
117
+ 2025-08-21 03:46:02 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] 30 frames saved to temp_videos/1861ded3-2706-4381-8e32-07949f940d95
118
+ 2025-08-21 03:46:15 - INFO - vision_config is None, using default vision config
119
+ 2025-08-21 03:46:27 - INFO - Tokens per second: 6.042554615117601, Peak GPU memory MB: 11824.375
120
+ 2025-08-21 03:46:27 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Inference time: 28.37 seconds, CPU usage: 33.3%, CPU core utilization: [53.9, 22.6, 36.0, 20.6]
121
+ 2025-08-21 03:46:27 - INFO - [1861ded3-2706-4381-8e32-07949f940d95] Cleaned up temporary frame directory: temp_videos/1861ded3-2706-4381-8e32-07949f940d95
122
+ 2025-08-21 03:46:27 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_013.mp4'
123
+ 2025-08-21 03:46:27 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Video saved to temporary file: temp_videos/04268664-f928-4c89-ab42-d07706c93257.mp4
124
+ 2025-08-21 03:46:27 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Extracting frames using method: uniform, rate/threshold: 30
125
+ 2025-08-21 03:46:31 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Extracted 30 frames successfully. Saving to temporary files...
126
+ 2025-08-21 03:46:31 - INFO - [04268664-f928-4c89-ab42-d07706c93257] 30 frames saved to temp_videos/04268664-f928-4c89-ab42-d07706c93257
127
+ 2025-08-21 03:46:44 - INFO - vision_config is None, using default vision config
128
+ 2025-08-21 03:46:55 - INFO - Tokens per second: 5.897914933533588, Peak GPU memory MB: 11824.375
129
+ 2025-08-21 03:46:55 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Inference time: 28.07 seconds, CPU usage: 33.2%, CPU core utilization: [52.2, 22.0, 22.6, 35.7]
130
+ 2025-08-21 03:46:55 - INFO - [04268664-f928-4c89-ab42-d07706c93257] Cleaned up temporary frame directory: temp_videos/04268664-f928-4c89-ab42-d07706c93257
131
+ 2025-08-21 03:46:55 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_014.mp4'
132
+ 2025-08-21 03:46:55 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Video saved to temporary file: temp_videos/604fd124-b44b-4dfe-b7b4-8d3e1f179d69.mp4
133
+ 2025-08-21 03:46:55 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Extracting frames using method: uniform, rate/threshold: 30
134
+ 2025-08-21 03:46:59 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Extracted 30 frames successfully. Saving to temporary files...
135
+ 2025-08-21 03:46:59 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] 30 frames saved to temp_videos/604fd124-b44b-4dfe-b7b4-8d3e1f179d69
136
+ 2025-08-21 03:47:12 - INFO - vision_config is None, using default vision config
137
+ 2025-08-21 03:47:25 - INFO - Tokens per second: 6.542944804531987, Peak GPU memory MB: 11824.375
138
+ 2025-08-21 03:47:25 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Inference time: 29.09 seconds, CPU usage: 32.5%, CPU core utilization: [11.8, 19.4, 53.2, 45.3]
139
+ 2025-08-21 03:47:25 - INFO - [604fd124-b44b-4dfe-b7b4-8d3e1f179d69] Cleaned up temporary frame directory: temp_videos/604fd124-b44b-4dfe-b7b4-8d3e1f179d69
140
+ 2025-08-21 03:47:25 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_015.mp4'
141
+ 2025-08-21 03:47:25 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Video saved to temporary file: temp_videos/02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d.mp4
142
+ 2025-08-21 03:47:25 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Extracting frames using method: uniform, rate/threshold: 30
143
+ 2025-08-21 03:47:28 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Extracted 30 frames successfully. Saving to temporary files...
144
+ 2025-08-21 03:47:28 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] 30 frames saved to temp_videos/02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d
145
+ 2025-08-21 03:47:41 - INFO - vision_config is None, using default vision config
146
+ 2025-08-21 03:47:52 - INFO - Tokens per second: 5.0694963028257245, Peak GPU memory MB: 11824.375
147
+ 2025-08-21 03:47:52 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Inference time: 27.06 seconds, CPU usage: 33.0%, CPU core utilization: [18.7, 21.7, 41.3, 50.1]
148
+ 2025-08-21 03:47:52 - INFO - [02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d] Cleaned up temporary frame directory: temp_videos/02b38a1d-8ab7-43a6-a5e6-8a531ff2ce6d
149
+ 2025-08-21 03:47:52 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_016.mp4'
150
+ 2025-08-21 03:47:52 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Video saved to temporary file: temp_videos/0ac85bea-34eb-4a31-8263-f7810fb38235.mp4
151
+ 2025-08-21 03:47:52 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Extracting frames using method: uniform, rate/threshold: 30
152
+ 2025-08-21 03:47:55 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Extracted 30 frames successfully. Saving to temporary files...
153
+ 2025-08-21 03:47:55 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] 30 frames saved to temp_videos/0ac85bea-34eb-4a31-8263-f7810fb38235
154
+ 2025-08-21 03:48:08 - INFO - vision_config is None, using default vision config
155
+ 2025-08-21 03:48:19 - INFO - Tokens per second: 5.245383520327553, Peak GPU memory MB: 11824.375
156
+ 2025-08-21 03:48:19 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Inference time: 27.29 seconds, CPU usage: 33.6%, CPU core utilization: [53.0, 47.3, 12.8, 21.4]
157
+ 2025-08-21 03:48:19 - INFO - [0ac85bea-34eb-4a31-8263-f7810fb38235] Cleaned up temporary frame directory: temp_videos/0ac85bea-34eb-4a31-8263-f7810fb38235
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_212712.log ADDED
@@ -0,0 +1 @@
 
 
1
+ 2025-08-18 21:27:12 - INFO - Loading model: Qwen2-VL-2B-Instruct-AWQ
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_212744.log ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 21:27:44 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ
2
+ 2025-08-18 21:27:46 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 21:27:54 - INFO - Model loaded in 9.42 seconds
4
+ 2025-08-18 21:27:54 - INFO - GPU Memory Usage after model load: 2.31 GB
5
+ 2025-08-18 21:29:38 - INFO - [f4055b77-a7b2-4f36-8134-d0a30d7f57b0] Received new video inference request. Prompt: '视频里发生了什么?', Video: 'messi_part_022.mp4'
6
+ 2025-08-18 21:29:38 - INFO - [f4055b77-a7b2-4f36-8134-d0a30d7f57b0] Video saved to temporary file: temp_videos/f4055b77-a7b2-4f36-8134-d0a30d7f57b0.mp4
7
+ 2025-08-18 21:29:38 - INFO - [f4055b77-a7b2-4f36-8134-d0a30d7f57b0] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 21:29:47 - INFO - [f4055b77-a7b2-4f36-8134-d0a30d7f57b0] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 21:29:47 - INFO - [f4055b77-a7b2-4f36-8134-d0a30d7f57b0] 30 frames saved to temp_videos/f4055b77-a7b2-4f36-8134-d0a30d7f57b0
10
+ 2025-08-18 21:29:49 - ERROR - [f4055b77-a7b2-4f36-8134-d0a30d7f57b0] An error occurred during processing: name 'processor' is not defined
11
+ Traceback (most recent call last):
12
+ File "/mnt/data/xiuying/Code/local_deploy/infer.py", line 105, in video_inference
13
+ output_text = model.generate(frame_paths, prompt)
14
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15
+ File "/mnt/data/xiuying/Code/local_deploy/models/qwen.py", line 44, in generate
16
+ streamer = TextIteratorStreamer(processor, skip_prompt=True, skip_special_tokens=True)
17
+ ^^^^^^^^^
18
+ NameError: name 'processor' is not defined
19
+ 2025-08-18 21:29:49 - INFO - [f4055b77-a7b2-4f36-8134-d0a30d7f57b0] Cleaned up temporary file: temp_videos/f4055b77-a7b2-4f36-8134-d0a30d7f57b0.mp4
20
+ 2025-08-18 21:29:49 - INFO - [f4055b77-a7b2-4f36-8134-d0a30d7f57b0] Cleaned up temporary frame directory: temp_videos/f4055b77-a7b2-4f36-8134-d0a30d7f57b0
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_213116.log ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 21:31:16 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ
2
+ 2025-08-18 21:31:19 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 21:31:26 - INFO - Model loaded in 10.07 seconds
4
+ 2025-08-18 21:31:26 - INFO - GPU Memory Usage after model load: 2.31 GB
5
+ 2025-08-18 21:31:32 - INFO - [4b186b65-1c0b-488e-b92f-aa90f72b6b89] Received new video inference request. Prompt: '视频里发生了什么?', Video: 'messi_part_022.mp4'
6
+ 2025-08-18 21:31:32 - INFO - [4b186b65-1c0b-488e-b92f-aa90f72b6b89] Video saved to temporary file: temp_videos/4b186b65-1c0b-488e-b92f-aa90f72b6b89.mp4
7
+ 2025-08-18 21:31:32 - INFO - [4b186b65-1c0b-488e-b92f-aa90f72b6b89] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 21:31:38 - INFO - [4b186b65-1c0b-488e-b92f-aa90f72b6b89] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 21:31:38 - INFO - [4b186b65-1c0b-488e-b92f-aa90f72b6b89] 30 frames saved to temp_videos/4b186b65-1c0b-488e-b92f-aa90f72b6b89
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_214203.log ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 21:42:03 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ
2
+ 2025-08-18 21:42:05 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 21:42:12 - INFO - Model loaded in 8.75 seconds
4
+ 2025-08-18 21:42:12 - INFO - GPU Memory Usage after model load: 2.31 GB
5
+ 2025-08-18 21:42:48 - INFO - [a8230cea-9aea-49ae-9a22-ef739e460b3a] Received new video inference request. Prompt: '视频里发生了什么?', Video: 'messi_part_022.mp4'
6
+ 2025-08-18 21:42:48 - INFO - [a8230cea-9aea-49ae-9a22-ef739e460b3a] Video saved to temporary file: temp_videos/a8230cea-9aea-49ae-9a22-ef739e460b3a.mp4
7
+ 2025-08-18 21:42:48 - INFO - [a8230cea-9aea-49ae-9a22-ef739e460b3a] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 21:42:54 - INFO - [a8230cea-9aea-49ae-9a22-ef739e460b3a] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 21:42:54 - INFO - [a8230cea-9aea-49ae-9a22-ef739e460b3a] 30 frames saved to temp_videos/a8230cea-9aea-49ae-9a22-ef739e460b3a
10
+ 2025-08-18 21:42:54 - INFO - Prompt token length: 2276
11
+ 2025-08-18 21:43:14 - INFO - [a8230cea-9aea-49ae-9a22-ef739e460b3a] Cleaned up temporary file: temp_videos/a8230cea-9aea-49ae-9a22-ef739e460b3a.mp4
12
+ 2025-08-18 21:43:14 - INFO - [a8230cea-9aea-49ae-9a22-ef739e460b3a] Cleaned up temporary frame directory: temp_videos/a8230cea-9aea-49ae-9a22-ef739e460b3a
13
+ 2025-08-18 21:43:43 - INFO - [24e5c5d0-b5ee-4bb0-b431-ebf1b58533ac] Received new video inference request. Prompt: 'Please describe the video in detail.', Video: 'messi_part_022.mp4'
14
+ 2025-08-18 21:43:43 - INFO - [24e5c5d0-b5ee-4bb0-b431-ebf1b58533ac] Video saved to temporary file: temp_videos/24e5c5d0-b5ee-4bb0-b431-ebf1b58533ac.mp4
15
+ 2025-08-18 21:43:43 - INFO - [24e5c5d0-b5ee-4bb0-b431-ebf1b58533ac] Extracting frames using method: uniform, rate/threshold: 30
16
+ 2025-08-18 21:43:48 - INFO - [24e5c5d0-b5ee-4bb0-b431-ebf1b58533ac] Extracted 30 frames successfully. Saving to temporary files...
17
+ 2025-08-18 21:43:48 - INFO - [24e5c5d0-b5ee-4bb0-b431-ebf1b58533ac] 30 frames saved to temp_videos/24e5c5d0-b5ee-4bb0-b431-ebf1b58533ac
18
+ 2025-08-18 21:43:49 - INFO - Prompt token length: 2278
19
+ 2025-08-18 21:43:59 - INFO - [24e5c5d0-b5ee-4bb0-b431-ebf1b58533ac] Cleaned up temporary file: temp_videos/24e5c5d0-b5ee-4bb0-b431-ebf1b58533ac.mp4
20
+ 2025-08-18 21:43:59 - INFO - [24e5c5d0-b5ee-4bb0-b431-ebf1b58533ac] Cleaned up temporary frame directory: temp_videos/24e5c5d0-b5ee-4bb0-b431-ebf1b58533ac
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_215326.log ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 21:53:26 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ
2
+ 2025-08-18 21:53:28 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 21:53:35 - INFO - Model loaded in 8.77 seconds
4
+ 2025-08-18 21:53:35 - INFO - GPU Memory Usage after model load: 2.31 GB
5
+ 2025-08-18 21:53:50 - INFO - [c0e9cdfb-ee82-4829-855a-2e1b42058c09] Received new video inference request. Prompt: 'Please describe the video in detail.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 21:53:50 - INFO - [c0e9cdfb-ee82-4829-855a-2e1b42058c09] Video saved to temporary file: temp_videos/c0e9cdfb-ee82-4829-855a-2e1b42058c09.mp4
7
+ 2025-08-18 21:53:50 - INFO - [c0e9cdfb-ee82-4829-855a-2e1b42058c09] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 21:53:52 - INFO - [c0e9cdfb-ee82-4829-855a-2e1b42058c09] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 21:53:52 - INFO - [c0e9cdfb-ee82-4829-855a-2e1b42058c09] 30 frames saved to temp_videos/c0e9cdfb-ee82-4829-855a-2e1b42058c09
10
+ 2025-08-18 21:53:53 - INFO - Prompt token length: 2278
11
+ 2025-08-18 21:54:13 - INFO - [c0e9cdfb-ee82-4829-855a-2e1b42058c09] Cleaned up temporary file: temp_videos/c0e9cdfb-ee82-4829-855a-2e1b42058c09.mp4
12
+ 2025-08-18 21:54:13 - INFO - [c0e9cdfb-ee82-4829-855a-2e1b42058c09] Cleaned up temporary frame directory: temp_videos/c0e9cdfb-ee82-4829-855a-2e1b42058c09
13
+ 2025-08-18 21:55:01 - INFO - [62a196af-36df-460b-8259-6c55c7dd812b] Received new video inference request. Prompt: 'Please describe the video in detail.', Video: 'messi_part_001.mp4'
14
+ 2025-08-18 21:55:01 - INFO - [62a196af-36df-460b-8259-6c55c7dd812b] Video saved to temporary file: temp_videos/62a196af-36df-460b-8259-6c55c7dd812b.mp4
15
+ 2025-08-18 21:55:01 - INFO - [62a196af-36df-460b-8259-6c55c7dd812b] Extracting frames using method: uniform, rate/threshold: 30
16
+ 2025-08-18 21:55:03 - INFO - [62a196af-36df-460b-8259-6c55c7dd812b] Extracted 30 frames successfully. Saving to temporary files...
17
+ 2025-08-18 21:55:03 - INFO - [62a196af-36df-460b-8259-6c55c7dd812b] 30 frames saved to temp_videos/62a196af-36df-460b-8259-6c55c7dd812b
18
+ 2025-08-18 21:55:04 - INFO - Prompt token length: 2278
19
+ 2025-08-18 21:55:23 - INFO - [62a196af-36df-460b-8259-6c55c7dd812b] Cleaned up temporary file: temp_videos/62a196af-36df-460b-8259-6c55c7dd812b.mp4
20
+ 2025-08-18 21:55:23 - INFO - [62a196af-36df-460b-8259-6c55c7dd812b] Cleaned up temporary frame directory: temp_videos/62a196af-36df-460b-8259-6c55c7dd812b
21
+ 2025-08-18 21:58:51 - INFO - [91718f5c-a793-425c-80ca-f228057def8f] Received new video inference request. Prompt: 'Please describe the video in detail.', Video: 'messi_part_001.mp4'
22
+ 2025-08-18 21:58:51 - INFO - [91718f5c-a793-425c-80ca-f228057def8f] Video saved to temporary file: temp_videos/91718f5c-a793-425c-80ca-f228057def8f.mp4
23
+ 2025-08-18 21:58:51 - INFO - [91718f5c-a793-425c-80ca-f228057def8f] Extracting frames using method: uniform, rate/threshold: 30
24
+ 2025-08-18 21:58:53 - INFO - [91718f5c-a793-425c-80ca-f228057def8f] Extracted 30 frames successfully. Saving to temporary files...
25
+ 2025-08-18 21:58:53 - INFO - [91718f5c-a793-425c-80ca-f228057def8f] 30 frames saved to temp_videos/91718f5c-a793-425c-80ca-f228057def8f
26
+ 2025-08-18 21:58:54 - INFO - Prompt token length: 2278
27
+ 2025-08-18 21:59:13 - INFO - [91718f5c-a793-425c-80ca-f228057def8f] Cleaned up temporary file: temp_videos/91718f5c-a793-425c-80ca-f228057def8f.mp4
28
+ 2025-08-18 21:59:13 - INFO - [91718f5c-a793-425c-80ca-f228057def8f] Cleaned up temporary frame directory: temp_videos/91718f5c-a793-425c-80ca-f228057def8f
29
+ 2025-08-18 22:00:03 - INFO - [b642c617-2577-4079-8f47-9a0f03b8f46a] Received new video inference request. Prompt: 'Please describe the video in detail.', Video: 'messi_part_001.mp4'
30
+ 2025-08-18 22:00:03 - INFO - [b642c617-2577-4079-8f47-9a0f03b8f46a] Video saved to temporary file: temp_videos/b642c617-2577-4079-8f47-9a0f03b8f46a.mp4
31
+ 2025-08-18 22:00:03 - INFO - [b642c617-2577-4079-8f47-9a0f03b8f46a] Extracting frames using method: uniform, rate/threshold: 30
32
+ 2025-08-18 22:00:06 - INFO - [b642c617-2577-4079-8f47-9a0f03b8f46a] Extracted 30 frames successfully. Saving to temporary files...
33
+ 2025-08-18 22:00:06 - INFO - [b642c617-2577-4079-8f47-9a0f03b8f46a] 30 frames saved to temp_videos/b642c617-2577-4079-8f47-9a0f03b8f46a
34
+ 2025-08-18 22:00:06 - INFO - Prompt token length: 2278
35
+ 2025-08-18 22:00:25 - INFO - [b642c617-2577-4079-8f47-9a0f03b8f46a] Cleaned up temporary file: temp_videos/b642c617-2577-4079-8f47-9a0f03b8f46a.mp4
36
+ 2025-08-18 22:00:25 - INFO - [b642c617-2577-4079-8f47-9a0f03b8f46a] Cleaned up temporary frame directory: temp_videos/b642c617-2577-4079-8f47-9a0f03b8f46a
37
+ 2025-08-18 22:01:37 - INFO - [f91a5b3b-cd53-4ca6-9c5d-dd0b9432dba8] Received new video inference request. Prompt: 'Please describe the video in detail.', Video: 'messi_part_001.mp4'
38
+ 2025-08-18 22:01:37 - INFO - [f91a5b3b-cd53-4ca6-9c5d-dd0b9432dba8] Video saved to temporary file: temp_videos/f91a5b3b-cd53-4ca6-9c5d-dd0b9432dba8.mp4
39
+ 2025-08-18 22:01:37 - INFO - [f91a5b3b-cd53-4ca6-9c5d-dd0b9432dba8] Extracting frames using method: uniform, rate/threshold: 30
40
+ 2025-08-18 22:01:41 - INFO - [f91a5b3b-cd53-4ca6-9c5d-dd0b9432dba8] Extracted 30 frames successfully. Saving to temporary files...
41
+ 2025-08-18 22:01:41 - INFO - [f91a5b3b-cd53-4ca6-9c5d-dd0b9432dba8] 30 frames saved to temp_videos/f91a5b3b-cd53-4ca6-9c5d-dd0b9432dba8
42
+ 2025-08-18 22:01:42 - INFO - Prompt token length: 2278
43
+ 2025-08-18 22:01:51 - INFO - [f91a5b3b-cd53-4ca6-9c5d-dd0b9432dba8] Cleaned up temporary file: temp_videos/f91a5b3b-cd53-4ca6-9c5d-dd0b9432dba8.mp4
44
+ 2025-08-18 22:01:51 - INFO - [f91a5b3b-cd53-4ca6-9c5d-dd0b9432dba8] Cleaned up temporary frame directory: temp_videos/f91a5b3b-cd53-4ca6-9c5d-dd0b9432dba8
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_221356.log ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 22:13:56 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ
2
+ 2025-08-18 22:13:58 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 22:14:05 - INFO - Model loaded in 8.85 seconds
4
+ 2025-08-18 22:14:05 - INFO - GPU Memory Usage after model load: 2.31 GB
5
+ 2025-08-18 22:14:11 - INFO - [723f6887-db9b-43c9-846f-fc1b2cb67237] Received new video inference request. Prompt: 'Please describe the video in detail.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 22:14:11 - INFO - [723f6887-db9b-43c9-846f-fc1b2cb67237] Video saved to temporary file: temp_videos/723f6887-db9b-43c9-846f-fc1b2cb67237.mp4
7
+ 2025-08-18 22:14:11 - INFO - [723f6887-db9b-43c9-846f-fc1b2cb67237] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 22:14:13 - INFO - [723f6887-db9b-43c9-846f-fc1b2cb67237] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 22:14:13 - INFO - [723f6887-db9b-43c9-846f-fc1b2cb67237] 30 frames saved to temp_videos/723f6887-db9b-43c9-846f-fc1b2cb67237
10
+ 2025-08-18 22:14:14 - INFO - Prompt token length: 2278
11
+ 2025-08-18 22:14:34 - INFO - Tokens per second: 12.449120956672985, Avg GPU memory MB: 3164.236328125
12
+ 2025-08-18 22:14:34 - INFO - [723f6887-db9b-43c9-846f-fc1b2cb67237] Cleaned up temporary file: temp_videos/723f6887-db9b-43c9-846f-fc1b2cb67237.mp4
13
+ 2025-08-18 22:14:34 - INFO - [723f6887-db9b-43c9-846f-fc1b2cb67237] Cleaned up temporary frame directory: temp_videos/723f6887-db9b-43c9-846f-fc1b2cb67237
14
+ 2025-08-18 22:15:33 - INFO - [3a6ec3f7-fcb1-4c31-ab9f-9aed3009645e] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
15
+ 2025-08-18 22:15:33 - INFO - [3a6ec3f7-fcb1-4c31-ab9f-9aed3009645e] Video saved to temporary file: temp_videos/3a6ec3f7-fcb1-4c31-ab9f-9aed3009645e.mp4
16
+ 2025-08-18 22:15:33 - INFO - [3a6ec3f7-fcb1-4c31-ab9f-9aed3009645e] Extracting frames using method: uniform, rate/threshold: 30
17
+ 2025-08-18 22:15:35 - INFO - [3a6ec3f7-fcb1-4c31-ab9f-9aed3009645e] Extracted 30 frames successfully. Saving to temporary files...
18
+ 2025-08-18 22:15:35 - INFO - [3a6ec3f7-fcb1-4c31-ab9f-9aed3009645e] 30 frames saved to temp_videos/3a6ec3f7-fcb1-4c31-ab9f-9aed3009645e
19
+ 2025-08-18 22:15:36 - INFO - Prompt token length: 2276
20
+ 2025-08-18 22:15:44 - INFO - Tokens per second: 11.375589334843664, Avg GPU memory MB: 3163.87451171875
21
+ 2025-08-18 22:15:44 - INFO - [3a6ec3f7-fcb1-4c31-ab9f-9aed3009645e] Cleaned up temporary file: temp_videos/3a6ec3f7-fcb1-4c31-ab9f-9aed3009645e.mp4
22
+ 2025-08-18 22:15:44 - INFO - [3a6ec3f7-fcb1-4c31-ab9f-9aed3009645e] Cleaned up temporary frame directory: temp_videos/3a6ec3f7-fcb1-4c31-ab9f-9aed3009645e
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_221804.log ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 22:18:04 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ
2
+ 2025-08-18 22:18:07 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 22:18:15 - INFO - Model loaded in 10.56 seconds
4
+ 2025-08-18 22:18:15 - INFO - GPU Memory Usage after model load: 2.31 GB
5
+ 2025-08-18 22:18:24 - INFO - [f7d858d3-2aa9-419b-8876-9fa6c707b362] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 22:18:24 - INFO - [f7d858d3-2aa9-419b-8876-9fa6c707b362] Video saved to temporary file: temp_videos/f7d858d3-2aa9-419b-8876-9fa6c707b362.mp4
7
+ 2025-08-18 22:18:24 - INFO - [f7d858d3-2aa9-419b-8876-9fa6c707b362] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 22:18:28 - INFO - [f7d858d3-2aa9-419b-8876-9fa6c707b362] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 22:18:28 - INFO - [f7d858d3-2aa9-419b-8876-9fa6c707b362] 30 frames saved to temp_videos/f7d858d3-2aa9-419b-8876-9fa6c707b362
10
+ 2025-08-18 22:18:29 - INFO - Prompt token length: 2276
11
+ 2025-08-18 22:18:39 - INFO - Tokens per second: 9.237412977708425, Avg GPU memory MB: 3164.08056640625
12
+ 2025-08-18 22:18:39 - ERROR - [f7d858d3-2aa9-419b-8876-9fa6c707b362] An error occurred during processing: [Errno 2] No such file or directory: 'outputs/Qwen2-VL-2B-Instruct-AWQ/20250818_221804.json'
13
+ Traceback (most recent call last):
14
+ File "/mnt/data/xiuying/Code/local_deploy/infer.py", line 110, in video_inference
15
+ with open(os.path.join(OUTPUT_DIR, f"{start_time}.json"), "w") as f:
16
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17
+ FileNotFoundError: [Errno 2] No such file or directory: 'outputs/Qwen2-VL-2B-Instruct-AWQ/20250818_221804.json'
18
+ 2025-08-18 22:18:39 - INFO - [f7d858d3-2aa9-419b-8876-9fa6c707b362] Cleaned up temporary file: temp_videos/f7d858d3-2aa9-419b-8876-9fa6c707b362.mp4
19
+ 2025-08-18 22:18:39 - INFO - [f7d858d3-2aa9-419b-8876-9fa6c707b362] Cleaned up temporary frame directory: temp_videos/f7d858d3-2aa9-419b-8876-9fa6c707b362
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_222505.log ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 22:25:05 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ
2
+ 2025-08-18 22:25:08 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 22:25:15 - INFO - Model loaded in 10.72 seconds
4
+ 2025-08-18 22:25:15 - INFO - GPU Memory Usage after model load: 2.31 GB
5
+ 2025-08-18 22:25:32 - INFO - [a4aa5634-1f05-4a10-a409-f6f99576382b] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 22:25:32 - INFO - [a4aa5634-1f05-4a10-a409-f6f99576382b] Video saved to temporary file: temp_videos/a4aa5634-1f05-4a10-a409-f6f99576382b.mp4
7
+ 2025-08-18 22:25:32 - INFO - [a4aa5634-1f05-4a10-a409-f6f99576382b] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 22:25:36 - INFO - [a4aa5634-1f05-4a10-a409-f6f99576382b] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 22:25:36 - INFO - [a4aa5634-1f05-4a10-a409-f6f99576382b] 30 frames saved to temp_videos/a4aa5634-1f05-4a10-a409-f6f99576382b
10
+ 2025-08-18 22:25:37 - INFO - Prompt token length: 2276
11
+ 2025-08-18 22:25:47 - ERROR - [a4aa5634-1f05-4a10-a409-f6f99576382b] An error occurred during processing: 'avg_gpu_memory_mb'
12
+ Traceback (most recent call last):
13
+ File "/mnt/data/xiuying/Code/local_deploy/infer.py", line 109, in video_inference
14
+ logging.info(f"Tokens per second: {output['tokens_per_second']}, Avg GPU memory MB: {output['avg_gpu_memory_mb']}")
15
+ ~~~~~~^^^^^^^^^^^^^^^^^^^^^
16
+ KeyError: 'avg_gpu_memory_mb'
17
+ 2025-08-18 22:25:47 - INFO - [a4aa5634-1f05-4a10-a409-f6f99576382b] Cleaned up temporary file: temp_videos/a4aa5634-1f05-4a10-a409-f6f99576382b.mp4
18
+ 2025-08-18 22:25:47 - INFO - [a4aa5634-1f05-4a10-a409-f6f99576382b] Cleaned up temporary frame directory: temp_videos/a4aa5634-1f05-4a10-a409-f6f99576382b
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_222617.log ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 22:26:17 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ
2
+ 2025-08-18 22:26:20 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 22:26:28 - INFO - Model loaded in 10.99 seconds
4
+ 2025-08-18 22:26:28 - INFO - GPU Memory Usage after model load: 2.31 GB
5
+ 2025-08-18 22:26:32 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 22:26:32 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] Video saved to temporary file: temp_videos/27d85b80-1b2f-42eb-9084-a747364133e1.mp4
7
+ 2025-08-18 22:26:32 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 22:26:36 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 22:26:36 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] 30 frames saved to temp_videos/27d85b80-1b2f-42eb-9084-a747364133e1
10
+ 2025-08-18 22:26:37 - INFO - Prompt token length: 2276
11
+ 2025-08-18 22:26:48 - INFO - Tokens per second: 8.544413217338054, Peak GPU memory MB: 4498.375
12
+ 2025-08-18 22:26:48 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] Cleaned up temporary file: temp_videos/27d85b80-1b2f-42eb-9084-a747364133e1.mp4
13
+ 2025-08-18 22:26:48 - INFO - [27d85b80-1b2f-42eb-9084-a747364133e1] Cleaned up temporary frame directory: temp_videos/27d85b80-1b2f-42eb-9084-a747364133e1
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_223141.log ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 22:31:41 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ
2
+ 2025-08-18 22:31:44 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 22:31:53 - INFO - Model loaded in 12.48 seconds
4
+ 2025-08-18 22:31:53 - INFO - GPU Memory Usage after model load: 2.31 GB
5
+ 2025-08-18 22:32:49 - INFO - [a0e31fc7-179a-419d-b6eb-a6f05bc2a73f] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 22:32:49 - INFO - [a0e31fc7-179a-419d-b6eb-a6f05bc2a73f] Video saved to temporary file: temp_videos/a0e31fc7-179a-419d-b6eb-a6f05bc2a73f.mp4
7
+ 2025-08-18 22:32:49 - INFO - [a0e31fc7-179a-419d-b6eb-a6f05bc2a73f] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 22:32:53 - INFO - [a0e31fc7-179a-419d-b6eb-a6f05bc2a73f] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 22:32:53 - INFO - [a0e31fc7-179a-419d-b6eb-a6f05bc2a73f] 30 frames saved to temp_videos/a0e31fc7-179a-419d-b6eb-a6f05bc2a73f
10
+ 2025-08-18 22:32:54 - INFO - Prompt token length: 2276
11
+ 2025-08-18 22:33:04 - INFO - Tokens per second: 9.100198479728341, Peak GPU memory MB: 4498.375
12
+ 2025-08-18 22:33:04 - INFO - [a0e31fc7-179a-419d-b6eb-a6f05bc2a73f] Inference time: 14.89 seconds, CPU usage: 0.0%, CPU core utilization: [0.0, 0.0, 0.0, 0.0]
13
+ 2025-08-18 22:33:04 - INFO - [a0e31fc7-179a-419d-b6eb-a6f05bc2a73f] Cleaned up temporary file: temp_videos/a0e31fc7-179a-419d-b6eb-a6f05bc2a73f.mp4
14
+ 2025-08-18 22:33:04 - INFO - [a0e31fc7-179a-419d-b6eb-a6f05bc2a73f] Cleaned up temporary frame directory: temp_videos/a0e31fc7-179a-419d-b6eb-a6f05bc2a73f
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_223603.log ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 22:36:03 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ
2
+ 2025-08-18 22:36:05 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 22:36:13 - INFO - Model loaded in 10.72 seconds
4
+ 2025-08-18 22:36:13 - INFO - GPU Memory Usage after model load: 2.31 GB
5
+ 2025-08-18 22:36:17 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 22:36:17 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Video saved to temporary file: temp_videos/6cf28ab6-d63f-482a-849e-5b626233e7dd.mp4
7
+ 2025-08-18 22:36:17 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 22:36:21 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 22:36:21 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] 30 frames saved to temp_videos/6cf28ab6-d63f-482a-849e-5b626233e7dd
10
+ 2025-08-18 22:36:21 - INFO - Prompt token length: 2276
11
+ 2025-08-18 22:36:32 - INFO - Tokens per second: 9.058665203909582, Peak GPU memory MB: 4498.375
12
+ 2025-08-18 22:36:32 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Inference time: 14.38 seconds, CPU usage: 64.5%, CPU core utilization: [61.5, 64.5, 60.3, 71.7]
13
+ 2025-08-18 22:36:32 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Cleaned up temporary file: temp_videos/6cf28ab6-d63f-482a-849e-5b626233e7dd.mp4
14
+ 2025-08-18 22:36:32 - INFO - [6cf28ab6-d63f-482a-849e-5b626233e7dd] Cleaned up temporary frame directory: temp_videos/6cf28ab6-d63f-482a-849e-5b626233e7dd
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_224148.log ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-18 22:41:48 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ
2
+ 2025-08-18 22:41:51 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-18 22:41:59 - INFO - Model loaded in 10.83 seconds
4
+ 2025-08-18 22:41:59 - INFO - GPU Memory Usage after model load: 2.31 GB
5
+ 2025-08-18 22:42:01 - INFO - [7d67f1c8-a6a2-41b3-88da-3f3ddf85842f] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-18 22:42:01 - INFO - [7d67f1c8-a6a2-41b3-88da-3f3ddf85842f] Video saved to temporary file: temp_videos/7d67f1c8-a6a2-41b3-88da-3f3ddf85842f.mp4
7
+ 2025-08-18 22:42:01 - INFO - [7d67f1c8-a6a2-41b3-88da-3f3ddf85842f] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-18 22:42:05 - INFO - [7d67f1c8-a6a2-41b3-88da-3f3ddf85842f] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-18 22:42:06 - INFO - [7d67f1c8-a6a2-41b3-88da-3f3ddf85842f] 30 frames saved to temp_videos/7d67f1c8-a6a2-41b3-88da-3f3ddf85842f
10
+ 2025-08-18 22:42:06 - INFO - Prompt token length: 2276
11
+ 2025-08-18 22:42:17 - INFO - Tokens per second: 8.892949630488515, Peak GPU memory MB: 4498.375
12
+ 2025-08-18 22:42:17 - INFO - [7d67f1c8-a6a2-41b3-88da-3f3ddf85842f] Inference time: 15.41 seconds, CPU usage: 79.7%, CPU core utilization: [80.4, 78.0, 77.4, 82.9]
13
+ 2025-08-18 22:42:17 - INFO - [7d67f1c8-a6a2-41b3-88da-3f3ddf85842f] Cleaned up temporary file: temp_videos/7d67f1c8-a6a2-41b3-88da-3f3ddf85842f.mp4
14
+ 2025-08-18 22:42:17 - INFO - [7d67f1c8-a6a2-41b3-88da-3f3ddf85842f] Cleaned up temporary frame directory: temp_videos/7d67f1c8-a6a2-41b3-88da-3f3ddf85842f
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250818_224556.log ADDED
The diff for this file is too large to render. See raw diff
 
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250819_010913.log ADDED
The diff for this file is too large to render. See raw diff
 
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250821_002944.log ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-21 00:29:44 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ
2
+ 2025-08-21 00:29:48 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-21 00:30:16 - INFO - Model loaded in 32.31 seconds
4
+ 2025-08-21 00:30:16 - INFO - GPU Memory Usage after model load: 2369.47 MB
5
+ 2025-08-21 00:30:22 - INFO - [473c7e67-59f4-4a5d-8868-f714f9787e83] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
6
+ 2025-08-21 00:30:22 - INFO - [473c7e67-59f4-4a5d-8868-f714f9787e83] Video saved to temporary file: temp_videos/473c7e67-59f4-4a5d-8868-f714f9787e83.mp4
7
+ 2025-08-21 00:30:22 - INFO - [473c7e67-59f4-4a5d-8868-f714f9787e83] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-21 00:30:27 - INFO - [473c7e67-59f4-4a5d-8868-f714f9787e83] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-21 00:30:27 - INFO - [473c7e67-59f4-4a5d-8868-f714f9787e83] 30 frames saved to temp_videos/473c7e67-59f4-4a5d-8868-f714f9787e83
10
+ 2025-08-21 00:30:28 - INFO - Prompt token length: 2306
11
+ 2025-08-21 00:30:42 - INFO - Tokens per second: 15.126569543378265, Peak GPU memory MB: 4514.375
12
+ 2025-08-21 00:30:42 - INFO - [473c7e67-59f4-4a5d-8868-f714f9787e83] Inference time: 19.59 seconds, CPU usage: 28.8%, CPU core utilization: [28.0, 25.1, 38.9, 23.2]
13
+ 2025-08-21 00:30:42 - INFO - [473c7e67-59f4-4a5d-8868-f714f9787e83] Cleaned up temporary frame directory: temp_videos/473c7e67-59f4-4a5d-8868-f714f9787e83
14
+ 2025-08-21 00:30:42 - INFO - [8695ced7-6ed8-4f84-8fd3-a5645e83398c] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_002.mp4'
15
+ 2025-08-21 00:30:42 - INFO - [8695ced7-6ed8-4f84-8fd3-a5645e83398c] Video saved to temporary file: temp_videos/8695ced7-6ed8-4f84-8fd3-a5645e83398c.mp4
16
+ 2025-08-21 00:30:42 - INFO - [8695ced7-6ed8-4f84-8fd3-a5645e83398c] Extracting frames using method: uniform, rate/threshold: 30
17
+ 2025-08-21 00:30:47 - INFO - [8695ced7-6ed8-4f84-8fd3-a5645e83398c] Extracted 30 frames successfully. Saving to temporary files...
18
+ 2025-08-21 00:30:47 - INFO - [8695ced7-6ed8-4f84-8fd3-a5645e83398c] 30 frames saved to temp_videos/8695ced7-6ed8-4f84-8fd3-a5645e83398c
19
+ 2025-08-21 00:30:47 - INFO - Prompt token length: 2306
20
+ 2025-08-21 00:30:58 - INFO - Tokens per second: 15.30312036179949, Peak GPU memory MB: 4514.375
21
+ 2025-08-21 00:30:58 - INFO - [8695ced7-6ed8-4f84-8fd3-a5645e83398c] Inference time: 15.90 seconds, CPU usage: 45.5%, CPU core utilization: [81.9, 29.9, 42.4, 27.9]
22
+ 2025-08-21 00:30:58 - INFO - [8695ced7-6ed8-4f84-8fd3-a5645e83398c] Cleaned up temporary frame directory: temp_videos/8695ced7-6ed8-4f84-8fd3-a5645e83398c
23
+ 2025-08-21 00:30:58 - INFO - [a8ea642a-4c80-4dfc-a0b6-6e9f4cf8be02] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_003.mp4'
24
+ 2025-08-21 00:30:58 - INFO - [a8ea642a-4c80-4dfc-a0b6-6e9f4cf8be02] Video saved to temporary file: temp_videos/a8ea642a-4c80-4dfc-a0b6-6e9f4cf8be02.mp4
25
+ 2025-08-21 00:30:58 - INFO - [a8ea642a-4c80-4dfc-a0b6-6e9f4cf8be02] Extracting frames using method: uniform, rate/threshold: 30
26
+ 2025-08-21 00:31:03 - INFO - [a8ea642a-4c80-4dfc-a0b6-6e9f4cf8be02] Extracted 30 frames successfully. Saving to temporary files...
27
+ 2025-08-21 00:31:03 - INFO - [a8ea642a-4c80-4dfc-a0b6-6e9f4cf8be02] 30 frames saved to temp_videos/a8ea642a-4c80-4dfc-a0b6-6e9f4cf8be02
28
+ 2025-08-21 00:31:03 - INFO - Prompt token length: 2306
29
+ 2025-08-21 00:31:14 - INFO - Tokens per second: 15.081962195610863, Peak GPU memory MB: 4514.375
30
+ 2025-08-21 00:31:14 - INFO - [a8ea642a-4c80-4dfc-a0b6-6e9f4cf8be02] Inference time: 15.82 seconds, CPU usage: 46.3%, CPU core utilization: [66.8, 30.9, 58.6, 29.0]
31
+ 2025-08-21 00:31:14 - INFO - [a8ea642a-4c80-4dfc-a0b6-6e9f4cf8be02] Cleaned up temporary frame directory: temp_videos/a8ea642a-4c80-4dfc-a0b6-6e9f4cf8be02
32
+ 2025-08-21 00:31:14 - INFO - [127051a2-002e-4513-af2a-b168f47a679c] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_004.mp4'
33
+ 2025-08-21 00:31:14 - INFO - [127051a2-002e-4513-af2a-b168f47a679c] Video saved to temporary file: temp_videos/127051a2-002e-4513-af2a-b168f47a679c.mp4
34
+ 2025-08-21 00:31:14 - INFO - [127051a2-002e-4513-af2a-b168f47a679c] Extracting frames using method: uniform, rate/threshold: 30
35
+ 2025-08-21 00:31:19 - INFO - [127051a2-002e-4513-af2a-b168f47a679c] Extracted 30 frames successfully. Saving to temporary files...
36
+ 2025-08-21 00:31:19 - INFO - [127051a2-002e-4513-af2a-b168f47a679c] 30 frames saved to temp_videos/127051a2-002e-4513-af2a-b168f47a679c
37
+ 2025-08-21 00:31:19 - INFO - Prompt token length: 2306
38
+ 2025-08-21 00:31:30 - INFO - Tokens per second: 15.1012932923201, Peak GPU memory MB: 4514.375
39
+ 2025-08-21 00:31:30 - INFO - [127051a2-002e-4513-af2a-b168f47a679c] Inference time: 15.92 seconds, CPU usage: 44.7%, CPU core utilization: [29.3, 28.1, 27.0, 94.3]
40
+ 2025-08-21 00:31:30 - INFO - [127051a2-002e-4513-af2a-b168f47a679c] Cleaned up temporary frame directory: temp_videos/127051a2-002e-4513-af2a-b168f47a679c
41
+ 2025-08-21 00:31:30 - INFO - [a3389ddf-4af5-4921-a824-0c0c8b4ff137] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_005.mp4'
42
+ 2025-08-21 00:31:30 - INFO - [a3389ddf-4af5-4921-a824-0c0c8b4ff137] Video saved to temporary file: temp_videos/a3389ddf-4af5-4921-a824-0c0c8b4ff137.mp4
43
+ 2025-08-21 00:31:30 - INFO - [a3389ddf-4af5-4921-a824-0c0c8b4ff137] Extracting frames using method: uniform, rate/threshold: 30
44
+ 2025-08-21 00:31:35 - INFO - [a3389ddf-4af5-4921-a824-0c0c8b4ff137] Extracted 30 frames successfully. Saving to temporary files...
45
+ 2025-08-21 00:31:35 - INFO - [a3389ddf-4af5-4921-a824-0c0c8b4ff137] 30 frames saved to temp_videos/a3389ddf-4af5-4921-a824-0c0c8b4ff137
46
+ 2025-08-21 00:31:35 - INFO - Prompt token length: 2306
47
+ 2025-08-21 00:31:42 - INFO - Tokens per second: 14.916020163544944, Peak GPU memory MB: 4514.375
48
+ 2025-08-21 00:31:42 - INFO - [a3389ddf-4af5-4921-a824-0c0c8b4ff137] Inference time: 12.32 seconds, CPU usage: 50.3%, CPU core utilization: [37.0, 57.7, 34.8, 71.6]
49
+ 2025-08-21 00:31:42 - INFO - [a3389ddf-4af5-4921-a824-0c0c8b4ff137] Cleaned up temporary frame directory: temp_videos/a3389ddf-4af5-4921-a824-0c0c8b4ff137
50
+ 2025-08-21 00:31:42 - INFO - [3a1249cb-f47c-4dab-916a-8e74dfe771cd] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_006.mp4'
51
+ 2025-08-21 00:31:42 - INFO - [3a1249cb-f47c-4dab-916a-8e74dfe771cd] Video saved to temporary file: temp_videos/3a1249cb-f47c-4dab-916a-8e74dfe771cd.mp4
52
+ 2025-08-21 00:31:42 - INFO - [3a1249cb-f47c-4dab-916a-8e74dfe771cd] Extracting frames using method: uniform, rate/threshold: 30
53
+ 2025-08-21 00:31:47 - INFO - [3a1249cb-f47c-4dab-916a-8e74dfe771cd] Extracted 30 frames successfully. Saving to temporary files...
54
+ 2025-08-21 00:31:47 - INFO - [3a1249cb-f47c-4dab-916a-8e74dfe771cd] 30 frames saved to temp_videos/3a1249cb-f47c-4dab-916a-8e74dfe771cd
55
+ 2025-08-21 00:31:47 - INFO - Prompt token length: 2306
56
+ 2025-08-21 00:31:58 - INFO - Tokens per second: 15.137056054618776, Peak GPU memory MB: 4514.375
57
+ 2025-08-21 00:31:58 - INFO - [3a1249cb-f47c-4dab-916a-8e74dfe771cd] Inference time: 16.42 seconds, CPU usage: 44.8%, CPU core utilization: [31.9, 78.8, 27.5, 41.0]
58
+ 2025-08-21 00:31:58 - INFO - [3a1249cb-f47c-4dab-916a-8e74dfe771cd] Cleaned up temporary frame directory: temp_videos/3a1249cb-f47c-4dab-916a-8e74dfe771cd
59
+ 2025-08-21 00:31:58 - INFO - [aeccbc49-b47d-4bb4-a28f-ce86d255d26e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_007.mp4'
60
+ 2025-08-21 00:31:58 - INFO - [aeccbc49-b47d-4bb4-a28f-ce86d255d26e] Video saved to temporary file: temp_videos/aeccbc49-b47d-4bb4-a28f-ce86d255d26e.mp4
61
+ 2025-08-21 00:31:58 - INFO - [aeccbc49-b47d-4bb4-a28f-ce86d255d26e] Extracting frames using method: uniform, rate/threshold: 30
62
+ 2025-08-21 00:32:03 - INFO - [aeccbc49-b47d-4bb4-a28f-ce86d255d26e] Extracted 30 frames successfully. Saving to temporary files...
63
+ 2025-08-21 00:32:03 - INFO - [aeccbc49-b47d-4bb4-a28f-ce86d255d26e] 30 frames saved to temp_videos/aeccbc49-b47d-4bb4-a28f-ce86d255d26e
64
+ 2025-08-21 00:32:03 - INFO - Prompt token length: 2306
65
+ 2025-08-21 00:32:11 - INFO - Tokens per second: 15.146565978743613, Peak GPU memory MB: 4514.375
66
+ 2025-08-21 00:32:11 - INFO - [aeccbc49-b47d-4bb4-a28f-ce86d255d26e] Inference time: 12.69 seconds, CPU usage: 50.4%, CPU core utilization: [35.0, 35.3, 94.0, 37.2]
67
+ 2025-08-21 00:32:11 - INFO - [aeccbc49-b47d-4bb4-a28f-ce86d255d26e] Cleaned up temporary frame directory: temp_videos/aeccbc49-b47d-4bb4-a28f-ce86d255d26e
68
+ 2025-08-21 00:32:11 - INFO - [56ab2c6c-91a0-4a2b-8e6e-30b409eabd5e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_008.mp4'
69
+ 2025-08-21 00:32:11 - INFO - [56ab2c6c-91a0-4a2b-8e6e-30b409eabd5e] Video saved to temporary file: temp_videos/56ab2c6c-91a0-4a2b-8e6e-30b409eabd5e.mp4
70
+ 2025-08-21 00:32:11 - INFO - [56ab2c6c-91a0-4a2b-8e6e-30b409eabd5e] Extracting frames using method: uniform, rate/threshold: 30
71
+ 2025-08-21 00:32:16 - INFO - [56ab2c6c-91a0-4a2b-8e6e-30b409eabd5e] Extracted 30 frames successfully. Saving to temporary files...
72
+ 2025-08-21 00:32:16 - INFO - [56ab2c6c-91a0-4a2b-8e6e-30b409eabd5e] 30 frames saved to temp_videos/56ab2c6c-91a0-4a2b-8e6e-30b409eabd5e
73
+ 2025-08-21 00:32:16 - INFO - Prompt token length: 2306
74
+ 2025-08-21 00:32:21 - INFO - Tokens per second: 15.105367104028009, Peak GPU memory MB: 4514.375
75
+ 2025-08-21 00:32:21 - INFO - [56ab2c6c-91a0-4a2b-8e6e-30b409eabd5e] Inference time: 9.88 seconds, CPU usage: 56.6%, CPU core utilization: [44.1, 58.8, 44.6, 78.6]
76
+ 2025-08-21 00:32:21 - INFO - [56ab2c6c-91a0-4a2b-8e6e-30b409eabd5e] Cleaned up temporary frame directory: temp_videos/56ab2c6c-91a0-4a2b-8e6e-30b409eabd5e
77
+ 2025-08-21 00:32:21 - INFO - [2b2cfccd-ea6e-47e1-a1be-cffa74ba8d17] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_009.mp4'
78
+ 2025-08-21 00:32:21 - INFO - [2b2cfccd-ea6e-47e1-a1be-cffa74ba8d17] Video saved to temporary file: temp_videos/2b2cfccd-ea6e-47e1-a1be-cffa74ba8d17.mp4
79
+ 2025-08-21 00:32:21 - INFO - [2b2cfccd-ea6e-47e1-a1be-cffa74ba8d17] Extracting frames using method: uniform, rate/threshold: 30
80
+ 2025-08-21 00:32:26 - INFO - [2b2cfccd-ea6e-47e1-a1be-cffa74ba8d17] Extracted 30 frames successfully. Saving to temporary files...
81
+ 2025-08-21 00:32:26 - INFO - [2b2cfccd-ea6e-47e1-a1be-cffa74ba8d17] 30 frames saved to temp_videos/2b2cfccd-ea6e-47e1-a1be-cffa74ba8d17
82
+ 2025-08-21 00:32:26 - INFO - Prompt token length: 2306
83
+ 2025-08-21 00:32:31 - INFO - Tokens per second: 15.121278187118696, Peak GPU memory MB: 4514.375
84
+ 2025-08-21 00:32:31 - INFO - [2b2cfccd-ea6e-47e1-a1be-cffa74ba8d17] Inference time: 9.51 seconds, CPU usage: 57.4%, CPU core utilization: [80.4, 45.2, 47.0, 57.0]
85
+ 2025-08-21 00:32:31 - INFO - [2b2cfccd-ea6e-47e1-a1be-cffa74ba8d17] Cleaned up temporary frame directory: temp_videos/2b2cfccd-ea6e-47e1-a1be-cffa74ba8d17
86
+ 2025-08-21 00:32:31 - INFO - [ccb3ba8f-6c3e-4301-884e-8a28f8f4cac3] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_010.mp4'
87
+ 2025-08-21 00:32:31 - INFO - [ccb3ba8f-6c3e-4301-884e-8a28f8f4cac3] Video saved to temporary file: temp_videos/ccb3ba8f-6c3e-4301-884e-8a28f8f4cac3.mp4
88
+ 2025-08-21 00:32:31 - INFO - [ccb3ba8f-6c3e-4301-884e-8a28f8f4cac3] Extracting frames using method: uniform, rate/threshold: 30
89
+ 2025-08-21 00:32:35 - INFO - [ccb3ba8f-6c3e-4301-884e-8a28f8f4cac3] Extracted 30 frames successfully. Saving to temporary files...
90
+ 2025-08-21 00:32:35 - INFO - [ccb3ba8f-6c3e-4301-884e-8a28f8f4cac3] 30 frames saved to temp_videos/ccb3ba8f-6c3e-4301-884e-8a28f8f4cac3
91
+ 2025-08-21 00:32:36 - INFO - Prompt token length: 2306
92
+ 2025-08-21 00:32:54 - INFO - Tokens per second: 15.20172275516238, Peak GPU memory MB: 4514.375
93
+ 2025-08-21 00:32:54 - INFO - [ccb3ba8f-6c3e-4301-884e-8a28f8f4cac3] Inference time: 23.66 seconds, CPU usage: 39.3%, CPU core utilization: [43.6, 46.9, 30.6, 36.0]
94
+ 2025-08-21 00:32:54 - INFO - [ccb3ba8f-6c3e-4301-884e-8a28f8f4cac3] Cleaned up temporary frame directory: temp_videos/ccb3ba8f-6c3e-4301-884e-8a28f8f4cac3
API_Transformers/logs/Qwen2-VL-2B-Instruct-AWQ/20250821_013207.log ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-21 01:32:07 - INFO - Loading model: Qwen/Qwen2-VL-2B-Instruct-AWQ
2
+ 2025-08-21 01:32:11 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-21 01:32:38 - INFO - Model loaded in 31.45 seconds
4
+ 2025-08-21 01:32:38 - INFO - GPU Memory Usage after model load: 2369.47 MB
5
+ 2025-08-21 01:32:48 - INFO - [6806d96b-50d0-41d5-8703-320d06e1bb84] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_001.mp4'
6
+ 2025-08-21 01:32:48 - INFO - [6806d96b-50d0-41d5-8703-320d06e1bb84] Video saved to temporary file: temp_videos/6806d96b-50d0-41d5-8703-320d06e1bb84.mp4
7
+ 2025-08-21 01:32:48 - INFO - [6806d96b-50d0-41d5-8703-320d06e1bb84] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-21 01:32:52 - INFO - [6806d96b-50d0-41d5-8703-320d06e1bb84] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-21 01:32:52 - INFO - [6806d96b-50d0-41d5-8703-320d06e1bb84] 30 frames saved to temp_videos/6806d96b-50d0-41d5-8703-320d06e1bb84
10
+ 2025-08-21 01:32:52 - INFO - Prompt token length: 2306
11
+ 2025-08-21 01:33:04 - INFO - Tokens per second: 14.857358494588418, Peak GPU memory MB: 4514.375
12
+ 2025-08-21 01:33:04 - INFO - [6806d96b-50d0-41d5-8703-320d06e1bb84] Inference time: 15.50 seconds, CPU usage: 32.9%, CPU core utilization: [32.8, 28.6, 29.2, 41.1]
13
+ 2025-08-21 01:33:04 - INFO - [6806d96b-50d0-41d5-8703-320d06e1bb84] Cleaned up temporary frame directory: temp_videos/6806d96b-50d0-41d5-8703-320d06e1bb84
14
+ 2025-08-21 01:33:04 - INFO - [5cb2e558-a2e2-4495-b40b-5f785967226f] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_002.mp4'
15
+ 2025-08-21 01:33:04 - INFO - [5cb2e558-a2e2-4495-b40b-5f785967226f] Video saved to temporary file: temp_videos/5cb2e558-a2e2-4495-b40b-5f785967226f.mp4
16
+ 2025-08-21 01:33:04 - INFO - [5cb2e558-a2e2-4495-b40b-5f785967226f] Extracting frames using method: uniform, rate/threshold: 30
17
+ 2025-08-21 01:33:07 - INFO - [5cb2e558-a2e2-4495-b40b-5f785967226f] Extracted 30 frames successfully. Saving to temporary files...
18
+ 2025-08-21 01:33:07 - INFO - [5cb2e558-a2e2-4495-b40b-5f785967226f] 30 frames saved to temp_videos/5cb2e558-a2e2-4495-b40b-5f785967226f
19
+ 2025-08-21 01:33:08 - INFO - Prompt token length: 2306
20
+ 2025-08-21 01:33:17 - INFO - Tokens per second: 14.913712045394723, Peak GPU memory MB: 4514.375
21
+ 2025-08-21 01:33:17 - INFO - [5cb2e558-a2e2-4495-b40b-5f785967226f] Inference time: 12.93 seconds, CPU usage: 42.1%, CPU core utilization: [24.8, 42.8, 25.7, 75.0]
22
+ 2025-08-21 01:33:17 - INFO - [5cb2e558-a2e2-4495-b40b-5f785967226f] Cleaned up temporary frame directory: temp_videos/5cb2e558-a2e2-4495-b40b-5f785967226f
23
+ 2025-08-21 01:33:17 - INFO - [515f2d40-6c02-40e7-b489-254d66061d58] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_003.mp4'
24
+ 2025-08-21 01:33:17 - INFO - [515f2d40-6c02-40e7-b489-254d66061d58] Video saved to temporary file: temp_videos/515f2d40-6c02-40e7-b489-254d66061d58.mp4
25
+ 2025-08-21 01:33:17 - INFO - [515f2d40-6c02-40e7-b489-254d66061d58] Extracting frames using method: uniform, rate/threshold: 30
26
+ 2025-08-21 01:33:20 - INFO - [515f2d40-6c02-40e7-b489-254d66061d58] Extracted 30 frames successfully. Saving to temporary files...
27
+ 2025-08-21 01:33:20 - INFO - [515f2d40-6c02-40e7-b489-254d66061d58] 30 frames saved to temp_videos/515f2d40-6c02-40e7-b489-254d66061d58
28
+ 2025-08-21 01:33:20 - INFO - Prompt token length: 2306
29
+ 2025-08-21 01:33:39 - INFO - Tokens per second: 15.212866049846783, Peak GPU memory MB: 4514.375
30
+ 2025-08-21 01:33:39 - INFO - [515f2d40-6c02-40e7-b489-254d66061d58] Inference time: 22.18 seconds, CPU usage: 35.0%, CPU core utilization: [28.2, 14.0, 81.7, 16.0]
31
+ 2025-08-21 01:33:39 - INFO - [515f2d40-6c02-40e7-b489-254d66061d58] Cleaned up temporary frame directory: temp_videos/515f2d40-6c02-40e7-b489-254d66061d58
32
+ 2025-08-21 01:33:39 - INFO - [7702f18f-4562-4928-bacd-861b024219c1] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_004.mp4'
33
+ 2025-08-21 01:33:39 - INFO - [7702f18f-4562-4928-bacd-861b024219c1] Video saved to temporary file: temp_videos/7702f18f-4562-4928-bacd-861b024219c1.mp4
34
+ 2025-08-21 01:33:39 - INFO - [7702f18f-4562-4928-bacd-861b024219c1] Extracting frames using method: uniform, rate/threshold: 30
35
+ 2025-08-21 01:33:43 - INFO - [7702f18f-4562-4928-bacd-861b024219c1] Extracted 30 frames successfully. Saving to temporary files...
36
+ 2025-08-21 01:33:43 - INFO - [7702f18f-4562-4928-bacd-861b024219c1] 30 frames saved to temp_videos/7702f18f-4562-4928-bacd-861b024219c1
37
+ 2025-08-21 01:33:43 - INFO - Prompt token length: 2306
38
+ 2025-08-21 01:33:51 - INFO - Tokens per second: 14.729804011346738, Peak GPU memory MB: 4514.375
39
+ 2025-08-21 01:33:51 - INFO - [7702f18f-4562-4928-bacd-861b024219c1] Inference time: 11.51 seconds, CPU usage: 44.6%, CPU core utilization: [43.6, 28.0, 27.7, 78.9]
40
+ 2025-08-21 01:33:51 - INFO - [7702f18f-4562-4928-bacd-861b024219c1] Cleaned up temporary frame directory: temp_videos/7702f18f-4562-4928-bacd-861b024219c1
41
+ 2025-08-21 01:33:51 - INFO - [14252659-b5fb-4fa7-8d3e-f62a3c69679b] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_005.mp4'
42
+ 2025-08-21 01:33:51 - INFO - [14252659-b5fb-4fa7-8d3e-f62a3c69679b] Video saved to temporary file: temp_videos/14252659-b5fb-4fa7-8d3e-f62a3c69679b.mp4
43
+ 2025-08-21 01:33:51 - INFO - [14252659-b5fb-4fa7-8d3e-f62a3c69679b] Extracting frames using method: uniform, rate/threshold: 30
44
+ 2025-08-21 01:33:54 - INFO - [14252659-b5fb-4fa7-8d3e-f62a3c69679b] Extracted 30 frames successfully. Saving to temporary files...
45
+ 2025-08-21 01:33:54 - INFO - [14252659-b5fb-4fa7-8d3e-f62a3c69679b] 30 frames saved to temp_videos/14252659-b5fb-4fa7-8d3e-f62a3c69679b
46
+ 2025-08-21 01:33:54 - INFO - Prompt token length: 2306
47
+ 2025-08-21 01:34:03 - INFO - Tokens per second: 15.087484052694805, Peak GPU memory MB: 4514.375
48
+ 2025-08-21 01:34:03 - INFO - [14252659-b5fb-4fa7-8d3e-f62a3c69679b] Inference time: 12.10 seconds, CPU usage: 41.3%, CPU core utilization: [37.5, 35.4, 68.2, 24.2]
49
+ 2025-08-21 01:34:03 - INFO - [14252659-b5fb-4fa7-8d3e-f62a3c69679b] Cleaned up temporary frame directory: temp_videos/14252659-b5fb-4fa7-8d3e-f62a3c69679b
50
+ 2025-08-21 01:34:03 - INFO - [f08308a4-d0e5-4d3f-ade5-4a3517c11659] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_006.mp4'
51
+ 2025-08-21 01:34:03 - INFO - [f08308a4-d0e5-4d3f-ade5-4a3517c11659] Video saved to temporary file: temp_videos/f08308a4-d0e5-4d3f-ade5-4a3517c11659.mp4
52
+ 2025-08-21 01:34:03 - INFO - [f08308a4-d0e5-4d3f-ade5-4a3517c11659] Extracting frames using method: uniform, rate/threshold: 30
53
+ 2025-08-21 01:34:06 - INFO - [f08308a4-d0e5-4d3f-ade5-4a3517c11659] Extracted 30 frames successfully. Saving to temporary files...
54
+ 2025-08-21 01:34:06 - INFO - [f08308a4-d0e5-4d3f-ade5-4a3517c11659] 30 frames saved to temp_videos/f08308a4-d0e5-4d3f-ade5-4a3517c11659
55
+ 2025-08-21 01:34:06 - INFO - Prompt token length: 2306
56
+ 2025-08-21 01:34:17 - INFO - Tokens per second: 14.952252033094313, Peak GPU memory MB: 4514.375
57
+ 2025-08-21 01:34:17 - INFO - [f08308a4-d0e5-4d3f-ade5-4a3517c11659] Inference time: 14.60 seconds, CPU usage: 40.3%, CPU core utilization: [21.8, 22.2, 22.3, 94.7]
58
+ 2025-08-21 01:34:17 - INFO - [f08308a4-d0e5-4d3f-ade5-4a3517c11659] Cleaned up temporary frame directory: temp_videos/f08308a4-d0e5-4d3f-ade5-4a3517c11659
59
+ 2025-08-21 01:34:17 - INFO - [5671a395-356d-43ae-9464-5fc071986b0e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_007.mp4'
60
+ 2025-08-21 01:34:17 - INFO - [5671a395-356d-43ae-9464-5fc071986b0e] Video saved to temporary file: temp_videos/5671a395-356d-43ae-9464-5fc071986b0e.mp4
61
+ 2025-08-21 01:34:17 - INFO - [5671a395-356d-43ae-9464-5fc071986b0e] Extracting frames using method: uniform, rate/threshold: 30
62
+ 2025-08-21 01:34:21 - INFO - [5671a395-356d-43ae-9464-5fc071986b0e] Extracted 30 frames successfully. Saving to temporary files...
63
+ 2025-08-21 01:34:21 - INFO - [5671a395-356d-43ae-9464-5fc071986b0e] 30 frames saved to temp_videos/5671a395-356d-43ae-9464-5fc071986b0e
64
+ 2025-08-21 01:34:21 - INFO - Prompt token length: 2306
65
+ 2025-08-21 01:34:29 - INFO - Tokens per second: 14.96244510342586, Peak GPU memory MB: 4514.375
66
+ 2025-08-21 01:34:29 - INFO - [5671a395-356d-43ae-9464-5fc071986b0e] Inference time: 11.52 seconds, CPU usage: 42.8%, CPU core utilization: [33.2, 59.1, 33.1, 46.1]
67
+ 2025-08-21 01:34:29 - INFO - [5671a395-356d-43ae-9464-5fc071986b0e] Cleaned up temporary frame directory: temp_videos/5671a395-356d-43ae-9464-5fc071986b0e
68
+ 2025-08-21 01:34:29 - INFO - [09f8c5e5-7851-4ff6-85c5-fd9a0ad9d11f] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_008.mp4'
69
+ 2025-08-21 01:34:29 - INFO - [09f8c5e5-7851-4ff6-85c5-fd9a0ad9d11f] Video saved to temporary file: temp_videos/09f8c5e5-7851-4ff6-85c5-fd9a0ad9d11f.mp4
70
+ 2025-08-21 01:34:29 - INFO - [09f8c5e5-7851-4ff6-85c5-fd9a0ad9d11f] Extracting frames using method: uniform, rate/threshold: 30
71
+ 2025-08-21 01:34:32 - INFO - [09f8c5e5-7851-4ff6-85c5-fd9a0ad9d11f] Extracted 30 frames successfully. Saving to temporary files...
72
+ 2025-08-21 01:34:32 - INFO - [09f8c5e5-7851-4ff6-85c5-fd9a0ad9d11f] 30 frames saved to temp_videos/09f8c5e5-7851-4ff6-85c5-fd9a0ad9d11f
73
+ 2025-08-21 01:34:33 - INFO - Prompt token length: 2306
74
+ 2025-08-21 01:34:43 - INFO - Tokens per second: 15.042997648777325, Peak GPU memory MB: 4514.375
75
+ 2025-08-21 01:34:43 - INFO - [09f8c5e5-7851-4ff6-85c5-fd9a0ad9d11f] Inference time: 13.92 seconds, CPU usage: 40.3%, CPU core utilization: [35.6, 54.7, 22.9, 47.8]
76
+ 2025-08-21 01:34:43 - INFO - [09f8c5e5-7851-4ff6-85c5-fd9a0ad9d11f] Cleaned up temporary frame directory: temp_videos/09f8c5e5-7851-4ff6-85c5-fd9a0ad9d11f
77
+ 2025-08-21 01:34:43 - INFO - [9110b0b9-0870-40fe-bfa9-fde4a5519eeb] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_009.mp4'
78
+ 2025-08-21 01:34:43 - INFO - [9110b0b9-0870-40fe-bfa9-fde4a5519eeb] Video saved to temporary file: temp_videos/9110b0b9-0870-40fe-bfa9-fde4a5519eeb.mp4
79
+ 2025-08-21 01:34:43 - INFO - [9110b0b9-0870-40fe-bfa9-fde4a5519eeb] Extracting frames using method: uniform, rate/threshold: 30
80
+ 2025-08-21 01:34:46 - INFO - [9110b0b9-0870-40fe-bfa9-fde4a5519eeb] Extracted 30 frames successfully. Saving to temporary files...
81
+ 2025-08-21 01:34:46 - INFO - [9110b0b9-0870-40fe-bfa9-fde4a5519eeb] 30 frames saved to temp_videos/9110b0b9-0870-40fe-bfa9-fde4a5519eeb
82
+ 2025-08-21 01:34:46 - INFO - Prompt token length: 2306
83
+ 2025-08-21 01:35:05 - INFO - Tokens per second: 15.022627137863786, Peak GPU memory MB: 4514.375
84
+ 2025-08-21 01:35:05 - INFO - [9110b0b9-0870-40fe-bfa9-fde4a5519eeb] Inference time: 22.48 seconds, CPU usage: 35.1%, CPU core utilization: [14.0, 57.8, 15.6, 53.2]
85
+ 2025-08-21 01:35:05 - INFO - [9110b0b9-0870-40fe-bfa9-fde4a5519eeb] Cleaned up temporary frame directory: temp_videos/9110b0b9-0870-40fe-bfa9-fde4a5519eeb
86
+ 2025-08-21 01:35:05 - INFO - [a8af5915-754a-4c20-8eed-e7dc0e54633d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_010.mp4'
87
+ 2025-08-21 01:35:05 - INFO - [a8af5915-754a-4c20-8eed-e7dc0e54633d] Video saved to temporary file: temp_videos/a8af5915-754a-4c20-8eed-e7dc0e54633d.mp4
88
+ 2025-08-21 01:35:05 - INFO - [a8af5915-754a-4c20-8eed-e7dc0e54633d] Extracting frames using method: uniform, rate/threshold: 30
89
+ 2025-08-21 01:35:09 - INFO - [a8af5915-754a-4c20-8eed-e7dc0e54633d] Extracted 30 frames successfully. Saving to temporary files...
90
+ 2025-08-21 01:35:09 - INFO - [a8af5915-754a-4c20-8eed-e7dc0e54633d] 30 frames saved to temp_videos/a8af5915-754a-4c20-8eed-e7dc0e54633d
91
+ 2025-08-21 01:35:09 - INFO - Prompt token length: 2306
92
+ 2025-08-21 01:35:16 - INFO - Tokens per second: 14.923263484191663, Peak GPU memory MB: 4514.375
93
+ 2025-08-21 01:35:16 - INFO - [a8af5915-754a-4c20-8eed-e7dc0e54633d] Inference time: 10.32 seconds, CPU usage: 45.6%, CPU core utilization: [35.4, 79.2, 32.6, 35.0]
94
+ 2025-08-21 01:35:16 - INFO - [a8af5915-754a-4c20-8eed-e7dc0e54633d] Cleaned up temporary frame directory: temp_videos/a8af5915-754a-4c20-8eed-e7dc0e54633d
95
+ 2025-08-21 01:35:16 - INFO - [a45614c0-df7a-4c35-a1ea-1efa6a29a8d8] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_011.mp4'
96
+ 2025-08-21 01:35:16 - INFO - [a45614c0-df7a-4c35-a1ea-1efa6a29a8d8] Video saved to temporary file: temp_videos/a45614c0-df7a-4c35-a1ea-1efa6a29a8d8.mp4
97
+ 2025-08-21 01:35:16 - INFO - [a45614c0-df7a-4c35-a1ea-1efa6a29a8d8] Extracting frames using method: uniform, rate/threshold: 30
98
+ 2025-08-21 01:35:19 - INFO - [a45614c0-df7a-4c35-a1ea-1efa6a29a8d8] Extracted 30 frames successfully. Saving to temporary files...
99
+ 2025-08-21 01:35:19 - INFO - [a45614c0-df7a-4c35-a1ea-1efa6a29a8d8] 30 frames saved to temp_videos/a45614c0-df7a-4c35-a1ea-1efa6a29a8d8
100
+ 2025-08-21 01:35:19 - INFO - Prompt token length: 2306
101
+ 2025-08-21 01:35:27 - INFO - Tokens per second: 15.164410207215045, Peak GPU memory MB: 4514.375
102
+ 2025-08-21 01:35:27 - INFO - [a45614c0-df7a-4c35-a1ea-1efa6a29a8d8] Inference time: 10.98 seconds, CPU usage: 44.3%, CPU core utilization: [88.7, 28.9, 33.2, 26.5]
103
+ 2025-08-21 01:35:27 - INFO - [a45614c0-df7a-4c35-a1ea-1efa6a29a8d8] Cleaned up temporary frame directory: temp_videos/a45614c0-df7a-4c35-a1ea-1efa6a29a8d8
104
+ 2025-08-21 01:35:27 - INFO - [8156334c-d671-4483-b468-863d84a26687] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_012.mp4'
105
+ 2025-08-21 01:35:27 - INFO - [8156334c-d671-4483-b468-863d84a26687] Video saved to temporary file: temp_videos/8156334c-d671-4483-b468-863d84a26687.mp4
106
+ 2025-08-21 01:35:27 - INFO - [8156334c-d671-4483-b468-863d84a26687] Extracting frames using method: uniform, rate/threshold: 30
107
+ 2025-08-21 01:35:30 - INFO - [8156334c-d671-4483-b468-863d84a26687] Extracted 30 frames successfully. Saving to temporary files...
108
+ 2025-08-21 01:35:30 - INFO - [8156334c-d671-4483-b468-863d84a26687] 30 frames saved to temp_videos/8156334c-d671-4483-b468-863d84a26687
109
+ 2025-08-21 01:35:30 - INFO - Prompt token length: 2306
110
+ 2025-08-21 01:35:49 - INFO - Tokens per second: 15.117784722711388, Peak GPU memory MB: 4514.375
111
+ 2025-08-21 01:35:49 - INFO - [8156334c-d671-4483-b468-863d84a26687] Inference time: 22.36 seconds, CPU usage: 34.6%, CPU core utilization: [24.1, 60.2, 15.1, 38.9]
112
+ 2025-08-21 01:35:49 - INFO - [8156334c-d671-4483-b468-863d84a26687] Cleaned up temporary frame directory: temp_videos/8156334c-d671-4483-b468-863d84a26687
113
+ 2025-08-21 01:35:49 - INFO - [58827a98-0a85-4ee4-8240-b10420154270] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_013.mp4'
114
+ 2025-08-21 01:35:49 - INFO - [58827a98-0a85-4ee4-8240-b10420154270] Video saved to temporary file: temp_videos/58827a98-0a85-4ee4-8240-b10420154270.mp4
115
+ 2025-08-21 01:35:49 - INFO - [58827a98-0a85-4ee4-8240-b10420154270] Extracting frames using method: uniform, rate/threshold: 30
116
+ 2025-08-21 01:35:52 - INFO - [58827a98-0a85-4ee4-8240-b10420154270] Extracted 30 frames successfully. Saving to temporary files...
117
+ 2025-08-21 01:35:52 - INFO - [58827a98-0a85-4ee4-8240-b10420154270] 30 frames saved to temp_videos/58827a98-0a85-4ee4-8240-b10420154270
118
+ 2025-08-21 01:35:53 - INFO - Prompt token length: 2306
119
+ 2025-08-21 01:36:00 - INFO - Tokens per second: 14.985813897711381, Peak GPU memory MB: 4514.375
120
+ 2025-08-21 01:36:00 - INFO - [58827a98-0a85-4ee4-8240-b10420154270] Inference time: 11.17 seconds, CPU usage: 41.0%, CPU core utilization: [25.1, 70.6, 23.8, 44.4]
121
+ 2025-08-21 01:36:00 - INFO - [58827a98-0a85-4ee4-8240-b10420154270] Cleaned up temporary frame directory: temp_videos/58827a98-0a85-4ee4-8240-b10420154270
122
+ 2025-08-21 01:36:00 - INFO - [cc8da7cb-4ffc-4400-a354-fe30fac0dc25] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_014.mp4'
123
+ 2025-08-21 01:36:00 - INFO - [cc8da7cb-4ffc-4400-a354-fe30fac0dc25] Video saved to temporary file: temp_videos/cc8da7cb-4ffc-4400-a354-fe30fac0dc25.mp4
124
+ 2025-08-21 01:36:00 - INFO - [cc8da7cb-4ffc-4400-a354-fe30fac0dc25] Extracting frames using method: uniform, rate/threshold: 30
125
+ 2025-08-21 01:36:04 - INFO - [cc8da7cb-4ffc-4400-a354-fe30fac0dc25] Extracted 30 frames successfully. Saving to temporary files...
126
+ 2025-08-21 01:36:04 - INFO - [cc8da7cb-4ffc-4400-a354-fe30fac0dc25] 30 frames saved to temp_videos/cc8da7cb-4ffc-4400-a354-fe30fac0dc25
127
+ 2025-08-21 01:36:04 - INFO - Prompt token length: 2306
128
+ 2025-08-21 01:36:12 - INFO - Tokens per second: 15.019531662577604, Peak GPU memory MB: 4514.375
129
+ 2025-08-21 01:36:12 - INFO - [cc8da7cb-4ffc-4400-a354-fe30fac0dc25] Inference time: 12.21 seconds, CPU usage: 40.9%, CPU core utilization: [58.5, 47.3, 33.7, 24.0]
130
+ 2025-08-21 01:36:12 - INFO - [cc8da7cb-4ffc-4400-a354-fe30fac0dc25] Cleaned up temporary frame directory: temp_videos/cc8da7cb-4ffc-4400-a354-fe30fac0dc25
131
+ 2025-08-21 01:36:12 - INFO - [95265fda-7544-4393-a928-5411d89f8f51] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_015.mp4'
132
+ 2025-08-21 01:36:12 - INFO - [95265fda-7544-4393-a928-5411d89f8f51] Video saved to temporary file: temp_videos/95265fda-7544-4393-a928-5411d89f8f51.mp4
133
+ 2025-08-21 01:36:12 - INFO - [95265fda-7544-4393-a928-5411d89f8f51] Extracting frames using method: uniform, rate/threshold: 30
134
+ 2025-08-21 01:36:16 - INFO - [95265fda-7544-4393-a928-5411d89f8f51] Extracted 30 frames successfully. Saving to temporary files...
135
+ 2025-08-21 01:36:16 - INFO - [95265fda-7544-4393-a928-5411d89f8f51] 30 frames saved to temp_videos/95265fda-7544-4393-a928-5411d89f8f51
136
+ 2025-08-21 01:36:16 - INFO - Prompt token length: 2306
137
+ 2025-08-21 01:36:23 - INFO - Tokens per second: 14.90843712868291, Peak GPU memory MB: 4514.375
138
+ 2025-08-21 01:36:23 - INFO - [95265fda-7544-4393-a928-5411d89f8f51] Inference time: 10.51 seconds, CPU usage: 42.4%, CPU core utilization: [33.2, 26.3, 25.8, 84.0]
139
+ 2025-08-21 01:36:23 - INFO - [95265fda-7544-4393-a928-5411d89f8f51] Cleaned up temporary frame directory: temp_videos/95265fda-7544-4393-a928-5411d89f8f51
140
+ 2025-08-21 01:36:23 - INFO - [5425fe3f-264e-4b86-b655-903ec4f4ef2e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_016.mp4'
141
+ 2025-08-21 01:36:23 - INFO - [5425fe3f-264e-4b86-b655-903ec4f4ef2e] Video saved to temporary file: temp_videos/5425fe3f-264e-4b86-b655-903ec4f4ef2e.mp4
142
+ 2025-08-21 01:36:23 - INFO - [5425fe3f-264e-4b86-b655-903ec4f4ef2e] Extracting frames using method: uniform, rate/threshold: 30
143
+ 2025-08-21 01:36:26 - INFO - [5425fe3f-264e-4b86-b655-903ec4f4ef2e] Extracted 30 frames successfully. Saving to temporary files...
144
+ 2025-08-21 01:36:26 - INFO - [5425fe3f-264e-4b86-b655-903ec4f4ef2e] 30 frames saved to temp_videos/5425fe3f-264e-4b86-b655-903ec4f4ef2e
145
+ 2025-08-21 01:36:27 - INFO - Prompt token length: 2306
146
+ 2025-08-21 01:36:34 - INFO - Tokens per second: 15.01252738973886, Peak GPU memory MB: 4514.375
147
+ 2025-08-21 01:36:34 - INFO - [5425fe3f-264e-4b86-b655-903ec4f4ef2e] Inference time: 11.52 seconds, CPU usage: 42.0%, CPU core utilization: [24.5, 25.9, 25.9, 91.7]
148
+ 2025-08-21 01:36:34 - INFO - [5425fe3f-264e-4b86-b655-903ec4f4ef2e] Cleaned up temporary frame directory: temp_videos/5425fe3f-264e-4b86-b655-903ec4f4ef2e
API_Transformers/logs/Qwen2.5-VL-3B-Instruct-AWQ/20250821_003308.log ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ 2025-08-21 00:33:08 - INFO - Loading model: Qwen/Qwen2.5-VL-3B-Instruct-AWQ
2
+ 2025-08-21 00:33:56 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
API_Transformers/logs/Qwen2.5-VL-3B-Instruct-AWQ/20250821_003548.log ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ 2025-08-21 00:35:48 - INFO - Loading model: Qwen/Qwen2.5-VL-3B-Instruct-AWQ
2
+ 2025-08-21 00:35:51 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
API_Transformers/logs/Qwen2.5-VL-3B-Instruct-AWQ/20250821_003740.log ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-21 00:37:40 - INFO - Loading model: Qwen/Qwen2.5-VL-3B-Instruct-AWQ
2
+ 2025-08-21 00:37:42 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-21 00:37:58 - INFO - Model loaded in 17.64 seconds
4
+ 2025-08-21 00:37:58 - INFO - GPU Memory Usage after model load: 3250.85 MB
5
+ 2025-08-21 00:39:14 - INFO - [7b3e4c2f-150e-4db3-a2b2-792ef836f5c3] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
6
+ 2025-08-21 00:39:14 - INFO - [7b3e4c2f-150e-4db3-a2b2-792ef836f5c3] Video saved to temporary file: temp_videos/7b3e4c2f-150e-4db3-a2b2-792ef836f5c3.mp4
7
+ 2025-08-21 00:39:14 - INFO - [7b3e4c2f-150e-4db3-a2b2-792ef836f5c3] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-21 00:39:18 - INFO - [7b3e4c2f-150e-4db3-a2b2-792ef836f5c3] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-21 00:39:18 - INFO - [7b3e4c2f-150e-4db3-a2b2-792ef836f5c3] 30 frames saved to temp_videos/7b3e4c2f-150e-4db3-a2b2-792ef836f5c3
10
+ 2025-08-21 00:39:19 - INFO - Prompt token length: 2306
API_Transformers/logs/Qwen2.5-VL-3B-Instruct-AWQ/20250821_004253.log ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-21 00:42:53 - INFO - Loading model: Qwen/Qwen2.5-VL-3B-Instruct-AWQ
2
+ 2025-08-21 00:42:56 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-21 00:43:05 - INFO - Model loaded in 11.91 seconds
4
+ 2025-08-21 00:43:05 - INFO - GPU Memory Usage after model load: 3250.55 MB
5
+ 2025-08-21 00:44:34 - INFO - [85d08818-6d68-43fa-a772-626d83ea5d11] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
6
+ 2025-08-21 00:44:34 - INFO - [85d08818-6d68-43fa-a772-626d83ea5d11] Video saved to temporary file: temp_videos/85d08818-6d68-43fa-a772-626d83ea5d11.mp4
7
+ 2025-08-21 00:44:34 - INFO - [85d08818-6d68-43fa-a772-626d83ea5d11] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-21 00:44:41 - INFO - [85d08818-6d68-43fa-a772-626d83ea5d11] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-21 00:44:41 - INFO - [85d08818-6d68-43fa-a772-626d83ea5d11] 30 frames saved to temp_videos/85d08818-6d68-43fa-a772-626d83ea5d11
10
+ 2025-08-21 00:44:41 - INFO - Prompt token length: 2306
11
+ 2025-08-21 00:44:47 - INFO - Tokens per second: 11.896679103804114, Peak GPU memory MB: 5348.375
12
+ 2025-08-21 00:44:47 - INFO - [85d08818-6d68-43fa-a772-626d83ea5d11] Inference time: 12.73 seconds, CPU usage: 20.1%, CPU core utilization: [17.7, 19.0, 21.6, 22.0]
13
+ 2025-08-21 00:44:47 - INFO - [85d08818-6d68-43fa-a772-626d83ea5d11] Cleaned up temporary frame directory: temp_videos/85d08818-6d68-43fa-a772-626d83ea5d11
14
+ 2025-08-21 00:44:47 - INFO - [6f9278db-56d7-44a9-b7f0-7200571a0979] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_002.mp4'
15
+ 2025-08-21 00:44:47 - INFO - [6f9278db-56d7-44a9-b7f0-7200571a0979] Video saved to temporary file: temp_videos/6f9278db-56d7-44a9-b7f0-7200571a0979.mp4
16
+ 2025-08-21 00:44:47 - INFO - [6f9278db-56d7-44a9-b7f0-7200571a0979] Extracting frames using method: uniform, rate/threshold: 30
17
+ 2025-08-21 00:44:52 - INFO - [6f9278db-56d7-44a9-b7f0-7200571a0979] Extracted 30 frames successfully. Saving to temporary files...
18
+ 2025-08-21 00:44:52 - INFO - [6f9278db-56d7-44a9-b7f0-7200571a0979] 30 frames saved to temp_videos/6f9278db-56d7-44a9-b7f0-7200571a0979
19
+ 2025-08-21 00:44:52 - INFO - Prompt token length: 2306
20
+ 2025-08-21 00:44:57 - INFO - Tokens per second: 12.02869428489415, Peak GPU memory MB: 5348.375
21
+ 2025-08-21 00:44:57 - INFO - [6f9278db-56d7-44a9-b7f0-7200571a0979] Inference time: 10.58 seconds, CPU usage: 55.1%, CPU core utilization: [42.1, 41.3, 93.8, 43.2]
22
+ 2025-08-21 00:44:57 - INFO - [6f9278db-56d7-44a9-b7f0-7200571a0979] Cleaned up temporary frame directory: temp_videos/6f9278db-56d7-44a9-b7f0-7200571a0979
23
+ 2025-08-21 00:44:57 - INFO - [2835a505-ec18-45e1-9b43-393c4eb0c79a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_003.mp4'
24
+ 2025-08-21 00:44:57 - INFO - [2835a505-ec18-45e1-9b43-393c4eb0c79a] Video saved to temporary file: temp_videos/2835a505-ec18-45e1-9b43-393c4eb0c79a.mp4
25
+ 2025-08-21 00:44:57 - INFO - [2835a505-ec18-45e1-9b43-393c4eb0c79a] Extracting frames using method: uniform, rate/threshold: 30
26
+ 2025-08-21 00:45:02 - INFO - [2835a505-ec18-45e1-9b43-393c4eb0c79a] Extracted 30 frames successfully. Saving to temporary files...
27
+ 2025-08-21 00:45:02 - INFO - [2835a505-ec18-45e1-9b43-393c4eb0c79a] 30 frames saved to temp_videos/2835a505-ec18-45e1-9b43-393c4eb0c79a
28
+ 2025-08-21 00:45:02 - INFO - Prompt token length: 2306
29
+ 2025-08-21 00:45:08 - INFO - Tokens per second: 11.82593643667435, Peak GPU memory MB: 5348.375
30
+ 2025-08-21 00:45:08 - INFO - [2835a505-ec18-45e1-9b43-393c4eb0c79a] Inference time: 10.16 seconds, CPU usage: 56.2%, CPU core utilization: [90.9, 44.3, 46.7, 42.8]
31
+ 2025-08-21 00:45:08 - INFO - [2835a505-ec18-45e1-9b43-393c4eb0c79a] Cleaned up temporary frame directory: temp_videos/2835a505-ec18-45e1-9b43-393c4eb0c79a
32
+ 2025-08-21 00:45:08 - INFO - [9ad1595c-b1c3-409e-99bb-050a41cf9e9e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_004.mp4'
33
+ 2025-08-21 00:45:08 - INFO - [9ad1595c-b1c3-409e-99bb-050a41cf9e9e] Video saved to temporary file: temp_videos/9ad1595c-b1c3-409e-99bb-050a41cf9e9e.mp4
34
+ 2025-08-21 00:45:08 - INFO - [9ad1595c-b1c3-409e-99bb-050a41cf9e9e] Extracting frames using method: uniform, rate/threshold: 30
35
+ 2025-08-21 00:45:13 - INFO - [9ad1595c-b1c3-409e-99bb-050a41cf9e9e] Extracted 30 frames successfully. Saving to temporary files...
36
+ 2025-08-21 00:45:13 - INFO - [9ad1595c-b1c3-409e-99bb-050a41cf9e9e] 30 frames saved to temp_videos/9ad1595c-b1c3-409e-99bb-050a41cf9e9e
37
+ 2025-08-21 00:45:13 - INFO - Prompt token length: 2306
38
+ 2025-08-21 00:45:19 - INFO - Tokens per second: 11.785621023429538, Peak GPU memory MB: 5348.375
39
+ 2025-08-21 00:45:19 - INFO - [9ad1595c-b1c3-409e-99bb-050a41cf9e9e] Inference time: 11.90 seconds, CPU usage: 53.0%, CPU core utilization: [38.8, 90.1, 41.1, 42.3]
40
+ 2025-08-21 00:45:19 - INFO - [9ad1595c-b1c3-409e-99bb-050a41cf9e9e] Cleaned up temporary frame directory: temp_videos/9ad1595c-b1c3-409e-99bb-050a41cf9e9e
41
+ 2025-08-21 00:45:19 - INFO - [83ee3b32-7870-4d00-b3f0-d1ec1167d45e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_005.mp4'
42
+ 2025-08-21 00:45:19 - INFO - [83ee3b32-7870-4d00-b3f0-d1ec1167d45e] Video saved to temporary file: temp_videos/83ee3b32-7870-4d00-b3f0-d1ec1167d45e.mp4
43
+ 2025-08-21 00:45:19 - INFO - [83ee3b32-7870-4d00-b3f0-d1ec1167d45e] Extracting frames using method: uniform, rate/threshold: 30
44
+ 2025-08-21 00:45:24 - INFO - [83ee3b32-7870-4d00-b3f0-d1ec1167d45e] Extracted 30 frames successfully. Saving to temporary files...
45
+ 2025-08-21 00:45:24 - INFO - [83ee3b32-7870-4d00-b3f0-d1ec1167d45e] 30 frames saved to temp_videos/83ee3b32-7870-4d00-b3f0-d1ec1167d45e
46
+ 2025-08-21 00:45:25 - INFO - Prompt token length: 2306
47
+ 2025-08-21 00:45:32 - INFO - Tokens per second: 9.017638706034026, Peak GPU memory MB: 5348.375
48
+ 2025-08-21 00:45:32 - INFO - [83ee3b32-7870-4d00-b3f0-d1ec1167d45e] Inference time: 12.17 seconds, CPU usage: 75.1%, CPU core utilization: [69.4, 92.0, 68.0, 70.7]
49
+ 2025-08-21 00:45:32 - INFO - [83ee3b32-7870-4d00-b3f0-d1ec1167d45e] Cleaned up temporary frame directory: temp_videos/83ee3b32-7870-4d00-b3f0-d1ec1167d45e
50
+ 2025-08-21 00:45:50 - INFO - [91458b58-07b8-4a0e-bbec-63fde300aebc] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
51
+ 2025-08-21 00:45:50 - INFO - [91458b58-07b8-4a0e-bbec-63fde300aebc] Video saved to temporary file: temp_videos/91458b58-07b8-4a0e-bbec-63fde300aebc.mp4
52
+ 2025-08-21 00:45:50 - INFO - [91458b58-07b8-4a0e-bbec-63fde300aebc] Extracting frames using method: uniform, rate/threshold: 30
53
+ 2025-08-21 00:45:57 - INFO - [91458b58-07b8-4a0e-bbec-63fde300aebc] Extracted 30 frames successfully. Saving to temporary files...
54
+ 2025-08-21 00:45:57 - INFO - [91458b58-07b8-4a0e-bbec-63fde300aebc] 30 frames saved to temp_videos/91458b58-07b8-4a0e-bbec-63fde300aebc
55
+ 2025-08-21 00:45:57 - INFO - Prompt token length: 2296
56
+ 2025-08-21 00:46:18 - INFO - Tokens per second: 11.854063880552362, Peak GPU memory MB: 5348.375
57
+ 2025-08-21 00:46:18 - INFO - [91458b58-07b8-4a0e-bbec-63fde300aebc] Inference time: 28.38 seconds, CPU usage: 43.3%, CPU core utilization: [34.4, 74.4, 32.5, 31.9]
58
+ 2025-08-21 00:46:18 - INFO - [91458b58-07b8-4a0e-bbec-63fde300aebc] Cleaned up temporary frame directory: temp_videos/91458b58-07b8-4a0e-bbec-63fde300aebc
59
+ 2025-08-21 00:46:18 - INFO - [65b42141-20bf-4cf1-92b1-f29d846146ab] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_002.mp4'
60
+ 2025-08-21 00:46:18 - INFO - [65b42141-20bf-4cf1-92b1-f29d846146ab] Video saved to temporary file: temp_videos/65b42141-20bf-4cf1-92b1-f29d846146ab.mp4
61
+ 2025-08-21 00:46:18 - INFO - [65b42141-20bf-4cf1-92b1-f29d846146ab] Extracting frames using method: uniform, rate/threshold: 30
62
+ 2025-08-21 00:46:23 - INFO - [65b42141-20bf-4cf1-92b1-f29d846146ab] Extracted 30 frames successfully. Saving to temporary files...
63
+ 2025-08-21 00:46:23 - INFO - [65b42141-20bf-4cf1-92b1-f29d846146ab] 30 frames saved to temp_videos/65b42141-20bf-4cf1-92b1-f29d846146ab
64
+ 2025-08-21 00:46:23 - INFO - Prompt token length: 2296
65
+ 2025-08-21 00:46:47 - INFO - Tokens per second: 11.997021386458192, Peak GPU memory MB: 5348.375
66
+ 2025-08-21 00:46:47 - INFO - [65b42141-20bf-4cf1-92b1-f29d846146ab] Inference time: 29.08 seconds, CPU usage: 37.0%, CPU core utilization: [33.9, 16.1, 80.3, 17.7]
67
+ 2025-08-21 00:46:47 - INFO - [65b42141-20bf-4cf1-92b1-f29d846146ab] Cleaned up temporary frame directory: temp_videos/65b42141-20bf-4cf1-92b1-f29d846146ab
68
+ 2025-08-21 00:46:47 - INFO - [2ff4de72-4fa0-4759-9211-626a4f60c683] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_003.mp4'
69
+ 2025-08-21 00:46:47 - INFO - [2ff4de72-4fa0-4759-9211-626a4f60c683] Video saved to temporary file: temp_videos/2ff4de72-4fa0-4759-9211-626a4f60c683.mp4
70
+ 2025-08-21 00:46:47 - INFO - [2ff4de72-4fa0-4759-9211-626a4f60c683] Extracting frames using method: uniform, rate/threshold: 30
71
+ 2025-08-21 00:46:52 - INFO - [2ff4de72-4fa0-4759-9211-626a4f60c683] Extracted 30 frames successfully. Saving to temporary files...
72
+ 2025-08-21 00:46:52 - INFO - [2ff4de72-4fa0-4759-9211-626a4f60c683] 30 frames saved to temp_videos/2ff4de72-4fa0-4759-9211-626a4f60c683
73
+ 2025-08-21 00:46:52 - INFO - Prompt token length: 2296
74
+ 2025-08-21 00:47:16 - INFO - Tokens per second: 12.037390307990146, Peak GPU memory MB: 5348.375
75
+ 2025-08-21 00:47:16 - INFO - [2ff4de72-4fa0-4759-9211-626a4f60c683] Inference time: 29.02 seconds, CPU usage: 37.2%, CPU core utilization: [48.4, 16.9, 65.1, 18.1]
76
+ 2025-08-21 00:47:16 - INFO - [2ff4de72-4fa0-4759-9211-626a4f60c683] Cleaned up temporary frame directory: temp_videos/2ff4de72-4fa0-4759-9211-626a4f60c683
77
+ 2025-08-21 00:47:16 - INFO - [68a0b698-fcf0-4e8b-b0cb-e03797f97561] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_004.mp4'
78
+ 2025-08-21 00:47:16 - INFO - [68a0b698-fcf0-4e8b-b0cb-e03797f97561] Video saved to temporary file: temp_videos/68a0b698-fcf0-4e8b-b0cb-e03797f97561.mp4
79
+ 2025-08-21 00:47:16 - INFO - [68a0b698-fcf0-4e8b-b0cb-e03797f97561] Extracting frames using method: uniform, rate/threshold: 30
80
+ 2025-08-21 00:47:21 - INFO - [68a0b698-fcf0-4e8b-b0cb-e03797f97561] Extracted 30 frames successfully. Saving to temporary files...
81
+ 2025-08-21 00:47:21 - INFO - [68a0b698-fcf0-4e8b-b0cb-e03797f97561] 30 frames saved to temp_videos/68a0b698-fcf0-4e8b-b0cb-e03797f97561
82
+ 2025-08-21 00:47:21 - INFO - Prompt token length: 2296
83
+ 2025-08-21 00:47:45 - INFO - Tokens per second: 12.027123899562989, Peak GPU memory MB: 5348.375
84
+ 2025-08-21 00:47:45 - INFO - [68a0b698-fcf0-4e8b-b0cb-e03797f97561] Inference time: 29.08 seconds, CPU usage: 36.9%, CPU core utilization: [74.2, 17.8, 15.5, 40.0]
85
+ 2025-08-21 00:47:45 - INFO - [68a0b698-fcf0-4e8b-b0cb-e03797f97561] Cleaned up temporary frame directory: temp_videos/68a0b698-fcf0-4e8b-b0cb-e03797f97561
86
+ 2025-08-21 00:47:45 - INFO - [6b5a1c52-b835-40af-b34e-b1b24b36ca95] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_005.mp4'
87
+ 2025-08-21 00:47:45 - INFO - [6b5a1c52-b835-40af-b34e-b1b24b36ca95] Video saved to temporary file: temp_videos/6b5a1c52-b835-40af-b34e-b1b24b36ca95.mp4
88
+ 2025-08-21 00:47:45 - INFO - [6b5a1c52-b835-40af-b34e-b1b24b36ca95] Extracting frames using method: uniform, rate/threshold: 30
89
+ 2025-08-21 00:47:50 - INFO - [6b5a1c52-b835-40af-b34e-b1b24b36ca95] Extracted 30 frames successfully. Saving to temporary files...
90
+ 2025-08-21 00:47:50 - INFO - [6b5a1c52-b835-40af-b34e-b1b24b36ca95] 30 frames saved to temp_videos/6b5a1c52-b835-40af-b34e-b1b24b36ca95
91
+ 2025-08-21 00:47:50 - INFO - Prompt token length: 2296
92
+ 2025-08-21 00:48:07 - INFO - Tokens per second: 11.998806395924422, Peak GPU memory MB: 5348.375
93
+ 2025-08-21 00:48:07 - INFO - [6b5a1c52-b835-40af-b34e-b1b24b36ca95] Inference time: 21.52 seconds, CPU usage: 40.1%, CPU core utilization: [93.9, 22.7, 21.5, 22.1]
94
+ 2025-08-21 00:48:07 - INFO - [6b5a1c52-b835-40af-b34e-b1b24b36ca95] Cleaned up temporary frame directory: temp_videos/6b5a1c52-b835-40af-b34e-b1b24b36ca95
95
+ 2025-08-21 00:48:07 - INFO - [f6c17199-243f-477c-8b92-175e7d81c801] Received new video inference request. Prompt: 'Please describe the video in detail, only focus on customer and staff behavior and activities and do not overly describe the static scene.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_006.mp4'
96
+ 2025-08-21 00:48:07 - INFO - [f6c17199-243f-477c-8b92-175e7d81c801] Video saved to temporary file: temp_videos/f6c17199-243f-477c-8b92-175e7d81c801.mp4
97
+ 2025-08-21 00:48:07 - INFO - [f6c17199-243f-477c-8b92-175e7d81c801] Extracting frames using method: uniform, rate/threshold: 30
98
+ 2025-08-21 00:48:12 - INFO - [f6c17199-243f-477c-8b92-175e7d81c801] Extracted 30 frames successfully. Saving to temporary files...
99
+ 2025-08-21 00:48:12 - INFO - [f6c17199-243f-477c-8b92-175e7d81c801] 30 frames saved to temp_videos/f6c17199-243f-477c-8b92-175e7d81c801
100
+ 2025-08-21 00:48:12 - INFO - Prompt token length: 2296
101
+ 2025-08-21 00:48:36 - INFO - Tokens per second: 12.045229786817497, Peak GPU memory MB: 5348.375
102
+ 2025-08-21 00:48:36 - INFO - [f6c17199-243f-477c-8b92-175e7d81c801] Inference time: 29.18 seconds, CPU usage: 37.2%, CPU core utilization: [44.5, 28.8, 58.7, 16.6]
103
+ 2025-08-21 00:48:36 - INFO - [f6c17199-243f-477c-8b92-175e7d81c801] Cleaned up temporary frame directory: temp_videos/f6c17199-243f-477c-8b92-175e7d81c801
API_Transformers/logs/Qwen2.5-VL-3B-Instruct-AWQ/20250821_004907.log ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-21 00:49:07 - INFO - Loading model: Qwen/Qwen2.5-VL-3B-Instruct-AWQ
2
+ 2025-08-21 00:49:11 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-21 00:49:19 - INFO - Model loaded in 11.79 seconds
4
+ 2025-08-21 00:49:19 - INFO - GPU Memory Usage after model load: 3250.55 MB
5
+ 2025-08-21 00:50:53 - INFO - [30fe7962-43c7-418e-9663-3cf53776c810] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
6
+ 2025-08-21 00:50:53 - INFO - [30fe7962-43c7-418e-9663-3cf53776c810] Video saved to temporary file: temp_videos/30fe7962-43c7-418e-9663-3cf53776c810.mp4
7
+ 2025-08-21 00:50:53 - INFO - [30fe7962-43c7-418e-9663-3cf53776c810] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-21 00:50:58 - INFO - [30fe7962-43c7-418e-9663-3cf53776c810] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-21 00:50:58 - INFO - [30fe7962-43c7-418e-9663-3cf53776c810] 30 frames saved to temp_videos/30fe7962-43c7-418e-9663-3cf53776c810
10
+ 2025-08-21 00:50:58 - INFO - Prompt token length: 2306
11
+ 2025-08-21 00:51:05 - INFO - Tokens per second: 11.82581726573877, Peak GPU memory MB: 5348.375
12
+ 2025-08-21 00:51:05 - INFO - [30fe7962-43c7-418e-9663-3cf53776c810] Inference time: 11.48 seconds, CPU usage: 19.9%, CPU core utilization: [17.8, 19.9, 21.8, 20.2]
13
+ 2025-08-21 00:51:05 - INFO - [30fe7962-43c7-418e-9663-3cf53776c810] Cleaned up temporary frame directory: temp_videos/30fe7962-43c7-418e-9663-3cf53776c810
14
+ 2025-08-21 00:51:05 - INFO - [a3af8f29-02fa-49b6-bcb7-c671f274c93a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_002.mp4'
15
+ 2025-08-21 00:51:05 - INFO - [a3af8f29-02fa-49b6-bcb7-c671f274c93a] Video saved to temporary file: temp_videos/a3af8f29-02fa-49b6-bcb7-c671f274c93a.mp4
16
+ 2025-08-21 00:51:05 - INFO - [a3af8f29-02fa-49b6-bcb7-c671f274c93a] Extracting frames using method: uniform, rate/threshold: 30
17
+ 2025-08-21 00:51:09 - INFO - [a3af8f29-02fa-49b6-bcb7-c671f274c93a] Extracted 30 frames successfully. Saving to temporary files...
18
+ 2025-08-21 00:51:09 - INFO - [a3af8f29-02fa-49b6-bcb7-c671f274c93a] 30 frames saved to temp_videos/a3af8f29-02fa-49b6-bcb7-c671f274c93a
19
+ 2025-08-21 00:51:10 - INFO - Prompt token length: 2306
20
+ 2025-08-21 00:51:15 - INFO - Tokens per second: 11.74072921403584, Peak GPU memory MB: 5348.375
21
+ 2025-08-21 00:51:15 - INFO - [a3af8f29-02fa-49b6-bcb7-c671f274c93a] Inference time: 10.66 seconds, CPU usage: 56.2%, CPU core utilization: [43.8, 43.4, 43.8, 93.4]
22
+ 2025-08-21 00:51:15 - INFO - [a3af8f29-02fa-49b6-bcb7-c671f274c93a] Cleaned up temporary frame directory: temp_videos/a3af8f29-02fa-49b6-bcb7-c671f274c93a
23
+ 2025-08-21 00:51:15 - INFO - [be2cf942-7b83-46a1-80f4-3341fc34fdda] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_003.mp4'
24
+ 2025-08-21 00:51:15 - INFO - [be2cf942-7b83-46a1-80f4-3341fc34fdda] Video saved to temporary file: temp_videos/be2cf942-7b83-46a1-80f4-3341fc34fdda.mp4
25
+ 2025-08-21 00:51:15 - INFO - [be2cf942-7b83-46a1-80f4-3341fc34fdda] Extracting frames using method: uniform, rate/threshold: 30
26
+ 2025-08-21 00:51:20 - INFO - [be2cf942-7b83-46a1-80f4-3341fc34fdda] Extracted 30 frames successfully. Saving to temporary files...
27
+ 2025-08-21 00:51:20 - INFO - [be2cf942-7b83-46a1-80f4-3341fc34fdda] 30 frames saved to temp_videos/be2cf942-7b83-46a1-80f4-3341fc34fdda
28
+ 2025-08-21 00:51:20 - INFO - Prompt token length: 2306
29
+ 2025-08-21 00:51:27 - INFO - Tokens per second: 11.73304837389127, Peak GPU memory MB: 5348.375
30
+ 2025-08-21 00:51:27 - INFO - [be2cf942-7b83-46a1-80f4-3341fc34fdda] Inference time: 11.55 seconds, CPU usage: 52.1%, CPU core utilization: [38.3, 93.8, 38.3, 38.0]
31
+ 2025-08-21 00:51:27 - INFO - [be2cf942-7b83-46a1-80f4-3341fc34fdda] Cleaned up temporary frame directory: temp_videos/be2cf942-7b83-46a1-80f4-3341fc34fdda
32
+ 2025-08-21 00:51:27 - INFO - [1d4fb530-0fc9-438f-a51a-cabce02b6cb7] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_004.mp4'
33
+ 2025-08-21 00:51:27 - INFO - [1d4fb530-0fc9-438f-a51a-cabce02b6cb7] Video saved to temporary file: temp_videos/1d4fb530-0fc9-438f-a51a-cabce02b6cb7.mp4
34
+ 2025-08-21 00:51:27 - INFO - [1d4fb530-0fc9-438f-a51a-cabce02b6cb7] Extracting frames using method: uniform, rate/threshold: 30
35
+ 2025-08-21 00:51:32 - INFO - [1d4fb530-0fc9-438f-a51a-cabce02b6cb7] Extracted 30 frames successfully. Saving to temporary files...
36
+ 2025-08-21 00:51:32 - INFO - [1d4fb530-0fc9-438f-a51a-cabce02b6cb7] 30 frames saved to temp_videos/1d4fb530-0fc9-438f-a51a-cabce02b6cb7
37
+ 2025-08-21 00:51:32 - INFO - Prompt token length: 2306
38
+ 2025-08-21 00:51:38 - INFO - Tokens per second: 11.929480284506932, Peak GPU memory MB: 5348.375
39
+ 2025-08-21 00:51:38 - INFO - [1d4fb530-0fc9-438f-a51a-cabce02b6cb7] Inference time: 11.57 seconds, CPU usage: 52.3%, CPU core utilization: [89.2, 38.6, 42.9, 38.3]
40
+ 2025-08-21 00:51:38 - INFO - [1d4fb530-0fc9-438f-a51a-cabce02b6cb7] Cleaned up temporary frame directory: temp_videos/1d4fb530-0fc9-438f-a51a-cabce02b6cb7
41
+ 2025-08-21 00:51:38 - INFO - [ce11e297-0569-49ac-85bc-050e43e84448] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_005.mp4'
42
+ 2025-08-21 00:51:38 - INFO - [ce11e297-0569-49ac-85bc-050e43e84448] Video saved to temporary file: temp_videos/ce11e297-0569-49ac-85bc-050e43e84448.mp4
43
+ 2025-08-21 00:51:38 - INFO - [ce11e297-0569-49ac-85bc-050e43e84448] Extracting frames using method: uniform, rate/threshold: 30
44
+ 2025-08-21 00:51:43 - INFO - [ce11e297-0569-49ac-85bc-050e43e84448] Extracted 30 frames successfully. Saving to temporary files...
45
+ 2025-08-21 00:51:43 - INFO - [ce11e297-0569-49ac-85bc-050e43e84448] 30 frames saved to temp_videos/ce11e297-0569-49ac-85bc-050e43e84448
46
+ 2025-08-21 00:51:43 - INFO - Prompt token length: 2306
47
+ 2025-08-21 00:51:50 - INFO - Tokens per second: 11.89941941740383, Peak GPU memory MB: 5348.375
48
+ 2025-08-21 00:51:50 - INFO - [ce11e297-0569-49ac-85bc-050e43e84448] Inference time: 11.12 seconds, CPU usage: 53.2%, CPU core utilization: [37.8, 40.6, 93.6, 40.6]
49
+ 2025-08-21 00:51:50 - INFO - [ce11e297-0569-49ac-85bc-050e43e84448] Cleaned up temporary frame directory: temp_videos/ce11e297-0569-49ac-85bc-050e43e84448
50
+ 2025-08-21 00:51:50 - INFO - [5cec6dd5-3430-473f-aa6a-0d81b6475f34] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_006.mp4'
51
+ 2025-08-21 00:51:50 - INFO - [5cec6dd5-3430-473f-aa6a-0d81b6475f34] Video saved to temporary file: temp_videos/5cec6dd5-3430-473f-aa6a-0d81b6475f34.mp4
52
+ 2025-08-21 00:51:50 - INFO - [5cec6dd5-3430-473f-aa6a-0d81b6475f34] Extracting frames using method: uniform, rate/threshold: 30
53
+ 2025-08-21 00:51:54 - INFO - [5cec6dd5-3430-473f-aa6a-0d81b6475f34] Extracted 30 frames successfully. Saving to temporary files...
54
+ 2025-08-21 00:51:54 - INFO - [5cec6dd5-3430-473f-aa6a-0d81b6475f34] 30 frames saved to temp_videos/5cec6dd5-3430-473f-aa6a-0d81b6475f34
55
+ 2025-08-21 00:51:55 - INFO - Prompt token length: 2306
56
+ 2025-08-21 00:52:00 - INFO - Tokens per second: 11.881699260124632, Peak GPU memory MB: 5348.375
57
+ 2025-08-21 00:52:00 - INFO - [5cec6dd5-3430-473f-aa6a-0d81b6475f34] Inference time: 10.77 seconds, CPU usage: 53.6%, CPU core utilization: [59.2, 66.8, 47.9, 40.4]
58
+ 2025-08-21 00:52:00 - INFO - [5cec6dd5-3430-473f-aa6a-0d81b6475f34] Cleaned up temporary frame directory: temp_videos/5cec6dd5-3430-473f-aa6a-0d81b6475f34
59
+ 2025-08-21 00:52:21 - INFO - [70289040-01c3-4ed8-83de-5a2d9996ed2d] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4'
60
+ 2025-08-21 00:52:21 - INFO - [70289040-01c3-4ed8-83de-5a2d9996ed2d] Video saved to temporary file: temp_videos/70289040-01c3-4ed8-83de-5a2d9996ed2d.mp4
61
+ 2025-08-21 00:52:21 - INFO - [70289040-01c3-4ed8-83de-5a2d9996ed2d] Extracting frames using method: uniform, rate/threshold: 30
62
+ 2025-08-21 00:52:26 - INFO - [70289040-01c3-4ed8-83de-5a2d9996ed2d] Extracted 30 frames successfully. Saving to temporary files...
63
+ 2025-08-21 00:52:26 - INFO - [70289040-01c3-4ed8-83de-5a2d9996ed2d] 30 frames saved to temp_videos/70289040-01c3-4ed8-83de-5a2d9996ed2d
64
+ 2025-08-21 00:52:26 - INFO - Prompt token length: 2305
65
+ 2025-08-21 00:52:32 - INFO - Tokens per second: 11.87194334609885, Peak GPU memory MB: 5348.375
66
+ 2025-08-21 00:52:32 - INFO - [70289040-01c3-4ed8-83de-5a2d9996ed2d] Inference time: 10.96 seconds, CPU usage: 20.3%, CPU core utilization: [15.5, 15.3, 33.8, 16.7]
67
+ 2025-08-21 00:52:32 - INFO - [70289040-01c3-4ed8-83de-5a2d9996ed2d] Cleaned up temporary frame directory: temp_videos/70289040-01c3-4ed8-83de-5a2d9996ed2d
68
+ 2025-08-21 00:52:32 - INFO - [a8bb150b-138f-4300-adf1-fa15dbace647] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_002.mp4'
69
+ 2025-08-21 00:52:32 - INFO - [a8bb150b-138f-4300-adf1-fa15dbace647] Video saved to temporary file: temp_videos/a8bb150b-138f-4300-adf1-fa15dbace647.mp4
70
+ 2025-08-21 00:52:32 - INFO - [a8bb150b-138f-4300-adf1-fa15dbace647] Extracting frames using method: uniform, rate/threshold: 30
71
+ 2025-08-21 00:52:37 - INFO - [a8bb150b-138f-4300-adf1-fa15dbace647] Extracted 30 frames successfully. Saving to temporary files...
72
+ 2025-08-21 00:52:37 - INFO - [a8bb150b-138f-4300-adf1-fa15dbace647] 30 frames saved to temp_videos/a8bb150b-138f-4300-adf1-fa15dbace647
73
+ 2025-08-21 00:52:37 - INFO - Prompt token length: 2305
74
+ 2025-08-21 00:52:44 - INFO - Tokens per second: 11.83286096302887, Peak GPU memory MB: 5348.375
75
+ 2025-08-21 00:52:44 - INFO - [a8bb150b-138f-4300-adf1-fa15dbace647] Inference time: 11.96 seconds, CPU usage: 52.1%, CPU core utilization: [39.5, 38.2, 37.0, 93.5]
76
+ 2025-08-21 00:52:44 - INFO - [a8bb150b-138f-4300-adf1-fa15dbace647] Cleaned up temporary frame directory: temp_videos/a8bb150b-138f-4300-adf1-fa15dbace647
77
+ 2025-08-21 00:52:44 - INFO - [6f5c9723-cae6-47ab-8cc7-6942f1ad38d4] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_003.mp4'
78
+ 2025-08-21 00:52:44 - INFO - [6f5c9723-cae6-47ab-8cc7-6942f1ad38d4] Video saved to temporary file: temp_videos/6f5c9723-cae6-47ab-8cc7-6942f1ad38d4.mp4
79
+ 2025-08-21 00:52:44 - INFO - [6f5c9723-cae6-47ab-8cc7-6942f1ad38d4] Extracting frames using method: uniform, rate/threshold: 30
80
+ 2025-08-21 00:52:49 - INFO - [6f5c9723-cae6-47ab-8cc7-6942f1ad38d4] Extracted 30 frames successfully. Saving to temporary files...
81
+ 2025-08-21 00:52:49 - INFO - [6f5c9723-cae6-47ab-8cc7-6942f1ad38d4] 30 frames saved to temp_videos/6f5c9723-cae6-47ab-8cc7-6942f1ad38d4
82
+ 2025-08-21 00:52:49 - INFO - Prompt token length: 2305
83
+ 2025-08-21 00:52:56 - INFO - Tokens per second: 11.760994323642526, Peak GPU memory MB: 5348.375
84
+ 2025-08-21 00:52:56 - INFO - [6f5c9723-cae6-47ab-8cc7-6942f1ad38d4] Inference time: 11.29 seconds, CPU usage: 53.3%, CPU core utilization: [40.3, 75.1, 39.1, 58.8]
85
+ 2025-08-21 00:52:56 - INFO - [6f5c9723-cae6-47ab-8cc7-6942f1ad38d4] Cleaned up temporary frame directory: temp_videos/6f5c9723-cae6-47ab-8cc7-6942f1ad38d4
86
+ 2025-08-21 00:52:56 - INFO - [0d1e6fde-165d-4649-8878-c71f32a33f71] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_004.mp4'
87
+ 2025-08-21 00:52:56 - INFO - [0d1e6fde-165d-4649-8878-c71f32a33f71] Video saved to temporary file: temp_videos/0d1e6fde-165d-4649-8878-c71f32a33f71.mp4
88
+ 2025-08-21 00:52:56 - INFO - [0d1e6fde-165d-4649-8878-c71f32a33f71] Extracting frames using method: uniform, rate/threshold: 30
89
+ 2025-08-21 00:53:00 - INFO - [0d1e6fde-165d-4649-8878-c71f32a33f71] Extracted 30 frames successfully. Saving to temporary files...
90
+ 2025-08-21 00:53:01 - INFO - [0d1e6fde-165d-4649-8878-c71f32a33f71] 30 frames saved to temp_videos/0d1e6fde-165d-4649-8878-c71f32a33f71
91
+ 2025-08-21 00:53:01 - INFO - Prompt token length: 2305
92
+ 2025-08-21 00:53:07 - INFO - Tokens per second: 11.888364051217732, Peak GPU memory MB: 5348.375
93
+ 2025-08-21 00:53:07 - INFO - [0d1e6fde-165d-4649-8878-c71f32a33f71] Inference time: 11.70 seconds, CPU usage: 51.6%, CPU core utilization: [59.1, 67.7, 41.2, 38.5]
94
+ 2025-08-21 00:53:07 - INFO - [0d1e6fde-165d-4649-8878-c71f32a33f71] Cleaned up temporary frame directory: temp_videos/0d1e6fde-165d-4649-8878-c71f32a33f71
95
+ 2025-08-21 00:53:07 - INFO - [884171ad-eeda-4dd5-9f1b-d43868fa7804] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_005.mp4'
96
+ 2025-08-21 00:53:07 - INFO - [884171ad-eeda-4dd5-9f1b-d43868fa7804] Video saved to temporary file: temp_videos/884171ad-eeda-4dd5-9f1b-d43868fa7804.mp4
97
+ 2025-08-21 00:53:07 - INFO - [884171ad-eeda-4dd5-9f1b-d43868fa7804] Extracting frames using method: uniform, rate/threshold: 30
98
+ 2025-08-21 00:53:12 - INFO - [884171ad-eeda-4dd5-9f1b-d43868fa7804] Extracted 30 frames successfully. Saving to temporary files...
99
+ 2025-08-21 00:53:12 - INFO - [884171ad-eeda-4dd5-9f1b-d43868fa7804] 30 frames saved to temp_videos/884171ad-eeda-4dd5-9f1b-d43868fa7804
100
+ 2025-08-21 00:53:12 - INFO - Prompt token length: 2305
101
+ 2025-08-21 00:53:18 - INFO - Tokens per second: 11.790592742669173, Peak GPU memory MB: 5348.375
102
+ 2025-08-21 00:53:18 - INFO - [884171ad-eeda-4dd5-9f1b-d43868fa7804] Inference time: 10.74 seconds, CPU usage: 55.3%, CPU core utilization: [53.6, 43.0, 81.8, 42.7]
103
+ 2025-08-21 00:53:18 - INFO - [884171ad-eeda-4dd5-9f1b-d43868fa7804] Cleaned up temporary frame directory: temp_videos/884171ad-eeda-4dd5-9f1b-d43868fa7804
104
+ 2025-08-21 00:53:18 - INFO - [ce71041b-d894-486d-9e37-8b9a86705705] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_006.mp4'
105
+ 2025-08-21 00:53:18 - INFO - [ce71041b-d894-486d-9e37-8b9a86705705] Video saved to temporary file: temp_videos/ce71041b-d894-486d-9e37-8b9a86705705.mp4
106
+ 2025-08-21 00:53:18 - INFO - [ce71041b-d894-486d-9e37-8b9a86705705] Extracting frames using method: uniform, rate/threshold: 30
107
+ 2025-08-21 00:53:23 - INFO - [ce71041b-d894-486d-9e37-8b9a86705705] Extracted 30 frames successfully. Saving to temporary files...
108
+ 2025-08-21 00:53:23 - INFO - [ce71041b-d894-486d-9e37-8b9a86705705] 30 frames saved to temp_videos/ce71041b-d894-486d-9e37-8b9a86705705
109
+ 2025-08-21 00:53:23 - INFO - Prompt token length: 2305
110
+ 2025-08-21 00:53:28 - INFO - Tokens per second: 11.854764118772119, Peak GPU memory MB: 5348.375
111
+ 2025-08-21 00:53:28 - INFO - [ce71041b-d894-486d-9e37-8b9a86705705] Inference time: 10.06 seconds, CPU usage: 55.9%, CPU core utilization: [48.9, 44.4, 87.7, 42.4]
112
+ 2025-08-21 00:53:28 - INFO - [ce71041b-d894-486d-9e37-8b9a86705705] Cleaned up temporary frame directory: temp_videos/ce71041b-d894-486d-9e37-8b9a86705705
113
+ 2025-08-21 00:53:28 - INFO - [a135f93a-1ac9-4578-a1c4-2b1aeb89afda] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_007.mp4'
114
+ 2025-08-21 00:53:28 - INFO - [a135f93a-1ac9-4578-a1c4-2b1aeb89afda] Video saved to temporary file: temp_videos/a135f93a-1ac9-4578-a1c4-2b1aeb89afda.mp4
115
+ 2025-08-21 00:53:28 - INFO - [a135f93a-1ac9-4578-a1c4-2b1aeb89afda] Extracting frames using method: uniform, rate/threshold: 30
116
+ 2025-08-21 00:53:33 - INFO - [a135f93a-1ac9-4578-a1c4-2b1aeb89afda] Extracted 30 frames successfully. Saving to temporary files...
117
+ 2025-08-21 00:53:33 - INFO - [a135f93a-1ac9-4578-a1c4-2b1aeb89afda] 30 frames saved to temp_videos/a135f93a-1ac9-4578-a1c4-2b1aeb89afda
118
+ 2025-08-21 00:53:33 - INFO - Prompt token length: 2305
119
+ 2025-08-21 00:53:40 - INFO - Tokens per second: 11.806017274209756, Peak GPU memory MB: 5348.375
120
+ 2025-08-21 00:53:40 - INFO - [a135f93a-1ac9-4578-a1c4-2b1aeb89afda] Inference time: 12.00 seconds, CPU usage: 51.4%, CPU core utilization: [49.4, 37.7, 81.8, 36.8]
121
+ 2025-08-21 00:53:40 - INFO - [a135f93a-1ac9-4578-a1c4-2b1aeb89afda] Cleaned up temporary frame directory: temp_videos/a135f93a-1ac9-4578-a1c4-2b1aeb89afda
122
+ 2025-08-21 00:53:40 - INFO - [6873e58b-3473-4224-909d-3159c03588e5] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_008.mp4'
123
+ 2025-08-21 00:53:40 - INFO - [6873e58b-3473-4224-909d-3159c03588e5] Video saved to temporary file: temp_videos/6873e58b-3473-4224-909d-3159c03588e5.mp4
124
+ 2025-08-21 00:53:40 - INFO - [6873e58b-3473-4224-909d-3159c03588e5] Extracting frames using method: uniform, rate/threshold: 30
125
+ 2025-08-21 00:53:45 - INFO - [6873e58b-3473-4224-909d-3159c03588e5] Extracted 30 frames successfully. Saving to temporary files...
126
+ 2025-08-21 00:53:45 - INFO - [6873e58b-3473-4224-909d-3159c03588e5] 30 frames saved to temp_videos/6873e58b-3473-4224-909d-3159c03588e5
127
+ 2025-08-21 00:53:45 - INFO - Prompt token length: 2305
128
+ 2025-08-21 00:53:51 - INFO - Tokens per second: 11.878890265234213, Peak GPU memory MB: 5348.375
129
+ 2025-08-21 00:53:51 - INFO - [6873e58b-3473-4224-909d-3159c03588e5] Inference time: 10.61 seconds, CPU usage: 55.0%, CPU core utilization: [90.1, 42.2, 45.8, 41.8]
130
+ 2025-08-21 00:53:51 - INFO - [6873e58b-3473-4224-909d-3159c03588e5] Cleaned up temporary frame directory: temp_videos/6873e58b-3473-4224-909d-3159c03588e5
131
+ 2025-08-21 00:53:51 - INFO - [6356c394-1484-4391-b145-81215ba47ee8] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_009.mp4'
132
+ 2025-08-21 00:53:51 - INFO - [6356c394-1484-4391-b145-81215ba47ee8] Video saved to temporary file: temp_videos/6356c394-1484-4391-b145-81215ba47ee8.mp4
133
+ 2025-08-21 00:53:51 - INFO - [6356c394-1484-4391-b145-81215ba47ee8] Extracting frames using method: uniform, rate/threshold: 30
134
+ 2025-08-21 00:53:56 - INFO - [6356c394-1484-4391-b145-81215ba47ee8] Extracted 30 frames successfully. Saving to temporary files...
135
+ 2025-08-21 00:53:56 - INFO - [6356c394-1484-4391-b145-81215ba47ee8] 30 frames saved to temp_videos/6356c394-1484-4391-b145-81215ba47ee8
136
+ 2025-08-21 00:53:56 - INFO - Prompt token length: 2305
137
+ 2025-08-21 00:54:02 - INFO - Tokens per second: 11.82995179235076, Peak GPU memory MB: 5348.375
138
+ 2025-08-21 00:54:02 - INFO - [6356c394-1484-4391-b145-81215ba47ee8] Inference time: 10.80 seconds, CPU usage: 53.8%, CPU core utilization: [49.9, 41.3, 78.9, 45.0]
139
+ 2025-08-21 00:54:02 - INFO - [6356c394-1484-4391-b145-81215ba47ee8] Cleaned up temporary frame directory: temp_videos/6356c394-1484-4391-b145-81215ba47ee8
140
+ 2025-08-21 00:54:02 - INFO - [2999105b-10bc-497e-8931-352c7d9d65e6] Received new video inference request. Prompt: 'Summarize the key events in this convenience store video. Focus only on the actions and interactions of the people. Avoid repetitive descriptions of the store's layout or shelves.', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_010.mp4'
141
+ 2025-08-21 00:54:02 - INFO - [2999105b-10bc-497e-8931-352c7d9d65e6] Video saved to temporary file: temp_videos/2999105b-10bc-497e-8931-352c7d9d65e6.mp4
142
+ 2025-08-21 00:54:02 - INFO - [2999105b-10bc-497e-8931-352c7d9d65e6] Extracting frames using method: uniform, rate/threshold: 30
143
+ 2025-08-21 00:54:07 - INFO - [2999105b-10bc-497e-8931-352c7d9d65e6] Extracted 30 frames successfully. Saving to temporary files...
144
+ 2025-08-21 00:54:07 - INFO - [2999105b-10bc-497e-8931-352c7d9d65e6] 30 frames saved to temp_videos/2999105b-10bc-497e-8931-352c7d9d65e6
145
+ 2025-08-21 00:54:07 - INFO - Prompt token length: 2305
API_Transformers/logs/Qwen2.5-VL-3B-Instruct-AWQ/20250821_014204.log ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-21 01:42:04 - INFO - Loading model: Qwen/Qwen2.5-VL-3B-Instruct-AWQ
2
+ 2025-08-21 01:42:09 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-21 01:42:40 - INFO - Model loaded in 35.77 seconds
4
+ 2025-08-21 01:42:40 - INFO - GPU Memory Usage after model load: 3250.55 MB
5
+ 2025-08-21 02:54:09 - INFO - [c40f2273-a9f5-4d96-82d4-990269ab9708] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_001.mp4'
6
+ 2025-08-21 02:54:09 - INFO - [c40f2273-a9f5-4d96-82d4-990269ab9708] Video saved to temporary file: temp_videos/c40f2273-a9f5-4d96-82d4-990269ab9708.mp4
7
+ 2025-08-21 02:54:09 - INFO - [c40f2273-a9f5-4d96-82d4-990269ab9708] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-21 02:54:13 - INFO - [c40f2273-a9f5-4d96-82d4-990269ab9708] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-21 02:54:13 - INFO - [c40f2273-a9f5-4d96-82d4-990269ab9708] 30 frames saved to temp_videos/c40f2273-a9f5-4d96-82d4-990269ab9708
10
+ 2025-08-21 02:54:13 - INFO - Prompt token length: 2306
11
+ 2025-08-21 02:54:23 - INFO - Tokens per second: 11.859020159952623, Peak GPU memory MB: 5350.375
12
+ 2025-08-21 02:54:23 - INFO - [c40f2273-a9f5-4d96-82d4-990269ab9708] Inference time: 14.12 seconds, CPU usage: 2.0%, CPU core utilization: [2.0, 2.0, 1.9, 1.9]
13
+ 2025-08-21 02:54:23 - INFO - [c40f2273-a9f5-4d96-82d4-990269ab9708] Cleaned up temporary frame directory: temp_videos/c40f2273-a9f5-4d96-82d4-990269ab9708
14
+ 2025-08-21 02:54:23 - INFO - [1bbf302e-4b0b-4363-bddd-3fb826552587] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_002.mp4'
15
+ 2025-08-21 02:54:23 - INFO - [1bbf302e-4b0b-4363-bddd-3fb826552587] Video saved to temporary file: temp_videos/1bbf302e-4b0b-4363-bddd-3fb826552587.mp4
16
+ 2025-08-21 02:54:23 - INFO - [1bbf302e-4b0b-4363-bddd-3fb826552587] Extracting frames using method: uniform, rate/threshold: 30
17
+ 2025-08-21 02:54:27 - INFO - [1bbf302e-4b0b-4363-bddd-3fb826552587] Extracted 30 frames successfully. Saving to temporary files...
18
+ 2025-08-21 02:54:27 - INFO - [1bbf302e-4b0b-4363-bddd-3fb826552587] 30 frames saved to temp_videos/1bbf302e-4b0b-4363-bddd-3fb826552587
19
+ 2025-08-21 02:54:27 - INFO - Prompt token length: 2306
20
+ 2025-08-21 02:54:34 - INFO - Tokens per second: 12.033912631174916, Peak GPU memory MB: 5350.375
21
+ 2025-08-21 02:54:34 - INFO - [1bbf302e-4b0b-4363-bddd-3fb826552587] Inference time: 10.49 seconds, CPU usage: 44.1%, CPU core utilization: [80.0, 27.5, 40.1, 29.0]
22
+ 2025-08-21 02:54:34 - INFO - [1bbf302e-4b0b-4363-bddd-3fb826552587] Cleaned up temporary frame directory: temp_videos/1bbf302e-4b0b-4363-bddd-3fb826552587
23
+ 2025-08-21 02:54:34 - INFO - [48b38709-fb9f-4c1d-9db6-279fea58e01f] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_003.mp4'
24
+ 2025-08-21 02:54:34 - INFO - [48b38709-fb9f-4c1d-9db6-279fea58e01f] Video saved to temporary file: temp_videos/48b38709-fb9f-4c1d-9db6-279fea58e01f.mp4
25
+ 2025-08-21 02:54:34 - INFO - [48b38709-fb9f-4c1d-9db6-279fea58e01f] Extracting frames using method: uniform, rate/threshold: 30
26
+ 2025-08-21 02:54:37 - INFO - [48b38709-fb9f-4c1d-9db6-279fea58e01f] Extracted 30 frames successfully. Saving to temporary files...
27
+ 2025-08-21 02:54:37 - INFO - [48b38709-fb9f-4c1d-9db6-279fea58e01f] 30 frames saved to temp_videos/48b38709-fb9f-4c1d-9db6-279fea58e01f
28
+ 2025-08-21 02:54:37 - INFO - Prompt token length: 2306
29
+ 2025-08-21 02:54:45 - INFO - Tokens per second: 11.980873759204092, Peak GPU memory MB: 5350.375
30
+ 2025-08-21 02:54:45 - INFO - [48b38709-fb9f-4c1d-9db6-279fea58e01f] Inference time: 10.84 seconds, CPU usage: 43.7%, CPU core utilization: [49.1, 32.4, 65.3, 27.8]
31
+ 2025-08-21 02:54:45 - INFO - [48b38709-fb9f-4c1d-9db6-279fea58e01f] Cleaned up temporary frame directory: temp_videos/48b38709-fb9f-4c1d-9db6-279fea58e01f
32
+ 2025-08-21 02:54:45 - INFO - [218b6cb4-0c13-4223-b6be-fbc881774b17] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_004.mp4'
33
+ 2025-08-21 02:54:45 - INFO - [218b6cb4-0c13-4223-b6be-fbc881774b17] Video saved to temporary file: temp_videos/218b6cb4-0c13-4223-b6be-fbc881774b17.mp4
34
+ 2025-08-21 02:54:45 - INFO - [218b6cb4-0c13-4223-b6be-fbc881774b17] Extracting frames using method: uniform, rate/threshold: 30
35
+ 2025-08-21 02:54:48 - INFO - [218b6cb4-0c13-4223-b6be-fbc881774b17] Extracted 30 frames successfully. Saving to temporary files...
36
+ 2025-08-21 02:54:48 - INFO - [218b6cb4-0c13-4223-b6be-fbc881774b17] 30 frames saved to temp_videos/218b6cb4-0c13-4223-b6be-fbc881774b17
37
+ 2025-08-21 02:54:48 - INFO - Prompt token length: 2306
38
+ 2025-08-21 02:55:13 - INFO - Tokens per second: 11.894932301505968, Peak GPU memory MB: 5350.375
39
+ 2025-08-21 02:55:13 - INFO - [218b6cb4-0c13-4223-b6be-fbc881774b17] Inference time: 27.98 seconds, CPU usage: 33.8%, CPU core utilization: [13.9, 45.9, 13.3, 61.9]
40
+ 2025-08-21 02:55:13 - INFO - [218b6cb4-0c13-4223-b6be-fbc881774b17] Cleaned up temporary frame directory: temp_videos/218b6cb4-0c13-4223-b6be-fbc881774b17
41
+ 2025-08-21 02:55:13 - INFO - [6550b43c-430e-4dee-8467-1a05b4c082cd] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_005.mp4'
42
+ 2025-08-21 02:55:13 - INFO - [6550b43c-430e-4dee-8467-1a05b4c082cd] Video saved to temporary file: temp_videos/6550b43c-430e-4dee-8467-1a05b4c082cd.mp4
43
+ 2025-08-21 02:55:13 - INFO - [6550b43c-430e-4dee-8467-1a05b4c082cd] Extracting frames using method: uniform, rate/threshold: 30
44
+ 2025-08-21 02:55:16 - INFO - [6550b43c-430e-4dee-8467-1a05b4c082cd] Extracted 30 frames successfully. Saving to temporary files...
45
+ 2025-08-21 02:55:16 - INFO - [6550b43c-430e-4dee-8467-1a05b4c082cd] 30 frames saved to temp_videos/6550b43c-430e-4dee-8467-1a05b4c082cd
46
+ 2025-08-21 02:55:16 - INFO - Prompt token length: 2306
47
+ 2025-08-21 02:55:25 - INFO - Tokens per second: 11.99842860278374, Peak GPU memory MB: 5350.375
48
+ 2025-08-21 02:55:25 - INFO - [6550b43c-430e-4dee-8467-1a05b4c082cd] Inference time: 12.41 seconds, CPU usage: 40.7%, CPU core utilization: [34.0, 38.6, 64.3, 25.9]
49
+ 2025-08-21 02:55:25 - INFO - [6550b43c-430e-4dee-8467-1a05b4c082cd] Cleaned up temporary frame directory: temp_videos/6550b43c-430e-4dee-8467-1a05b4c082cd
50
+ 2025-08-21 02:55:25 - INFO - [172a602d-213b-41d6-b892-e7ca06e535bc] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_006.mp4'
51
+ 2025-08-21 02:55:25 - INFO - [172a602d-213b-41d6-b892-e7ca06e535bc] Video saved to temporary file: temp_videos/172a602d-213b-41d6-b892-e7ca06e535bc.mp4
52
+ 2025-08-21 02:55:25 - INFO - [172a602d-213b-41d6-b892-e7ca06e535bc] Extracting frames using method: uniform, rate/threshold: 30
53
+ 2025-08-21 02:55:28 - INFO - [172a602d-213b-41d6-b892-e7ca06e535bc] Extracted 30 frames successfully. Saving to temporary files...
54
+ 2025-08-21 02:55:28 - INFO - [172a602d-213b-41d6-b892-e7ca06e535bc] 30 frames saved to temp_videos/172a602d-213b-41d6-b892-e7ca06e535bc
55
+ 2025-08-21 02:55:29 - INFO - Prompt token length: 2306
56
+ 2025-08-21 02:55:40 - INFO - Tokens per second: 11.862422969421846, Peak GPU memory MB: 5350.375
57
+ 2025-08-21 02:55:40 - INFO - [172a602d-213b-41d6-b892-e7ca06e535bc] Inference time: 15.04 seconds, CPU usage: 39.3%, CPU core utilization: [21.5, 43.6, 21.6, 70.6]
58
+ 2025-08-21 02:55:40 - INFO - [172a602d-213b-41d6-b892-e7ca06e535bc] Cleaned up temporary frame directory: temp_videos/172a602d-213b-41d6-b892-e7ca06e535bc
59
+ 2025-08-21 02:55:40 - INFO - [082b484d-e219-4cde-ac8e-8af5b8f380cd] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_007.mp4'
60
+ 2025-08-21 02:55:40 - INFO - [082b484d-e219-4cde-ac8e-8af5b8f380cd] Video saved to temporary file: temp_videos/082b484d-e219-4cde-ac8e-8af5b8f380cd.mp4
61
+ 2025-08-21 02:55:40 - INFO - [082b484d-e219-4cde-ac8e-8af5b8f380cd] Extracting frames using method: uniform, rate/threshold: 30
62
+ 2025-08-21 02:55:43 - INFO - [082b484d-e219-4cde-ac8e-8af5b8f380cd] Extracted 30 frames successfully. Saving to temporary files...
63
+ 2025-08-21 02:55:43 - INFO - [082b484d-e219-4cde-ac8e-8af5b8f380cd] 30 frames saved to temp_videos/082b484d-e219-4cde-ac8e-8af5b8f380cd
64
+ 2025-08-21 02:55:44 - INFO - Prompt token length: 2306
65
+ 2025-08-21 02:55:52 - INFO - Tokens per second: 12.007495276914103, Peak GPU memory MB: 5350.375
66
+ 2025-08-21 02:55:52 - INFO - [082b484d-e219-4cde-ac8e-8af5b8f380cd] Inference time: 11.83 seconds, CPU usage: 42.8%, CPU core utilization: [60.5, 34.4, 49.7, 26.5]
67
+ 2025-08-21 02:55:52 - INFO - [082b484d-e219-4cde-ac8e-8af5b8f380cd] Cleaned up temporary frame directory: temp_videos/082b484d-e219-4cde-ac8e-8af5b8f380cd
68
+ 2025-08-21 02:55:52 - INFO - [d4aec199-0b7e-4058-b8ba-bdfbb7806fca] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_008.mp4'
69
+ 2025-08-21 02:55:52 - INFO - [d4aec199-0b7e-4058-b8ba-bdfbb7806fca] Video saved to temporary file: temp_videos/d4aec199-0b7e-4058-b8ba-bdfbb7806fca.mp4
70
+ 2025-08-21 02:55:52 - INFO - [d4aec199-0b7e-4058-b8ba-bdfbb7806fca] Extracting frames using method: uniform, rate/threshold: 30
71
+ 2025-08-21 02:55:55 - INFO - [d4aec199-0b7e-4058-b8ba-bdfbb7806fca] Extracted 30 frames successfully. Saving to temporary files...
72
+ 2025-08-21 02:55:55 - INFO - [d4aec199-0b7e-4058-b8ba-bdfbb7806fca] 30 frames saved to temp_videos/d4aec199-0b7e-4058-b8ba-bdfbb7806fca
73
+ 2025-08-21 02:55:56 - INFO - Prompt token length: 2306
74
+ 2025-08-21 02:56:04 - INFO - Tokens per second: 11.871294681994929, Peak GPU memory MB: 5350.375
75
+ 2025-08-21 02:56:04 - INFO - [d4aec199-0b7e-4058-b8ba-bdfbb7806fca] Inference time: 12.13 seconds, CPU usage: 43.3%, CPU core utilization: [35.9, 32.5, 78.1, 26.8]
76
+ 2025-08-21 02:56:04 - INFO - [d4aec199-0b7e-4058-b8ba-bdfbb7806fca] Cleaned up temporary frame directory: temp_videos/d4aec199-0b7e-4058-b8ba-bdfbb7806fca
77
+ 2025-08-21 02:56:04 - INFO - [20eacc2f-2a33-4211-b488-f449c4bbc64d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_009.mp4'
78
+ 2025-08-21 02:56:04 - INFO - [20eacc2f-2a33-4211-b488-f449c4bbc64d] Video saved to temporary file: temp_videos/20eacc2f-2a33-4211-b488-f449c4bbc64d.mp4
79
+ 2025-08-21 02:56:04 - INFO - [20eacc2f-2a33-4211-b488-f449c4bbc64d] Extracting frames using method: uniform, rate/threshold: 30
80
+ 2025-08-21 02:56:07 - INFO - [20eacc2f-2a33-4211-b488-f449c4bbc64d] Extracted 30 frames successfully. Saving to temporary files...
81
+ 2025-08-21 02:56:07 - INFO - [20eacc2f-2a33-4211-b488-f449c4bbc64d] 30 frames saved to temp_videos/20eacc2f-2a33-4211-b488-f449c4bbc64d
82
+ 2025-08-21 02:56:08 - INFO - Prompt token length: 2306
83
+ 2025-08-21 02:56:15 - INFO - Tokens per second: 11.63501242448262, Peak GPU memory MB: 5350.375
84
+ 2025-08-21 02:56:15 - INFO - [20eacc2f-2a33-4211-b488-f449c4bbc64d] Inference time: 10.73 seconds, CPU usage: 46.3%, CPU core utilization: [38.3, 58.2, 31.7, 56.7]
85
+ 2025-08-21 02:56:15 - INFO - [20eacc2f-2a33-4211-b488-f449c4bbc64d] Cleaned up temporary frame directory: temp_videos/20eacc2f-2a33-4211-b488-f449c4bbc64d
86
+ 2025-08-21 02:56:15 - INFO - [7bd61912-f2d8-49f3-a1d2-d25a5bb09ff5] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_010.mp4'
87
+ 2025-08-21 02:56:15 - INFO - [7bd61912-f2d8-49f3-a1d2-d25a5bb09ff5] Video saved to temporary file: temp_videos/7bd61912-f2d8-49f3-a1d2-d25a5bb09ff5.mp4
88
+ 2025-08-21 02:56:15 - INFO - [7bd61912-f2d8-49f3-a1d2-d25a5bb09ff5] Extracting frames using method: uniform, rate/threshold: 30
89
+ 2025-08-21 02:56:18 - INFO - [7bd61912-f2d8-49f3-a1d2-d25a5bb09ff5] Extracted 30 frames successfully. Saving to temporary files...
90
+ 2025-08-21 02:56:18 - INFO - [7bd61912-f2d8-49f3-a1d2-d25a5bb09ff5] 30 frames saved to temp_videos/7bd61912-f2d8-49f3-a1d2-d25a5bb09ff5
91
+ 2025-08-21 02:56:18 - INFO - Prompt token length: 2306
92
+ 2025-08-21 02:56:31 - INFO - Tokens per second: 11.874488678953208, Peak GPU memory MB: 5350.375
93
+ 2025-08-21 02:56:31 - INFO - [7bd61912-f2d8-49f3-a1d2-d25a5bb09ff5] Inference time: 16.08 seconds, CPU usage: 37.8%, CPU core utilization: [19.6, 68.7, 18.4, 44.3]
94
+ 2025-08-21 02:56:31 - INFO - [7bd61912-f2d8-49f3-a1d2-d25a5bb09ff5] Cleaned up temporary frame directory: temp_videos/7bd61912-f2d8-49f3-a1d2-d25a5bb09ff5
95
+ 2025-08-21 02:56:31 - INFO - [305ccf60-14df-466d-8565-f04265430ba1] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_011.mp4'
96
+ 2025-08-21 02:56:31 - INFO - [305ccf60-14df-466d-8565-f04265430ba1] Video saved to temporary file: temp_videos/305ccf60-14df-466d-8565-f04265430ba1.mp4
97
+ 2025-08-21 02:56:31 - INFO - [305ccf60-14df-466d-8565-f04265430ba1] Extracting frames using method: uniform, rate/threshold: 30
98
+ 2025-08-21 02:56:34 - INFO - [305ccf60-14df-466d-8565-f04265430ba1] Extracted 30 frames successfully. Saving to temporary files...
99
+ 2025-08-21 02:56:34 - INFO - [305ccf60-14df-466d-8565-f04265430ba1] 30 frames saved to temp_videos/305ccf60-14df-466d-8565-f04265430ba1
100
+ 2025-08-21 02:56:35 - INFO - Prompt token length: 2306
101
+ 2025-08-21 02:56:44 - INFO - Tokens per second: 11.829041430743297, Peak GPU memory MB: 5350.375
102
+ 2025-08-21 02:56:44 - INFO - [305ccf60-14df-466d-8565-f04265430ba1] Inference time: 12.93 seconds, CPU usage: 42.0%, CPU core utilization: [28.8, 42.5, 25.9, 70.9]
103
+ 2025-08-21 02:56:44 - INFO - [305ccf60-14df-466d-8565-f04265430ba1] Cleaned up temporary frame directory: temp_videos/305ccf60-14df-466d-8565-f04265430ba1
104
+ 2025-08-21 02:56:44 - INFO - [659dc8e0-c40a-432f-887e-c9cdeefc17a4] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_012.mp4'
105
+ 2025-08-21 02:56:44 - INFO - [659dc8e0-c40a-432f-887e-c9cdeefc17a4] Video saved to temporary file: temp_videos/659dc8e0-c40a-432f-887e-c9cdeefc17a4.mp4
106
+ 2025-08-21 02:56:44 - INFO - [659dc8e0-c40a-432f-887e-c9cdeefc17a4] Extracting frames using method: uniform, rate/threshold: 30
107
+ 2025-08-21 02:56:47 - INFO - [659dc8e0-c40a-432f-887e-c9cdeefc17a4] Extracted 30 frames successfully. Saving to temporary files...
108
+ 2025-08-21 02:56:47 - INFO - [659dc8e0-c40a-432f-887e-c9cdeefc17a4] 30 frames saved to temp_videos/659dc8e0-c40a-432f-887e-c9cdeefc17a4
109
+ 2025-08-21 02:56:48 - INFO - Prompt token length: 2306
110
+ 2025-08-21 02:56:58 - INFO - Tokens per second: 11.928726359703456, Peak GPU memory MB: 5350.375
111
+ 2025-08-21 02:56:58 - INFO - [659dc8e0-c40a-432f-887e-c9cdeefc17a4] Inference time: 13.75 seconds, CPU usage: 39.8%, CPU core utilization: [31.6, 62.3, 41.3, 23.8]
112
+ 2025-08-21 02:56:58 - INFO - [659dc8e0-c40a-432f-887e-c9cdeefc17a4] Cleaned up temporary frame directory: temp_videos/659dc8e0-c40a-432f-887e-c9cdeefc17a4
113
+ 2025-08-21 02:56:58 - INFO - [05a4c1b9-d6d6-4e4e-a0f2-f58a0664c989] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_013.mp4'
114
+ 2025-08-21 02:56:58 - INFO - [05a4c1b9-d6d6-4e4e-a0f2-f58a0664c989] Video saved to temporary file: temp_videos/05a4c1b9-d6d6-4e4e-a0f2-f58a0664c989.mp4
115
+ 2025-08-21 02:56:58 - INFO - [05a4c1b9-d6d6-4e4e-a0f2-f58a0664c989] Extracting frames using method: uniform, rate/threshold: 30
116
+ 2025-08-21 02:57:01 - INFO - [05a4c1b9-d6d6-4e4e-a0f2-f58a0664c989] Extracted 30 frames successfully. Saving to temporary files...
117
+ 2025-08-21 02:57:01 - INFO - [05a4c1b9-d6d6-4e4e-a0f2-f58a0664c989] 30 frames saved to temp_videos/05a4c1b9-d6d6-4e4e-a0f2-f58a0664c989
118
+ 2025-08-21 02:57:01 - INFO - Prompt token length: 2306
119
+ 2025-08-21 02:57:07 - INFO - Tokens per second: 12.014726651428436, Peak GPU memory MB: 5350.375
120
+ 2025-08-21 02:57:07 - INFO - [05a4c1b9-d6d6-4e4e-a0f2-f58a0664c989] Inference time: 9.37 seconds, CPU usage: 43.8%, CPU core utilization: [29.4, 29.6, 88.1, 27.9]
121
+ 2025-08-21 02:57:07 - INFO - [05a4c1b9-d6d6-4e4e-a0f2-f58a0664c989] Cleaned up temporary frame directory: temp_videos/05a4c1b9-d6d6-4e4e-a0f2-f58a0664c989
122
+ 2025-08-21 02:57:07 - INFO - [0f5076d3-96af-4d28-be73-0db23c76eaf4] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_014.mp4'
123
+ 2025-08-21 02:57:07 - INFO - [0f5076d3-96af-4d28-be73-0db23c76eaf4] Video saved to temporary file: temp_videos/0f5076d3-96af-4d28-be73-0db23c76eaf4.mp4
124
+ 2025-08-21 02:57:07 - INFO - [0f5076d3-96af-4d28-be73-0db23c76eaf4] Extracting frames using method: uniform, rate/threshold: 30
125
+ 2025-08-21 02:57:10 - INFO - [0f5076d3-96af-4d28-be73-0db23c76eaf4] Extracted 30 frames successfully. Saving to temporary files...
126
+ 2025-08-21 02:57:10 - INFO - [0f5076d3-96af-4d28-be73-0db23c76eaf4] 30 frames saved to temp_videos/0f5076d3-96af-4d28-be73-0db23c76eaf4
127
+ 2025-08-21 02:57:11 - INFO - Prompt token length: 2306
128
+ 2025-08-21 02:57:19 - INFO - Tokens per second: 11.861972979079045, Peak GPU memory MB: 5350.375
129
+ 2025-08-21 02:57:19 - INFO - [0f5076d3-96af-4d28-be73-0db23c76eaf4] Inference time: 11.61 seconds, CPU usage: 41.6%, CPU core utilization: [42.9, 26.4, 27.0, 69.9]
130
+ 2025-08-21 02:57:19 - INFO - [0f5076d3-96af-4d28-be73-0db23c76eaf4] Cleaned up temporary frame directory: temp_videos/0f5076d3-96af-4d28-be73-0db23c76eaf4
131
+ 2025-08-21 02:57:19 - INFO - [a60e4adc-5a10-496e-8dba-e95fa8204801] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_015.mp4'
132
+ 2025-08-21 02:57:19 - INFO - [a60e4adc-5a10-496e-8dba-e95fa8204801] Video saved to temporary file: temp_videos/a60e4adc-5a10-496e-8dba-e95fa8204801.mp4
133
+ 2025-08-21 02:57:19 - INFO - [a60e4adc-5a10-496e-8dba-e95fa8204801] Extracting frames using method: uniform, rate/threshold: 30
134
+ 2025-08-21 02:57:22 - INFO - [a60e4adc-5a10-496e-8dba-e95fa8204801] Extracted 30 frames successfully. Saving to temporary files...
135
+ 2025-08-21 02:57:22 - INFO - [a60e4adc-5a10-496e-8dba-e95fa8204801] 30 frames saved to temp_videos/a60e4adc-5a10-496e-8dba-e95fa8204801
136
+ 2025-08-21 02:57:22 - INFO - Prompt token length: 2306
137
+ 2025-08-21 02:57:31 - INFO - Tokens per second: 12.034885208983422, Peak GPU memory MB: 5350.375
138
+ 2025-08-21 02:57:31 - INFO - [a60e4adc-5a10-496e-8dba-e95fa8204801] Inference time: 12.68 seconds, CPU usage: 39.5%, CPU core utilization: [59.8, 22.7, 53.7, 22.1]
139
+ 2025-08-21 02:57:31 - INFO - [a60e4adc-5a10-496e-8dba-e95fa8204801] Cleaned up temporary frame directory: temp_videos/a60e4adc-5a10-496e-8dba-e95fa8204801
140
+ 2025-08-21 02:57:31 - INFO - [262c15ae-e353-4d00-b508-c4d77d75300a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/new/Clips_60s/video_part_016.mp4'
141
+ 2025-08-21 02:57:31 - INFO - [262c15ae-e353-4d00-b508-c4d77d75300a] Video saved to temporary file: temp_videos/262c15ae-e353-4d00-b508-c4d77d75300a.mp4
142
+ 2025-08-21 02:57:31 - INFO - [262c15ae-e353-4d00-b508-c4d77d75300a] Extracting frames using method: uniform, rate/threshold: 30
143
+ 2025-08-21 02:57:35 - INFO - [262c15ae-e353-4d00-b508-c4d77d75300a] Extracted 30 frames successfully. Saving to temporary files...
144
+ 2025-08-21 02:57:35 - INFO - [262c15ae-e353-4d00-b508-c4d77d75300a] 30 frames saved to temp_videos/262c15ae-e353-4d00-b508-c4d77d75300a
145
+ 2025-08-21 02:57:35 - INFO - Prompt token length: 2306
146
+ 2025-08-21 02:57:59 - INFO - Tokens per second: 12.052444962168167, Peak GPU memory MB: 5350.375
147
+ 2025-08-21 02:57:59 - INFO - [262c15ae-e353-4d00-b508-c4d77d75300a] Inference time: 27.71 seconds, CPU usage: 33.2%, CPU core utilization: [31.4, 17.1, 70.7, 13.4]
148
+ 2025-08-21 02:57:59 - INFO - [262c15ae-e353-4d00-b508-c4d77d75300a] Cleaned up temporary frame directory: temp_videos/262c15ae-e353-4d00-b508-c4d77d75300a
API_Transformers/logs/gemma-3-4b-it/20250819_005014.log ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-19 00:50:14 - INFO - Loading model: google/gemma-3-4b-it
2
+ 2025-08-19 00:50:16 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-19 00:51:28 - INFO - Model loaded in 73.81 seconds
4
+ 2025-08-19 00:51:28 - INFO - GPU Memory Usage after model load: 8201.85 MB
5
+ 2025-08-19 00:51:34 - INFO - [cd4de5c8-a57c-41ff-8d88-71dda9ce333f] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-19 00:51:34 - INFO - [cd4de5c8-a57c-41ff-8d88-71dda9ce333f] Video saved to temporary file: temp_videos/cd4de5c8-a57c-41ff-8d88-71dda9ce333f.mp4
7
+ 2025-08-19 00:51:34 - INFO - [cd4de5c8-a57c-41ff-8d88-71dda9ce333f] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-19 00:51:37 - INFO - [cd4de5c8-a57c-41ff-8d88-71dda9ce333f] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-19 00:51:37 - INFO - [cd4de5c8-a57c-41ff-8d88-71dda9ce333f] 30 frames saved to temp_videos/cd4de5c8-a57c-41ff-8d88-71dda9ce333f
10
+ 2025-08-19 00:51:37 - ERROR - [cd4de5c8-a57c-41ff-8d88-71dda9ce333f] An error occurred during processing: Incorrect format used for image. Should be an url linking to an image, a base64 string, a local path, or a PIL image.
11
+ Traceback (most recent call last):
12
+ File "/mnt/data/xiuying/Code/local_deploy/infer.py", line 107, in video_inference
13
+ output = model.generate(frame_paths, prompt)
14
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15
+ File "/mnt/data/xiuying/Code/local_deploy/models/gemma.py", line 56, in generate
16
+ inputs = self.processor.apply_chat_template(
17
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18
+ File "/home/xiuying/miniconda3/envs/gptq/lib/python3.11/site-packages/transformers/utils/deprecation.py", line 172, in wrapped_func
19
+ return func(*args, **kwargs)
20
+ ^^^^^^^^^^^^^^^^^^^^^
21
+ File "/home/xiuying/miniconda3/envs/gptq/lib/python3.11/site-packages/transformers/processing_utils.py", line 1552, in apply_chat_template
22
+ images.append(load_image(fname))
23
+ ^^^^^^^^^^^^^^^^^
24
+ File "/home/xiuying/miniconda3/envs/gptq/lib/python3.11/site-packages/transformers/image_utils.py", line 493, in load_image
25
+ raise TypeError(
26
+ TypeError: Incorrect format used for image. Should be an url linking to an image, a base64 string, a local path, or a PIL image.
27
+ 2025-08-19 00:51:37 - INFO - [cd4de5c8-a57c-41ff-8d88-71dda9ce333f] Cleaned up temporary file: temp_videos/cd4de5c8-a57c-41ff-8d88-71dda9ce333f.mp4
28
+ 2025-08-19 00:51:37 - INFO - [cd4de5c8-a57c-41ff-8d88-71dda9ce333f] Cleaned up temporary frame directory: temp_videos/cd4de5c8-a57c-41ff-8d88-71dda9ce333f
API_Transformers/logs/gemma-3-4b-it/20250819_005535.log ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-19 00:55:35 - INFO - Loading model: google/gemma-3-4b-it
2
+ 2025-08-19 00:55:37 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-19 00:55:50 - INFO - Model loaded in 14.81 seconds
4
+ 2025-08-19 00:55:50 - INFO - GPU Memory Usage after model load: 8201.85 MB
5
+ 2025-08-19 00:55:58 - INFO - [0cfe1e16-f6d4-4f20-9091-9719eee547e3] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-19 00:55:58 - INFO - [0cfe1e16-f6d4-4f20-9091-9719eee547e3] Video saved to temporary file: temp_videos/0cfe1e16-f6d4-4f20-9091-9719eee547e3.mp4
7
+ 2025-08-19 00:55:58 - INFO - [0cfe1e16-f6d4-4f20-9091-9719eee547e3] Extracting frames using method: uniform, rate/threshold: 30
8
+ 2025-08-19 00:56:04 - INFO - [0cfe1e16-f6d4-4f20-9091-9719eee547e3] Extracted 30 frames successfully. Saving to temporary files...
9
+ 2025-08-19 00:56:04 - INFO - [0cfe1e16-f6d4-4f20-9091-9719eee547e3] 30 frames saved to temp_videos/0cfe1e16-f6d4-4f20-9091-9719eee547e3
10
+ 2025-08-19 00:56:05 - INFO - Prompt token length: 7961
API_Transformers/logs/gemma-3-4b-it/20250819_010310.log ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-08-19 01:03:10 - INFO - Loading model: google/gemma-3-4b-it
2
+ 2025-08-19 01:03:11 - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
3
+ 2025-08-19 01:03:37 - INFO - Model loaded in 26.97 seconds
4
+ 2025-08-19 01:03:37 - INFO - GPU Memory Usage after model load: 8201.85 MB
5
+ 2025-08-19 01:03:58 - INFO - [ddbb264c-a911-43d4-aee3-8aebd82a1e83] Received new video inference request. Prompt: 'Please describe the video.', Video: 'messi_part_001.mp4'
6
+ 2025-08-19 01:03:58 - INFO - [ddbb264c-a911-43d4-aee3-8aebd82a1e83] Video saved to temporary file: temp_videos/ddbb264c-a911-43d4-aee3-8aebd82a1e83.mp4
7
+ 2025-08-19 01:03:58 - INFO - [ddbb264c-a911-43d4-aee3-8aebd82a1e83] Extracting frames using method: uniform, rate/threshold: 5
8
+ 2025-08-19 01:03:58 - INFO - [ddbb264c-a911-43d4-aee3-8aebd82a1e83] Extracted 5 frames successfully. Saving to temporary files...
9
+ 2025-08-19 01:03:58 - INFO - [ddbb264c-a911-43d4-aee3-8aebd82a1e83] 5 frames saved to temp_videos/ddbb264c-a911-43d4-aee3-8aebd82a1e83
10
+ 2025-08-19 01:03:58 - INFO - Prompt token length: 1317