ikaganacar commited on
Commit
62dcb9c
·
verified ·
1 Parent(s): af2601f

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,324 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - tr
5
+ - en
6
+ library_name: transformers
7
+ tags:
8
+ - kubernetes
9
+ - devops
10
+ - quantized
11
+ - 4bit
12
+ - gemma3
13
+ - bitsandbytes
14
+ base_model: aciklab/kubernetes-ai
15
+ model_type: gemma3
16
+ quantized_by: aciklab
17
+ ---
18
+
19
+ # Kubernetes AI - 4bit Safetensors
20
+
21
+ Fine-tuned Gemma 3 12B model specialized for answering Kubernetes questions in Turkish, quantized to 4bit format for efficient inference with reduced memory footprint.
22
+
23
+ ## Model Description
24
+
25
+ This repository contains a 4bit quantized version of the Kubernetes AI model, optimized for running on consumer hardware with reduced VRAM/RAM requirements. The model uses BitsAndBytes quantization with safetensors format for fast loading and efficient inference.
26
+
27
+ **Primary Purpose:** Answer Kubernetes-related questions in Turkish language with minimal hardware requirements.
28
+
29
+ ## Model Specifications
30
+
31
+ | Specification | Details |
32
+ |---------------|---------|
33
+ | **Format** | Safetensors (4bit quantized) |
34
+ | **Base Model** | unsloth/gemma-3-12b-it-qat-bnb-4bit |
35
+ | **Quantization** | 4bit (BitsAndBytes) |
36
+ | **Model Size** | ~7.2 GB |
37
+ | **Memory Usage** | ~8-10 GB VRAM/RAM |
38
+ | **Precision** | 4bit weights, FP16 compute |
39
+
40
+ ## Quick Start
41
+
42
+ ### Installation
43
+
44
+ ```bash
45
+ # Install required packages
46
+ pip install torch transformers accelerate bitsandbytes safetensors
47
+ ```
48
+
49
+ ### Basic Usage
50
+
51
+ ```python
52
+ from transformers import AutoModelForCausalLM, AutoTokenizer
53
+ import torch
54
+
55
+ # Load model and tokenizer
56
+ model_name = "aciklab/kubernetes-ai-4bit"
57
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
58
+ model = AutoModelForCausalLM.from_pretrained(
59
+ model_name,
60
+ device_map="auto",
61
+ trust_remote_code=True
62
+ )
63
+
64
+ # Prepare input
65
+ prompt = "Kubernetes'te 3 replikaya sahip bir deployment nasıl oluştururum?"
66
+
67
+ # Format with chat template
68
+ messages = [
69
+ {"role": "system", "content": "Sen Kubernetes konusunda uzmanlaşmış bir yapay zeka asistanısın. Kubernetes ile ilgili soruları Türkçe olarak yanıtlıyorsun."},
70
+ {"role": "user", "content": prompt}
71
+ ]
72
+
73
+ input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
74
+ inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
75
+
76
+ # Generate response
77
+ outputs = model.generate(
78
+ **inputs,
79
+ max_new_tokens=512,
80
+ temperature=1.0,
81
+ top_p=0.95,
82
+ top_k=64,
83
+ repetition_penalty=1.05,
84
+ do_sample=True
85
+ )
86
+
87
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
88
+ print(response)
89
+ ```
90
+
91
+ ### Advanced Usage with Pipeline
92
+
93
+ ```python
94
+ from transformers import pipeline
95
+
96
+ # Create text generation pipeline
97
+ pipe = pipeline(
98
+ "text-generation",
99
+ model="aciklab/kubernetes-ai-4bit",
100
+ device_map="auto",
101
+ trust_remote_code=True
102
+ )
103
+
104
+ # Generate response
105
+ messages = [
106
+ {"role": "system", "content": "Sen Kubernetes konusunda uzmanlaşmış bir yapay zeka asistanısın."},
107
+ {"role": "user", "content": "Pod ve Deployment arasındaki fark nedir?"}
108
+ ]
109
+
110
+ response = pipe(
111
+ messages,
112
+ max_new_tokens=512,
113
+ temperature=1.0,
114
+ top_p=0.95,
115
+ do_sample=True
116
+ )
117
+
118
+ print(response[0]["generated_text"][-1]["content"])
119
+ ```
120
+
121
+ ### Streaming Responses
122
+
123
+ ```python
124
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
125
+ from threading import Thread
126
+
127
+ model_name = "aciklab/kubernetes-ai-4bit"
128
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
129
+ model = AutoModelForCausalLM.from_pretrained(
130
+ model_name,
131
+ device_map="auto",
132
+ trust_remote_code=True
133
+ )
134
+
135
+ # Prepare input
136
+ prompt = "Kubernetes Service türlerini açıkla"
137
+ messages = [
138
+ {"role": "system", "content": "Sen Kubernetes konusunda uzmanlaşmış bir yapay zeka asistanısın."},
139
+ {"role": "user", "content": prompt}
140
+ ]
141
+
142
+ input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
143
+ inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
144
+
145
+ # Setup streamer
146
+ streamer = TextIteratorStreamer(tokenizer, skip_special_tokens=True)
147
+ generation_kwargs = dict(
148
+ **inputs,
149
+ max_new_tokens=512,
150
+ temperature=1.0,
151
+ streamer=streamer
152
+ )
153
+
154
+ # Generate in separate thread
155
+ thread = Thread(target=model.generate, kwargs=generation_kwargs)
156
+ thread.start()
157
+
158
+ # Stream output
159
+ for text in streamer:
160
+ print(text, end="", flush=True)
161
+
162
+ thread.join()
163
+ ```
164
+
165
+ ## Training Details
166
+
167
+ This model is based on the [aciklab/kubernetes-ai](https://huggingface.co/aciklab/kubernetes-ai) LoRA adapters:
168
+
169
+ - **Base Model:** unsloth/gemma-3-12b-it-qat-bnb-4bit
170
+ - **Training Method:** LoRA (Low-Rank Adaptation)
171
+ - **LoRA Rank:** 8
172
+ - **Target Modules:** q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
173
+ - **Training Dataset:** ~157,210 examples from Kubernetes docs, Stack Overflow, and DevOps datasets
174
+ - **Training Time:** 28 hours on NVIDIA RTX 5070 12GB
175
+ - **Max Sequence Length:** 1024 tokens
176
+
177
+ ### Training Dataset Summary
178
+
179
+ | Dataset Category | Count | Description |
180
+ |-----------------|-------|-------------|
181
+ | **Kubernetes Official Docs** | 8,910 | Concepts, kubectl, setup, tasks, tutorials |
182
+ | **Stack Overflow** | 52,000 | Kubernetes Q&A from community |
183
+ | **DevOps Datasets** | 62,500 | General DevOps and Kubernetes content |
184
+ | **Configurations & CLI** | 36,800 | Kubernetes configs, kubectl examples, operators |
185
+ | **Total** | **~157,210** | Comprehensive Kubernetes knowledge base |
186
+
187
+ ## Quantization Details
188
+
189
+ This model uses 4bit quantization with BitsAndBytes for optimal memory efficiency:
190
+
191
+ - **Source:** Merged LoRA adapters with base model
192
+ - **Quantization Method:** BitsAndBytes 4bit (NF4)
193
+ - **Compute Precision:** FP16
194
+ - **Format:** Safetensors (fast loading)
195
+ - **Memory Footprint:** ~7.2 GB on disk, ~8-10 GB in memory
196
+
197
+ ### Advantages of 4bit Format
198
+
199
+ - **Efficient Memory Usage:** Runs on GPUs with 8GB+ VRAM
200
+ - **Fast Loading:** Safetensors format loads quickly
201
+ - **Good Quality:** Minimal accuracy loss compared to full precision
202
+ - **Framework Support:** Compatible with Transformers, vLLM, Text Generation Inference
203
+ - **Flexible Deployment:** Can run on CPU with acceptable speed
204
+
205
+ ## Hardware Requirements
206
+
207
+ ### Minimum (GPU)
208
+ - **GPU:** 8GB VRAM (e.g., RTX 3060, RTX 4060)
209
+ - **RAM:** 8GB system memory
210
+ - **Storage:** 10GB free space
211
+ - **Recommended:** CUDA-capable NVIDIA GPU
212
+
213
+ ### Minimum (CPU Only)
214
+ - **CPU:** 8+ cores
215
+ - **RAM:** 16GB system memory
216
+ - **Storage:** 10GB free space
217
+ - **Note:** CPU inference will be slower than GPU
218
+
219
+ ### Recommended
220
+ - **GPU:** 12GB+ VRAM (e.g., RTX 3080, RTX 4070, RTX 5070)
221
+ - **RAM:** 16GB system memory
222
+ - **Storage:** 15GB free space
223
+ - **CUDA:** 11.8 or higher
224
+
225
+ ## Performance Benchmarks
226
+
227
+ | Hardware | Tokens/Second | Latency (512 tokens) |
228
+ |----------|---------------|----------------------|
229
+ | RTX 5070 12GB | ~45-55 | ~10-12 seconds |
230
+ | RTX 4060 8GB | ~35-45 | ~12-15 seconds |
231
+ | CPU (16 cores) | ~5-10 | ~60-100 seconds |
232
+
233
+ *Benchmarks are approximate and may vary based on system configuration*
234
+
235
+ ## Inference Optimization Tips
236
+
237
+ ### For Maximum Speed
238
+ ```python
239
+ # Use Flash Attention 2 (if available)
240
+ model = AutoModelForCausalLM.from_pretrained(
241
+ model_name,
242
+ device_map="auto",
243
+ trust_remote_code=True,
244
+ attn_implementation="flash_attention_2" # Requires flash-attn package
245
+ )
246
+ ```
247
+
248
+ ### For Lower Memory Usage
249
+ ```python
250
+ # Enable 8bit quantization instead of 4bit if needed
251
+ from transformers import BitsAndBytesConfig
252
+
253
+ quantization_config = BitsAndBytesConfig(
254
+ load_in_4bit=True,
255
+ bnb_4bit_compute_dtype=torch.float16,
256
+ bnb_4bit_use_double_quant=True,
257
+ bnb_4bit_quant_type="nf4"
258
+ )
259
+
260
+ model = AutoModelForCausalLM.from_pretrained(
261
+ model_name,
262
+ quantization_config=quantization_config,
263
+ device_map="auto"
264
+ )
265
+ ```
266
+
267
+ ## Example Queries
268
+
269
+ ```python
270
+ # Example 1: Creating a Deployment
271
+ "Kubernetes'te 3 replikaya sahip bir nginx deployment nasıl oluştururum?"
272
+
273
+ # Example 2: Service Explanation
274
+ "ClusterIP, NodePort ve LoadBalancer service türleri arasındaki farklar nelerdir?"
275
+
276
+ # Example 3: Troubleshooting
277
+ "Pod'um CrashLoopBackOff durumunda, nasıl debug edebilirim?"
278
+
279
+ # Example 4: Configuration
280
+ "ConfigMap ve Secret arasındaki fark nedir ve ne zaman hangisini kullanmalıyım?"
281
+
282
+ # Example 5: Best Practices
283
+ "Production ortamında Kubernetes deployment için en iyi pratikler nelerdir?"
284
+ ```
285
+
286
+ ## Limitations
287
+
288
+ - **Language:** Optimized primarily for Turkish; English queries may work but with reduced quality
289
+ - **Context Window:** 1024 tokens maximum sequence length
290
+ - **Domain:** Specialized for Kubernetes; may not perform well on general topics
291
+ - **Quantization:** 4bit quantization may occasionally affect response quality on complex queries
292
+
293
+ ## License
294
+
295
+ This model is released under the **MIT License**. Free to use in commercial and open-source projects.
296
+
297
+ ## Citation
298
+
299
+ If you use this model in your research or applications, please cite:
300
+
301
+ ```bibtex
302
+ @misc{kubernetes-ai-4bit,
303
+ author = {HAVELSAN/Açıklab},
304
+ title = {Kubernetes AI - 4bit Safetensors},
305
+ year = {2025},
306
+ publisher = {HuggingFace},
307
+ howpublished = {\url{https://huggingface.co/aciklab/kubernetes-ai-4bit}}
308
+ }
309
+ ```
310
+
311
+ ## Contact
312
+
313
+ **Produced by:** HAVELSAN/Açıklab
314
+
315
+ For questions, feedback, or issues, please open an issue on the model repository or contact us through HuggingFace.
316
+
317
+ ## Related Models
318
+
319
+ - [aciklab/kubernetes-ai](https://huggingface.co/aciklab/kubernetes-ai) - Original LoRA adapters
320
+ - [aciklab/kubernetes-ai-GGUF](https://huggingface.co/aciklab/kubernetes-ai-GGUF) - GGUF quantized versions for llama.cpp
321
+
322
+ ---
323
+
324
+ **Note:** This is a 4bit quantized model ready for immediate use with the Transformers library. No additional model merging or quantization required.
added_tokens.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "<image_soft_token>": 262144
3
+ }
chat_template.jinja ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {{ bos_token }}
2
+ {%- if messages[0]['role'] == 'system' -%}
3
+ {%- if messages[0]['content'] is string -%}
4
+ {%- set first_user_prefix = messages[0]['content'] + '
5
+
6
+ ' -%}
7
+ {%- else -%}
8
+ {%- set first_user_prefix = messages[0]['content'][0]['text'] + '
9
+
10
+ ' -%}
11
+ {%- endif -%}
12
+ {%- set loop_messages = messages[1:] -%}
13
+ {%- else -%}
14
+ {%- set first_user_prefix = "" -%}
15
+ {%- set loop_messages = messages -%}
16
+ {%- endif -%}
17
+ {%- for message in loop_messages -%}
18
+ {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}
19
+ {{ raise_exception("Conversation roles must alternate user/assistant/user/assistant/...") }}
20
+ {%- endif -%}
21
+ {%- if (message['role'] == 'assistant') -%}
22
+ {%- set role = "model" -%}
23
+ {%- else -%}
24
+ {%- set role = message['role'] -%}
25
+ {%- endif -%}
26
+ {{ '<start_of_turn>' + role + '
27
+ ' + (first_user_prefix if loop.first else "") }}
28
+ {%- if message['content'] is string -%}
29
+ {{ message['content'] | trim }}
30
+ {%- elif message['content'] is iterable -%}
31
+ {%- for item in message['content'] -%}
32
+ {%- if item['type'] == 'image' -%}
33
+ {{ '<start_of_image>' }}
34
+ {%- elif item['type'] == 'text' -%}
35
+ {{ item['text'] | trim }}
36
+ {%- endif -%}
37
+ {%- endfor -%}
38
+ {%- else -%}
39
+ {{ raise_exception("Invalid content type") }}
40
+ {%- endif -%}
41
+ {{ '<end_of_turn>
42
+ ' }}
43
+ {%- endfor -%}
44
+ {%- if add_generation_prompt -%}
45
+ {{'<start_of_turn>model
46
+ '}}
47
+ {%- endif -%}
config.json ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Gemma3ForConditionalGeneration"
4
+ ],
5
+ "boi_token_index": 255999,
6
+ "bos_token_id": 2,
7
+ "eoi_token_index": 256000,
8
+ "eos_token_id": 106,
9
+ "image_token_index": 262144,
10
+ "initializer_range": 0.02,
11
+ "mm_tokens_per_image": 256,
12
+ "model_type": "gemma3",
13
+ "pad_token_id": 0,
14
+ "quantization_config": {
15
+ "_load_in_4bit": true,
16
+ "_load_in_8bit": false,
17
+ "bnb_4bit_compute_dtype": "bfloat16",
18
+ "bnb_4bit_quant_storage": "uint8",
19
+ "bnb_4bit_quant_type": "nf4",
20
+ "bnb_4bit_use_double_quant": true,
21
+ "llm_int8_enable_fp32_cpu_offload": false,
22
+ "llm_int8_has_fp16_weight": false,
23
+ "llm_int8_skip_modules": [
24
+ "lm_head",
25
+ "multi_modal_projector",
26
+ "merger",
27
+ "modality_projection"
28
+ ],
29
+ "llm_int8_threshold": 6.0,
30
+ "load_in_4bit": true,
31
+ "load_in_8bit": false,
32
+ "quant_method": "bitsandbytes"
33
+ },
34
+ "text_config": {
35
+ "_sliding_window_pattern": 6,
36
+ "attention_bias": false,
37
+ "attention_dropout": 0.0,
38
+ "attn_logit_softcapping": null,
39
+ "cache_implementation": "hybrid",
40
+ "final_logit_softcapping": null,
41
+ "head_dim": 256,
42
+ "hidden_activation": "gelu_pytorch_tanh",
43
+ "hidden_size": 3840,
44
+ "initializer_range": 0.02,
45
+ "intermediate_size": 15360,
46
+ "layer_types": [
47
+ "sliding_attention",
48
+ "sliding_attention",
49
+ "sliding_attention",
50
+ "sliding_attention",
51
+ "sliding_attention",
52
+ "full_attention",
53
+ "sliding_attention",
54
+ "sliding_attention",
55
+ "sliding_attention",
56
+ "sliding_attention",
57
+ "sliding_attention",
58
+ "full_attention",
59
+ "sliding_attention",
60
+ "sliding_attention",
61
+ "sliding_attention",
62
+ "sliding_attention",
63
+ "sliding_attention",
64
+ "full_attention",
65
+ "sliding_attention",
66
+ "sliding_attention",
67
+ "sliding_attention",
68
+ "sliding_attention",
69
+ "sliding_attention",
70
+ "full_attention",
71
+ "sliding_attention",
72
+ "sliding_attention",
73
+ "sliding_attention",
74
+ "sliding_attention",
75
+ "sliding_attention",
76
+ "full_attention",
77
+ "sliding_attention",
78
+ "sliding_attention",
79
+ "sliding_attention",
80
+ "sliding_attention",
81
+ "sliding_attention",
82
+ "full_attention",
83
+ "sliding_attention",
84
+ "sliding_attention",
85
+ "sliding_attention",
86
+ "sliding_attention",
87
+ "sliding_attention",
88
+ "full_attention",
89
+ "sliding_attention",
90
+ "sliding_attention",
91
+ "sliding_attention",
92
+ "sliding_attention",
93
+ "sliding_attention",
94
+ "full_attention"
95
+ ],
96
+ "max_position_embeddings": 131072,
97
+ "model_type": "gemma3_text",
98
+ "num_attention_heads": 16,
99
+ "num_hidden_layers": 48,
100
+ "num_key_value_heads": 8,
101
+ "query_pre_attn_scalar": 256,
102
+ "rms_norm_eps": 1e-06,
103
+ "rope_local_base_freq": 10000,
104
+ "rope_scaling": {
105
+ "factor": 8.0,
106
+ "rope_type": "linear"
107
+ },
108
+ "rope_theta": 1000000,
109
+ "sliding_window": 1024,
110
+ "torch_dtype": "bfloat16",
111
+ "use_cache": true,
112
+ "vocab_size": 262208
113
+ },
114
+ "torch_dtype": "bfloat16",
115
+ "transformers_version": "4.55.4",
116
+ "unsloth_fixed": true,
117
+ "vision_config": {
118
+ "attention_dropout": 0.0,
119
+ "hidden_act": "gelu_pytorch_tanh",
120
+ "hidden_size": 1152,
121
+ "image_size": 896,
122
+ "intermediate_size": 4304,
123
+ "layer_norm_eps": 1e-06,
124
+ "model_type": "siglip_vision_model",
125
+ "num_attention_heads": 16,
126
+ "num_channels": 3,
127
+ "num_hidden_layers": 27,
128
+ "patch_size": 14,
129
+ "torch_dtype": "bfloat16",
130
+ "vision_use_head": false
131
+ }
132
+ }
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 2,
3
+ "cache_implementation": "hybrid",
4
+ "do_sample": true,
5
+ "eos_token_id": [
6
+ 1,
7
+ 106
8
+ ],
9
+ "pad_token_id": 0,
10
+ "top_k": 64,
11
+ "top_p": 0.95,
12
+ "transformers_version": "4.55.4"
13
+ }
model-00001-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e1c4dfdbd9ed238c8963e6c48673889cb9c5a65a044ed782229f7fb87ecb0657
3
+ size 4992268790
model-00002-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1ff251b4e29bc079e6c802a3f5529dd0543b8c5f66469352f974fa36b2dc7e39
3
+ size 2806556012
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "boi_token": "<start_of_image>",
3
+ "bos_token": {
4
+ "content": "<bos>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false
9
+ },
10
+ "eoi_token": "<end_of_image>",
11
+ "eos_token": {
12
+ "content": "<end_of_turn>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false
17
+ },
18
+ "image_token": "<image_soft_token>",
19
+ "pad_token": {
20
+ "content": "<pad>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false
25
+ },
26
+ "unk_token": {
27
+ "content": "<unk>",
28
+ "lstrip": false,
29
+ "normalized": false,
30
+ "rstrip": false,
31
+ "single_word": false
32
+ }
33
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
3
+ size 33384568
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1299c11d7cf632ef3b4e11937501358ada021bbdf7c47638d13c0ee982f2e79c
3
+ size 4689074
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff