DavidAU commited on
Commit
5c98690
·
verified ·
1 Parent(s): df15e27

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -1
README.md CHANGED
@@ -105,7 +105,7 @@ https://huggingface.co/DavidAU/AI_Autocorrect__Auto-Creative-Enhancement__Auto-L
105
 
106
  ---
107
 
108
- Template Considerations:
109
 
110
  For most reasoning/thinking models your template CHOICE is critical, as well as your System Prompt/Role setting(s) - below.
111
 
@@ -124,6 +124,7 @@ A "Jinja" template is usually in the model's "source code" / "full precision" ve
124
  Here is a Qwen 2.5 version example (DO NOT USE: I have added spacing/breaks for readablity):
125
 
126
  <pre>
 
127
  "chat_template": "{% if not add_generation_prompt is defined %}
128
  {% set add_generation_prompt = false %}
129
  {% endif %}
@@ -176,11 +177,13 @@ Here is a Qwen 2.5 version example (DO NOT USE: I have added spacing/breaks for
176
  {% if add_generation_prompt and not ns.is_tool %}
177
  {{'<|Assistant|>'}}
178
  {% endif %}"
 
179
  </pre>
180
 
181
  In some cases you may need to set a "tokenizer" too - depending on the LLM/AI app - to work with specific reasoning/thinking models. Usually
182
  this is NOT an issue as this is auto-detected/set, but if you are getting strange results then this might be the cause.
183
 
 
184
 
185
  TEMP/SETTINGS:
186
 
@@ -317,3 +320,58 @@ Response Guidelines:
317
  4. Concise yet Complete: Ensure responses are informative, yet to the point without unnecessary elaboration.
318
  5. Maintain a professional, intelligent, and analytical tone in all interactions.
319
  </PRE>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
105
 
106
  ---
107
 
108
+ <B>Template Considerations:</b>
109
 
110
  For most reasoning/thinking models your template CHOICE is critical, as well as your System Prompt/Role setting(s) - below.
111
 
 
124
  Here is a Qwen 2.5 version example (DO NOT USE: I have added spacing/breaks for readablity):
125
 
126
  <pre>
127
+ <small>
128
  "chat_template": "{% if not add_generation_prompt is defined %}
129
  {% set add_generation_prompt = false %}
130
  {% endif %}
 
177
  {% if add_generation_prompt and not ns.is_tool %}
178
  {{'<|Assistant|>'}}
179
  {% endif %}"
180
+ </small>
181
  </pre>
182
 
183
  In some cases you may need to set a "tokenizer" too - depending on the LLM/AI app - to work with specific reasoning/thinking models. Usually
184
  this is NOT an issue as this is auto-detected/set, but if you are getting strange results then this might be the cause.
185
 
186
+ Additional Section "General Notes" is at the end of this document.
187
 
188
  TEMP/SETTINGS:
189
 
 
320
  4. Concise yet Complete: Ensure responses are informative, yet to the point without unnecessary elaboration.
321
  5. Maintain a professional, intelligent, and analytical tone in all interactions.
322
  </PRE>
323
+
324
+ ---
325
+
326
+ <B>General Notes:</b>
327
+
328
+ These are general notes that have been collected from my various repos and/or from various experiences with both specific models
329
+ and all models.
330
+
331
+ These notes may assist you with other model(s) operation(s).
332
+
333
+ ---
334
+
335
+ From :
336
+
337
+ https://huggingface.co/DavidAU/L3.1-MOE-2X8B-Deepseek-DeepHermes-e32-uncensored-abliterated-13.7B-gguf
338
+
339
+ Due to how this model is configured, I suggest 2-4 generations depending on your use case(s) as each will vary widely in terms of context, thinking/reasoning and response.
340
+
341
+ Likewise, again depending on how your prompt is worded, it may take 1-4 regens for "thinking" to engage, however sometimes the model will generate a response, then think/reason and improve on this response and continue again. This is in part from "Deepseek" parts in the model.
342
+
343
+ If you raise temp over .9, you may want to consider 4+ generations.
344
+
345
+ Note on "reasoning/thinking" this will activate depending on the wording in your prompt(s) and also temp selected.
346
+
347
+ There can also be variations because of how the models interact per generation.
348
+
349
+ Also, as general note:
350
+
351
+ If you are getting "long winded" generation/thinking/reasoning you may want to breakdown the "problem(s)" to solve into one or more prompts. This will allow the model to focus more strongly, and in some case give far better answers.
352
+
353
+ IE:
354
+
355
+ If you ask it to generate 6 general plots for a story VS generate one plot with these specific requirements - you may get better results.
356
+
357
+ ---
358
+
359
+ From :
360
+
361
+ https://huggingface.co/DavidAU/Qwen2.5-MOE-6x1.5B-DeepSeek-Reasoning-e32-gguf
362
+
363
+ Temp of .4 to .8 is suggested, however it will still operate at much higher temps like 1.8, 2.6 etc.
364
+
365
+ Depending on your prompt change temp SLOWLY: IE: .41,.42,.43 ... etc etc.
366
+
367
+ Likewise, because these are small models, it may do a tonne of "thinking"/"reasoning" and then "forget" to finish a / the task(s). In this case, prompt the model to "Complete the task XYZ with the 'reasoning plan' above" .
368
+
369
+ Likewise it may function better if you breakdown the reasoning/thinking task(s) into smaller pieces :
370
+
371
+ "IE: Instead of asking for 6 plots FOR theme XYZ, ASK IT for ONE plot for theme XYZ at a time".
372
+
373
+ Also set context limit at 4k minimum, 8K+ suggested.
374
+
375
+ ---
376
+
377
+