Safetensors
GGUF
English
chain-of-thought
cot-reasoning
step-by-step-reasoning
systematic-research-planning
academic-assistant
academic-planning
thesis-planning
dissertation-planning
research-question-formulation
literature-review-planning
methodology-design
experimental-design
qualitative-research-planning
quantitative-research-planning
mixed-methods-planning
student-research-assistant
phd-support
postgraduate-tool
early-career-researcher
grant-writing-assistant
research-proposal-helper
cross-disciplinary-research
interdisciplinary-methodology
academic-mentorship-tool
research-evaluation-assistant
independent-researcher-tool
r-and-d-assistant
reasoning-model
structured-output
systematic-analysis
problem-decomposition
research-breakdown
actionable-planning
scientific-research
social-science-research
humanities-research
medical-research-planning
engineering-research
business-research
mistral-based
mistral-fine-tune
lora-adaptation
foundation-model
instruction-tuned
7b-parameters
ai-research-assistant
research-automation
sota-research-planning
hypothesis-generation
experiment-design-assistant
literature-analysis
paper-outline-generator
structured-output-generation
systematic-reasoning
detailed-planning
zero-shot-planning
research-summarization
biomedical-research-assistant
clinical-trial-planning
tech-r-and-d
materials-science
computational-research
data-science-assistant
literature-synthesis
meta-analysis-helper
best-research-assistant-model
top-research-planning-model
research-ai-assistant
ai-research-mentor
academic-planning-ai
research-workflow-automation
quantum-computing-research
ai-ml-research-planning
cybersecurity-research
neuroscience-research-planning
genomics-research
robotics-research-planning
climate-science-research
behavioral-economics-research
educational-technology-research
research-plan-generator
methodology-recommendation
data-collection-planning
analysis-strategy-development
implementation-planning
evaluation-framework-design
challenge-identification
resource-requirement-analysis
technical-limitation-assessment
research-gap-analysis
knowledge-synthesis
practical-research-tools
affordable-research-assistant
systematic-planning-tool
comprehensive-research-framework
research-project-management
researcher-productivity-tool
text-to-research-plan
dual-output-model
think-answer-format
evidence-based-research-planning
research-mentoring
science-domains-expert
engineering-domains-expert
social-science-domains-expert
multidisciplinary-research
structured-research-planning
hierarchical-plan-generator
convergent-thinking
divergent-thinking
research-ideation
experimental-protocol-design
mistral-research-assistant
focused-research-scope
quantitative-analysis-planning
portable-research-assistant
education-research-tool
Research-Reasoner-7B-v0.3
Research-Reasoner-7B
Research-Reasoner
conversational
Raymond-dev-546730's picture
Update Training/Training_Logs.txt
0abd808 verified
Loading tokenizer...
Loading dataset from ./Dataset.jsonl
Loaded 5750 samples
Training on 5462 samples, validating on 288 samples
Loading model...
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:01<00:00, 1.64it/s]
Trainable parameters: 85,065,728 (1.16% of 7,333,089,280)
No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.
Starting training...
0%| | 0/822 [00:00<?, ?it/s]`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
{'loss': 1.3121, 'grad_norm': 2.309774398803711, 'learning_rate': 9.987699876998771e-05, 'epoch': 0.04}
{'loss': 0.5661, 'grad_norm': 0.8865324258804321, 'learning_rate': 9.86469864698647e-05, 'epoch': 0.07}
{'loss': 0.4548, 'grad_norm': 0.7190561294555664, 'learning_rate': 9.74169741697417e-05, 'epoch': 0.11}
{'loss': 0.4142, 'grad_norm': 0.5569139719009399, 'learning_rate': 9.61869618696187e-05, 'epoch': 0.15}
{'loss': 0.3967, 'grad_norm': 0.5526906251907349, 'learning_rate': 9.49569495694957e-05, 'epoch': 0.18}
{'loss': 0.3791, 'grad_norm': 0.5905580520629883, 'learning_rate': 9.37269372693727e-05, 'epoch': 0.22}
{'loss': 0.3705, 'grad_norm': 0.5934385061264038, 'learning_rate': 9.24969249692497e-05, 'epoch': 0.26}
{'loss': 0.343, 'grad_norm': 0.5346804857254028, 'learning_rate': 9.126691266912669e-05, 'epoch': 0.29}
{'loss': 0.3571, 'grad_norm': 0.5717061161994934, 'learning_rate': 9.00369003690037e-05, 'epoch': 0.33}
{'loss': 0.3377, 'grad_norm': 0.572606086730957, 'learning_rate': 8.880688806888068e-05, 'epoch': 0.36}
{'loss': 0.3351, 'grad_norm': 0.5523596405982971, 'learning_rate': 8.757687576875769e-05, 'epoch': 0.4}
{'loss': 0.3358, 'grad_norm': 0.5368799567222595, 'learning_rate': 8.634686346863469e-05, 'epoch': 0.44}
{'loss': 0.3295, 'grad_norm': 0.5397845506668091, 'learning_rate': 8.511685116851169e-05, 'epoch': 0.47}
{'loss': 0.3176, 'grad_norm': 0.533035933971405, 'learning_rate': 8.388683886838868e-05, 'epoch': 0.51}
{'loss': 0.3121, 'grad_norm': 0.5297876000404358, 'learning_rate': 8.265682656826569e-05, 'epoch': 0.55}
{'loss': 0.3106, 'grad_norm': 0.5207675099372864, 'learning_rate': 8.142681426814267e-05, 'epoch': 0.58}
{'loss': 0.3076, 'grad_norm': 0.530646800994873, 'learning_rate': 8.019680196801969e-05, 'epoch': 0.62}
{'loss': 0.3138, 'grad_norm': 0.5448920130729675, 'learning_rate': 7.896678966789668e-05, 'epoch': 0.66}
{'loss': 0.2993, 'grad_norm': 0.5139408707618713, 'learning_rate': 7.773677736777368e-05, 'epoch': 0.69}
{'loss': 0.287, 'grad_norm': 0.510353147983551, 'learning_rate': 7.650676506765067e-05, 'epoch': 0.73}
{'loss': 0.2872, 'grad_norm': 0.4917829930782318, 'learning_rate': 7.527675276752768e-05, 'epoch': 0.77}
{'loss': 0.2878, 'grad_norm': 0.5074344873428345, 'learning_rate': 7.404674046740468e-05, 'epoch': 0.8}
{'loss': 0.283, 'grad_norm': 0.5248982310295105, 'learning_rate': 7.281672816728168e-05, 'epoch': 0.84}
{'loss': 0.2819, 'grad_norm': 0.50633305311203, 'learning_rate': 7.158671586715867e-05, 'epoch': 0.88}
{'loss': 0.2762, 'grad_norm': 0.49859219789505005, 'learning_rate': 7.035670356703567e-05, 'epoch': 0.91}
{'loss': 0.274, 'grad_norm': 0.5172735452651978, 'learning_rate': 6.912669126691266e-05, 'epoch': 0.95}
{'loss': 0.258, 'grad_norm': 0.49355506896972656, 'learning_rate': 6.789667896678967e-05, 'epoch': 0.99}
{'eval_loss': 0.26155468821525574, 'eval_runtime': 71.2989, 'eval_samples_per_second': 4.039, 'eval_steps_per_second': 0.505, 'epoch': 1.0}
33%|█████████████████████████████████████████▎ | 274/822 [1:17:25<1:51:49, 12.24s/it/venv/main/lib/python3.10/site-packages/peft/utils/save_and_load.py:220: UserWarning: Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.
warnings.warn("Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.")
{'loss': 0.2408, 'grad_norm': 0.5144125819206238, 'learning_rate': 6.666666666666667e-05, 'epoch': 1.02}
{'loss': 0.2214, 'grad_norm': 0.5052995085716248, 'learning_rate': 6.543665436654367e-05, 'epoch': 1.06}
{'loss': 0.2211, 'grad_norm': 0.5044167041778564, 'learning_rate': 6.420664206642066e-05, 'epoch': 1.09}
{'loss': 0.2177, 'grad_norm': 0.5073069334030151, 'learning_rate': 6.297662976629767e-05, 'epoch': 1.13}
{'loss': 0.2175, 'grad_norm': 0.5095102190971375, 'learning_rate': 6.174661746617465e-05, 'epoch': 1.17}
{'loss': 0.2146, 'grad_norm': 0.525816023349762, 'learning_rate': 6.0516605166051664e-05, 'epoch': 1.2}
{'loss': 0.2224, 'grad_norm': 0.5224237442016602, 'learning_rate': 5.928659286592866e-05, 'epoch': 1.24}
{'loss': 0.214, 'grad_norm': 0.5144808888435364, 'learning_rate': 5.8056580565805663e-05, 'epoch': 1.28}
{'loss': 0.2102, 'grad_norm': 0.5273745656013489, 'learning_rate': 5.682656826568265e-05, 'epoch': 1.31}
{'loss': 0.2132, 'grad_norm': 0.5015597343444824, 'learning_rate': 5.559655596555966e-05, 'epoch': 1.35}
{'loss': 0.2129, 'grad_norm': 0.5065250992774963, 'learning_rate': 5.436654366543665e-05, 'epoch': 1.39}
{'loss': 0.2155, 'grad_norm': 0.5387675166130066, 'learning_rate': 5.3136531365313655e-05, 'epoch': 1.42}
{'loss': 0.2067, 'grad_norm': 0.5043182373046875, 'learning_rate': 5.190651906519065e-05, 'epoch': 1.46}
{'loss': 0.2123, 'grad_norm': 0.5399544835090637, 'learning_rate': 5.0676506765067654e-05, 'epoch': 1.5}
{'loss': 0.2136, 'grad_norm': 0.5281521081924438, 'learning_rate': 4.944649446494466e-05, 'epoch': 1.53}
{'loss': 0.2016, 'grad_norm': 0.5173853039741516, 'learning_rate': 4.821648216482165e-05, 'epoch': 1.57}
{'loss': 0.2174, 'grad_norm': 0.5505253672599792, 'learning_rate': 4.698646986469865e-05, 'epoch': 1.61}
{'loss': 0.207, 'grad_norm': 0.5227971076965332, 'learning_rate': 4.575645756457565e-05, 'epoch': 1.64}
{'loss': 0.2105, 'grad_norm': 0.5264511108398438, 'learning_rate': 4.452644526445265e-05, 'epoch': 1.68}
{'loss': 0.2025, 'grad_norm': 0.48573943972587585, 'learning_rate': 4.3296432964329645e-05, 'epoch': 1.72}
{'loss': 0.2033, 'grad_norm': 0.5173289179801941, 'learning_rate': 4.206642066420665e-05, 'epoch': 1.75}
{'loss': 0.2164, 'grad_norm': 0.5112122893333435, 'learning_rate': 4.0836408364083644e-05, 'epoch': 1.79}
{'loss': 0.2061, 'grad_norm': 0.5157698392868042, 'learning_rate': 3.960639606396064e-05, 'epoch': 1.82}
{'loss': 0.1998, 'grad_norm': 0.5127679705619812, 'learning_rate': 3.837638376383764e-05, 'epoch': 1.86}
{'loss': 0.2052, 'grad_norm': 0.5318583846092224, 'learning_rate': 3.714637146371464e-05, 'epoch': 1.9}
{'loss': 0.1935, 'grad_norm': 0.5409183502197266, 'learning_rate': 3.591635916359164e-05, 'epoch': 1.93}
{'loss': 0.2112, 'grad_norm': 0.5394887328147888, 'learning_rate': 3.468634686346864e-05, 'epoch': 1.97}
{'eval_loss': 0.23323042690753937, 'eval_runtime': 71.2959, 'eval_samples_per_second': 4.04, 'eval_steps_per_second': 0.505, 'epoch': 2.0}
67%|████████████████████████████████████████████████████████████████████████████████████ | 548/822 [2:34:50<55:55, 12.25s/it/venv/main/lib/python3.10/site-packages/peft/utils/save_and_load.py:220: UserWarning: Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.
warnings.warn("Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.")
{'loss': 0.1896, 'grad_norm': 0.4955272674560547, 'learning_rate': 3.3456334563345635e-05, 'epoch': 2.01}
{'loss': 0.1458, 'grad_norm': 0.5467974543571472, 'learning_rate': 3.222632226322264e-05, 'epoch': 2.04}
{'loss': 0.1541, 'grad_norm': 0.5276175141334534, 'learning_rate': 3.0996309963099634e-05, 'epoch': 2.08}
{'loss': 0.1533, 'grad_norm': 0.5403090119361877, 'learning_rate': 2.9766297662976633e-05, 'epoch': 2.12}
{'loss': 0.1482, 'grad_norm': 0.5518564581871033, 'learning_rate': 2.8536285362853633e-05, 'epoch': 2.15}
{'loss': 0.1491, 'grad_norm': 0.5449461340904236, 'learning_rate': 2.730627306273063e-05, 'epoch': 2.19}
{'loss': 0.1498, 'grad_norm': 0.5545180439949036, 'learning_rate': 2.607626076260763e-05, 'epoch': 2.23}
{'loss': 0.1493, 'grad_norm': 0.5655773878097534, 'learning_rate': 2.4846248462484625e-05, 'epoch': 2.26}
{'loss': 0.1493, 'grad_norm': 0.5422763824462891, 'learning_rate': 2.3616236162361624e-05, 'epoch': 2.3}
{'loss': 0.1467, 'grad_norm': 0.575341522693634, 'learning_rate': 2.2386223862238624e-05, 'epoch': 2.34}
{'loss': 0.1496, 'grad_norm': 0.5568922758102417, 'learning_rate': 2.115621156211562e-05, 'epoch': 2.37}
{'loss': 0.1464, 'grad_norm': 0.5487733483314514, 'learning_rate': 1.992619926199262e-05, 'epoch': 2.41}
{'loss': 0.1412, 'grad_norm': 0.5830573439598083, 'learning_rate': 1.869618696186962e-05, 'epoch': 2.45}
{'loss': 0.1443, 'grad_norm': 0.5557405948638916, 'learning_rate': 1.746617466174662e-05, 'epoch': 2.48}
{'loss': 0.1473, 'grad_norm': 0.5754356384277344, 'learning_rate': 1.6236162361623615e-05, 'epoch': 2.52}
{'loss': 0.1515, 'grad_norm': 0.5622814297676086, 'learning_rate': 1.5006150061500615e-05, 'epoch': 2.55}
{'loss': 0.1492, 'grad_norm': 0.5576657056808472, 'learning_rate': 1.3776137761377614e-05, 'epoch': 2.59}
{'loss': 0.1464, 'grad_norm': 0.5700699687004089, 'learning_rate': 1.2546125461254612e-05, 'epoch': 2.63}
{'loss': 0.1467, 'grad_norm': 0.5535569787025452, 'learning_rate': 1.1316113161131612e-05, 'epoch': 2.66}
{'loss': 0.1452, 'grad_norm': 0.5665120482444763, 'learning_rate': 1.008610086100861e-05, 'epoch': 2.7}
{'loss': 0.1445, 'grad_norm': 0.5577532052993774, 'learning_rate': 8.85608856088561e-06, 'epoch': 2.74}
{'loss': 0.1354, 'grad_norm': 0.5575653314590454, 'learning_rate': 7.626076260762607e-06, 'epoch': 2.77}
{'loss': 0.1424, 'grad_norm': 0.5807399153709412, 'learning_rate': 6.396063960639606e-06, 'epoch': 2.81}
{'loss': 0.1361, 'grad_norm': 0.5388136506080627, 'learning_rate': 5.166051660516605e-06, 'epoch': 2.85}
{'loss': 0.1443, 'grad_norm': 0.5740516185760498, 'learning_rate': 3.936039360393604e-06, 'epoch': 2.88}
{'loss': 0.1456, 'grad_norm': 0.5718849897384644, 'learning_rate': 2.706027060270603e-06, 'epoch': 2.92}
{'loss': 0.1432, 'grad_norm': 0.5548787117004395, 'learning_rate': 1.4760147601476015e-06, 'epoch': 2.96}
{'loss': 0.1373, 'grad_norm': 0.5674080848693848, 'learning_rate': 2.4600246002460025e-07, 'epoch': 2.99}
{'eval_loss': 0.2300749570131302, 'eval_runtime': 71.291, 'eval_samples_per_second': 4.04, 'eval_steps_per_second': 0.505, 'epoch': 3.0}
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 822/822 [3:52:15<00:00, 12.25s/it/venv/main/lib/python3.10/site-packages/peft/utils/save_and_load.py:220: UserWarning: Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.
warnings.warn("Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.")
{'train_runtime': 13935.9037, 'train_samples_per_second': 1.176, 'train_steps_per_second': 0.059, 'train_loss': 0.24226934355830915, 'epoch': 3.0}
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 822/822 [3:52:15<00:00, 16.95s/it]
Saving model...
/venv/main/lib/python3.10/site-packages/peft/utils/save_and_load.py:220: UserWarning: Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.
warnings.warn("Setting `save_embedding_layers` to `True` as embedding layers found in `target_modules`.")
Model saved to ./Research-Reasoner-7B-v0.3_LoRA_adapter