ZeroXClem/Llama3.1-BestMix-Chem-Einstein-8B

Llama3.1-BestMix-Chem-Einstein-8B is an innovative, meticulously blended model designed to excel in instruction-following, chemistry-focused tasks, and long-form conversational generation. This model fuses the best qualities of multiple Llama3-based architectures, making it highly versatile for both general and specialized tasks. πŸ’»πŸ§ βœ¨

🌟 Family Tree

This model is the result of merging the following:


🧬 Model Lineage

A: bunnycore/Best-Mix-Llama-3.1-8B

  • A masterful blend of several Llama3 models like Aurora_faustus, TitanFusion, and OpenMath2.
  • Provides a balanced performance in a variety of tasks such as reasoning, math, and instruction-following.
  • Key contributor to the overall versatility of the merged model.

B: USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-1.5-8B

  • Specializes in chemistry and scientific knowledge, outperforming many larger models in chemistry benchmarks.
  • Adds scientific rigor and domain-specific expertise to the merged model, making it perfect for scientific and academic tasks.

C: Weyaxi/Einstein-v6.1-Llama3-8B

  • Fine-tuned on a wide range of instructive and conversational datasets like WizardLM, Alpaca, and ShareGPT.
  • Optimized for long-form text generation and enhanced with xformers attention and flash attention techniques for better performance.
  • Key player in dialogue-based tasks and long conversation generation.

πŸ› οΈ Merge Details

This model was merged using the TIES merge method, ensuring a smooth integration of the key strengths from each contributing model. Here's the configuration used:

yaml
Copy code
models:
  - model: bunnycore/Best-Mix-Llama-3.1-8B
    parameters:
      density: [1, 0.7, 0.5]
      weight: 1.0

  - model: USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-1.5-8B
    parameters:
      density: 0.6
      weight: [0.3, 0.7, 1.0]

  - model: Weyaxi/Einstein-v6.1-Llama3-8B
    parameters:
      density: 0.4
      weight:
        - filter: mlp
          value: 0.5
        - filter: self_attn
          value: 0.7
        - value: 0.5

merge_method: ties
base_model: bunnycore/Best-Mix-Llama-3.1-8B
parameters:
  normalize: true
  int8_mask: true
dtype: float16

🎯 Key Features & Capabilities

1. Instruction Following & General Reasoning:

With the foundation of Best-Mix, this model excels in general-purpose reasoning, instruction-following, and tasks that require high adaptability.

2. Scientific & Chemistry Expertise:

Thanks to the contribution from KALE-LM-Chem, this model shines in scientific research, particularly chemistry-focused tasks, making it ideal for academic and research purposes.

3. Long-Form & Conversational Mastery:

With Einstein-v6.1, the model handles long-form generation effortlessly, excelling in extended conversations and structured dialogue applications.


πŸš€ Performance Benchmarks

While still in its early stages, Llama3.1-BestMix-Chem-Einstein-8B is expected to perform well across a variety of benchmarks, including:

  • Chemistry-focused benchmarks (KALE-LM-Chem)
  • Instruction-following tasks (Best-Mix)
  • Conversational AI and long-form text generation (Einstein-v6.1)

Further testing and evaluation will continue to refine this model's capabilities.


πŸ“œ License

This model is open-sourced under the Apache-2.0 License, allowing free use and modification with proper attribution.


πŸ’‘ Tags

  • merge
  • TIES
  • BestMix
  • Chemistry
  • Einstein
  • instruction-following
  • long-form-generation
  • conversational

Downloads last month
103
Safetensors
Model size
8.03B params
Tensor type
FP16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for ZeroXClem/Llama3.1-BestMix-Chem-Einstein-8B

Finetuned
(1)
this model
Quantizations
10 models

Collections including ZeroXClem/Llama3.1-BestMix-Chem-Einstein-8B