stockmark/stockmark-100b-instruct-v0.1
Stockmark-100b-instruct-v0.1 is an instruction tuned version of stockmark-100b, a 100 billion parameter LLM developed by Stockmark Inc.
How to use
import torch
from transformers import AutoTokenizer
from peft import AutoPeftModelForCausalLM
prompt_template = """### ๆ็คบ:
{instruction}
### ๅฟ็ญ:
"""
tokenizer = AutoTokenizer.from_pretrained("stockmark/stockmark-100b-instruct-v0.1")
model = AutoPeftModelForCausalLM.from_pretrained("stockmark/stockmark-100b-instruct-v0.1", device_map="auto", torch_dtype=torch.bfloat16)
instruction = "็ๆAIใจใฏ๏ผ"
prompt = prompt_template.format(instruction=instruction)
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(model.device)
with torch.inference_mode():
tokens = model.generate(
input_ids,
max_new_tokens = 256,
do_sample = True,
temperature = 0.7,
top_p = 0.95,
repetition_penalty = 1.08
)
output = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(output)
Dataset (fine-tuning)
Performance
Stockmark Business Questions
Dataset: https://huggingface.co/datasets/stockmark/business-questions
model | accuracy |
---|---|
stockmark-100b-instruct | 0.90 |
stockmark-13b-instruct | 0.80 |
GPT-3.5-turbo^1 | 0.42 |
Japanese Vicuna QA Benchmark
We excluded categories that require calculation and coding, and use remaining 60 questions for evaluation.
GitHub: https://github.com/ku-nlp/ja-vicuna-qa-benchmark
model | average score |
---|---|
stockmark-100b-instruct | 5.97 |
tokyotech-llm/Swallow-70b-instruct-hf | 5.59 |
GPT-3.5 (text-davinci-003) | 5.08 |
Inference speed
model | time [s] for genrating 100 characters in Japanese |
---|---|
stockmark-100b-instruct | 1.86 |
gpt-3.5-turbo | 2.15 |
gpt-4-turbo | 5.48 |
tokyotech-llm/Swallow-70b-instruct-hf | 2.22 |
For local LLMs, we measured the inference time using AWS Inferentia2.
License
Developed by
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no pipeline_tag.