File size: 4,601 Bytes
f62904b 1cab402 f62904b a7b3188 f62904b 1cab402 f62904b 8be5f3c f62904b 1cab402 f62904b 1cab402 f62904b 409c788 f62904b dfb0245 f62904b dfb0245 f62904b dfb0245 f62904b dfb0245 f62904b 374b2b5 f62904b 714662b f62904b 714662b f62904b dfb0245 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
---
library_name: transformers
license: cc-by-sa-4.0
language:
- en
- ja
base_model:
- EQUES/JPharmatron-7B-base
tags:
- pharmacy
- biology
- chemistry
- medical
---
# JPharmatron-7B
<!-- Provide a quick summary of what the model is/does. -->
JPharmatron-7B is a 7B large language model designed for pharmaceutical applications and researches.
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
The JPharmatron-7B is continually pre-trained using 8.8B tokens from Japanese and English datasets, based on Qwen2.5-7B. Compared to the JPharmatron-7B-base model, JPharmatron-7B has enhanced chat capabilities, obtained from Qwen2.5-7B-Instruct's chat vector.
- **Developed by:** EQUES Inc.
- **Funded by [optional]:** [GENIAC Project](https://www.meti.go.jp/policy/mono_info_service/geniac/index.html)
- **Model type:** Causal decoder-only
- **Language(s) (NLP):** Japanese, English
- **License:** CC-BY-SA-4.0
### Model Sources [optional]
<!-- Provide the basic links for the model. -->
- **Repository:** https://github.com/EQUES-Inc/pharma-LLM-eval
- **Paper [optional]:** [A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLP](https://arxiv.org/abs/2505.16661)
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
This model is intended for applications in pharmaceutical paperwork and research. It is not validated for medical use or any other risk-sensitive use.
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
We evaluated our model, JPharmatron-7B, with other general / domain-specific models of a similar size.
### Testing Data
<!-- This should link to a Dataset Card if possible. -->
[JPharmaBench](https://huggingface.co/collections/EQUES/jpharmabench-680a34acfe96870e41d050d8) and two existing benchmarks (JMMLU (pharma) and IgakuQA) were used.
### Results
Compared to Meditron3-Qwen2.5-7B and Llama3.1-Swallow-8B-Instruct-v0.3, JPharmatron-7B achieved the highest score on all of the five benchmarks.

## Citation [optional]
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
**BibTeX:**
```
@misc{sukeda_japanese_2025,
title = {A {Japanese} {Language} {Model} and {Three} {New} {Evaluation} {Benchmarks} for {Pharmaceutical} {NLP}},
url = {http://arxiv.org/abs/2505.16661},
doi = {10.48550/arXiv.2505.16661},
abstract = {We present a Japanese domain-specific language model for the pharmaceutical field, developed through continual pretraining on 2 billion Japanese pharmaceutical tokens and 8 billion English biomedical tokens. To enable rigorous evaluation, we introduce three new benchmarks: YakugakuQA, based on national pharmacist licensing exams; NayoseQA, which tests cross-lingual synonym and terminology normalization; and SogoCheck, a novel task designed to assess consistency reasoning between paired statements. We evaluate our model against both open-source medical LLMs and commercial models, including GPT-4o. Results show that our domain-specific model outperforms existing open models and achieves competitive performance with commercial ones, particularly on terminology-heavy and knowledge-based tasks. Interestingly, even GPT-4o performs poorly on SogoCheck, suggesting that cross-sentence consistency reasoning remains an open challenge. Our benchmark suite offers a broader diagnostic lens for pharmaceutical NLP, covering factual recall, lexical variation, and logical consistency. This work demonstrates the feasibility of building practical, secure, and cost-effective language models for Japanese domain-specific applications, and provides reusable evaluation resources for future research in pharmaceutical and healthcare NLP. Our model, codes, and datasets are released at https://github.com/EQUES-Inc/pharma-LLM-eval.},
urldate = {2025-05-30},
publisher = {arXiv},
author = {Sukeda, Issey and Fujii, Takuro and Buma, Kosei and Sasaki, Shunsuke and Ono, Shinnosuke},
month = may,
year = {2025},
note = {arXiv:2505.16661 [cs]},
annote = {Comment: 15 pages, 9 tables, 5 figures}
}
```
## More Information [optional]
See our preprint: [A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLP](https://arxiv.org/abs/2505.16661).
## Model Card Authors [optional]
[@shinnosukeono](https://shinnosukeono.github.io/)
|