Acquiring Bidirectionality via Large and Small Language Models

This model is LLM for obtaining token-level representations as proposed in our COLING 2025 paper "Acquiring Bidirectionality via Large and Small Language Models." Using token representation from bidirectional language models (LMs) such as BERT is still a widely used approach for token-classification tasks. Even though there exist much larger unidirectional LMs such as Llama-2, they are rarely used to replace the token representation of bidirectional LMs. We propose to newly train a small backward LM and concatenate its representations to those of an existing LM for downstream tasks.

This model is the "small backward LM" and it needs to be combined with another forward LM such as Llama-2. Please refer to our official repository to use this model. This particular model uses meta-llama/Llama-2-7b-hf vocabulary for its training.

Downloads last month
5
Safetensors
Model size
110M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.