Transformers
English
Inference Endpoints

Model Weights Comming Soon!

Using HDT

To use the pre-trained model for UL2, use the following snippet:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

# See the `MDLM` collection page on the hub for list of available models.
tokenizer = transformers.AutoTokenizer.from_pretrained('howey/HDT-ED')
model_name = 'howey/HDT-ED'
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

For more details, please see our github repository: HDT

Model Details

The model, which has a context length of 8192 and is similar in size to BERT with approximately 110M parameters, was trained on standard UL2 task with a Transformer-based architecture using our proposed hierarchical attention. The training regimen comprised 72 hours on the ArXiv+Wikipedia+HUPD corpus, involving the processing of a total of 2.6 billion tokens.

For more details, please see our paper: HDT: Hierarchical Document Transformer.

Citation

Please cite our work using the bibtex below:

BibTeX:

@inproceedings{He2024COLM,
      title={HDT: Hierarchical Document Transformer},
      author={Haoyu He and Markus Flicke and Jan Buchmann and Iryna Gurevych and Andreas Geiger},
      year={2024},
      booktitle={Conference on Language Modeling}
}

Model Card Contact

Haoyu ([email protected])

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Datasets used to train howey/HDT-ED

Collection including howey/HDT-ED