--- license: mit language: - he - en base_model: - FacebookAI/xlm-roberta-large pipeline_tag: translation tags: - quality_estimation --- Using the DCSQE framework with the WMT2023 parallel corpus to generate synthetic data for pretraining a model, implemented with the Fairseq framework. For a detailed description of the DCSQE framework, please refer to the paper:
[Alleviating Distribution Shift in Synthetic Data for Machine Translation Quality Estimation](https://huggingface.co/papers/2502.19941)