❗The results of the 2022 edition are available.
❗Please also register to the QE google-group here in order to be able to receive updates and announcements immediately, and ask us questions.
This shared task focuses on automatic methods for estimating the quality of neural machine translation output at run-time, without relying on reference translations. It will cover estimation at sentence and word levels. This year we introduce the following new elements:
Have questions or suggestions? Feel free to Contact Us!
❗The test data for all tasks and subtasks can now be downloaded here 🔗. Please read the updated formatting instructions as well.
❗The .tags files for the DA-QE training data have been updated to fix a bug related to the annotation of BAD <EOS> tokens. Please make sure to download again here 🔗.
❗The training data for all tasks and subtasks can now be downloaded here 🔗. We provide an overview of the datasets, annotations and LPs below. For more information check also the individual task tabs and the additional data.
|Release of training data||30th May, 2022|
|Release of dev data||TBA|
|Release of test data||21th July, 2022|
|Submission of test predictions deadline|
|Announcement of results||5th September, 2022|
|System description submission deadline|
|Paper submission deadline to WMT||7th September, 2022|
|WMT Notification of acceptance||9th October, 2022|
|WMT Camera-ready deadline||16th October, 2022|
|WMT Conference||7th - 8th December, 2022|
In addition to generally advancing the state of the art in quality estimation, our specific goals are:
For all tasks, the datasets and NMT models that generated the translations will be made publicly available.
Participants are also allowed to explore any additional data and resources deemed relevant. Below are the three QE tasks addressing these goals.
The shared task competition will take place on CODALAB.
Please register with one account per team.
It is not necessary to participate on all tasks/subtasks/phases. Participating teams can choose which tasks/language pairs they want to make submissions for.
Here are some open source software for QE that might be useful for participants:
Each participating team can submit at most 10 systems for each of the language pairs of each subtask. These should be submitted to a CODALAB page for each subtask. Please check that your system output on the dev data is correctly read by the official evaluation scripts.