This repository contains binary patch data for reconstructing dataset for quality estimation of GEC.
Please pull thse codes on your local space.
$ git clone https://github.com/tmu-nlp/ProQE.git
The header contains the following columns:
原文: source sentence.訂正: GEC system output sentence.個別評価: annotation scores by 3 annotators.評価: average of個別評価.
If you use our dataset, please cite our LREC paper:
- Daisuke Suzuki, Yujin Takahashi, Ikumi Yamashita, Taichi Aida, Tosho Hirasawa, Michitaka Nakatsuji, Masato Mita, Mamoru Komachi. Construction of a Quality Estimation Dataset for Automatic Evaluation of Japanese Grammatical Error Correction. Proceedings of the 13th Edition of Language Resources and Evaluation Conference (LREC 2022), pp.5565-5572. June, 2022.