Kurdish-English Parallel Translation Corpus
Published: May 8, 2023
Size: 450 MB
License: CC BY-SA 4.0
Domain: Translation
Languages:
Kurdish (Sorani)
Kurdish (Kurmanji)
English
Formats:
TSV
JSON
Contributing Organizations:
High-quality parallel corpus containing 500,000 sentence pairs for Kurdish-English translation, covering multiple domains and ensuring balanced representation of both Sorani and Kurmanji dialects.
Tab-separated files with aligned sentences, Metadata including domain tags and quality scores, Source attribution files
Dr. Karim Mohammad , Dr. Zainab Hussein (2023)
We present a transformer-based neural machine translation system specifically designed for Kurdish-English translation, incorporating morphological awareness and handling dialectal variations across …
Salim, N., & Rashid, L. (2023). Kurdish-English Parallel Translation Corpus. KaiLab Research Data Repository. https://doi.org/10.5281/kurd-en-parallel.v1