Large-Scale Kurdish Text Corpus Creation and Analysis
Authors
Language Resources and Evaluation
Vol. 57(3)
pp. 891-920
August 12, 2023
Member Organizations:
Comprehensive methodology for creating and analyzing a 50-million-word Kurdish corpus covering multiple domains and dialects, with automated quality assessment and linguistic annotation.
A large-scale corpus of modern Kurdish texts containing 50 million words from diverse sources including news articles, literature, academic papers, and web content. Includes linguistic annotations and …
Omar, Z., Hassan, K., & Ali, A. (2023). Large-Scale Kurdish Text Corpus Creation and Analysis. Language Resources and Evaluation, 57(3), 891-920.