Kurdish Morphological Analysis Dataset

Published: April 20, 2023
Size: 85 MB
License: CC BY 4.0
Domain: Morphological Analysis
Languages:
Kurdish (Sorani) Kurdish (Kurmanji)
Formats:
CSV XML JSON
Contributing Organizations:

Dataset Description

Comprehensive morphological analysis dataset containing 100,000 Kurdish words with detailed morphological breakdowns, POS tags, and inflectional information for both Sorani and Kurmanji dialects.

Data Structure

CSV files with word forms and analyses, XML annotation schema, Documentation for morphological features

Primary Publication

Neural Machine Translation for Kurdish-English Language Pairs

Dr. Karim Mohammad , Dr. Zainab Hussein (2023)

We present a transformer-based neural machine translation system specifically designed for Kurdish-English translation, incorporating morphological awareness and handling dialectal variations across …

Related Publications

Kurdish Spell Checking Using Morphological Analysis

Dr. Karim Mohammad , Dr. John Doe (2023)

A novel approach to Kurdish spell checking that leverages morphological decomposition and statistical language modeling to handle the rich …

How to Cite

Salim, N., Rashid, L., Dilshad, S., & Tahir, N. (2023). Kurdish Morphological Analysis Dataset. KaiLab Research Data Repository. https://doi.org/10.5281/kurd-morphology.v1

Dataset Information

Total Size 85 MB
Languages 2 languages
Formats 3 formats
Related Papers 2 publications

Data Access

Download Dataset

Licensed under CC BY 4.0