Egyptian is one of the most commonly spoken languages in the world. That being said, it’s not always easy to find Egyptian language datasets to train your models.
That’s why we’ve done the hard bit for you. We’ve searched high and low here at Twine to find the best Egyptian Language datasets.
Are you ready?
Let’s dive in.
Here are our top picks for Egyptian Language datasets:
Egyptian Arabic Segmentation Dataset
This dataset contains 350 tweets with more than 8,000 words (including 3,000 unique words) written in the Egyptian dialect. The tweets have much dialectal content covering most dialectal Egyptian phonological, morphological, and syntactic phenomena. It also includes Twitter-specific aspects of the text, such as #hashtags, @mentions, emoticons, and URLs.
Egyptian Hieroglyphics Datasets
This dataset detects and translates hieroglyphs using a real-time object detection SSD algorithm which will help tourists to unveil the mysteries of Ancient Egypt. Contains text files.
Egyptian Arabic Conversational Speech Corpus
This open-source dataset consists of 5.5 hours of transcribed Egyptian Arabic conversational speech on certain topics, where nine conversations between two pairs of speakers were contained.
BOLT Egyptian Arabic-English Word Alignment Dataset
The BOLT Egyptian Arabic-English Word Alignment Dataset was developed by the Linguistic Data Consortium (LDC) and consists of 349,414 words of Egyptian Arabic and English parallel text enhanced with linguistic tags to indicate word relations. Contains text files.
Wrapping up
To conclude, here are top picks for the best Egyptian language datasets for your projects:
- Egyptian Arabic Segmentation Dataset
- Egyptian Hieroglyphics Datasets
- Egyptian Arabic Conversational Speech Corpus
- BOLT Egyptian Arabic-English Word Alignment Dataset
We hope that this list has either helped you find a dataset for your project or, realize the myriad of options available.
Please let us know if there are any datasets you would like us to add to the list.
If you would like to learn more about how we could help build a custom dataset for your project, don’t hesitate to contact us!
Let us help you do the math – check our AI dataset project calculator.