King Saud University Corpus of Classical Arabic (KSUCCA) is a pioneering 50 million tokens annotated corpus of Classical Arabic texts from the period of pre-Islamic era until the fourth Hijri century (equivalent to the period from the seventh until early eleventh century CE), which is the period of pure classical Arabic. The main aim of this corpus is to be used for studying the distributional lexical semantics of The Quran words. However, it can be used for other research purposes, such as:
• Arabic linguistics, which includes: lexical, morphological, syntactic, semantic and pragmatic research.
• Arabic computational linguistics, which includes: lexical, morphological, syntactic, semantic and pragmatic research including their various applications.
• Arabic language teaching for both Arabs and non Arabs.
• Artificial intelligence.
• Natural language processing.
• Information retrieval.
• Question answering.
• Machine translation.

Features

  • An electronic corpus: allowing faster and more accurate investigation of written Arabic.
  • A synchronic corpus: including Arabic texts from the period of pre-Islamic era until the fourth Hijri century (equivalent to the period from the seventh until early eleventh century CE), which is the period of pure classical Arabic.
  • A general corpus: covering a wide range of genres making it suitable for various research subjects.
  • A representative corpus: it can be used as the basis for generalizations concerning Classical Arabic.
  • A balanced corpus: the number of text samples taken from each genre is proportional to that genre.
  • A monolingual corpus: containing written text of classical Arabic.
  • An unvowelized corpus: only the words of the holy Quran are vowelized.
  • A raw corpus: containing no tagging, lemmatization nor any further type of annotation, just plain text.
  • An automatically annotated version of the corpus with lemma, stem, POS tag, gender and number annotations is also available.

Project Samples

Project Activity

See All Activity >

License

Creative Commons Attribution Non-Commercial License V2.0

Follow KSUCCA Corpus

KSUCCA Corpus Web Site

Other Useful Business Software
Simple, Secure Domain Registration Icon
Simple, Secure Domain Registration

Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
Sign up for free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of KSUCCA Corpus!

Additional Project Details

Operating Systems

Android, Apple iPhone, Linux, Mac, Windows

Registered

2019-11-20