Introduction to Arabic Q/A

Research in the field of Q/A has known significant progress for languages such as English, Spanish, French or Italian. In the context of the Arabic language there are few attempts for building Q/A systems. This may be due to the particularities of the language (short vowels, absence of capital letters, complex morphology, etc.). The most well-known Arabic Q/A systems are:

  • QARAB is a system that takes natural language questions expressed in the Arabic language and attempts to provide short answers. The system’s primary source of knowledge is a collection of Arabic newspaper text extracted from Al-Raya, a newspaper published in Qatar. QARAB uses shallow language understanding to process questions and it does not attempt to understand the content of the question at a deep, semantic level.
  • AQAS is knowledge-based and, therefore, extracts answers only from structured data and not from raw text (non structured text written in natural language).
  • ArabiQA is an Arabic Q/A prototype based on the Java Information Retrieval System (JIRS) Passage Retrieval (PR) system and a Named Entities Recognition (NER) module. It embeds an Answer Extraction module dedicated especially to factoid questions. In order to implement this module authors developed an Arabic NER system and a set of patterns manually built for each type of question.
  • QASAL is a recent attempt for building an Arabic Q/A which process factoid questions (e.g. questions that have NE answers). Experiments have been conducted and showed that for a test data of 50 questions the system reached 67.65% as precision, 91% as recall and 72.85% as F-measure.