MENAJOBS.ai - The Middle East's Elite AI Talent MatrixWhere people find jobs
arrow_backBack to jobs

Pantheion AI

Senior Arabic NLP Research Scientist

Abu Dhabi, UAEFull-TimeOn-sitePosted Apr 6, 2026
Arabic NLPPythonPyTorchHuggingFace TransformersArabic linguistic structure

Role overview

Pantheion AI is training Pantheion-1 — the world's first sovereign Arabic/English frontier large language model — on classified GCC data, with dialect awareness and Constitutional AI governance designed for the MENA cultural and regulatory context. As a Senior Arabic NLP Research Scientist, you will be at the core of this effort: researching, designing, and implementing the language model capabilities that make Pantheion-1 genuinely frontier-grade in Arabic. This is a rare opportunity to do meaningful research on an under-served language with real-world sovereign impact.

What you will do

  • Lead Arabic language capability research for Pantheion-1 — developing novel approaches to dialect-aware language modelling across Gulf, Levantine, Egyptian, and Modern Standard Arabic
  • Design and execute pre-training data curation pipelines for Arabic: sourcing, filtering, cleaning, and annotating sovereign GCC datasets at billion-token scale
  • Research and implement Arabic-specific tokenization, morphological processing, and vocabulary design that improves model performance across Arabic dialects
  • Develop Arabic language benchmarks for model evaluation — both adapting existing benchmarks and creating novel evaluations for GCC-specific language tasks
  • Contribute to the Constitutional AI adaptation for Arabic cultural, religious, and GCC regulatory contexts — ensuring model outputs align with Islamic ethical guidelines and regional compliance requirements
  • Collaborate with the infrastructure team to optimize Arabic language training efficiency on Pantheion's sovereign compute stack
  • Publish research findings at top NLP venues (ACL, EMNLP, NAACL) where appropriate, building Pantheion AI's international research profile
  • Engage with the Arabic developer community and academic partners to build Pantheion AI's research ecosystem

Skills profile

Required skills

Arabic NLPPythonPyTorchHuggingFace TransformersArabic linguistic structure

Required qualifications

Domain knowledge

  • PhD in computational linguistics, NLP, machine learning, or a closely related field (or equivalent research experience)
  • 3+ years of applied research experience in Arabic NLP — demonstrated by publications, open-source contributions, or production systems
  • Deep expertise in Arabic linguistic structure — morphology, dialect variation, code-switching, and Arabic-specific NLP challenges
  • Hands-on experience with large language model training, fine-tuning, and evaluation at meaningful scale
  • Strong programming skills in Python; proficiency with PyTorch, HuggingFace Transformers, and standard NLP tooling
  • Native or near-native Arabic proficiency with deep familiarity with Gulf Arabic dialects

Preferred qualifications

Bonus domain experience

  • Publication record at top NLP venues (ACL, EMNLP, NAACL, EACL) in Arabic NLP or multilingual AI
  • Experience with Constitutional AI, RLHF, or AI alignment methodology applied to non-English language models
  • Familiarity with low-resource language modelling, cross-lingual transfer, or code-switching research
  • Prior work experience in MENA academic or government AI research institutions
  • Contributions to Arabic NLP open-source projects or datasets
Senior Arabic NLP Research Scientist at Pantheion AI | MENAJOBS.ai