Egyptian open-source LLM Horus punches above its weight

TokenAI’s Horus 1.0 outperforms Llama 3.1 8B despite having half the parameters

Apr 20, 2026

Illustration of robotic Horus (Image credit: MEAIN via Gemini)

#Egypt #LLMs – Egyptian AI startup TokenAI, founded by developer Assem Sabry, has released Horus 1.0-4B, a fully open-source large language model developed in Egypt. Despite the fact that the country graduates some 60,000 technology students per year and employees half a million in its ICT sector, new Egyptian-built AI models are a rare sight. Released this month, the 4 billion parameter model is multilingual, Arabic-optimised, and available in seven variants to suit different hardware and deployment environments. On the MMLU LLM benchmark, Horus 1.0-4B scores 88 percent, outperforming Qwen 3.5-4B at 73 percent, Llama 3.1-8B at 69 percent, and Gemma-2-9B at 71 percent. All models with significantly larger parameter counts.

SO WHAT? – Egypt has long been a key source of computer science graduates for the Arab world and is a historic centre for Arabic software development. In recent years we’ve seen increasing numbers of Egyptian AI startups achieve success including Intella, Synapse Analytics, Trendak and WideBot. However, we have yet to see significant increase in open source AI development. In February the government announced its first national model Karnak, which was open-sourced as a 41 billion parameter model. This makes Horus not only the second open-source model to be released this year, but perhaps over the past three years. Unlike the the 41B model, Horus 4B’s small footprint makes it easy for developers and researchers to use.

KEY POINTS:

Horus 1.0-4B was released this month by Cairo-based AI startup TokenAI under an MIT licence, which it describes as the first fully open-source LLM ‘built from scratch’ in Egypt. It has been trained on trillions of tokens and designed for multilingual use, with particular optimisation for Arabic language and cultural contexts.
The model scores 88% on the MMLU benchmark (which tests multilingual knowledge across 57 academic subjects) compared with 73% for Qwen 3.5-4B, 69% for Llama 3.1-8B, and 71% for Gemma-2-9B. The Llama and Gemma models are substantially larger than Horus.
On Arabic-specific benchmarks, Horus performs competitively. It scores 67% on ArabicBench, ahead of Qwen 3.5-4B at 65%, Llama 3.1-8B at 40%, and Gemma-2-9B at 60%. On ERQA, an Arabic entity-rich question answering benchmark, Horus scores 67% against Qwen’s 60%.
Arabic mathematical reasoning remains a relative weakness. On AraMath, Horus scores 33% against Qwen’s 40% and Gemma’s 35%. On GSM8K grade school maths, it scores 67% compared with 88% for Gemma-2-9B and 84% for Llama 3.1-8B (an area the developer acknowledges for future improvement).
Horus is currently available in seven variants, ranging from a full 16-bit version at approximately 8GB to a Q4_K_M (4-bit GGUF quantisation) compressed variant at 2.3GB. This range allows deployment across GPU servers, personal computers, and edge devices — making it accessible to researchers and developers with limited compute budgets.
The model supports chain-of-thought reasoning, instruction following, and tool use, and performs strongly on terminal and command-line benchmarks, scoring 84% on Terminal Bench. It is available via Hugging Face and through TokenAI’s neuralnode Python framework.
A text-to-speech model called Replica is planned for imminent release, integrated within the neuralnode framework. It will offer 20 voices across 10 languages including Arabic, extending TokenAI’s ambition beyond text generation into voice-enabled AI applications.
TokenAI’s vision is for Horus to become the foundation of an Egyptian open-source AI infrastructure, with future models in the Horus family planned to expand capabilities, while maintaining cultural alignment and Arabic language focus.

ZOOM OUT – Horus arrives at a moment when Egypt is beginning to build serious AI momentum, led by a growing number of successful AI startups. In February 2026, Egypt's IT industry development authority ITIDA unveiled Karnak, the country's first national LLM, positioned as the highest-ranking Arabic model in the 30 to 80 billion parameter range. Karnak already powers several applications including SIA, an Arabic-language AI tutor, a legal assistant for citizens and SMEs, and healthcare AI tools for early disease detection. Long before the arrival of Karnak and Horus, Abu Dhabi's Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) released Nile-Chat, two open-source models built specifically for the Egyptian Arabic dialect. Now with, Horus, Karnak and Nile-Chat available as open source, more Egyptian developers may be inspired to develop AI models.

[Written and edited with the assistance of AI]

Middle East AI News

Discussion about this post

Ready for more?

Middle East AI News

Egyptian open-source LLM Horus punches above its weight

TokenAI’s Horus 1.0 outperforms Llama 3.1 8B despite having half the parameters

LINKS

Discussion about this post

Ready for more?