Falcon 3 LLM series gets first Arabic model
TII releases first Arabic Falcon AI model and new hybrid architecture model
#UAE #LLMs - Technology Innovation Institute (TII), the applied research arm of Abu Dhabi's Advanced Technology Research Council (ATRC), has launched two major additions to its Falcon artificial intelligence model family: Falcon Arabic and Falcon-H1. Announced at Make it in the Emirates this week, TII introduces Falcon Arabic as the region's best-performing Arabic language AI model, with Falcon Arabic 7B setting a new benchmark for Arabic NLP (natural language processing). Meanwhile, Falcon-H1 is built on a revolutionary hybrid architecture that outperforms comparable offerings from Meta's LLaMA and Alibaba's Qwen in the 30-70 billion parameter range. The new models expand the capabilities of the Falcon model ecosystem, which has seen its models downloaded over 55 million times globally.
SO WHAT? - The now highly acclaimed Technology Innovation Institute in Abu Dhabi released its first Falcon large language model in 2023, surprising a world that then assumed that all cutting-edge AI models would be developed in North America or Europe. Since then the Falcon series has been developed and expanded, releasing a second full series of models in spring 2024 and the first Falcon 3 models in December 2024. However, until this week there were no Arabic language Falcon models. The release of Falcon Arabic 3 marks the first Arabic language addition to the Falcon AI model family.
Abu Dhabi-based Technology Innovation Institute (TII) has launched two major additions to its Falcon 3 large language model series: Falcon Arabic and Falcon-H1.
Falcon Arabic is built on the 7-billion-parameter Falcon 3-7B architecture and trained on high-quality native (non-translated) Arabic datasets spanning Modern Standard Arabic and regional dialects to capture the full linguistic diversity of the Arab world. The model was also trained on on 600 Giga Tokens of Arabic, multilingual, technical data.
According to the Open Arabic LLM Leaderboard benchmarks tests conducted by TII, Falcon Arabic outperforms all other regionally available Arabic language models, matching the performance of models up to 10 times its size.
TII also announced another significant new model this week. The new Falcon-H1 model introduces a hybrid architecture combining Transformers and Mamba (State Space Model) technologies to enable faster inference speeds and lower memory consumption while maintaining high performance across benchmarks.
Falcon-H1 comes in multiple sizes (34B, 7B, 3B, 1.5B, 1.5B-deep, and 500M parameters), offering developers flexibility to choose appropriate models for different deployment scenarios from edge computing to complex enterprise applications.
Each model in the Falcon-H1 family reportedly surpasses other models twice its size, setting new standards for performance-to-efficiency ratios in mathematics, reasoning, coding, long-context understanding, and multilingual tasks.
The Falcon-H1 model family supports18 languages as standard, including Arabic, but can be scaled to use more than 100 languages due to its multilingual tokenizer, which was trained across diverse language datasets.
The institute announced its first State Space Language Model (SSLM) last year, releasing Falcon Mamba 7B under an open-source licence.
Falcon-H1 models are available now for download via Hugging Face and FalconLLM.TII.ae, released under the TII Falcon License, which encourages responsible and ethical AI development.
Falcon Arabic is not currently available for download and may be released under a special licence (details to be confirmed).
Benchmark tests for Falcon Arabic 3 7B Instruct
PERFORMANCE* - Falcon Arabic is built on the 7-billion-parameter Falcon 3-7B architecture and trained on high-quality native (non-translated) Arabic datasets spanning Modern Standard Arabic and regional dialects to capture the full linguistic diversity of the Arab world. Despite its compact 7B parameter size, Falcon Arabic outperforms all existing Arabic LLMs in its class and even surpasses models up to four times larger, setting new benchmarks in Arabic MMLU, Exams, MadinahQA, and Aratrust evaluations. The Instruct-aligned version demonstrates exceptional capabilities in both following instructions and engaging in open-ended dialogue. Falcon Arabic sets raises the bar for Arabic conversational AI that combines fluency in Modern Standard Arabic with strong understanding of regional dialects.
* Note: all benchmark test data for Falcon Arabic provided by TII, since Falcon Arabic is not currently listed on the Open-Arabic-LLM-Leaderboard.
LINKS
Falcon Arabic 3 (Falcon website)
Falcon-H1 (Falcon website)
Falcon Arabic chat preview (Falcon website)
Falcon Arabic technical blog (Hugging Face) (Also GitHub)
Falcon Arabic models space (Hugging Face)
Falcon-H1 models space (Hugging Face)
Falcon-H1 chat preview (Falcon website)
Read more about Falcon large language models:
TII releases Falcon-Edge 1.58bit language models (Middle East AI News)
TII launches most powerful SLMs under 13B parameters (Middle East AI News)
TII launches Falcon's first SSLM (Middle East AI News)
TII debuts multimodal Falcon 2 Series (Middle East AI News)
Could Falcon become the Linux of AI? (Middle East AI News)
Can UAE-built Falcon rival global AI models? (Middle East AI News)