MBZUAI releases Arabic 'inclusive' multimodal AI model
New Arabic-English AI model Excels in language and vision tasks
#UAE #AI - Abu Dhabi’s Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) has released AIN, the first comprehensive bilingual Arabic-English inclusive large multimodal model (LMM). The 7-billion parameter model has been developed to excel at visual and contextual understanding across diverse domains. The model has outperformed significantly larger AI systems such as OpenAI’s GPT-4o and Google’s Gemini 1.5 Pro in key benchmarks, achieving a 3.4% higher accuracy on CAMEL-Bench’s 38 sub-domains. AIN’s capabilities extend to OCR, remote sensing, cultural insights, and medical imaging, marking a major milestone in AI accessibility for Arabic speakers. The AIN model has been integrated into WhatsApp and Telegram, expanding its real-world usability.
SO WHAT? - AIN follows the release of CAMEL-Bench, a comprehensive benchmark for Arabic large multimodal models (LMMs). The benchmark evaluates the performance of LMMs across diverse tasks like OCR, medical imaging, and remote sensing. CAMEL-Bench includes over 29k questions across 8 diverse domains and 38 sub-domains, revealing performance gaps in AI models. No surprise then that the new AIN model performs well in CAMEL-Bench! According to MBZUAI, AIN sets a new benchmark for language and vision tasks, ensuring that Arabic users can benefit from cutting-edge generative AI via an Arabic-centric model.
Some key points about the AIN LMM announcement:
Abu Dhabi’s Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) has unveiled AIN, the first comprehensive bilingual Arabic-centric large multimodal model (LMM).
The 7 billion parameter model was trained on 3.6 million high-quality Arabic-English data samples, with 35% authentic Arabic content.
The AIN LLM was developed by a team of researchers from MBZUAI, Aalto University (Finland), Amazon, Australian National University, and Linkoping University (Sweden).
AIN achieves competitive performance against larger models like GPT-4o and Gemini 1.5 Pro, surpassing them in key benchmarks by 3.4%.
The new LMM excels in visual and contextual understanding, including OCR, cultural insights, agricultural tasks, medical imaging, and remote sensing.
AIN demonstrates superior bilingual performance, ranking among the best in Arabic language processing while maintaining strong English capabilities.
The chat model is integrated into WhatsApp and Telegram, enhancing accessibility and usability for Arabic-speaking users.
The LLM is also validated by human evaluation, with 76% of users preferring AIN’s responses over other leading AI models.
Additional model weights will be announced soon as open-source code.
ZOOM OUT - The AIN large multimodal model has been released via MBZUAI’s Oryx Library, an open-source multimodal model library launched by the university in 2023. The initiative focuses on projects and demos for large vision-language models developed at MBZUAI, with the goal of advancing LMM research for multi-modal and domain-specific dialogues. The library currently shares 20 models including the recently announced comprehensive benchmark for Arabic large multimodal models, CAMEL-Bench.
LINKS
AIN LMM landing page (MBZUAI)
AIN web chat (Hugging Face)
AIN Whatsapp chat (Whatsapp)
AIN Telegram chat (Telegram)
AIN LMM research paper (arXiv)
AIN LMM code (Github)
Read more about MBZUAI’s LLM research:
New benchmark challenges inclusivity of AI models (Middle East AI News)
Inception & MBZUAI launch new Arabic LLM leaderboard (Middle East AI News)
MBZUAI open-sources NANDA LLM (Middle East AI News)
Comprehensive multimodal Arabic AI benchmark (Middle East AI News)
Atlas-Chat 9B demo goes live (Middle East AI News)
Powerful open-source K2-65B LLM costs 35% less to train (Middle East AI News)
New framework for open-source LLMs (Middle East AI News)