First Libyan national large language model launched
LibiGPT targets Arabic and local dialect support, with public access chat app
#Libya #LLMs - Libya-based technology company Smart Co for Technology Projects and Artificial Intelligence launched LibiGPT, the country’s first national large language model (LLM), developed by the company’s CEO and founder Dr Ali Othman Al-Baji. The developer has announced a publically available online chat app, plus three proprietary AI models: LibiGPT-Base with 7 billion parameters, LibiGPT-Instruct with 13 billion parameters, and LibiGPT-Enterprise with 34 billion parameters. Trained on a multi-hundred-billion-token corpus with a significantly larger proportion of Arabic text than typical global models, LibiGPT understands Libyan colloquial Arabic (dārija), modern standard Arabic (MSA), English and French.
SO WHAT? - The launch addresses a critical gap in AI platforms available in Libya, with existing global models like OpenAI’s ChatGPT and Google Gemini lacking deep understanding of Libyan culture, dialect variations and local terminologies that characterise North African Arabic usage. Trained on a relatively large Arabic corpus including Modern Standard Arabic and selected North African dialects, the model is designed to converse in local Arabic. For Libyan enterprises, government institutions and educational organisations, the platform offers customisation capabilities aligned with national requirements, history and culture.
Here are some key point about the new Libyan AI model:
Libya’s first national large language model LibiGPT was launched in a ceremony in Tripoli late October, attended by Libyan Minister of Economy and Trade Mohamed Al-Hwej, Undersecretary of the Ministry of Education Masouda Al-Aswad, and Central Tripoli Mayor Fadel Bouargoub.
Developed by Dr Ali Othman Al-Baji, founder and CEO of Libyan technology company Smart Co for Technology Projects and Artificial Intelligence, there are three versions of the LibiGPT model:
LibiGPT-Base 7B - for lightweight applications
LibiGPT-Instruct 13B - for powerful instruction-following
PT-Enterprise 34B - for enterprise or sovereign-cloud deployments
LibiGPT was trained on a multi-hundred-billion-token corpus with significantly larger proportion of Arabic text than standard multilingual large language models. Arabic text used included Modern Standard Arabic (MSA) and selected dialects from Gulf Cooperation Council, Levant and North Africa.
Training data was compiled from public and open-licensed Arabic datasets, MSA corpora, academic and journalistic text, Arabic Wikipedia, and high-quality Creative Commons-licensed resources.
The training data also includes curated domain-specific corpora covering technical, legal, financial, regulatory and governmental documents alongside regional editorial sources and industry-specific content from the Middle East and North Africa region.
Development employed a custom Arabic optimisation pipeline including orthographic normalisation, diacritic handling, dialect filtering, improved Arabic tokenisation, deduplication and quality scoring. High-signal filtering was also applied using perplexity-based ranking, classifier-based filtering and multi-stage toxicity and bias filtering.
The developer also created high-quality synthetic Arabic data to improve: dialect robustness, long-instruction handling, complex reasoning tasks in Arabic, and English-to-Arabic translation and code-switching.
LibiGPT provides translation capabilities between Arabic, English and French using interactive AI characterised by high linguistic accuracy that takes into account local cultural context, whilst all data is stored locally and protected by security standards addressing sovereignty concerns.
The developer’s roadmap includes larger context windows exceeding 200,000 tokens, Arabic-domain expert models for legal, finance, healthcare and government sectors, enhanced dialect understanding across the region, formal technical report covering dataset construction and training methodology, and enterprise-grade retrieval-augmented generation systems optimised for Arabic.
ZOOM OUT - The past two years have seen a increase in the number of new national AI models developed for the Maghreb. Not only does the lack of local language-capable language models limit AI usage by local populations, but it also limits the extent to which governments, national service providers and large organisations can use AI to provide better public services. Now, national models conversant with North African dialects and cultural references are being development by government, commercial and academic sectors.
[Written and edited with the assistance of AI]
LINKS
LibiGPT chat (website)
Read more about Arabic language LLM development for the Maghreb:
AtlasIA releases smarter, faster Moroccan darija AI models (Middle East AI News)
Atlas-Chat 9B demo goes live (Middle East AI News)
MBZUAI-led research team builds Moroccan AI models (Middle East AI News)
Algerian AI researchers crowdsource local language data (Middle East AI News)


