Tarjama& launches Pronoia LLM for business Arabic translation
Almost 8 years in the making, Tarjama& delivers Arabic AI model for enterprise
#UAE #LLMs - Language technology and services provider Tarjama& has announced Pronoia, a family of small language models (SLMs) for enterprise Arabic language translation tasks. Trained on data curated or created by Tarjama&, two proprietary fine-tuned bilingual Arabic-English SLMs have been developed: Pronoia 7B and Pronoia 14B. With a focus on business, legal and medical Arabic language content, the artificial intelligence models provide translation capabilities with fluency and context, making them ideal for use by large business and government organisations.
SO WHAT? - More and more Arabic language AI models are being developed in the Arab world, but many of them seem to try to be all things to all people. A common complaint among users of LLMs in the Arab world is that they often fail to produce content that is both readily useable for business, and correct in the Arabic language. However, building and training Arabic models that are versatile enough for large enterprises to rely on requires significant volumes of quality data that can not be found on the Internet alone. Tarjama&'s focus on building a business-level Arabic AI model and its wealth of Arabic language data, could make all the difference to the new Pronoia models and their value to large organisations.
Here are the key points of the Pronoia Arabic AI model launch:
Tarjama& announced Pronoia, a family of small language models (SLMs) for enterprise Arabic language translation needs. The announcement was made by Tarjama& CEO, Nour Al Hassan during GITEX GLOBAL in Dubai.
Tarjama& is a UAE and Saudi Arabia-based language technology and services provider, that designs intelligent tools, products, and services for global businesses. The company released its first machine translation model in 2017 and launched its first AI product online in 2022: Tarjama Translate.
The company has announced two new proprietary fine-tuned small language models: Pronoia 7B and Pronoia 14B. The small size of the models makes them cost-efficient to maintain and run, giving enterprise customers the option to run them in the cloud via Tarjarma&’s APIs, or on-premise.
Built on open-source architecture with Tarjama&’s own RAG (Retrieval-Augmented Generation), the two new models are the culmination of the company’s nearly 8 years of work on Arabic language AI models and datasets.
Pronoia 7B and Pronoia 14B have been trained on a combination of Tarjama&’s proprietary data, curated data sets developed by the company and a variety of open source data. Fine-tuning was focused on business, legal and medical data.
According to Tarjama&, the two models are ideal for business translation and summarisation, such as contract review, translation and summarisation.
Both Pronoia models will become commercially available soon, to both customers and product developers.
IMO - Many AI language models and the apps that use them find it hard to stabilise the content generated by their product and there is often a delta between getting the feedback that they need from users and refining the product. Here, Tarjama& has a huge advantage, because it is its own beta tester. With content and Arabic-to-English and English-to-Arabic translation at the core of its business, Tarjama& as a a network of over 35,000 linguists and dedicated AI teams, witha large team of senior editors, language experts and freelancer annotators working with its AI language models daily. That shortens the feedback loop considerably and both lends itself to continual improvements in quality, and provides data for future versions.

Updated on 20-Dec-24 to add Hugging Face Open Arabic Leaderboard benchmark results.