Abu Dhabi-based LibrAI creates LLM leaderboard for AI safety
MBZUAI-supported AI Lab aims to pioneer AI safety and governance
#UAE #safety - Abu Dhabi-based AI lab LibrAI has released the Libra-Leaderboard, a new platform that provides a comprehensive safety evaluation for large language models (LLMs), developed in collaboration with Mohamed bin Zayed University of Artificial Intelligence (MBZUAI). The leaderboard, designed to balance LLM performance with safety, benchmarks 26 mainstream models (such as Claude, ChatGPT, Gemini and Owen) using 57 datasets, including 10 custom adversarial datasets, backed by the open-source Libra-Eval framework. The initiative aims to address critical safety gaps in AI development and support responsible innovation.
SO WHAT? - Most current LLM leaderboards and evaluation frameworks focus squarely on dimensions of model performance, such as knowledge, reasoning, mathematics and language capabilities. However, far fewer researchers are developing frameworks for evaluating AI safety. Large language models are necessarily trained on huge datasets which can increase the potential for critical AI safety issues. The risk is that LLMs can inadvertently propagate biases, amplify misinformation, or mishandle sensitive topics, that lead to real-world consequences. The Libra-Leaderboard introduces the first systematic approach to integrating safety and performance in LLM evaluations.
here are some more key points about the Libra-Leaderboard:
The new Libra-Leaderboard evaluates 26 large language models (LLMs) from 14 organisations, revealing safety challenges even in leading models. The new evaluation framework and leaderboard were developed by Abu Dhabi-based AI lab LibrAI with the support of Mohamed bin Zayed University of Artificial Intelligence (MBZUAI).
The framework includes 57 datasets, featuring over 40 new datasets since 2023 and 10 custom adversarial sets.
LibraAI’s Interactive Safety Arena provides a space to engage in adversarial prompt testing, tutoring, and feedback collection via a user-friendly chat interface. User feedback directly informs model scores.
Libra-Eval is a comprehensive Python library available via GitHub, supporting reproducible and scalable safety assessments. The repository hosts the dataset and evaluation code for the Libra-Leaderboard,
The Libra-Leadersboard balance-encouraging scoring system incentivises models to optimise both safety and capability.
The Libra-Leaderboard incentivises the joint optimisation of capability and safety in large language models. Unlike traditional approaches that average performance and safety metrics, the leaderboard uses a distance-to-optimal-score method to calculate the overall rankings
The project provides a fully reproducible evaluation strategy with updates and dynamic datasets ensure evaluations stay relevant and contamination-free.
Benchmark categories include bias, misinformation, and over-sensitivity to benign prompts.
ZOOM OUT - Already a hub for the development of large language models, small language models (SLMs) and domain-specific models, the UAE also produced a variety of AI evaluation models over the past year and UAE-based researchers are leading a growing number of LLM evaluation projects. In 2024, UAE-based labs led projects to develop specialised evaluation frameworks and leaderboards for Arabic language models, clinical models and telecom models.
LINKS
Libra-Leaderboard (website)
Libra-Leaderboard research paper (arXiv)
LibrA-Eval Library (GitHub)
Safety Arena (website)
Read more about LLM leaderboard and evaluation projects:
Inception & MBZUAI launch new Arabic LLM leaderboard (Middle East AI News)
Comprehensive multimodal Arabic AI benchmark (Middle East AI News)
New telecom LLMs leaderboard project (Middle East AI News)
M42 delivers framework for evaluating clinical LLMs (Middle East AI News)
Arabic LLM index launched at GAIN (Middle East AI News)
New Hugging Face Open Arabic LLM Leaderboard (Middle East AI News)