M42 releases new versions of clinical LLM
M42 releases new Med42 clinical large language models for review and testing
#UAE #LLMs - M42, the tech-driven health care group of Abu Dhabi-based AI holding and investment group G42, has released new open-access versions of Med42, its clinically-aligned large language model (LLM). Nine months after the debut of the Med42 LLM, M42 has released Med42 v2 8B and Med42 v2 70B, via the data science platform and open-source AI community Hugging Face. Built on Meta's LLaMA-3, the new models are instruction and preference fine-tuned, to both expand access to medical knowledge, and recognise the human preferences of clinicians and other medical stakeholders.
SO WHAT? - Generative AI is expected to reshape the future of healthcare and is forecast to grow into a $21.7 billion global market by 2032, according to Precedence Research1. One of the most obvious, yet potentially most transformational use cases is the creation of AI assistants for clinicians, healthcare administrators, insurers and patients. However, the healthcare industry sets a very low tolerance for error concerning data, medical information and, of course, advice. This makes building, training and fine-tuning models for use by the sector a challenging and resource-intensive process. The new Med42 LLMs from M42, demonstrate advances in performance, access to knowledge and ability to respond to users. However, M42 is also attempting to solve another problem with this release: and that is how such models should be evaluated for the healthcare industry.
Here are some key details of the new Med42 release:
Two new versions of M42’s Med42 large language model have been released this week on Hugging Face: Med42 v2 8B and Med42 v2 70B. The models have been trained to provide high-quality answers to medical questions. The 70B model was first previewed by M42 during Abu Dhabi Global Healthcare Week in May.
The two models are built on Meta’s LLaMA-3, which makes both 8B and 70B pretrained models available, and are the successors to the Med42 70B model released via Hugging Face in October last year. The original Med42 70B model was downloaded via the platform more than 8,200 times.
The models have been both instruct and preference-tuned by M42 to expand access to medical knowledge and to meet the high standards of the healthcare industry.
Potential use cases for Med42 include medical question answering, patient record summarisation, aiding medical diagnosis and general health Q&A.
The new models have been released as open-access to encourage developers, researchers, healthcare institutions and other stakeholders to download the model for review, testing and experimentation.
M42 also plans to release data sets, a code repository and a research paper detailing M42’s new evaluation framework for clinical LLMs and its application to Med42-v2.
PERFORMANCE
In preliminary results based on Elo Ratings for open-ended healthcare QA using Prometheus-2-as-a-judge2, Med42 v2 70B outperforms Meta LLaMA-3 70B Instruct and GPT-4o.
According to M42, the Med42 v2 70B has a zero-shot accuracy score of 85.1 per cent and a maximum accuracy score of 87.3 per cent using specialised prompting on the USMLE (United States Medical Licensing Examination).
🎧 Listen to the podcast of Middle East AI News LIVE on 19th October 2023, when Middle East AI News talked to Dr, Shadab Khan, Director of AI and Applied Science at M42 about the first version of the Med42 70 billion parameter model,
LINKS
Med42 8B LLM (Hugging Face)
Med42 70B LLM (Hugging Face)
Technical paper: Med42-V2: A suite of clinical LLMs (PDF via Arxiv)
Generative AI in the health care market (July 2023)
Prometheus 2 is an open source language model specialised in evaluating other language models (see Prometheus 2 on Hugging Face)