MBZUAI to present breakthrough sat VLM at CVPR 2025
New EarthDial vision model processes multi-spectral earth observation data
#UAE #computervision - Abu Dhabi’s Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), in collaboration with IBM Research and other researchers, has developed the first vision language model capable of processing multi-spectral, multi-resolution satellite imagery across different time periods. The first of its kind EarthDial model, outperforms existing models by up to 32.5% on classification tasks and so represents a significant advancement in geospatial data analysis. The technology enables natural language queries of Earth observation data for applications including disaster response, climate monitoring and agricultural assessment. The research team will present their findings at the Computer Vision and Pattern Recognition Conference (CVPR) in Nashville, Tennessee this week.
SO WHAT? - The new vision language model addresses a critical gap in AI capabilities for Earth observation, as existing AI models struggle with the diverse data formats used in remote sensing. EarthDial's ability to process infrared, radar and optical imagery simultaneously whilst tracking environmental changes over time provides researchers and governments with better analytical capabilities for environmental monitoring and disaster management.
Here are some key facts about the EarthDial VLM:
Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), in collaboration with IBM Research and other researchers, has developed EarthDial: a first of its kind vision language model capable of processing multi-spectral, multi-resolution satellite imagery across different time periods in order to observe environmental changes
MBZUAI researchers trained EarthDial on the largest geospatial instruction dataset ever created, containing over 11 million question-answer pairs from existing remote sensing databases
The system achieved nearly 20 percentage points better accuracy than OpenAI’s GPT-4o on the BigEarthNet dataset classification task, demonstrating superior performance over general-purpose AI models
EarthDial can simultaneously process multi-spectral imagery, high-resolution optical data and synthetic aperture radar information through a unified architecture rather than separate encoders
The model supports applications including methane plume detection, urban heat island analysis, tree species classification and real-time disaster assessment capabilities
Researchers employed a three-stage training approach to ensure computational efficiency whilst maintaining high accuracy across diverse geospatial analysis tasks
The team has released the complete EarthDial codebase to encourage community development and enhancement of the system's capabilities
Testing across 40 different tasks including object detection, change detection and image captioning showed consistent outperformance compared to existing specialised models
The model’s dynamic resolution strategy automatically optimises image processing by selecting appropriate aspect ratios and creating thumbnail overviews for scene understanding
The research team was: Sagar Soni, Akshay Dudhane, Hiyam Debary, Mustansar Fiaz, Muhammad Akhtar Munir, Muhammad Sohail Danish, Paolo Fraccaro, Campbell D Watson, Levente J Klein, Fahad Shahbaz Khan, Salman Khan.
ZOOM OUT - The UAE's computer vision AI ambitions recently received validation when Dubai secured hosting rights for the 2029 International Conference on Computer Vision (ICCV29), one of the world's premier computer vision conferences. The December 2-6, 2029 event expected to attract 15,000 AI industry leaders and researchers. The conference selection, organised by the Computer Vision Foundation and sponsored by IEEE, underscores the Emirates’ emergence as a global AI research hub capable of bringing together the global computational science community.
[Written and edited with the assistance of AI]
LINKS
EarthDial model code (Github)
EarthDial dataset (Hugging Face)
EarthDial research paper (arXiv)
MBZUAI blog post (MBZUAI)
Read more MBZUAI news:
MBZUAI to open AI lab at UAE climate ministry (Middle East AI News)
MBZUAI releases Nile-Chat: Egyptian Arabic LLM (Middle East AI News)
MBZUAI graduates record AI cohort (Middle East AI News)
MBZUAI opens Silicon Valley AI lab (Middle East AI News)
New partnership strengthens MBZUAI French research ties (Middle East AI News)