TII launches Falcon's first SSLM
Falcon Mamba 7B top performing open-source SSLM according to Hugging Face
#UAE #LLMS - The applied research arm of Abu Dhabi’s Advanced Technology Research Council (ATRC), Technology Innovation Institute (TII), has open-sourced the Falcon Mamba 7B State Space Language Model (SSLM). The latest AI model from the Falcon series surpasses traditional transformer architectures, marking a significant advancement in AI research. The model is the world’s top-performing open-source State Space Language Model (SSLM). Independently verified by Hugging Face. The model has been released as a pretrained model and an instruction/chat model, and also as a 4bit pretrained model and instruction/chat model.
SO WHAT? - Falcon Mamba 7B is the first AI model to be announced in the region developed using the Mamba State Space Model architecture. State Space Language Models are more economic with their use of memory (DRAM and SRAM) improving their performance in certain areas in comparison to transformer architecture. Whilst transformer based models are very efficient at remembering and using information they have processed earlier in a sequence, SSLMs can excel in tasks such as estimation, forecasting, and control tasks. The upshot of this is that TII's Falcon Mamba 7B outperforms Meta’s Llama 3.1 8B, Llama 3 8B, and Mistral 7B on new Hugging Face benchmarks (soon to be featured on Hugging Face’s website).
Some key details about this announcement:
Technology Innovation Institute (TII) has released Falcon Mamba 7B under an open-source licence, as the institute’s first State Space Language Model (SSLM), using Mamba SSM architecture.
Released in four different variations, Falcon Mamba 7B was trained with ~ 6,000 GT, leveraging a multi-stage training to increase the context-length from 2,048 to 8,192. The data was tokenized with the Falcon-7B/11B tokenizer. The model took less than two months to build and train.
Falcon Mamba 7B is ranked first SSLM globally in performance, according to Hugging Face, surpassing both existing SSLMs and traditional transformer-based models like Meta’s Llama 3.1 8B and Mistral’s 7B.
The model has been released as a pretrained model (Falcon Mamba 7B), an instruction/chat model (Falcon Mamba 7B Instruct), a 4bit pretrained model (Falcon Mamba 7B 4-bit) and instruction/chat model (Falcon Mamba 7B 4-bit Instruct).
The model’s architecture allows it to process large blocks of text without requiring additional memory (i.e. SRAM, DRAM), setting it apart from transformer models that demand significant additional memory.
TII’s new model not only outperforms others on old benchmarks but is also set to lead on Hugging Face’s tougher new benchmark leaderboard (when this goes live).
SSLMs like Falcon Mamba 7B are particularly effective in tasks like estimation, forecasting, and control, alongside traditional NLP applications such as machine translation, text summarization, and more.
Falcon Mamba 7B is released under the TII Falcon License 2.0, a permissive Apache 2.0-based license promoting the responsible use of AI.
Read more about Falcon LLMs:
TII debuts multimodal Falcon 2 Series (Middle East AI News)
Could Falcon become the Linux of AI? (Middle East AI News)
TII announces Falcon 180B LLM (Middle East AI News)
Can UAE-built Falcon rival global AI models? (Middle East AI News)
LINKS
Falcon Mamba 7B (Hugging Face)
Falcon Mamba 7B (website)