TII releases Falcon-Edge 1.58bit language models

Falcon-Edge series brings LLMs to edge computing, based on BitNet

May 15, 2025

#UAE #edgecomputing - Technology Innovation Institute (TII), the global applied research centre of Advanced Technology Research Council (ATRC),, has announced Falcon-Edge – a groundbreaking series of extremely compressed, powerful and fine-tunable small language models (SLMs) that operate using just 1.58 bits. Available in 1 billion and 3 billion parameter versions, these models leverage Microsoft’s open-source BitNet 1.58bit architecture to dramatically reduce computational requirements while maintaining competitive performance. TII has simultaneously released 'onebitllms', an open-source toolkit enabling developers to easily fine-tune Falcon-Edge for specific applications.

SO WHAT? - Open-source BitNet architecture is paving the way for a new era of 1- bit language models. Developed by Microsoft, BitNet promises to dramatically reduce energy consumption and provide powerful AI capabilities on edge devices. Developed on the BitNet 1.58 bit framework, Falcon-Edge is a significant advancement in making AI more accessible and efficient. By operating with ternary weights (-1, 0, 1) during model training (instead of binary, -1, 1), BitNet models can run up to five times faster on CPUs, potentially democratizing access to powerful AI capabilities on edge devices without requiring specialised hardware.

Here are some details about the new Falcon-Edge models:

Technology Innovation Institute (TII) has released New Falcon-Edge models that use a novel ternary format (1.58 bits) architecture based on Microsoft's BitNet research, enabling them to operate with dramatically reduced computational requirements compared to standard LLMs that use floating-point precision.
The Falcon-Edge series is available in two sizes – 1 billion and 3 billion parameters – with each size offered in both base and instruction-tuned variants, providing options for different use cases and resource constraints.
TII's pre-training approach delivers a comprehensive output from a single training process, yielding both non-quantized (bfloat16) and native BitNet model variants, giving developers flexibility in deployment options.
Performance evaluations show Falcon-Edge models are competitive with similarly-sized conventional models on the former Hugging Face leaderboard v2 benchmark, demonstrating that BitNet's extreme compression doesn't significantly compromise capabilities.
TII has also released 'onebitllms', a Python package (installable via pip, a Python package manager) that provides developers with tools to convert pre-quantised model checkpoints into BitNet training format and quantize trained checkpoints into BitNet format.
Unlike previous BitNet releases that focused solely on inference, Falcon-Edge models include pre-quantised weights, enabling users to perform fine-tuning or continuous pre-training – a capability that had been a significant limitation for earlier BitNet implementations.
The researchers implemented optimized Triton kernels for activation_quant and weight_quant functions, significantly reducing the pre-training costs of these models while making these optimizations available to the community.
TII's approach removed certain Layer Normalization layers within the BitNet architecture without compromising performance, ensuring compatibility with the broader Llama ecosystem with minimal adjustments.

ZOOM OUT - BitNet models like Falcon-Edge could fundamentally reshape AI deployment at the edge, addressing critical bottlenecks in computational requirements and energy consumption. As industries from manufacturing to healthcare increasingly demand on-device AI without reliance on cloud connectivity, traditional models requiring high-precision floating-point operations become impractical. The 1.58-bit architecture pioneered by BitNet and implemented in Falcon-Edge is driving a new hardware paradigm, where specialized chips optimised for ternary operations could deliver dramatic efficiency gains beyond standard CPUs and GPUs. This architecture aligns perfectly with the growing need for energy-efficient AI processing in battery-powered devices, remote installations, and IoT ecosystems.

[Written and edited with the assistance of AI]

Middle East AI News

Discussion about this post

Middle East AI News

TII releases Falcon-Edge 1.58bit language models

Falcon-Edge series brings LLMs to edge computing, based on BitNet

LINKS

Discussion about this post