Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Meta AI has announced the open-source release of MobileLLM, a set of language models optimized for mobile devices, with model checkpoints and code now accessible on Hugging Face. However, it is presently only available under a Creative Commons 4.0 non-commercial license, meaning enterprises can’t use it on commercial products.
Originally described in a research paper published in July 2024 and covered by VentureBeat, MobileLLM is now fully available with open weights, marking a significant milestone for efficient, on-device AI.
The release of these open weights makes MobileLLM a more direct, if roundabout, competitor to Apple Intelligence, Apple’s on-device/private cloud hybrid AI solution made up of multiple models, shipping out to users of its iOS 18 operating system in the U.S. and outside the EU this week. However, being restricted to research use and requiring downloading and installation from Hugging Face, it’s likely to remain limited to a computer science and academic audience for now.
More efficiency for mobile devices
MobileLLM aims to tackle the challenges of deploying AI models on smartphones and other resource-constrained devices.
With parameter counts ranging from 125 million to 1 billion, these models are designed to operate within the limited memory and energy capacities typical of mobile hardware.
By emphasizing architecture over sheer size, Meta’s research suggests that well-designed compact models can deliver robust AI performance directly on devices.
Resolving scaling issues
The design philosophy behind MobileLLM deviates from traditional AI scaling laws that emphasize width and large parameter counts.
Meta AI’s research instead focuses on deep, thin architectures to maximize performance, improving how abstract concepts are captured by the model.
Yann LeCun, Meta’s Chief AI Scientist, highlighted the importance of these depth-focused strategies in enabling advanced AI on everyday hardware.
MobileLLM incorporates several innovations aimed at making smaller models more effective:
• Depth Over Width: The models employ deep architectures, shown to outperform wider but shallower ones in small-scale scenarios.
• Embedding Sharing Techniques: These maximize weight efficiency, crucial for maintaining compact model architecture.
• Grouped Query Attention: Inspired by work from Ainslie et al. (2023), this method optimizes attention mechanisms.
• Imm …