At NYSE with John Furrier, our CEO Ramin Hasani shared how the next chapter of AI will be a hybrid, combining the scale of the cloud with the adaptability of on-device intelligence. Large-scale training happens in the cloud, but privacy-sensitive, latency-sensitive, and efficient applications must stay on-device. These systems need to perform instantly, securely, and without any risk of interruption. This is where highly specialized, compact models thrive. Liquid’s technology was built for this kind of intelligence. From our research at MIT, we developed liquid neural networks, models inspired by biology and designed for adaptability. These systems stay flexible even after learning, able to hold and compress more knowledge within smaller footprints, without sacrificing performance. By drawing from nature, we have built models that deliver powerful intelligence in compact, efficient form, creating the foundation for AI that can run reliably and privately wherever needed. 🎥 Watch Ramin’s full conversation on SiliconANGLE & theCUBE: https://lnkd.in/eBh3qJwU
Liquid AI
Information Services
Cambridge, Massachusetts 24,395 followers
We build efficient general-purpose AI at every scale.
About us
We build efficient general-purpose AI at every scale.
- Website
-
http://liquid.ai
External link for Liquid AI
- Industry
- Information Services
- Company size
- 51-200 employees
- Headquarters
- Cambridge, Massachusetts
- Type
- Privately Held
- Founded
- 2023
Locations
-
Primary
Get directions
314 Main St
Cambridge, Massachusetts 02142, US
Employees at Liquid AI
-
Dave Blundin
Co-Founder, DataSage, EverQuote & Vestmark. Godfather of Quantization. GP at Link Ventures.
-
Raymond Liao
Venture Capitalist
-
Hermes Ruiz
Keynote Speaker | AI Strategist | Ethical Tech Evangelist | EB-1 Talent in the U.S. | Chief Humanity Hacker
-
George Bandarian
Driving AI Innovation as General Partner, Untapped Ventures | AI Keynote Speaker | Agentic Podcast Host | Proud Husband & Father of 3 Boys
Updates
-
Liquid AI, AMD, and Robotec.ai have deployed compact foundation models for autonomous agentic robotics: showcasing a fine-tuned version of our recently released 3-billion parameter Liquid vision-language model (LFM2-VL-3B), running efficiently on AMD Ryzen™, to enable real-time multimodal perception and decision-making at the edge. Using Robotec.ai's RAI framework—a flexible AI agent platform designed for developing and deploying Embodied AI features—the team validated autonomous inspection capabilities through hardware-in-the-loop simulation across hundreds of warehouse scenarios. This approach enabled rigorous testing of the complete system before physical deployment, significantly reducing development time and risk while ensuring robust real-world performance. Check out the demo at the AMD booth at #ROSCon2025 next week with more to follow.
Robotec.ai simulation solutions unlock the full potential of the first fully autonomous warehouse robot powered exclusively by AMD Ryzen™ AI processors ➡️ https://lnkd.in/dwxScKTD We have collaborated with AMD and Liquid AI to work on a reasoning robot that intelligently responds to the changing environment around it. The robot takes the word “autonomy” to a whole new level: it can interpret commands, detect safety hazards, and autonomously execute corrective actions. Our extensive simulation-based testing accelerated the R&D process by: ⚫ Supporting validation of embedded AI on real hardware ⚫ Reducing the time, costs, and risks related to physical testing ⚫ Enabling rapid OEM prototyping Visit the AMD stand (Booth 17/18) at #ROSCon2025 in Singapore to see the demo of Agentic AI in robotics that we created in collaboration with AMD and Liquid AI Our team will be there to answer all of your questions. ✅ Learn more on the blog: https://lnkd.in/dwxScKTD
-
-
Introducing our new tiny vision model: LFM2-VL-3B 👀 Built for flexibility and performance: > 51.8% on MM-IFEval (instruction following) > 71.4% on RealWorldQA (real-world understanding) > Excels in single and multi-image understanding and English OCR > Low object hallucination rate (POPE benchmark) > Expanded multilingual visual understanding: English, Japanese, French, Spanish, German, Italian, Portuguese, Arabic, Chinese, Korean This model expands our multimodal portfolio and demonstrates the universal applicability of our hybrid LFM2 backbones. Blog: https://lnkd.in/edFKrAJk HF: https://lnkd.in/eKrJizeA LEAP: https://lnkd.in/ePKp8CRx
-
-
Liquid AI reposted this
🔥 LFM2-1.2B just got a major speed boost — 52 tokens/sec on Snapdragon X Elite. We’ve optimized Liquid AI’s hybrid LFM2-1.2B with our NexaML Turbo Engine, achieving real-time inference fully on the Qualcomm Hexagon NPU. LFM2’s new multiplicative-gate + convolution architecture isn’t trivial to run — it demanded hardware-aware graph optimization. NexaML Turbo squeezes every bit of NPU performance for faster, smoother on-device AI. This update shows what happens when great model design meets a purpose-built inference engine. Thrilled to be partnering with Liquid AI — and even more excited for what’s next. Ramin Hasani Mathias Lechner Alexander Amini Daniela Rus Jeffrey Li Manoj Khilnani Chun-Po Chang
-
Liquid AI reposted this
Had a great time today chatting about what we do at Liquid AI at the New York Stock Exchange with John Furrier in the NYSE Wired "AI Factories - Data Centers of The Future" Series. Video coming soon! special thanks to Brian J. Baumann, David Vellante, and theCUBE SiliconANGLE Media for the kind invitation, and Gemma Allen for the great support! #NYSE Wired
-
-
Enjoy LFM2-1.2B on Qualcomm NPUs in NexaSDK! Special thanks to the Nexa AI for integrating a series of liquid instances.
LFM2-1.2B models from Liquid AI are now running on Qualcomm Hexagon NPU in NexaSDK, powered by NexaML engine. Four new edge-ready variants: - LFM2-1.2B — general chat and reasoning - LFM2-1.2B-RAG — retrieval-augmented local chat - LFM2-1.2B-Tool — structured tool calling and agent workflows - LFM2-1.2B-Extract — ultra-fast document parsing LFM2 is a brand-new hybrid model architecture with both transformers and the SSM. Most inference frameworks can’t even run it yet. NexaML can. That means these models now run fully accelerated on Qualcomm Hexagon NPUs, hitting real-time speeds with tiny memory footprints for popular edge intelligence tasks — perfect for phones, wearables, and edge devices. We’re already working with customers like Brilliant Labs on what this unlocks next in ARVR glasses. Model link in comments. And if you want to follow the new model drops, star NexaSDK — it helps us deliver faster! Manoj Khilnani, Chun-Po Chang, Dr. Vinesh Sukumar, Srinivasa Deevi, Devang Aggarwal, Madhura Chatterjee, Neeraj Pramod, Bobak Tavangar, Heeseon Lim, Justin Lee
-
おはようございます!Liquid Nanosファミリーに新しく追加されたLFM2-350M-PII-Extract-JPを紹介します。 日本語テキストから個人情報(PII)を抽出し、 デバイス上での機密データのマスキングに使える構造化されたJSONを出力します。 データがローカル環境で処理されるため、プライバシーを保護したまま、クラウドモデル並みの精度と速度を実現します。 Huggingfaceモデル: https://lnkd.in/eD3Fi8tR LEAPでのデプロイ: https://lnkd.in/eum9_3bm Liquidの最新情報はDiscordチャンネルで随時発信しています。ご参加をお待ちしています。https://lnkd.in/eShCjakY
-
-
We have a new nano LFM that is on-par with GPT-5 on data extraction with 350M parameters. Introducing LFM2-350M-PII-Extract-JP 🇯🇵 Extracts personally identifiable information (PII) from Japanese text → returns structured JSON for on-device masking of sensitive data. Delivers the accuracy and speed of giant cloud-based models while keeping data where it belongs: fully private on-device. Download on HF: https://lnkd.in/eD3Fi8tR Deploy with LEAP: https://lnkd.in/eum9_3bm Join our discord channel to get live-updates on latest from Liquid: https://lnkd.in/eShCjakY
-
-
Meet LFM2-8B-A1B, our first on-device Mixture-of-Experts (MoE)! 🐘 > LFM2-8B-A1B is the best on-device MoE in terms of both quality and speed. > Performance of a 3B-4B model class, with up to 5x faster inference profile on CPUs and GPUs. > Quantized variants fit comfortably on high-end phones, tablets, and laptops. Enabling fast, private, low-latency applications across modern phones, tablets, laptops, and embedded systems. Quality LFM2-8B-A1B has greater knowledge capacity than competitive models and is trained to provide quality inference across a variety of capabilities. Including: > Knowledge > Instruction following > Mathematics > Language translation Architecture LFM2-8B-A1B is one of the first models to challenge the common belief that the MoE architecture is not effective at smaller parameter sizes. LFM2‑8B-A1B keeps the LFM2 fast backbone and introduces sparse MoE feed‑forward networks to add representational capacity without significantly increasing the active compute path. > LFM2 Backbone: 18 gated short convolution blocks and 6 GQA blocks. > Size: 8.3B total parameters, 1.5B active parameters. > MoE placement: With the exception of the first two layers, all layers include an MoE block. The first two layers remain dense for stability purposes. > Expert granularity: 32 experts per MoE block, with top-4 active experts applied per token. This configuration provides a strong quality boost over lower granularity configs while maintaining fast routing and portable kernels. > Router: Normalized sigmoid gating with adaptive routing biases for better load balancing and training dynamics. CPU Performance: Across devices on CPU, LFM2-8B-A1B is considerably faster than the fastest variants of Qwen3-1.7B, IBM Granite 4.0, and others. GPU Performance: In addition to integrating LFM2-8B-A1B on llama.cpp and ExecuTorch to validate inference efficiency on CPU-only devices, we’ve also integrated the model into vLLM to deploy on GPU in both single-request and online batched settings. Our 8B LFM2 MoE model not only outperforms comparable size models on CPU but also excels against those same models on GPU (1xH100) with full CUDA-graph compilation during decode and piecewise CUDA-graph during prefill. Full Blog: https://lnkd.in/ecyCmKHM HF: https://lnkd.in/eKGCJEDk
-
-
Today, we expand our LFM2 family to audio. 👂👄 LFM2-Audio is an end-to-end audio-text omni foundation model, and delivers responsive, real-time conversation on-device at just 1.5B parameters. One model. Seamless multimodal support. No chains. > Speech-to-speech > Speech-to-text > Text-to-speech > Audio classification > Open weights 10x faster inference vs peers, with quality rivaling systems 10x larger. Efficiency: Efficiency is key for interactive real-time audio scenarios. LFM2-Audio-1.5B has an average end-to-end latency of under 100 ms, highlighting superb efficiency, even faster than models with much fewer than 1.5B parameters. Quality: LFM2-Audio-1.5B performs best-in-class by a large margin on conversational speech-to-speech chat – competitive with larger open models, such as Qwen2.5-Omni-3B (5B), Lyra-Base (9B), and GLM-4-Voice (9B). Model: LFM2-Audio is a novel omni-modal architecture that supports both text AND audio as first-class modalities, in both input and output. On the input side, the model intakes and tokenizes both text tokens and audio tokens into the same latent space. On the output side, the model autoregressively and flexibly generates tokens of either modality, depending on the task. Full Blog: https://lnkd.in/eHdbsAHg HF: https://lnkd.in/eJutwina