Nexa AI

Nexa AI · 2025-10-26T00:30:34.041Z

Excited to see how this work can advance robotics and embodied intelligence 🔥 An on-device SoC VLA model enables faster perception and action, all without requiring internet connectivity.

Software Development

Cupertino, California 5,611 followers

On Device AI Deployment and Research | NexaSDK: github.com/NexaAI/nexa-sdk | Hyperlink App: https://hyperlink.nexa.ai/

See jobs Follow

View all 33 employees

About us

Nexa AI is an on device AI deployment and research company. We craft optimized foundation models and on-device inference framework that runs any model on any device, across any backend—within minutes. Our mission is to make on device AI friction‑free and production‑ready.

Website: https://nexa.ai/
External link for Nexa AI
Industry: Software Development
Company size: 11-50 employees
Headquarters: Cupertino, California
Type: Privately Held
Founded: 2023

Locations

Primary

Cupertino, California 95014, US

Get directions

Employees at Nexa AI

See all employees

Updates

Nexa AI

5,611 followers
20h Edited
Report this post
This Week at Nexa 🚀 - new model support, new platforms, and new ways for builders to get involved. 1) NexaML supports latest models across Qualcomm platforms on NPU: For example, Liquid AI’s LFM2-1.2B now runs fully on the Qualcomm Hexagon NPU, across IoT (Dragonwing IQ-9075), Automotive (SA8295), Mobile (Samsung S25), and Compute (Snapdragon X Elite) devices 2) IBM Granite 4.0 Nano joins the lineup on NPU: Powered by NexaSDK, Granite 4.0 Nano runs locally on Snapdragon X Elite at 60 tokens/sec (full-precision). We built an AI agent demo showing it fetching live info and organizing files through pure on-device function calls. 3) Builder Bounty Program is live: Developers can now earn up to $1,500, get the “Nexa Builder” Discord badge, and be featured in our SDK repo and launch docs — just by shipping an on-device AI project. Learn more: https://lnkd.in/gHRzCyMX 4) Nexa Wishlist launches: Vote for the next models we bring on-device — from GGUF to MLX to NexaML (Qualcomm and Apple NPU). Vote today: sdk.nexa.ai/wishlist

Like Comment Share
Nexa AI

5,611 followers
2d
Report this post
Our Hyperlink website just got a brand new look. As local AI becomes the new interface for your computer, it deserves a design that feels just as elegant — calm, clear, and beautifully simple. It starts from the landing page — a quiet, enjoyable expression of what Hyperlink is. And that’s not all — the floating UI feature is now ready for public testing. It stays with you wherever you go. Go to Settings → Shortcut and turn it on. Start exploring the new Hyperlink experience and see how it fits your flow. Tell us what you think of the new Hyperlink look. 👇 Link below. Kudos to the product and design team at Nexa AI.
2 Comments

Like Comment Share
Nexa AI

5,611 followers
3d Edited
Report this post
The best AI roadmap is shaped by real builders. For weeks, one request kept showing up in our inbox: “Can Nexa support Qwen3-8B on Qualcomm NPU?" So we did. Qwen3-8B now runs fully on-device on Qualcomm Hexagon NPU through NexaML. And today, we’re making this loop official: introducing Nexa Wishlist — where developers can request and vote for the next models we bring on-device. Whether it’s GGUF or MLX for CPU/GPU, or Nexa format for Qualcomm and Apple NPU — just: 1. Submit the Hugging Face repo ID 2. Select the backends you want supported 3. Watch as popular models get supported The community leads. We build fast. Vote today on https://lnkd.in/griaEP5R — and drop your requested model in the comments.

1 Comment

Like Comment Share
Nexa AI

5,611 followers
4d Edited
Report this post
Fully local RAG on Qualcomm Hexagon NPU — built with Python. Using NexaSDK Python Library (pip install nexaai), this demo indexes your local docs and answers questions entirely on-device. Demo below: - First minute: the RAG demo in action on Snapdragon - Next 40 seconds: quickstart in Jupyter Notebook— easily setting up and running models on the Qualcomm Hexagon NPU with NexaML. If you know Python, you can build advanced on-device AI — private, fast, and hardware-accelerated. GitHub code in comment. Manoj Khilnani, Chun-Po Chang, Srinivasa Deevi, Madhura Chatterjee

2 Comments

Like Comment Share
Nexa AI

5,611 followers
5d
Report this post
💰 Nexa Builder Bounty Program is live. Learn, build, and get paid for shipping on-device AI. Build with NexaSDK — the unified on-device engine with NPU acceleration, full multimodal (text / vision / audio), and cross-platform support. - Earn up to $1,500 - Get the “Nexa Builder” Discord badge - Be featured in our SDK repo & launch materials Run models others can’t even touch locally (like Qwen3-VL).Fast to start, simple to build, and the best way to learn the modern on-device AI stack while earning for your work. Participation detail here: https://lnkd.in/gHRzCyMX Got an idea? Drop it below or tag a dev who should join
Like Comment Share
Nexa AI reposted this
IBM Developer

47,675 followers
5d Edited
Report this post
Introducing Granite 4.0 Nano — compact, open-source models built for AI at the edge: https://ibm.co/6041Bzpsx Available in 350M and 1B for building AI on laptops and mobile devices. Now available on: ✅ Docker, Inc ✅ Hugging Face ✅ Nexa AI ✅ Ollama ✅ Qualcomm ✅ Unsloth AI
9 Comments

Like Comment Share
Nexa AI

5,611 followers
5d
Report this post
IBM Granite 4.0 Nano is out and runs Day-0 on Qualcomm Hexagon NPU on NexaSDK, powered by our NexaML engine, currently the only framework that brings Granite models to full NPU execution — unlocking real, on-device AI agents. We built a demo where Granite 4.0 Nano fetches live info and organizes files through function calls — fully local inference, blazingly fast at 60 tokens/sec on Snapdragon X Elite with full-precision. AI agents need to run on-device for instant response, privacy, and always-on awareness — and NPUs are built exactly for that. Small models like Granite 4.0 Nano make this possible, and NexaML makes it practical — turning every phone, PC, car, and IoT device into an AI agent platform. Demo below. Star NexaSDK for more Day-0 supports! Links in comments. Manoj Khilnani, Chun-Po Chang, Eda Kavlakoğlu, Saleem Hussain, Gabe Goodhart, Neel Kishan, Rodrigo A., Srinivasa Deevi, Devang Aggarwal, Madhura Chatterjee

5 Comments

Like Comment Share
Nexa AI

5,611 followers
6d Edited
Report this post
NexaML supports the latest models across all Qualcomm platforms — from compute to mobile, automotive, and IoT fully on NPU — with real-time speed and rapid turnaround time. For example, Liquid AI’s LFM2-1.2B — a new hybrid model combining multiplicative-gate and convolution layers — now runs 100% on-device on Qualcomm Hexagon NPUs across all platforms: - Dragonwing IQ-9075 (IoT): 45 tokens/sec - SA8295 (Automotive): 37 tokens/sec - Samsung S25 (Mobile): 89 tokens/sec - Snapdragon X Elite (Compute): 52 tokens/sec This marks the first time a state-of-the-art small language model runs across Qualcomm’s full ecosystem under one unified inference engine — NexaML. Check out the demo below or reach out to explore model integration for Qualcomm platforms. See blog in comment. Manoj Khilnani, Chun-Po Chang, Srinivasa Deevi, Devang Aggarwal, Madhura Chatterjee, Damanjit Singh

5 Comments

Like Comment Share
Nexa AI

5,611 followers
1w
Report this post
This week at Nexa 🚀 — SOTA model support, Python Library, Community showcase 1) Liquid AI's LFM2-1.2B on Qualcomm Hexagon NPU with 52 tok/s on Snapdragon X Elite, powered by NexaSDK. 2) Qwen's Qwen3-VL now runs locally on Qualcomm Oryon CPU, Adreno GPU, and Hexagon NPU with NexaSDK powered by NexaML (first/only on Snapdragon). 3) Python bindings live — pip install nexaai to run LLMs, VLMs, ASR, embedding models on NPU, GPU, CPU from Python. 4) NexaML Profiling Tool (preview) — shows E2E latency and per-op breakdown to tune the inference performance on NPU. 5) Community: Nexa AI was featured at Qualcomm and AMD booths at the PyTorch conference this week.

Like Comment Share

Browse jobs

Funding

Nexa AI 1 total round

Last Round

Seed Sep 28, 2024

Investors

Alumni Ventures

See more info on crunchbase

Nexa AI

Software Development

Cupertino, California 5,611 followers

On Device AI Deployment and Research | NexaSDK: github.com/NexaAI/nexa-sdk | Hyperlink App: https://hyperlink.nexa.ai/

About us

Locations

Employees at Nexa AI

Chuck E.

Professor at Stanford University

Iris Zhou

UX/Product Designer | Seeking a fulltime position

Wenwen C.

Designer | UW Alum

Zack Li

Cofounder & CTO at NexaAI | On-device Multimodal Models & Edge Inference Acceleration | Ex-Google, Ex-Amazon Lab126

Updates

Join now to see what you are missing

Similar pages

Nexa

Lazarus AI

Nexxa.ai

Fusion Fund

Chinese Entrepreneurs Organization

Qualcomm

StartX

OpusClip

Together AI

Scale AI

Browse jobs

Product Designer jobs

User Experience Designer jobs

Machine Learning Engineer jobs

Intern jobs

Engineer jobs

Scientist jobs

Senior Software Engineer jobs

Researcher jobs

Customer Service Representative jobs

Data Engineer jobs

Biomedical Engineer jobs

Commercial Lawyer jobs

Junior Lawyer jobs

Junior Software Engineer jobs

Lawyer jobs

Designer jobs

Developer jobs

Industrial Designer jobs

Analyst jobs

Risk Analyst jobs

Funding