🧠"Less is More: Recursive Reasoning with Tiny Networks" — a new take on AI reasoning from Samsung SAIL Montreal. This paper challenges the classic idea that bigger is always better in LLMs. Instead, it shows how smart recursion with tiny networks outperforms huge models (like Deepseek, Gemini, Claude) on hard tasks — Sudoku, Maze, and ARC-AGI — using a fraction of the data and parameters: • Core Problem: LLMs often stumble on tough reasoning puzzles. Hierarchical Reasoning Model (HRM) was an inspired solution using two small networks, biological analogy, and deep supervision. But — is complexity required? • New Approach: Tiny Recursive Model (TRM) simplifies the design radically — using just one tiny (2-layer, 7M parameter) network and pure recursion. No biological hierarchy, no fixed-point math — just step-by-step improvement. • Results: TRM beats HRM at generalization, pushes state-of-the-art on tough benchmarks (Sudoku-Extreme test accuracy from 55%→87%, ARC-AGI-1 from 40%→45%, ARC-AGI-2 from 5%→8%). Models like Gemini 2.5 Pro and o3-mini fall short at higher cost. • Implications: Recursion, not scale, unlocks smarter reasoning on sparse data. Deep supervision and minimal architectures may hold the key for future LLM and AGI advances. • Real-world Design: TRM removes unnecessary complexity — no extra networks, math, or biological analogies. Results fuel new ways to build efficient, reliable AI for reasoning-intensive tasks. What's next for recursive models? Can smaller, adaptive architectures redefine how we build LLMs for real-world intelligence? 📄 Attached is the original research PDF for full details. Curious how this approach might influence future model architectures?
"Tiny Networks Outperform Large LLMs in AI Reasoning"
More Relevant Posts
-
Less is More: How Tiny Recursive Models Redefine AI Reasoning Efficiency It introduces the Tiny Recursive Model (TRM), a simplified yet powerful architecture that surpasses large-scale LLMs on complex reasoning tasks like Sudoku, Maze solving and ARC-AGI - all with just 7M parameters. While large models rely on massive data and computational resources, TRM proves that intelligent design and recursive reasoning can deliver state-of-the-art performance using minimal parameters. Key Highlights: Efficient Reasoning: TRM uses a single tiny 2-layer network to iteratively refine its answers, no complex biological analogies or fixed-point mathematics required. Superior Generalization: Achieves 45% test accuracy on ARC-AGI-1 and 8% on ARC-AGI-2 outperforming models like DeepSeek R1, Gemini 2.5 Pro and O3-mini, despite having less than 0.01% of their parameters. Simplicity Over Complexity: Unlike the Hierarchical Reasoning Model (HRM), TRM eliminates dual networks and unnecessary biological justification, focusing on transparent, data-driven recursion. Performance Gains: Sudoku-Extreme: 55% → 87% Maze-Hard: 75% → 85% ARC-AGI-1: 40% → 45% ARC-AGI-2: 5% → 8% Attention-Free Design: For small, fixed-length tasks (like Sudoku), TRM replaces self-attention with efficient MLP layers improving generalization and speed. Optimized Training: Incorporates deep supervision and exponential moving average (EMA) for stable convergence, minimizing overfitting on small datasets. Read the full research paper: https://lnkd.in/gPQ6CHBu Github Link : https://lnkd.in/g-GnTt3j 👉 Join Telegram group for handpicked resources, learning materials and updates! Link : https://t.me/aibulletin56
To view or add a comment, sign in
-
Tiny Models, Infinite Loops: How Recursive Reasoning Beats Giant LLMs The new paper “Less is More: Recursive Reasoning with Tiny Networks” by Alexia Jolicoeur-Martineau (Samsung SAIT Montreal) delivers a stunning result 10,000× fewer parameters, yet better reasoning. While today’s Large Language Models (LLMs) rely on billions of parameters to mimic reasoning, this work flips the script: what if reasoning doesn’t need scale, but recursion? 🧠 The Core Idea Instead of memorizing patterns, the proposed Tiny Recursive Model (TRM) “thinks in loops.” A single small network repeatedly refines its answers updating internal states and revisiting its own output, like a human double-checking logic steps. Each recursion layer acts as a thought iteration, improving the previous one. This process happens without growing the model, only deepening its reasoning path. ⚙️ Architecture in Brief Single lightweight network (~7 M parameters) Two key steps: 1. Update latent reasoning state z ← net(x, y, z) 2. Update answer y ← net(y, z) Looped T times, with only the final pass backpropagated Uses EMA (Exponential Moving Average) for stability No attention layers or large context windows just recursion and feedback The design eliminates the heavy fixed-point and dual-network assumptions of older Hierarchical Reasoning Models (HRM). Training becomes faster, simpler, and less memory-intensive. 📊 Results That Redefine Efficiency On structured reasoning benchmarks like Sudoku-Extreme, Maze-Hard, ARC-AGI-1, and ARC-AGI-2, TRM outperforms both HRM and several LLMs (Gemini 2.5 Pro, o3-mini, Deepseek R1) despite using less than 0.01% of their parameters. Task TRM Accuracy LLM Baselines Sudoku-Extreme ~87 % Near 0 – few % ARC-AGI-1 ~45 % Gemini 2.5 Pro ≈ 37 % These results show that reasoning ≠ scaling iterative refinement can outperform raw size. 🚀 Why This Matters Recursive reasoning could shift AI from “memorizing data” to “thinking in steps.” It opens the door to: Edge AI & on-device reasoning (phones, IoT) Greener compute with drastically lower energy use Better generalization on small, logic-heavy datasets It’s a powerful reminder that intelligence may not come from bigger networks, but from smarter loops.
To view or add a comment, sign in
-
The Tiny Recursive Model (TRM) paper upends conventional wisdom in AI by showing that clever architecture beats brute-force scale. Rather than relying on massive models, TRM uses a simple, two-layer, 7M parameter network that recursively refines its answers by iteratively updating a latent reasoning “scratchpad.” This process mirrors deep thinking: TRM drafts an initial answer, repeatedly self-critiques on internal logic, and revises until convergence. Key aspects: ≫ Radical Simplicity: TRM delivers higher generalization than the previous HRM approach—using less than a quarter of the parameters and halving the layer count, with no need for biological analogies or esoteric math. ≫ Efficient Recursion: The model’s core cycle (draft, reason intensely, revise) is run multiple times per sample (up to 16), combining recursive and deep supervision to incrementally polish answers. ≫ Empirical Wins: TRM outperforms massive LLMs and past SOTA on the hardest AI reasoning benchmarks (over 87% on Sudoku-Extreme, 85% on Maze-Hard, 45%/8% on ARC-AGI-1/2), all with a tiny computational footprint. ≫ Generalization, Not Overfitting: Adding complexity or size hurts TRM’s generalization. Its design shows that “small and deep” recursion is more robust than “big and shallow” networks, especially under limited data. ≫ Blueprint for the Future: TRM’s success validates old ideas about separating “reasoning” from “acting” and offers a compelling template for efficient, high-performing, neurosymbolic AI that runs on modest hardware. This paper sets a new direction for AI reasoning … where parameter-light, recursive architectures can outperform giant models. TRM could mark a shift towards smarter, deliberate computation where model design, not size, is decisive. #recursive #efficientAI #architecture #neurosymbolic
To view or add a comment, sign in
-
Tiny Recursion Model (TRM), new paper from Samsung SAIL Montreal, introduces a minimalist alternative to the Hierarchical Reasoning Model (HRM). By replacing two separate recurrent modules with a single, 2-layer network that iteratively refines a latent reasoning state z and an answer y, TRM achieves superior generalization across reasoning-intensive benchmarks such as Sudoku-Extreme, Maze-Hard, and ARC-AGI. Notably, TRM dispenses with HRM’s biological justifications and fixed-point gradient approximations, backpropagating through full recursions instead. With fewer than 7 million parameters, TRM surpasses HRM’s accuracy and even outperforms large-scale LLMs (e.g., Gemini 2.5 Pro, DeepSeek R1) using less than 0.01 % of their size. TRM’s design embodies the principle that iteration can replace scale: recursive refinement steps substitute for billions of parameters. Its findings point toward a new class of compact reasoning engines that can complement or embed within large models—small, deterministic solvers that refine intermediate outputs instead of relying on stochastic chain-of-thought sampling. This reflects a shift in AI engineering from parameter scaling to compute recycling through recursion, suggesting that "future LLM architectures may integrate lightweight recursive cores for structured reasoning". As physicist Murray Gell-Mann once said, “You don’t need something more to get something more.” TRM illustrates this truth in machine intelligence—beauty and efficiency often arise from disciplined simplicity. While TRM’s results are impressive, its heavy reliance on data augmentation (up to 1000 variants per item) and small problem domains limits generalizability. Future work should test the model’s robustness under reduced augmentation, explore scaling laws across deeper recursions, and extend the deterministic framework toward generative or stochastic reasoning. Integrating TRM into hybrid LLM systems—where large models handle abstraction and tiny recursive networks handle verification—could offer a compelling research frontier that unites theoretical elegance with engineering practicality. Read more from Alexia's paper: https://lnkd.in/gDqKubch
To view or add a comment, sign in
-
🚀 **Traditional LLM function calls are killing your throughput.** Check out my AI development services: https://lnkd.in/dh8aYj2H Most developers don't realize that synchronous function calling in MCP (Model Context Protocol) contexts creates a massive bottleneck. Your model sits idle, waiting for tool execution to complete before continuing inference. **Here's what's happening under the hood:** • Model generates tool call → **BLOCKS** → Waits for execution → Continues reasoning • Result: Wasted compute cycles and frustrated users **The async revolution changes everything:** ✅ Model emits tool calls and **keeps reasoning** ✅ Multiple tools execute in parallel ✅ Inference pipeline stays hot ✅ 40-60% improvement in overall throughput **Real-world impact I've measured:** - API response times: 2.3s → 0.9s - Concurrent request handling: 3x improvement - Resource utilization: 85% vs 45% The key is implementing proper async queuing with callback handlers. Your model doesn't need to wait – it can start processing the next reasoning step while tools execute in the background. **Pro tip:** Use event-driven architectures with message queues to decouple tool execution from inference completely. This isn't just about speed – it's about building LLM applications that scale. What's your biggest bottleneck when working with tool-enabled LLMs? 🤔 --- #AsyncProgramming #MCP #LLM #AI #MachineLearning #SoftwareDevelopment #Performance #TechOptimization
To view or add a comment, sign in
-
-
A tiny ~7M-parameter model reportedly outperformed giant reasoning models (DeepSeek-R1, Gemini 2.5 Pro, o3-mini) on ARG-AGI and ARC-AGI benchmarks. The trick isn’t more weights — it’s a different procedure. TL;DR: TRM trades static parameter scale for iterative, algorithmic reasoning at inference: generate → introspect → refine → repeat. How it works (5 steps) Draft: produce a complete candidate solution (one-shot draft), not token-by-token assembly. Scratchpad: create an internal latent workspace to hold intermediate reasoning. Self-critique: run a focused inner loop (e.g., ~6 passes) that checks the draft against the problem and updates the scratchpad. Revise: use the improved scratchpad to produce a better draft. Repeat: iterate the cycle up to many times (reported up to ~16) until convergence or budget. Why a tiny model can be “smarter” Compute vs. parameters: iterative refinement reallocates capacity into compute at inference, emulating deeper reasoning without more weights. Effective algorithmic depth: the inner loop unrolls an internal algorithm, giving selective, task-specific depth. Self-verification: repeated critique filters contradictions and reduces brittle outputs. Amortized heuristics: the model learns reusable reasoning routines and calls them repeatedly, amplifying a small parameter set. Better cost/latency tradeoffs for constrained deployments—more compute per query, far fewer stored parameters. Implications Business: algorithmic advantage — cheaper inference with high reasoning quality. Research: evidence that architecture + procedure can rival raw scale; supports neuro-symbolic and iterative paradigms. Practice: makes high-quality reasoners feasible on edge/enterprise hardware. Would you run ablations: (a) inner loop off, (b) cycles varied, (c) compute × accuracy curves? That’s the experiment that proves whether recursion or the learned prior is doing the heavy lifting. #AI #MachineLearning #EfficientAI #Reasoning #TinyModels #NeuroSymbolic https://lnkd.in/gTRZh53u
To view or add a comment, sign in
-
Researchers demonstrate breakthrough efficiency with 7-million parameter Tiny Recursive Model outperforming major reasoning systems including DeepSeek-R1, Gemini 2.5 Pro, and o3-mini on ARC-AGI benchmarks. This achievement challenges the scaling paradigm, proving sophisticated reasoning emerges from architectural innovation rather than parameter count. The development signals fundamental shifts in AI economics, enabling powerful reasoning capabilities on edge devices while dramatically reducing computational costs and energy requirements for enterprise deployments. #LowerAlabamaAI #MachineLearning #AIInfrastructure
To view or add a comment, sign in
-
Vector Databases Guide: RAG Applications 2025 Vector Databases: The Essential Guide to Powering Your RAG Applications in 2025 What Are Vector Databases and Why They Matter Understanding Vector Embeddings and Semantic Search Vector databases store and retrieve high-dimensional numerical representations of data, known as embeddings. When you process text, images, or other content through a machine learning model, it outputs an array of floating-point numbers—typically 384 to 1536 dimensions for modern embedding models. These vectors capture semantic meaning in a mathematical space where similar concepts cluster together. Unlike traditional keyword matching, semantic search uses vector similarity to find contextually relevant results. A search for "feline companion" can return documents about "cats" because their embeddings are mathematically close, even without shared words. This capability has become essential for building AI applications that understand intent rather than just matching strings. Traditional https://lnkd.in/gUKUbrVn
To view or add a comment, sign in
-
The Tiny Recursive Model (TRM) was detailed in a paper dated October 7, 2025. The paper is titled *“Less is More: Recursive Reasoning with Tiny Networks”*. TRM uses only 7 million parameters. This tiny model reportedly beats large language models on reasoning benchmarks. It surpasses DeepSeek-R1 and Gemini 2.5 Pro. TRM scores 45% on ARC-AGI-1. It scores 8% on ARC-AGI-2. Gemini 2.5 Pro achieved only about 4.9% on ARC-AGI-2. The model works through iterative self-refinement. Its key narrative favors architecture over brute-force scale. The model first generates an initial solution draft. It then uses a "scratchpad" for internal reasoning. TRM recursively critiques its own logic multiple times. The process refines the answer over up to 16 cycles. This method is called Deep Supervision. TRM’s algorithm is efficient. It requires only one forward pass during optimization. The approach avoids complex fixed-point theorems. These results support neuro-symbolic AI ideas. This could lead to edge-deployable "reasoners". The benchmark results are factually supported by the paper. However, claims of general superiority over large models are exaggerated. The ARC scores are still far below human level. Source: https://lnkd.in/gEYWCE7A https://lnkd.in/gxsqc7ZF
To view or add a comment, sign in
-
Just read a nice article on arXiv, "Less is More: Recursive Reasoning with Tiny Networks" by Alexia Jolicoeur-Martineau. It's a good read with a cup of morning coffee. For so long, we’ve all been conditioned to think about sheer scale in AI, right? We assume that winning the parameter race is how magic happens. But guess what? A tiny model just beat the giants at their own game! (allegedly) The new Tiny Recursive Model has only 7 million parameters. That’s peanuts compared to the massive LLMs we usually talk about. And yet, this little engine is actually outperforming models like Gemini 2.5 Pro on complex reasoning challenges like Sudoku-Extreme and ARC-AGI. I mean, we're talking about a model with less than 0.01% of the parameters delivering better results on a pure reasoning task! So, what's its secret sauce? It’s all about the process, not the size. Instead of trying to crank out the perfect answer in one go (which, let's be honest, can be brittle), TRM uses a recursive approach. It literally makes a guess, then refines that answer step-by-step, over and over. It’s like the model is taking notes, thinking things through, and doing self-correction as it goes. This really makes you realize that for our super-hard, structured problems, the answer might be this kind of efficient, specialized reasoning engine. But here's the kicker: as cool as this is, we haven't seen a large-scale enterprise fully adopt and productionize this model structure yet. It's fantastic in the lab, but can this tiny, recurrent engine handle the scale and robustness demands of real-time commerce and data? That's the billion-dollar question. Once we know better cost, latency, and throughput metrics, this will be an interesting option to pursue. It’s an incredible lesson in efficiency and smart architecture. Do we really need a massive, expensive net for everything, or can we start doing more with less?
To view or add a comment, sign in
-
More from this author
Explore content categories
- Career
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Technology
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Hospitality & Tourism
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development