Sliding window attention (SWA) is powering frontier hybrid models for efficiency. Is there something better? Introducing Phalanx, a faster and better quality drop-in replacement for sliding window attention (SWA). Phalanx is a new family of hardware and numerics-aware windowed layers designed with a focus on data locality and jagged, block-aligned windows that map directly to GPUs. In training, Phalanx delivers 10–40% higher end-to-end throughput at 4K–32K context lengths over optimized SWA-hybrids and Transformers by reducing costly inter-warp communication. Today, we are releasing both the technical report, a blog, and Phalanx kernels in spear, our research kernel library. We are hiring. Blog: https://lnkd.in/gSC-H4FF Code: https://lnkd.in/grkQ2FUb Report: https://lnkd.in/guEwMsMP
About us
- Website
-
https://radicalnumerics.ai/
External link for Radical Numerics
- Industry
- Research Services
- Company size
- 2-10 employees
- Type
- Privately Held
Employees at Radical Numerics
Updates
-
Introducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to catalyze further research on DLM inference and post-training. We are researchers and engineers (DeepMind, Meta, Liquid, Stanford) building the engine for recursive self-improvement (RSI) — and using it to accelerate our own work. Our goal is to let AI design AI. We’re hiring. More on RND1 models and how we are training them: - Blog: https://lnkd.in/gBWEPC5p - Code: https://lnkd.in/gemEQCvd - Report: https://lnkd.in/gbxE8FrG - Hugging Face weights: https://lnkd.in/g4gXG5Q2
-