Stars
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle
The Station, an open-world multi-agent environment that models a miniature scientific ecosystem.
Station Viewer - providing access to view data from completed Station instances.
Code and Data for Paper "AutoTIR: Autonomous Tools Integrated Reasoning via Reinforcement Learning"
程序员延寿指南 | A programmer's guide to live longer
Doing simple retrieval from LLM models at various context lengths to measure accuracy
Official PyTorch implementation of One-Minute Video Generation with Test-Time Training
🔥 A minimal training framework for scaling FLA models
Code release for paper "Test-Time Training Done Right"
[NeurIPS 2025 spotlight] QFFT, Question-Free Fine-Tuning for Adaptive Reasoning
Towards Fine-grained Audio Captioning with Multimodal Contextual Cues
[ACL 2025] Exploring Compositional Generalization of Multimodal LLMs for Medical Imaging
Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning
[EMNLP 2025] RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions
NeurIPS 2025: Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs
Code and Data for IJCAI 2025 Paper "SetKE: Knowledge Editing for Knowledge Elements Overlap"
