Gabriel Brostow’s Post

It's still surprisingly hard to make models that can "discuss" scene-layout. This is a modest but first step. Come check it out and we'll tell you all about it! Also, shoutout to yesterday's excellent Workshop on Vision Foundation Models for Accessibility (https://lnkd.in/exc8AjZB). Yuheng W. Jon Froehlich Yapeng Tian Chu Li Yuhang Zhao. Among many cool problems, last-20m-navigation is my favorite motivator for this scene-understanding research (PlaceIt3D and beyond).

View profile for Guillermo Garcia-Hernando

Researcher at Niantic Spatial

We’ll be in Honolulu 🌺 at #ICCV2025 this Tuesday afternoon for our poster (Session 2) — come see us present: 🧩 𝐏𝐥𝐚𝐜𝐞𝐈𝐭3𝐃: Language-Guided Object Placement in Real 3D Scenes 🧩 What feels natural to us — “put the lamp next to the sofa” or “move this character where it can’t be seen from the doorway” — remains one of the hardest challenges in AI. Translating language into grounded 3D actions requires reasoning about free space, objects, and intent. In this paper, we introduce: • A benchmark for evaluating language-guided 3D object placement • A dataset with 97k+ examples (scalable to millions) • A baseline method (PlaceWizard) that shows what’s possible today 👉 Explore the full paper, code and dataset here: https://lnkd.in/dMiJ8HeX Work by Ahmed Abdelreheem, Filippo Aleotti, Jamie Watson, Zawar Qureshi, Abdelrahman Eldesokey, PhD, Peter Wonka, Gabriel Brostow, Sara Vicente, and Guillermo Garcia-Hernando Research from Niantic Spatial, Inc.

Yuheng W.

CS PhD Student@UW-Madison | HCI, Accessibility, AR/VR, AI-powered AR system

4d

Thank you Gabriel Brostow for coming to the workshop! Has very insightful discussion. And this paper looks very interesting! It surprises me how hard it is for LLM to reason about 3D layout.

To view or add a comment, sign in

Explore content categories