Plan-Recognition-Driven Attention Modeling for Visual Recognition

Zha, Yantian; Li, Yikang; Yu, Tianshu; Kambhampati, Subbarao; Li, Baoxin

Computer Science > Computer Vision and Pattern Recognition

arXiv:1812.00301 (cs)

[Submitted on 2 Dec 2018 (v1), last revised 14 Oct 2021 (this version, v2)]

Title:Plan-Recognition-Driven Attention Modeling for Visual Recognition

Authors:Yantian Zha, Yikang Li, Tianshu Yu, Subbarao Kambhampati, Baoxin Li

View PDF

Abstract:Human visual recognition of activities or external agents involves an interplay between high-level plan recognition and low-level perception. Given that, a natural question to ask is: can low-level perception be improved by high-level plan recognition? We formulate the problem of leveraging recognized plans to generate better top-down attention maps \cite{gazzaniga2009,baluch2011} to improve the perception performance. We call these top-down attention maps specifically as plan-recognition-driven attention maps. To address this problem, we introduce the Pixel Dynamics Network. Pixel Dynamics Network serves as an observation model, which predicts next states of object points at each pixel location given observation of pixels and pixel-level action feature. This is like internally learning a pixel-level dynamics model. Pixel Dynamics Network is a kind of Convolutional Neural Network (ConvNet), with specially-designed architecture. Therefore, Pixel Dynamics Network could take the advantage of parallel computation of ConvNets, while learning the pixel-level dynamics model. We further prove the equivalence between Pixel Dynamics Network as an observation model, and the belief update in partially observable Markov decision process (POMDP) framework. We evaluate our Pixel Dynamics Network in event recognition tasks. We build an event recognition system, ER-PRN, which takes Pixel Dynamics Network as a subroutine, to recognize events based on observations augmented by plan-recognition-driven attention.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:1812.00301 [cs.CV]
	(or arXiv:1812.00301v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1812.00301

Submission history

From: Yantian Zha [view email]
[v1] Sun, 2 Dec 2018 02:07:06 UTC (659 KB)
[v2] Thu, 14 Oct 2021 05:03:23 UTC (2,359 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Plan-Recognition-Driven Attention Modeling for Visual Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Plan-Recognition-Driven Attention Modeling for Visual Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators