Large-scale Semantic Visual Search
NGUYEN ANH TUAN
tuannguyen.research@gmail.com
2016/07/17
About me
• 東京大学 情報理工学系研究科
修士2年生
• テーマ:Object Retrieval,情
報検索等
• 趣味:水泳,囲碁
• ブログ:
https://imsmarxen68.tumblr.co
m/
NGUYEN ANH TUAN 東京大学・情報理
工・修士2年生
A picture is worth a thousand
words
Outline
• Semantic Visual Search
• A visual search framework
Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html
Feature
extraction
Feature
aggregation
Feature
matching Re-ranking
Preliminary
results
Final
results
NGUYEN ANH TUAN 東京大学・情報理
工・修士2年生
Visual search
Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html
Image credits: http://google.com
NGUYEN ANH TUAN 東京大学・情報理
工・修士2年生
What’s the problem?
• Semantic difficulties: fine-grained differences
Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html
NGUYEN ANH TUAN 東京大学・情報理
工・修士2年生
But for search problem?
Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html
Query Database
NGUYEN ANH TUAN 東京大学・情報理
工・修士2年生
But for search problem?
Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html
Query Database
0.1
0.5
0.2Ranking problem
with a variation of
fine-grained
changes
NGUYEN ANH TUAN 東京大学・情報理
工・修士2年生
But for search problem?
Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html
Query Database
0.1
0.5
0.2Find visual representations
to capture all fine-grained
local information in images
NGUYEN ANH TUAN 東京大学・情報理
工・修士2年生
Large-scale Visual Search
Robust feature extraction
• Robust to
– Scale changes
– Rotation and affine changes
– Blur, sharpening, …
Feature
extraction
Feature
aggregation
Feature
matching Re-ranking
Preliminary
results
Final
results
Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html
A picture is
worth a
thousand words
NGUYEN ANH TUAN 東京大学・情報理
工・修士2年生
Statistical kernels
• Bag-of-Features (BoF)
• Fisher kernel (GMM) [1]
• VLAD (K-means) [2]
Image credits: http://www.mathworks.com/matlabcentral/
Feature
extraction
Feature
aggregation
Feature
matching Re-ranking
Preliminary
results
Final
results
[1] F. Perronnin, C. Dance, “Fisher Kernels on Visual Vocabularies for Image
Categorization,” in Proc. CVPR, IEEE, 2007
[2] H. Jegou, F. Perronnin, M. Douze, J. Sanchez, P. Perez, C. Schmid, “Aggregating Local
Image Descriptors into Compact Codes,” IEEE Trans. Pattern Anal. Mach. Intell. 34 (2012)
1704–1716. NGUYEN ANH TUAN 東京大学・情報理
工・修士2年生
Statistical kernels
Feature
extraction
Feature
aggregation
Feature
matching Re-ranking
Preliminary
results
Final
results
NGUYEN ANH TUAN 東京大学・情報理
工・修士2年生
Image matching = Feature matching
• Feature matching→Nearest Neighbor Search
– Inverse Search with Inverted Indices
– Compressed data for better memory usage [3]
Feature
extraction
Feature
aggregation
Feature
matching Re-ranking
Preliminary
results
Final
results
[3] H. Jégou, M. Douze, C. Schmid, Product
quantization for nearest neighbor search., IEEE
Trans. Pattern Anal. Mach. Intell. 33 (2011) 117–
28.Data CompressionNGUYEN ANH TUAN 東京大学・情報理
工・修士2年生
Verification
• Geometry verification
– RANSAC methods [4]
– Reduce the number of good inliers
Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html
Feature
extraction
Feature
aggregation
Feature
matching Re-ranking
Preliminary
results
Final
results
[4] M.A. Fischler, R.C. Bolles, Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,
Commun. ACM. 24 (1981) 381–395. NGUYEN ANH TUAN 東京大学・情報理
工・修士2年生
Thank you for listening

今日から始める人工知能 × 機械学習 Meetup ライトニングトーク1

  • 1.
    Large-scale Semantic VisualSearch NGUYEN ANH TUAN [email protected] 2016/07/17
  • 2.
    About me • 東京大学情報理工学系研究科 修士2年生 • テーマ:Object Retrieval,情 報検索等 • 趣味:水泳,囲碁 • ブログ: https://imsmarxen68.tumblr.co m/ NGUYEN ANH TUAN 東京大学・情報理 工・修士2年生
  • 3.
    A picture isworth a thousand words
  • 4.
    Outline • Semantic VisualSearch • A visual search framework Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html Feature extraction Feature aggregation Feature matching Re-ranking Preliminary results Final results NGUYEN ANH TUAN 東京大学・情報理 工・修士2年生
  • 5.
    Visual search Image credits:http://ai.stanford.edu/~jkrause/cars/car_dataset.html Image credits: http://google.com NGUYEN ANH TUAN 東京大学・情報理 工・修士2年生
  • 6.
    What’s the problem? •Semantic difficulties: fine-grained differences Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html NGUYEN ANH TUAN 東京大学・情報理 工・修士2年生
  • 7.
    But for searchproblem? Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html Query Database NGUYEN ANH TUAN 東京大学・情報理 工・修士2年生
  • 8.
    But for searchproblem? Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html Query Database 0.1 0.5 0.2Ranking problem with a variation of fine-grained changes NGUYEN ANH TUAN 東京大学・情報理 工・修士2年生
  • 9.
    But for searchproblem? Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html Query Database 0.1 0.5 0.2Find visual representations to capture all fine-grained local information in images NGUYEN ANH TUAN 東京大学・情報理 工・修士2年生
  • 10.
  • 11.
    Robust feature extraction •Robust to – Scale changes – Rotation and affine changes – Blur, sharpening, … Feature extraction Feature aggregation Feature matching Re-ranking Preliminary results Final results Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html A picture is worth a thousand words NGUYEN ANH TUAN 東京大学・情報理 工・修士2年生
  • 12.
    Statistical kernels • Bag-of-Features(BoF) • Fisher kernel (GMM) [1] • VLAD (K-means) [2] Image credits: http://www.mathworks.com/matlabcentral/ Feature extraction Feature aggregation Feature matching Re-ranking Preliminary results Final results [1] F. Perronnin, C. Dance, “Fisher Kernels on Visual Vocabularies for Image Categorization,” in Proc. CVPR, IEEE, 2007 [2] H. Jegou, F. Perronnin, M. Douze, J. Sanchez, P. Perez, C. Schmid, “Aggregating Local Image Descriptors into Compact Codes,” IEEE Trans. Pattern Anal. Mach. Intell. 34 (2012) 1704–1716. NGUYEN ANH TUAN 東京大学・情報理 工・修士2年生
  • 13.
  • 14.
    Image matching =Feature matching • Feature matching→Nearest Neighbor Search – Inverse Search with Inverted Indices – Compressed data for better memory usage [3] Feature extraction Feature aggregation Feature matching Re-ranking Preliminary results Final results [3] H. Jégou, M. Douze, C. Schmid, Product quantization for nearest neighbor search., IEEE Trans. Pattern Anal. Mach. Intell. 33 (2011) 117– 28.Data CompressionNGUYEN ANH TUAN 東京大学・情報理 工・修士2年生
  • 15.
    Verification • Geometry verification –RANSAC methods [4] – Reduce the number of good inliers Image credits: http://ai.stanford.edu/~jkrause/cars/car_dataset.html Feature extraction Feature aggregation Feature matching Re-ranking Preliminary results Final results [4] M.A. Fischler, R.C. Bolles, Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography, Commun. ACM. 24 (1981) 381–395. NGUYEN ANH TUAN 東京大学・情報理 工・修士2年生
  • 16.
    Thank you forlistening