1st Place Solution To Google Landmark Retrieval 2020 Modified

This paper describes the 1st place solution to the Google Landmark Retrieval 2020 competition. The solution used metric learning to classify landmarks from two training datasets (GLD2 and CGLD2). It achieved performance gains through transfer learning, fine-tuning models on larger images, adjusting loss weights to focus on cleaner training samples, and ensembling multiple models. The best scoring model was an ensemble of EfficientNet models fine-tuned on larger images that scored 0.38677 mAP on the private leaderboard.

Uploaded by

Wisam Naji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views3 pages

1st Place Solution To Google Landmark Retrieval 2020 Modified

Uploaded by

Wisam Naji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

1st Place Solution to Google Landmark Retrieval 2020

SeungKee Jeon(Samsung Electronics)*

[email protected]

Abstract
This paper presents the 1st place solution to the Google The train set of GLD2 contains 4132914 images and
Landmark Retrieval 2020 Competition on Kaggle. The 203094 classes, while the train set of CGLD2 contains
solution is based on metric learning to classify numerous 1580470 images and 81313 classes. Both GLD2 train set and
landmark classes, and uses transfer learning with two train CGLD2 train set are used for training in this solution.
datasets, fine-tuning on bigger images, adjusting loss weight In the rest, I describe the basic configurations in Section2,
for cleaner samples, and esemble to enhance the model's training strategies in Section3, ensemble method in Section4,
performance further. Finally, it scored 0.38677 mAP@100 and summary in Section5.
on the private leaderboard.
2.Basic Configuration
1.Introduction
The model structure is depicted in Figure 1. Efficientnet[5]
Google Landmark Retrieval 2020 Competition[1] is the and global average pooling extract features from images, and
third landmark retrieval competition on Kaggle. The task of a deep neural network is followed to squeeze the features
image retrieval is to rank images in an index set by their into smaller dimensions for compact representation and
relevance to a query image. In past landmark retrieval reducing model size. After that, cosine softmax[6] is used to
competitions, the developed models were expected to classify a number of classes. Imagenet pretrained
retrieve database images containing the same landmark as efficientnet[7] was used for backbone CNNs at initial stage,
query. and embedding feature size 512 was used for all models. For
To put the emphasis on representation learning, this cosine softmax parameters, scale value was automatically
competition requires you to create a model that extracts a determined by fixed adacos[8]. Margin value was set to 0,
feature embeddings from the images, then scoring system because train datasets are noisy so trying to cluster more
will use the model to 1) Extract embeddings for the private between same class samples could make training more
test and index sets 2) Create a kNN(k=100) lookup for each difficult. To deal with imbalanced classes, weighted cross
test sample, using the Euclidean distance between test and entropy was used, and weight was determined proportional
index embeddings 3) Score the quality of the lookups using to 1/log(class count) for each class. For image augmentation,
the competition metric. The public test, index image sets are only left-right flip was used since there was low possibility
subsets of Google Landmarks Dataset v2(GLD2)[2], while of overfitting due to the large number of samples, and not to
private test, index image sets are completely new datasets. disturb the image distributions. Stochastic gradient descent
GLD2 is the biggest landmark dataset, which contains optimizer was used for training, where learning rate,
images annotated with labels representing human-made and momentum, weight decay are set to 1e-3, 0.9, 1e-5. No
natural landmarks. It contains approximately 5 million learning scheduling was used for training. For validation set,
images, split into 3 sets of images: train, index and test. 1 sample per class which has equal to or larger than 4
There are 4132914 images in train set, 761757 images in samples in CGLD2 was used, as a result 72322 samples were
index set, and 117577 images in test set. used for validation. Validation loss was calculated using non-
GLD2 is constructed by mining web landmark images, so weighted cross entropy. It is chosen this way because I
it is very noisy. There is a cleaned version of wanted to have as much as classes in validation set, while
GLD2(CGLD2)[3], which was made by team smlyaka using giving them same importance. It is quite small size
automatic data cleaning system for Google Landmark compared to large training data size, but validation loss
Retrieval 2019 Competition[4]. correlated well with the leaderboard score. Google Colab[9]
was used for all experiments, which provides TPUv2-8.
*This work was done while taking a year off from work.
Figure 1. Basic model structure

3.Training Strategy from the experience on step 2, that training too long with
GLD2 could worsen the model performance. By changing
In the first step, CGLD2 was used to train the model to the loss weights, I wanted the model to focus more on
classify 81313 landmark classes. Single efficientnet7 classifying CGLD2 samples. For model with 640x640 inputs
backbone based model with 512x512 image inputs and from step3, it took 4 epochs, or 64 hours for validation loss
batchsize 64 took 35 epochs, or 149 hours for validation loss to converge with batch size 64. For model with 736x736
to converge. It scored private LB score of 0.30264. inputs from step3, it took 3 epochs, or 84 hours for
Second, GLD2 was used to train new model to classify validation loss to converge with batch size 32. It scored
203094 classes, where efficientnet backbone is taken from 0.35932 for model with 640x640 image inputs, and 0.36569
step 1 for transfer learning. With batch size 64, it took 13 for model with 736x736 image inputs, which showed that
epochs, or 150 hours for validation loss to converge. It this step was effective.
scored 0.33749, which is huge improvement from before.
This process was the main driving force to increase the score. public score private score
From here, it was found that even GLD2 is noisy, it helps to
step1, 512x512 0.33907 0.30264
make image feature embeddings to be more representative.
step2, 512x512 0.36576 0.33749
Given that training on GLD2 was successful, I tried to train
the model from scratch using GLD2 for a long time, but step3, 640x640 0.39121 0.35389
validation loss slowly decreased and leaderboard score was step3, 736x736 0.40174 0.36364
much lower. This experiment showed that CGLD2 is actually step4, 640x640 0.39881 0.35932
cleaner than GLD2 and it helps the model to learn important step4, 736x736 0.40215 0.36569
CNN filters. Table 1. Leaderboard scores for each step. It represents the scores
Third, whole model from step 2 was used as is and was of single efficientnet7 backbone based model.
given increasingly bigger images. 640x640 image inputs
4.Ensemble
were given with batch size 64 to the model, and it took 5
epochs or 80 hours for validation loss to converge. Next, 736 Ensemble method was used to raise the leaderboard score
x736 image inputs were given with batch size 32, and it took further. Feature embeddings from several models were
3 epochs or 84 hours for the loss to converge. It was found concatenated with weight to make final feature embeddings.
that bigger the images, better the scores. This step scored Used models are one efficientnet7, one efficientnet6, and
0.35389 on private LB for model with 640x640 inputs, and two efficientnet5 backbone models. Weights were given
0.36364 for model with 736x736 inputs. Since only input based on the performance of each model. 1.0 for
images changed and the whole model was reused, training efficientnet7, 0.8 for efficientnet6, and 0.5 for efficientnet5
was able to converge quite fast. And some data augmentation were finally chosen by inspecting few submission results.
effect was also expected, because different image sizes could With all models that went through step3, it scored 0.38366
give CNN filters a new challenge to optimize. on the private leaderboard. And by applying step4 for
Fourth, whole model from step 3 was taken and loss efficientnet7 backbone models, it scored 0.38677 on the
weight for CGLD2 samples were set twice. It was derived private leaderboard, which is the best score I've got.
5.Summary
In this paper, 1st place solution for Google Landmark
Retrieval 2020 was presented in detail. The solution used
metric learning to classify numerous landmark classes, and
gradually increased the leaderboard score by adopting
transfer learning with two train datasets, finetuning on bigger
images, adjusting loss weights for cleaner train samples, and
finally ensemble method.

References
[1] https://www.kaggle.com/c/landmark-retrieval-2020
[2] T. Weyand, A. Araujo, B. Cao, J. Sim. Google
Landmarks Dataset v2 - A Large-Scale Benchmark for
Instance-Level Recognition and Retrieval. CVPR,
2020
[3] Kohei Ozaki, Shuhei Yokoo. Large-scale Landmark
Retrieval/Recognition under a Noisy and Diverse
Dataset. 2019
[4] https://www.kaggle.com/c/landmark-retrieval-2019
[5] Mingxing Tan, Quoc V. Le. EfficientNet: Rethinking
Model Scaling for Convolutional Neural Networks.
ICML, 2019
[6] Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji,
Dihong Gong, Jingchao Zhou, Zhifeng Li, and Wei
Liu∗. CosFace: Large Margin Cosine Loss for Deep
Face Recognition. CVPR, 2018
[7] https://github.com/qubvel/efficientnet
[8] AdaCos : Adaptively Scaling Cosine Logits for
Effectively Learning Deep Face Representations.
CVPR, 2019
[9] https://colab.research.google.com

Apple Power Mac G5 Quad 2 5 Dual 2 0 2 3 GHZ Service Repair Manual
91% (11)
Apple Power Mac G5 Quad 2 5 Dual 2 0 2 3 GHZ Service Repair Manual
163 pages
Final Job Card Format
75% (4)
Final Job Card Format
32 pages
Assignment No 2 (Aleeza Anjum CS101)
No ratings yet
Assignment No 2 (Aleeza Anjum CS101)
60 pages
cz4041 Project Final Report Nyc Taxi Fare Prediction
0% (1)
cz4041 Project Final Report Nyc Taxi Fare Prediction
18 pages
Tda API Docs
83% (6)
Tda API Docs
334 pages
40 Questions
100% (2)
40 Questions
10 pages
Map - Ukraine
100% (18)
Map - Ukraine
1 page
Grade5 Logical Reasoning PDF
100% (8)
Grade5 Logical Reasoning PDF
7 pages
Supporting Large-Scale Image Recognition With Out-Of-Domain Samples
No ratings yet
Supporting Large-Scale Image Recognition With Out-Of-Domain Samples
4 pages
International Journal of Computational Science, Information Technology and Control Engineering (IJCSITCE)
No ratings yet
International Journal of Computational Science, Information Technology and Control Engineering (IJCSITCE)
8 pages
Google Landmarks Dataset v2 A Large-Scale Benchmark For Instance-Level Recognition and Retrieval
No ratings yet
Google Landmarks Dataset v2 A Large-Scale Benchmark For Instance-Level Recognition and Retrieval
18 pages
Fine-Tuned Xception For Image Classification On Tiny ImageNet
No ratings yet
Fine-Tuned Xception For Image Classification On Tiny ImageNet
4 pages
Residual Squeeze VGG16
No ratings yet
Residual Squeeze VGG16
11 pages
Learning Multiple Layers of Features From Tiny Images. Alex Krizhevsky
No ratings yet
Learning Multiple Layers of Features From Tiny Images. Alex Krizhevsky
60 pages
Mobilenet Part2 Ref
No ratings yet
Mobilenet Part2 Ref
1 page
Toderici Full Resolution Image CVPR 2017 Paper
No ratings yet
Toderici Full Resolution Image CVPR 2017 Paper
9 pages
Using Convolutional Neural Networks and Transfer Learning To Perform Yelp Restaurant Photo Classification
No ratings yet
Using Convolutional Neural Networks and Transfer Learning To Perform Yelp Restaurant Photo Classification
9 pages
Dataset
No ratings yet
Dataset
4 pages
Full Resolution Image Compression With Recurrent Neural Networks
No ratings yet
Full Resolution Image Compression With Recurrent Neural Networks
10 pages
Self-Supervised Learning: Pretext Tasks
No ratings yet
Self-Supervised Learning: Pretext Tasks
3 pages
Loreggia Giacomo
No ratings yet
Loreggia Giacomo
80 pages
Full Resolution Image Compression With Recurrent Neural Networks
No ratings yet
Full Resolution Image Compression With Recurrent Neural Networks
9 pages
Trustworthy - Final Essay
No ratings yet
Trustworthy - Final Essay
21 pages
DL Tutorial NIPS2015 PDF
No ratings yet
DL Tutorial NIPS2015 PDF
133 pages
Time Series Classification: Lab Based Project
No ratings yet
Time Series Classification: Lab Based Project
14 pages
Master's Thesis Deep Learning For Visual Recognition: Remi Cadene Supervised by Nicolas Thome and Matthieu Cord
No ratings yet
Master's Thesis Deep Learning For Visual Recognition: Remi Cadene Supervised by Nicolas Thome and Matthieu Cord
58 pages
IP Report Final
No ratings yet
IP Report Final
20 pages
Yaochang 2019
No ratings yet
Yaochang 2019
4 pages
Le y Yang - Tiny ImageNet Visual Recognition Challenge
No ratings yet
Le y Yang - Tiny ImageNet Visual Recognition Challenge
6 pages
PyTorch Implementation of Learning Fine-Grained Image Similarity With Deep Ranking
No ratings yet
PyTorch Implementation of Learning Fine-Grained Image Similarity With Deep Ranking
8 pages
PyTorch Custom Datasets
No ratings yet
PyTorch Custom Datasets
1 page
Localization Using Convolutional Neural Networks
No ratings yet
Localization Using Convolutional Neural Networks
29 pages
Master Inspera
No ratings yet
Master Inspera
45 pages
Technologies
No ratings yet
Technologies
9 pages
Application of Transfer Learning For Image Classification On Dataset With Not Mutually Exclusive Classes
No ratings yet
Application of Transfer Learning For Image Classification On Dataset With Not Mutually Exclusive Classes
4 pages
Thesis Master 2022 Application of GNN For Graph Classification
No ratings yet
Thesis Master 2022 Application of GNN For Graph Classification
81 pages
Exer8 TresMarias
No ratings yet
Exer8 TresMarias
3 pages
Capstone Project
No ratings yet
Capstone Project
47 pages
Almost Free Embeddings Outperform Trained Graph Neural Networks in Graph Classification
No ratings yet
Almost Free Embeddings Outperform Trained Graph Neural Networks in Graph Classification
14 pages
Cats and Dogs Classification
No ratings yet
Cats and Dogs Classification
12 pages
Snap2Insight Task Explanation
No ratings yet
Snap2Insight Task Explanation
6 pages
Darts: D A S: Ifferentiable Rchitecture Earch
No ratings yet
Darts: D A S: Ifferentiable Rchitecture Earch
13 pages
Unconventional Wisdom A New Transfer Learning Approach Applied To Bengali Numeral Classification
No ratings yet
Unconventional Wisdom A New Transfer Learning Approach Applied To Bengali Numeral Classification
6 pages
Classifying Authentic and AI-Generated Images With A Fine - Tuned ResNet50 Model.
No ratings yet
Classifying Authentic and AI-Generated Images With A Fine - Tuned ResNet50 Model.
7 pages
Video 7 - Building A Multilayer Feedforward Network For Classification in PyTorch
No ratings yet
Video 7 - Building A Multilayer Feedforward Network For Classification in PyTorch
18 pages
2024 GR5245 HW1 - Due0929 - 11pm
No ratings yet
2024 GR5245 HW1 - Due0929 - 11pm
2 pages
DARTS: Differentiable Architecture Search
No ratings yet
DARTS: Differentiable Architecture Search
12 pages
Thesis 2021 Optimizaiton GNN Stathas Nistath Meng Eecs 2021 Thesis
No ratings yet
Thesis 2021 Optimizaiton GNN Stathas Nistath Meng Eecs 2021 Thesis
79 pages
Deep Residual Learning For Image Recognition (Summary)
No ratings yet
Deep Residual Learning For Image Recognition (Summary)
11 pages
Learning Transferable Visual Models From Natural Language Supervision
No ratings yet
Learning Transferable Visual Models From Natural Language Supervision
14 pages
MobileNetV2 Inverted Residuals and Linear Bottlenecks
No ratings yet
MobileNetV2 Inverted Residuals and Linear Bottlenecks
11 pages
Co4 Question Bank
No ratings yet
Co4 Question Bank
6 pages
Espinosa, Velastin, Branch - 2017 - Vehicle Detection Using Alex Net and Faster R-CNN Deep Learning Models A Comparative Study-Annotated
No ratings yet
Espinosa, Velastin, Branch - 2017 - Vehicle Detection Using Alex Net and Faster R-CNN Deep Learning Models A Comparative Study-Annotated
14 pages
Structural Damage Image Classification: Minnie Ho Jorge Troncoso
No ratings yet
Structural Damage Image Classification: Minnie Ho Jorge Troncoso
6 pages
AAIEXP@5
No ratings yet
AAIEXP@5
3 pages
Res Net
No ratings yet
Res Net
46 pages
Brief Introduction of Mobilenetv1 V2 V3 Lightweight Network
No ratings yet
Brief Introduction of Mobilenetv1 V2 V3 Lightweight Network
29 pages
RESNET
No ratings yet
RESNET
5 pages
X-DenseNet Deep Learning For Garbage Classificatio
No ratings yet
X-DenseNet Deep Learning For Garbage Classificatio
7 pages
Self Supervised Learning
No ratings yet
Self Supervised Learning
5 pages
OD Trans Christopher-Lang2022 Q2
No ratings yet
OD Trans Christopher-Lang2022 Q2
15 pages
"I C U N N ": Mage Lassification Sing Eural Etworks
No ratings yet
"I C U N N ": Mage Lassification Sing Eural Etworks
15 pages
Fruit Quality Classifier - Group 1
No ratings yet
Fruit Quality Classifier - Group 1
12 pages
Cambridge IGCSE: Computer Science 0478/22
No ratings yet
Cambridge IGCSE: Computer Science 0478/22
12 pages
Reading A PC Serial Port Using C
100% (1)
Reading A PC Serial Port Using C
3 pages
ALPHA6000 Series User Manual
No ratings yet
ALPHA6000 Series User Manual
219 pages
My Mini-Project 300 Level DONE
No ratings yet
My Mini-Project 300 Level DONE
45 pages
Exam With Solutions PDF
0% (1)
Exam With Solutions PDF
17 pages
Johnson, Gabbrielle (2020) - Algorithmic Bias - On The Implicit Biases of Social Technology (Synthese) .2up
No ratings yet
Johnson, Gabbrielle (2020) - Algorithmic Bias - On The Implicit Biases of Social Technology (Synthese) .2up
11 pages
Finger Print Time Clock WE-68
No ratings yet
Finger Print Time Clock WE-68
2 pages
9b. Computer Networks
No ratings yet
9b. Computer Networks
14 pages
One-to-Many Tool - Limited Version - GEDmatch TODOS OS MATCHES
No ratings yet
One-to-Many Tool - Limited Version - GEDmatch TODOS OS MATCHES
4 pages
Chapter 8
No ratings yet
Chapter 8
13 pages
Traps Technology: Palo Alto Networks - Traps Technology Overview - White Paper
No ratings yet
Traps Technology: Palo Alto Networks - Traps Technology Overview - White Paper
9 pages
iUSB User Guide - 2018.06.05 PDF
No ratings yet
iUSB User Guide - 2018.06.05 PDF
8 pages
COBIT 2019 RACI by Role April 2020 BIMBO
No ratings yet
COBIT 2019 RACI by Role April 2020 BIMBO
247 pages
(Lecture Notes in Economics and Mathematical Systems 374) Søren Asmussen, Reuven Rubinstein (Auth.), Prof. Dr. Georg Pflug, Prof. Dr. Ulrich Dieter (Eds.)-Simulation and Optimization_ Proceedings of t
No ratings yet
(Lecture Notes in Economics and Mathematical Systems 374) Søren Asmussen, Reuven Rubinstein (Auth.), Prof. Dr. Georg Pflug, Prof. Dr. Ulrich Dieter (Eds.)-Simulation and Optimization_ Proceedings of t
174 pages
07+ +Path+Tracing
No ratings yet
07+ +Path+Tracing
113 pages
Resume - Ashish Mangalampalli
No ratings yet
Resume - Ashish Mangalampalli
3 pages
T300 CAT - NRJED314621EN - 26mai16 - V19
No ratings yet
T300 CAT - NRJED314621EN - 26mai16 - V19
104 pages
New V20 Q Internet
No ratings yet
New V20 Q Internet
6 pages
XVR5104HS-I3 (1T) V3.0-SSD1T Datasheet 20230912
No ratings yet
XVR5104HS-I3 (1T) V3.0-SSD1T Datasheet 20230912
3 pages
Bom Cno201763
No ratings yet
Bom Cno201763
20 pages
Compte Rendu TP2: Gestion Des Droits Et Des Utilisateurs
No ratings yet
Compte Rendu TP2: Gestion Des Droits Et Des Utilisateurs
7 pages
All The Math Books You'Ll Ever Need - Decent List
100% (3)
All The Math Books You'Ll Ever Need - Decent List
19 pages
CCNA Exploration 3, Chapter 2. "Basic Switch Concepts and Configuration" Worksheet
No ratings yet
CCNA Exploration 3, Chapter 2. "Basic Switch Concepts and Configuration" Worksheet
16 pages
STS-3 6KTL-P Datasheet EN
No ratings yet
STS-3 6KTL-P Datasheet EN
2 pages