0% found this document useful (0 votes)
43 views

One Person, One Model, One World: Learning Continual User Representation Without Forgetting

1) The document proposes a new method called Conure for building lifelong user representation models that can continually learn new tasks without forgetting previous tasks. 2) Conure uses an over-parameterized model and regularization to prevent catastrophic forgetting when learning new tasks sequentially. 3) Experiments on recommendation datasets show that Conure outperforms other baselines and lifelong learning methods at learning multiple sequential tasks without forgetting.

Uploaded by

lifengyi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

One Person, One Model, One World: Learning Continual User Representation Without Forgetting

1) The document proposes a new method called Conure for building lifelong user representation models that can continually learn new tasks without forgetting previous tasks. 2) Conure uses an over-parameterized model and regularization to prevent catastrophic forgetting when learning new tasks sequentially. 3) Experiments on recommendation datasets show that Conure outperforms other baselines and lifelong learning methods at learning multiple sequential tasks without forgetting.

Uploaded by

lifengyi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

One Person, One Model, One World:

Learning Continual User Representation


without Forgetting
SIGIR2021
Data&Code: https://github.com/fajieyuan/SIGIR2021_Conure

Fajie Yuan (Westlake University, Tencent), Guoxiao Zhang(Tencent ),


Alexandros Karatzoglou (Google Research), Joemon Jose (University of Glasgow),
Beibei Kong (Tencent), Yudong Li(Tencent)
Outline

 Motivation
 Related Work
 Conure
 Experiments
Our Motivation
A person has different roles to play in life!
But all these roles may have some
commonalities, such as personalization,
habits, preference.

Our Focus:
Whether we can build a user representation model that could
keep learning throughout all sequential tasks without forgetting One Person, One Model, One World
News Rec

Video Rec

Car Rec Map


APP

Search
A person has different roles to
play in life!But all these roles Engine
Music Rec
may have some commonalities,
such as personalization,
habits, preference.

Browser
Social APP

Video Rec Video Rec

Video Rec

Social APP
Clicking logs

No interaction

TikTok -- warm user Amazon —cold users Ads --- new users
Using Lifelong learning techniques to solve recommendation tasks

Keypoints
• Necessity and possibility why lifelong learning for UR learning?
• Lifelong learning paradigm throughout all tasks.
• Performance gain for tasks have certain correlations.
Outline
 Motivation
 Related Work
 Conure
 Experiments
• Classical UR models (works well but is specific to only one task)

GRU4Rec (Hidasi et al ICLR2016) NextItNet (Yuan et al WSDM2019)

SASRec(Kang et al ICDM2018) DSSM(Huang et al CIKM2013) Grec (Yuan et al WWW2020 )


• PeterRec (Two-stage Transfer Learning):
• PeterRec (Finetuning):
• Transfer Learning Paradigm Comparisons:

(a) Standard TF (b) PeterRec (b) Conure (b) MTL

Lifelong learning without parameter preserving


Outline
 Motivation
 Related Work
 Conure
 Experiments
• Catastrophic Forgetting :
Parameter
Changes

Last hidden
Vector Changes
• Over-parameterization:

—(1) the more parameters are pruned, the worse it performs


—(2) performing retraining on the pruned network (i.e., “pr70+retrain”) regains its original accuracy quickly
—(3) smaller models (i.e., (b))are also highly over-parameterized
• Conure architecture and learning process.

Conure is conceptually very


simple, easy to implement,
and applicable to various
sequential encoder networks.
Outline
 Motivation
 Related Work
 Conure
 Experiments
• Datasets:
TTL: https://drive.google.com/file/d/1imhHUsivh6oMEtEW-RwVc4OsDqn-xOaP/view
ML: https://drive.google.com/file/d/1-_KmnZFaOdH11keLYVcgkf-kW_BaM266/view
• Results:

—(1) Conure largely outperforms other models on T3 because of the positive transfer from T1 and T2
—(2) Conure, PeterRec and FineAll largely outperforms SimMo because of of the positive transfer from T1
—(3) SinMoAll performs much worse on most tasks (except the last one) because of catastrophic forgetting
• Ablation study- T2 for T3:

—(1) Without training T2,Conure shows worse results,e.g., -6.5% on TTL20%


• Ablation study- Task order:

—(1) Conure is not sensitive to the task order.


• Ablation study:

—(1) pruning also works for the embedding layer

—(1) Conure is not restricted to specialized


sequential encoder.
—(2) Conure with the Transformer backbone works
a bit better than it with NextItNet.
Contributions:
—(1)providing the first lifelong learning paradigm for user representations.
—(2)providing insights for forgetting and redundancy issues in user representation models
—(3)designing Conure, the first lifelong learning algorithm - smple and easy to implement
—(4)instantiazing Conure with NextItNet and Transformer backbones
—(5)Extensive experiments with SOTA performance with many new discoveries and insights
Case study:

You might also like