SlideShare a Scribd company logo
Fast ALS-based matrix factorization for explicit and implicit feedback datasets Istv á n Pil á szy, D ávid Zibriczky,  Domonkos Tikk Gravity R&D Ltd. www.gravityrd.com 28   September  20 10
Collaborative filtering
Problem setting 5 4 3 4 4 2 4 1
Ridge Regression
Optimal solution: Ridge Regression
Computing the optimal solution: Matrix inversion is costly:  Sum of squared errors of the optimal solution: 0.055 Ridge Regression
RR1: RR with coordinate descent Idea: optimize only one variable of    at once Start with zero: Sum of squared errors: 24.6
RR1: RR with coordinate descent Idea: optimize only one variable of    at once Start with zero, then optimize w 1 Sum of squared errors: 7.5
RR1: RR with coordinate descent Idea: optimize only one variable of    at once Start with zero, then optimize w 1  ,then optimize w 2 Sum of squared errors: 6.2
RR1: RR with coordinate descent Idea: optimize only one variable of    at once Start with zero, then optimize w 1 , then w 2 , then w 3 Sum of squared errors: 5.7
RR1: RR with coordinate descent Idea: optimize only one variable of    at once …  w 4 Sum of squared errors: 5.4
RR1: RR with coordinate descent Idea: optimize only one variable of    at once …  w 5 Sum of squared errors: 5.0
RR1: RR with coordinate descent Idea: optimize only one variable of    at once …  w 1  again Sum of squared errors: 3.4
RR1: RR with coordinate descent Idea: optimize only one variable of    at once …  w 2  again Sum of squared errors: 2.9
RR1: RR with coordinate descent Idea: optimize only one variable of    at once …  w 3  again Sum of squared errors: 2.7
RR1: RR with coordinate descent Idea: optimize only one variable of    at once …  after a while: Sum of squared errors: 0.055 No remarkable difference Cost:  n examples, e epoch
The rating matrix,  R  of  (M x N ) is approximated as the product of two lower ranked matrices,  P : user feature matrix of ( M x K ) size Q : item (movie) feature matrix of ( N x K ) size K : number of features Matrix factorization P T R T Q
Matrix Factorization  for explicit feedb. Q P 5 5 4 3 1 R 3.3 1.3 1.3 1. 4 1. 3 1 . 9 1. 7 0.7 1.0 1.3 0.8 0 0. 7 0.4 1. 7 0. 3 2.1 2.2 6.7 1.6 1. 4 2 4 3.3 1.6 1.8
Finding P and Q Q P R 0.3 0.9 0.7 1.3 0.5 0 .6 1.2 0.3 1. 6 1.1 5 5 4 3 1 2 4 ? ? Init Q randomly Find p 1
Finding  p 1  with RR Optimal solution:
Finding  p 1  with RR Q P R 0.3 0.9 0.7 1.3 0.5 0 .6 1.2 0.3 1. 6 1.1 5 5 4 3 1 2 4 2.3 3.2
Initialize Q randomly Repeat Recompute P Compute  p 1  with RR Compute  p 2  with RR …  (for each user) Recompute Q Compute  q 1  with RR …  (for each item) Alternating Least Squares (ALS)
ALS relies on RR: recomputation of vectors with RR when recomputing  p 1 , the previously computed value is ignored ALS1 relies on RR1: optimize the previously computed  p 1 , one scalar at once the previously computed value is not lost run RR1 only for one epoch ALS is just an approximation method. Likewise ALS1. ALS1: ALS with RR1
Implicit feedback Q P 1 0 R 0.5 0.1 0.2 0.7 0.3 0.1 0.1 0.7 0.3 0 0.2 0 0. 7 0.4 0.4 0. 4 1 0 0 0 0 1 1 0 0 1 0 1 1
The matrix is fully specified: each user watched each item. Zeros are less important, but still important. Many 0-s, few 1-s. Recall, that Idea (Hu, Koren, Volinsky): consider a user, who watched nothing compute    and    for this user (the null-user) when recomputing  p 1 , compare her to the null-user based on the cached    and   , update them according to the differences In this way, only the number of 1-s affect performance, not the number of 0-s IALS: alternating least squares with this trick. Implicit feedback: IALS
The RR1 trick cannot be applied here  Implicit feedback: IALS1
The RR1 trick cannot be applied here  But, wait…! Implicit feedback: IALS1
X T X is just a matrix. No matter how many items we have, its dimension is the same (KxK) If we are lucky, we can find K items which generate this matrix What, if we are unlucky?   We can still create synthetic items. Assume that the null user did not watch these K items X T X and X T y are the same, if synthetic items were created appropriately Implicit feedback: IALS1
Can we find a Z matrix such that Z is small,  KxK  and  ? We can, by eigenvalue decomposition Implicit feedback: IALS1
If a user watched N items,we can run RR1 with  N+K examples To recompute  p u , we need steps (assume 1 epoch) Is it better in practice, than the   of IALS ? Implicit feedback: IALS1
Evaluation of ALS vs. ALS1 Probe10 RMSE on Netflix Prize dataset, after 25  epochs
Evaluation of ALS vs. ALS1 Time-accuracy tradeoff
Evaluation of IALS vs. IALS1 Average Relative Position on the test subset of a proprietary implicit feedback dataset, after 20 epochs. Lower is better.
Evaluation of IALS vs. IALS1 Time – accuracy tradeoff.
Conclusions users items We learned two tricks: ALS1: RR1 can be used instead of RR in ALS IALS1: we can create few synthetic examples to replace the not-watching of many examples ALS and IALS are approximation algorithms,  so why not change them to be even more approximative ALS1 and IALS1 offer better time-accuracy tradeoffs,  esp. when  K is large. They can be even 10x faster   (or even 100x faster, for non-realistic K values) TODO: Precision, recall, other datasets.
Thank you for your attention ?

More Related Content

What's hot (14)

Lecture 2 fuzzy inference system
Lecture 2  fuzzy inference systemLecture 2  fuzzy inference system
Lecture 2 fuzzy inference system
ParveenMalik18
 
Lecture 6 radial basis-function_network
Lecture 6 radial basis-function_networkLecture 6 radial basis-function_network
Lecture 6 radial basis-function_network
ParveenMalik18
 
Multiclass Logistic Regression: Derivation and Apache Spark Examples
Multiclass Logistic Regression: Derivation and Apache Spark ExamplesMulticlass Logistic Regression: Derivation and Apache Spark Examples
Multiclass Logistic Regression: Derivation and Apache Spark Examples
Marjan Sterjev
 
MediaEval 2015 - Emotion in Music: Task Overview
MediaEval 2015 - Emotion in Music: Task OverviewMediaEval 2015 - Emotion in Music: Task Overview
MediaEval 2015 - Emotion in Music: Task Overview
multimediaeval
 
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
AI Robotics KR
 
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
AI Robotics KR
 
Av 738-Adaptive Filters - Extended Kalman Filter
Av 738-Adaptive Filters - Extended Kalman FilterAv 738-Adaptive Filters - Extended Kalman Filter
Av 738-Adaptive Filters - Extended Kalman Filter
Dr. Bilal Siddiqui, C.Eng., MIMechE, FRAeS
 
Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]
Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]
Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]
AI Robotics KR
 
"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest
"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest
"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest
Stefan Adam
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
Wael Sharba
 
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsGradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation Graphs
Yoonho Lee
 
0415_seminar_DeepDPG
0415_seminar_DeepDPG0415_seminar_DeepDPG
0415_seminar_DeepDPG
Hye-min Ahn
 
Applied Machine Learning For Search Engine Relevance
Applied Machine Learning For Search Engine Relevance Applied Machine Learning For Search Engine Relevance
Applied Machine Learning For Search Engine Relevance
charlesmartin14
 
ANCLMS
ANCLMSANCLMS
ANCLMS
Ashish Meshram
 
Lecture 2 fuzzy inference system
Lecture 2  fuzzy inference systemLecture 2  fuzzy inference system
Lecture 2 fuzzy inference system
ParveenMalik18
 
Lecture 6 radial basis-function_network
Lecture 6 radial basis-function_networkLecture 6 radial basis-function_network
Lecture 6 radial basis-function_network
ParveenMalik18
 
Multiclass Logistic Regression: Derivation and Apache Spark Examples
Multiclass Logistic Regression: Derivation and Apache Spark ExamplesMulticlass Logistic Regression: Derivation and Apache Spark Examples
Multiclass Logistic Regression: Derivation and Apache Spark Examples
Marjan Sterjev
 
MediaEval 2015 - Emotion in Music: Task Overview
MediaEval 2015 - Emotion in Music: Task OverviewMediaEval 2015 - Emotion in Music: Task Overview
MediaEval 2015 - Emotion in Music: Task Overview
multimediaeval
 
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
AI Robotics KR
 
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
AI Robotics KR
 
Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]
Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]
Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]
AI Robotics KR
 
"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest
"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest
"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest
Stefan Adam
 
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsGradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation Graphs
Yoonho Lee
 
0415_seminar_DeepDPG
0415_seminar_DeepDPG0415_seminar_DeepDPG
0415_seminar_DeepDPG
Hye-min Ahn
 
Applied Machine Learning For Search Engine Relevance
Applied Machine Learning For Search Engine Relevance Applied Machine Learning For Search Engine Relevance
Applied Machine Learning For Search Engine Relevance
charlesmartin14
 

Similar to Fast ALS-based matrix factorization for explicit and implicit feedback datasets (20)

Recsys matrix-factorizations
Recsys matrix-factorizationsRecsys matrix-factorizations
Recsys matrix-factorizations
Dmitriy Selivanov
 
Music Recommendations at Scale with Spark
Music Recommendations at Scale with SparkMusic Recommendations at Scale with Spark
Music Recommendations at Scale with Spark
Chris Johnson
 
Neural Collaborative Subspace Clustering
Neural Collaborative Subspace ClusteringNeural Collaborative Subspace Clustering
Neural Collaborative Subspace Clustering
Takahiro Hasegawa
 
Numerical Solution of Linear algebraic Equation
Numerical Solution of Linear algebraic EquationNumerical Solution of Linear algebraic Equation
Numerical Solution of Linear algebraic Equation
payalpriyadarshinisa1
 
Introduction to behavior based recommendation system
Introduction to behavior based recommendation systemIntroduction to behavior based recommendation system
Introduction to behavior based recommendation system
Kimikazu Kato
 
Linear Programming
Linear ProgrammingLinear Programming
Linear Programming
knspavan
 
Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)
Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)
Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)
ALINLAB
 
directed-research-report
directed-research-reportdirected-research-report
directed-research-report
Ryen Krusinga
 
Matrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsMatrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender Systems
Dmitriy Selivanov
 
LINEAR ALGEBRA, WITH OPTIMIZATION
LINEAR ALGEBRA, WITH OPTIMIZATIONLINEAR ALGEBRA, WITH OPTIMIZATION
LINEAR ALGEBRA, WITH OPTIMIZATION
CHARAK RAY
 
cvpr2009 tutorial: kernel methods in computer vision: part II: Statistics and...
cvpr2009 tutorial: kernel methods in computer vision: part II: Statistics and...cvpr2009 tutorial: kernel methods in computer vision: part II: Statistics and...
cvpr2009 tutorial: kernel methods in computer vision: part II: Statistics and...
zukun
 
Recommendation System --Theory and Practice
Recommendation System --Theory and PracticeRecommendation System --Theory and Practice
Recommendation System --Theory and Practice
Kimikazu Kato
 
Relationship between some machine learning concepts
Relationship between some machine learning conceptsRelationship between some machine learning concepts
Relationship between some machine learning concepts
Zoya Bylinskii
 
Context-Aware Recommender System Based on Boolean Matrix Factorisation
Context-Aware Recommender System Based on Boolean Matrix FactorisationContext-Aware Recommender System Based on Boolean Matrix Factorisation
Context-Aware Recommender System Based on Boolean Matrix Factorisation
Dmitrii Ignatov
 
14-dynamic-programming-work-methods.pptx
14-dynamic-programming-work-methods.pptx14-dynamic-programming-work-methods.pptx
14-dynamic-programming-work-methods.pptx
r6s0069
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorial
Jia-Bin Huang
 
Lessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scaleLessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scale
Domonkos Tikk
 
Linear Algebra for AI & ML
Linear Algebra for AI & MLLinear Algebra for AI & ML
Linear Algebra for AI & ML
Sandip Ladi
 
Lecture_9_LA_Review.pptx
Lecture_9_LA_Review.pptxLecture_9_LA_Review.pptx
Lecture_9_LA_Review.pptx
Sunny432360
 
Large Scale Recommendation: a view from the Trenches
Large Scale Recommendation: a view from the TrenchesLarge Scale Recommendation: a view from the Trenches
Large Scale Recommendation: a view from the Trenches
Anne-Marie Tousch
 
Recsys matrix-factorizations
Recsys matrix-factorizationsRecsys matrix-factorizations
Recsys matrix-factorizations
Dmitriy Selivanov
 
Music Recommendations at Scale with Spark
Music Recommendations at Scale with SparkMusic Recommendations at Scale with Spark
Music Recommendations at Scale with Spark
Chris Johnson
 
Neural Collaborative Subspace Clustering
Neural Collaborative Subspace ClusteringNeural Collaborative Subspace Clustering
Neural Collaborative Subspace Clustering
Takahiro Hasegawa
 
Numerical Solution of Linear algebraic Equation
Numerical Solution of Linear algebraic EquationNumerical Solution of Linear algebraic Equation
Numerical Solution of Linear algebraic Equation
payalpriyadarshinisa1
 
Introduction to behavior based recommendation system
Introduction to behavior based recommendation systemIntroduction to behavior based recommendation system
Introduction to behavior based recommendation system
Kimikazu Kato
 
Linear Programming
Linear ProgrammingLinear Programming
Linear Programming
knspavan
 
Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)
Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)
Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)
ALINLAB
 
directed-research-report
directed-research-reportdirected-research-report
directed-research-report
Ryen Krusinga
 
Matrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender SystemsMatrix Factorizations for Recommender Systems
Matrix Factorizations for Recommender Systems
Dmitriy Selivanov
 
LINEAR ALGEBRA, WITH OPTIMIZATION
LINEAR ALGEBRA, WITH OPTIMIZATIONLINEAR ALGEBRA, WITH OPTIMIZATION
LINEAR ALGEBRA, WITH OPTIMIZATION
CHARAK RAY
 
cvpr2009 tutorial: kernel methods in computer vision: part II: Statistics and...
cvpr2009 tutorial: kernel methods in computer vision: part II: Statistics and...cvpr2009 tutorial: kernel methods in computer vision: part II: Statistics and...
cvpr2009 tutorial: kernel methods in computer vision: part II: Statistics and...
zukun
 
Recommendation System --Theory and Practice
Recommendation System --Theory and PracticeRecommendation System --Theory and Practice
Recommendation System --Theory and Practice
Kimikazu Kato
 
Relationship between some machine learning concepts
Relationship between some machine learning conceptsRelationship between some machine learning concepts
Relationship between some machine learning concepts
Zoya Bylinskii
 
Context-Aware Recommender System Based on Boolean Matrix Factorisation
Context-Aware Recommender System Based on Boolean Matrix FactorisationContext-Aware Recommender System Based on Boolean Matrix Factorisation
Context-Aware Recommender System Based on Boolean Matrix Factorisation
Dmitrii Ignatov
 
14-dynamic-programming-work-methods.pptx
14-dynamic-programming-work-methods.pptx14-dynamic-programming-work-methods.pptx
14-dynamic-programming-work-methods.pptx
r6s0069
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorial
Jia-Bin Huang
 
Lessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scaleLessons learnt at building recommendation services at industry scale
Lessons learnt at building recommendation services at industry scale
Domonkos Tikk
 
Linear Algebra for AI & ML
Linear Algebra for AI & MLLinear Algebra for AI & ML
Linear Algebra for AI & ML
Sandip Ladi
 
Lecture_9_LA_Review.pptx
Lecture_9_LA_Review.pptxLecture_9_LA_Review.pptx
Lecture_9_LA_Review.pptx
Sunny432360
 
Large Scale Recommendation: a view from the Trenches
Large Scale Recommendation: a view from the TrenchesLarge Scale Recommendation: a view from the Trenches
Large Scale Recommendation: a view from the Trenches
Anne-Marie Tousch
 
Ad

Recently uploaded (20)

LSNIF: Locally-Subdivided Neural Intersection Function
LSNIF: Locally-Subdivided Neural Intersection FunctionLSNIF: Locally-Subdivided Neural Intersection Function
LSNIF: Locally-Subdivided Neural Intersection Function
Takahiro Harada
 
UiPath Community Zurich: Release Management and Build Pipelines
UiPath Community Zurich: Release Management and Build PipelinesUiPath Community Zurich: Release Management and Build Pipelines
UiPath Community Zurich: Release Management and Build Pipelines
UiPathCommunity
 
End-to-end Assurance for SD-WAN & SASE with ThousandEyes
End-to-end Assurance for SD-WAN & SASE with ThousandEyesEnd-to-end Assurance for SD-WAN & SASE with ThousandEyes
End-to-end Assurance for SD-WAN & SASE with ThousandEyes
ThousandEyes
 
Fortinet Certified Associate in Cybersecurity
Fortinet Certified Associate in CybersecurityFortinet Certified Associate in Cybersecurity
Fortinet Certified Associate in Cybersecurity
VICTOR MAESTRE RAMIREZ
 
Supercharge Your AI Development with Local LLMs
Supercharge Your AI Development with Local LLMsSupercharge Your AI Development with Local LLMs
Supercharge Your AI Development with Local LLMs
Francesco Corti
 
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto CertificateCybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
VICTOR MAESTRE RAMIREZ
 
Microsoft Build 2025 takeaways in one presentation
Microsoft Build 2025 takeaways in one presentationMicrosoft Build 2025 takeaways in one presentation
Microsoft Build 2025 takeaways in one presentation
Digitalmara
 
Jira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : IntroductionJira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : Introduction
Ravi Teja
 
New Ways to Reduce Database Costs with ScyllaDB
New Ways to Reduce Database Costs with ScyllaDBNew Ways to Reduce Database Costs with ScyllaDB
New Ways to Reduce Database Costs with ScyllaDB
ScyllaDB
 
Create Your First AI Agent with UiPath Agent Builder
Create Your First AI Agent with UiPath Agent BuilderCreate Your First AI Agent with UiPath Agent Builder
Create Your First AI Agent with UiPath Agent Builder
DianaGray10
 
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AI Emotional Actors:  “When Machines Learn to Feel and Perform"AI Emotional Actors:  “When Machines Learn to Feel and Perform"
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AkashKumar809858
 
Maxx nft market place new generation nft marketing place
Maxx nft market place new generation nft marketing placeMaxx nft market place new generation nft marketing place
Maxx nft market place new generation nft marketing place
usersalmanrazdelhi
 
Dr Jimmy Schwarzkopf presentation on the SUMMIT 2025 A
Dr Jimmy Schwarzkopf presentation on the SUMMIT 2025 ADr Jimmy Schwarzkopf presentation on the SUMMIT 2025 A
Dr Jimmy Schwarzkopf presentation on the SUMMIT 2025 A
Dr. Jimmy Schwarzkopf
 
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Nikki Chapple
 
Contributing to WordPress With & Without Code.pptx
Contributing to WordPress With & Without Code.pptxContributing to WordPress With & Without Code.pptx
Contributing to WordPress With & Without Code.pptx
Patrick Lumumba
 
Gihbli AI and Geo sitution |use/misuse of Ai Technology
Gihbli AI and Geo sitution |use/misuse of Ai TechnologyGihbli AI and Geo sitution |use/misuse of Ai Technology
Gihbli AI and Geo sitution |use/misuse of Ai Technology
zainkhurram1111
 
AI Trends - Mary Meeker
AI Trends - Mary MeekerAI Trends - Mary Meeker
AI Trends - Mary Meeker
Razin Mustafiz
 
Palo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity FoundationPalo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity Foundation
VICTOR MAESTRE RAMIREZ
 
Cyber Security Legal Framework in Nepal.pptx
Cyber Security Legal Framework in Nepal.pptxCyber Security Legal Framework in Nepal.pptx
Cyber Security Legal Framework in Nepal.pptx
Ghimire B.R.
 
STKI Israel Market Study 2025 final v1 version
STKI Israel Market Study 2025 final v1 versionSTKI Israel Market Study 2025 final v1 version
STKI Israel Market Study 2025 final v1 version
Dr. Jimmy Schwarzkopf
 
LSNIF: Locally-Subdivided Neural Intersection Function
LSNIF: Locally-Subdivided Neural Intersection FunctionLSNIF: Locally-Subdivided Neural Intersection Function
LSNIF: Locally-Subdivided Neural Intersection Function
Takahiro Harada
 
UiPath Community Zurich: Release Management and Build Pipelines
UiPath Community Zurich: Release Management and Build PipelinesUiPath Community Zurich: Release Management and Build Pipelines
UiPath Community Zurich: Release Management and Build Pipelines
UiPathCommunity
 
End-to-end Assurance for SD-WAN & SASE with ThousandEyes
End-to-end Assurance for SD-WAN & SASE with ThousandEyesEnd-to-end Assurance for SD-WAN & SASE with ThousandEyes
End-to-end Assurance for SD-WAN & SASE with ThousandEyes
ThousandEyes
 
Fortinet Certified Associate in Cybersecurity
Fortinet Certified Associate in CybersecurityFortinet Certified Associate in Cybersecurity
Fortinet Certified Associate in Cybersecurity
VICTOR MAESTRE RAMIREZ
 
Supercharge Your AI Development with Local LLMs
Supercharge Your AI Development with Local LLMsSupercharge Your AI Development with Local LLMs
Supercharge Your AI Development with Local LLMs
Francesco Corti
 
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto CertificateCybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
VICTOR MAESTRE RAMIREZ
 
Microsoft Build 2025 takeaways in one presentation
Microsoft Build 2025 takeaways in one presentationMicrosoft Build 2025 takeaways in one presentation
Microsoft Build 2025 takeaways in one presentation
Digitalmara
 
Jira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : IntroductionJira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : Introduction
Ravi Teja
 
New Ways to Reduce Database Costs with ScyllaDB
New Ways to Reduce Database Costs with ScyllaDBNew Ways to Reduce Database Costs with ScyllaDB
New Ways to Reduce Database Costs with ScyllaDB
ScyllaDB
 
Create Your First AI Agent with UiPath Agent Builder
Create Your First AI Agent with UiPath Agent BuilderCreate Your First AI Agent with UiPath Agent Builder
Create Your First AI Agent with UiPath Agent Builder
DianaGray10
 
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AI Emotional Actors:  “When Machines Learn to Feel and Perform"AI Emotional Actors:  “When Machines Learn to Feel and Perform"
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AkashKumar809858
 
Maxx nft market place new generation nft marketing place
Maxx nft market place new generation nft marketing placeMaxx nft market place new generation nft marketing place
Maxx nft market place new generation nft marketing place
usersalmanrazdelhi
 
Dr Jimmy Schwarzkopf presentation on the SUMMIT 2025 A
Dr Jimmy Schwarzkopf presentation on the SUMMIT 2025 ADr Jimmy Schwarzkopf presentation on the SUMMIT 2025 A
Dr Jimmy Schwarzkopf presentation on the SUMMIT 2025 A
Dr. Jimmy Schwarzkopf
 
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Nikki Chapple
 
Contributing to WordPress With & Without Code.pptx
Contributing to WordPress With & Without Code.pptxContributing to WordPress With & Without Code.pptx
Contributing to WordPress With & Without Code.pptx
Patrick Lumumba
 
Gihbli AI and Geo sitution |use/misuse of Ai Technology
Gihbli AI and Geo sitution |use/misuse of Ai TechnologyGihbli AI and Geo sitution |use/misuse of Ai Technology
Gihbli AI and Geo sitution |use/misuse of Ai Technology
zainkhurram1111
 
AI Trends - Mary Meeker
AI Trends - Mary MeekerAI Trends - Mary Meeker
AI Trends - Mary Meeker
Razin Mustafiz
 
Palo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity FoundationPalo Alto Networks Cybersecurity Foundation
Palo Alto Networks Cybersecurity Foundation
VICTOR MAESTRE RAMIREZ
 
Cyber Security Legal Framework in Nepal.pptx
Cyber Security Legal Framework in Nepal.pptxCyber Security Legal Framework in Nepal.pptx
Cyber Security Legal Framework in Nepal.pptx
Ghimire B.R.
 
STKI Israel Market Study 2025 final v1 version
STKI Israel Market Study 2025 final v1 versionSTKI Israel Market Study 2025 final v1 version
STKI Israel Market Study 2025 final v1 version
Dr. Jimmy Schwarzkopf
 
Ad

Fast ALS-based matrix factorization for explicit and implicit feedback datasets

  • 1. Fast ALS-based matrix factorization for explicit and implicit feedback datasets Istv á n Pil á szy, D ávid Zibriczky, Domonkos Tikk Gravity R&D Ltd. www.gravityrd.com 28 September 20 10
  • 3. Problem setting 5 4 3 4 4 2 4 1
  • 6. Computing the optimal solution: Matrix inversion is costly: Sum of squared errors of the optimal solution: 0.055 Ridge Regression
  • 7. RR1: RR with coordinate descent Idea: optimize only one variable of at once Start with zero: Sum of squared errors: 24.6
  • 8. RR1: RR with coordinate descent Idea: optimize only one variable of at once Start with zero, then optimize w 1 Sum of squared errors: 7.5
  • 9. RR1: RR with coordinate descent Idea: optimize only one variable of at once Start with zero, then optimize w 1 ,then optimize w 2 Sum of squared errors: 6.2
  • 10. RR1: RR with coordinate descent Idea: optimize only one variable of at once Start with zero, then optimize w 1 , then w 2 , then w 3 Sum of squared errors: 5.7
  • 11. RR1: RR with coordinate descent Idea: optimize only one variable of at once … w 4 Sum of squared errors: 5.4
  • 12. RR1: RR with coordinate descent Idea: optimize only one variable of at once … w 5 Sum of squared errors: 5.0
  • 13. RR1: RR with coordinate descent Idea: optimize only one variable of at once … w 1 again Sum of squared errors: 3.4
  • 14. RR1: RR with coordinate descent Idea: optimize only one variable of at once … w 2 again Sum of squared errors: 2.9
  • 15. RR1: RR with coordinate descent Idea: optimize only one variable of at once … w 3 again Sum of squared errors: 2.7
  • 16. RR1: RR with coordinate descent Idea: optimize only one variable of at once … after a while: Sum of squared errors: 0.055 No remarkable difference Cost: n examples, e epoch
  • 17. The rating matrix, R of (M x N ) is approximated as the product of two lower ranked matrices, P : user feature matrix of ( M x K ) size Q : item (movie) feature matrix of ( N x K ) size K : number of features Matrix factorization P T R T Q
  • 18. Matrix Factorization for explicit feedb. Q P 5 5 4 3 1 R 3.3 1.3 1.3 1. 4 1. 3 1 . 9 1. 7 0.7 1.0 1.3 0.8 0 0. 7 0.4 1. 7 0. 3 2.1 2.2 6.7 1.6 1. 4 2 4 3.3 1.6 1.8
  • 19. Finding P and Q Q P R 0.3 0.9 0.7 1.3 0.5 0 .6 1.2 0.3 1. 6 1.1 5 5 4 3 1 2 4 ? ? Init Q randomly Find p 1
  • 20. Finding p 1 with RR Optimal solution:
  • 21. Finding p 1 with RR Q P R 0.3 0.9 0.7 1.3 0.5 0 .6 1.2 0.3 1. 6 1.1 5 5 4 3 1 2 4 2.3 3.2
  • 22. Initialize Q randomly Repeat Recompute P Compute p 1 with RR Compute p 2 with RR … (for each user) Recompute Q Compute q 1 with RR … (for each item) Alternating Least Squares (ALS)
  • 23. ALS relies on RR: recomputation of vectors with RR when recomputing p 1 , the previously computed value is ignored ALS1 relies on RR1: optimize the previously computed p 1 , one scalar at once the previously computed value is not lost run RR1 only for one epoch ALS is just an approximation method. Likewise ALS1. ALS1: ALS with RR1
  • 24. Implicit feedback Q P 1 0 R 0.5 0.1 0.2 0.7 0.3 0.1 0.1 0.7 0.3 0 0.2 0 0. 7 0.4 0.4 0. 4 1 0 0 0 0 1 1 0 0 1 0 1 1
  • 25. The matrix is fully specified: each user watched each item. Zeros are less important, but still important. Many 0-s, few 1-s. Recall, that Idea (Hu, Koren, Volinsky): consider a user, who watched nothing compute and for this user (the null-user) when recomputing p 1 , compare her to the null-user based on the cached and , update them according to the differences In this way, only the number of 1-s affect performance, not the number of 0-s IALS: alternating least squares with this trick. Implicit feedback: IALS
  • 26. The RR1 trick cannot be applied here  Implicit feedback: IALS1
  • 27. The RR1 trick cannot be applied here  But, wait…! Implicit feedback: IALS1
  • 28. X T X is just a matrix. No matter how many items we have, its dimension is the same (KxK) If we are lucky, we can find K items which generate this matrix What, if we are unlucky? We can still create synthetic items. Assume that the null user did not watch these K items X T X and X T y are the same, if synthetic items were created appropriately Implicit feedback: IALS1
  • 29. Can we find a Z matrix such that Z is small, KxK and ? We can, by eigenvalue decomposition Implicit feedback: IALS1
  • 30. If a user watched N items,we can run RR1 with N+K examples To recompute p u , we need steps (assume 1 epoch) Is it better in practice, than the of IALS ? Implicit feedback: IALS1
  • 31. Evaluation of ALS vs. ALS1 Probe10 RMSE on Netflix Prize dataset, after 25 epochs
  • 32. Evaluation of ALS vs. ALS1 Time-accuracy tradeoff
  • 33. Evaluation of IALS vs. IALS1 Average Relative Position on the test subset of a proprietary implicit feedback dataset, after 20 epochs. Lower is better.
  • 34. Evaluation of IALS vs. IALS1 Time – accuracy tradeoff.
  • 35. Conclusions users items We learned two tricks: ALS1: RR1 can be used instead of RR in ALS IALS1: we can create few synthetic examples to replace the not-watching of many examples ALS and IALS are approximation algorithms, so why not change them to be even more approximative ALS1 and IALS1 offer better time-accuracy tradeoffs, esp. when K is large. They can be even 10x faster (or even 100x faster, for non-realistic K values) TODO: Precision, recall, other datasets.
  • 36. Thank you for your attention ?