Material presented at Tokyo Web Mining Meetup, March 26, 2016.
The source code is here:
https://github.com/hamukazu/tokyo.webmining.2016-03-26
東京ウェブマイニング(2016年3月27)の発表資料です。すべて英語です。
Netflix provides personalized recommendations at scale to over 37 million members across 40 countries. They take a multi-layered approach using offline, nearline, and online computation. In the offline layer, large datasets are processed to train machine learning models. The nearline layer incrementally refines recommendations based on member events. In the online layer, recommendations are generated and presented to members in real-time based on signals from live services and precomputed results. Netflix recommendations are powered by a massive dataset of over 30 million daily plays and sophisticated algorithms running across distributed cloud computing infrastructure.
Slides from the 'Essentials of Product Management' workshop at General Assembly in London, June 2013
ABOUT THIS WORKSHOP
The first step in making an idea reality is to understand product management. There is a huge amount of work between the idea stage and the coding stage, and this Saturday workshop will help you understand what needs to be accomplished.
We will start the day off by learning what the product management role encompasses and what the managing process is like. We'll also cover a product's feasibility and the various stages of—and ways to approach—the product development process. Through group work and hands-on practice, we'll look at the MVP (Minimum Viable Product) philosophy to test and validate your plans, and move on to identify the other more technical tools needed to start and evaluate the building process.
TAKEAWAYS
Part 1: The Product Manager role & the Product Management Process
Part 2: The Customer and MVP
- Learn to break an idea into its primary parts to assess product feasibility
- Explain the purpose and process of building an MVP
- Identify various ways to build and learn from an MVP
- Evolve an MVP to reach product/market fit
- Determine if product/market fit has been achieved for a product
Some slide content courtesy of Simon Cast, John Eikenberry, and General Assembly
The document discusses the economic potential of generative AI. Some key points:
- Generative AI could add $2.6-$4.4 trillion annually to the global economy by automating tasks across various industries and business functions. This would increase AI's total economic impact by 15-40%.
- About 75% of the value from generative AI would come from use cases in customer operations, marketing/sales, software engineering, and research & development.
- All industry sectors would be significantly impacted, including banking, high tech, and life sciences. Banking alone could see $200-$340 billion in additional annual value from generative AI use cases.
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018Massimo Quadrana
Slides of the Tutorial on Sequence Aware Recommenders held at ACM RecSys 2018 in Vancouver.
Link to the website: https://sites.google.com/view/seq-recsys-tutorial
Link to the hands-on: https://github.com/mquad/sars_tutorial
Feature engineering--the underdog of machine learning. This deck provides an overview of feature generation methods for text, image, audio, feature cleaning and transformation methods, how well they work and why.
Where to find better ideas? +10 categories to explore with examplesBoard of Innovation
This document provides tips for finding creative ideas as a team. It suggests getting inspiration from problems users face, observing how people workaround frustrations, exploring your company's existing unused assets, tracking trends, researching history and old ideas, observing extreme users, and browsing sources randomly for Eureka moments. The overall message is that being open to diverse sources of information can trigger novel ideas.
This document discusses key aspects of product management including defining the role of a product manager, common frameworks used in product definition and design such as Facebook's three questions, jobs to be done framework, product canvas, and design thinking. It also covers prioritization frameworks like MoSCoW and RICE, different types of product metrics like north star metric, behavioral metric and success metric, and the AARRR pirate metrics framework. The document provides an overview of processes, methodologies and metrics used in planning, developing and measuring success of products.
This deck was presented on 28th January 2017 at Chiang Mai Startup Events. It covers questions such as "What is JTBD framework"? and "How does JTBD help businesses understand the WHY rather than the WHAT?" It is based on Tony Ulwick's presentation.
In this presentation I introduce a tool for strategic planning; Impact Mapping (http://impactmapping.org).
This is one of the best tools I've used to help us produce great, well communicated and easily understood strategic plans, by involving everyone needed to execute the plan.
This presentation is a continuation of my presentations about Mission, Vision and Strategic plans, but this time it's much more hands-on and practical.
A product manager is responsible for the overall success of a product by understanding customer needs and ensuring the product delivers value. Key responsibilities include defining product requirements and strategy, building business cases, conducting user research, creating roadmaps, and tracking metrics. The role requires balancing internal needs while representing customers externally throughout the product development process.
If you've heard of the agile process, you've probably heard about it's value in developing quality software, Here are steps on how to plan a sprint in Agile.
Context-aware Recommendation: A Quick ViewYONG ZHENG
Context-aware recommendation systems take into account additional contextual information beyond just the user and item, such as time, location, and companion. There are three main approaches: contextual prefiltering splits items or users based on context; contextual modeling directly integrates context into models like matrix factorization; and CARSKit is an open source Java library for building context-aware recommender systems.
Making Netflix Machine Learning Algorithms ReliableJustin Basilico
This document discusses making Netflix machine learning algorithms reliable. It describes how Netflix uses machine learning for tasks like personalized ranking and recommendation. The goals are to maximize member satisfaction and retention. The models and algorithms used include regression, matrix factorization, neural networks, and bandits. The key aspects of making the models reliable discussed are: automated retraining of models, testing training pipelines, checking models and inputs online for anomalies, responding gracefully to failures, and training models to be resilient to different conditions and failures.
Design thinking is a 5-stage process used to solve complex problems in an innovative way. The 5 stages are: empathize to understand user needs, define the problem from their perspective, ideate potential solutions, prototype the top ideas, and test them with users. It provides a human-centered approach to problem solving by gaining empathy for users and iterating on solutions.
A compilation of the absolute basics for those who want to know about Agile Methodology with some insights on Scrum. The idea is to give enough to fuel the curiosity to learn more. It might not interest one of he / she is an Agile guru but may I ask for your review / comments / suggestions. I'd love to hear from you all...
User Story Maps: Secrets for Better Backlogs and PlanningAaron Sanders
User story mapping is an intuitive way to build and organize a product backlog. During this session you’ll get hands-on experience building a user story map. You’ll learn:
How story mapping drives productive conversations with users and stakeholders.
How to plan incremental releases of your product using minimal holistic slices that deliver value at each product release.
Secrets to effective prioritization for both planning releases, and figuring out what to build next.
Tactical management of your backlog as you grow your working software to releasability.
The backlog building and managing strategies in this session will take you well beyond the agile basics.
Quick guide to the Design sprint.
The sprint is a five-day process for answering critical business questions through design, prototyping, and testing ideas with customers. Developed at Google Ventures, it’s a “greatest hits” of business strategy, innovation, behavior science, design thinking, and more — packaged into a battle-tested process that any team can use.
To use the links within the deck - download the presentation and open it in the browser.
This document provides an overview of Scala data pipelines at Spotify. It discusses:
- The speaker's background and Spotify's scale with over 75 million active users.
- Spotify's music recommendation systems including Discover Weekly and personalized radio.
- How Scala and frameworks like Scalding, Spark, and Crunch are used to build data pipelines for tasks like joins, aggregations, and machine learning algorithms.
- Techniques for optimizing pipelines including distributed caching, bloom filters, and Parquet for efficient storage and querying of large datasets.
- The speaker's success in migrating over 300 jobs from Python to Scala and growing the team of engineers building Scala pipelines at Spotify.
Music Recommendations at Scale with SparkChris Johnson
Spotify uses a range of Machine Learning models to power its music recommendation features including the Discover page, Radio, and Related Artists. Due to the iterative nature of these models they are a natural fit to the Spark computation paradigm and suffer from the IO overhead incurred by Hadoop. In this talk, I review the ALS algorithm for Matrix Factorization with implicit feedback data and how we’ve scaled it up to handle 100s of Billions of data points using Scala, Breeze, and Spark.
From the NYC Machine Learning meetup on Jan 17, 2013: http://www.meetup.com/NYC-Machine-Learning/events/97871782/
Video is available here: http://vimeo.com/57900625
This document provides guidance on creating a product vision. It discusses why a product vision is useful, including to get buy-in, compare initiatives, and serve as a decision-making standard. It provides a template for the product vision board with categories for the user, their needs, key features, and business goals. These elements should align and deliver on the overall vision statement. The document also covers how to develop a product vision, including preparing for a workshop, facilitating the session, and next steps after the vision is created. It discusses how to manage multiple visions using a Lean Value Tree to focus on value outcomes and connect initiatives to organizational goals and strategies. Finally, it addresses using OKRs and PIRATE metrics together to measure
Interactive Recommender Systems with Netflix and SpotifyChris Johnson
Interactive recommender systems enable the user to steer the received recommendations in the desired direction through explicit interaction with the system. In the larger ecosystem of recommender systems used on a website, it is positioned between a lean-back recommendation experience and an active search for a specific piece of content. Besides this aspect, we will discuss several parts that are especially important for interactive recommender systems, including the following: design of the user interface and its tight integration with the algorithm in the back-end; computational efficiency of the recommender algorithm; as well as choosing the right balance between exploiting the feedback from the user as to provide relevant recommendations, and enabling the user to explore the catalog and steer the recommendations in the desired direction.
In particular, we will explore the field of interactive video and music recommendations and their application at Netflix and Spotify. We outline some of the user-experiences built, and discuss the approaches followed to tackle the various aspects of interactive recommendations. We present our insights from user studies and A/B tests.
The tutorial targets researchers and practitioners in the field of recommender systems, and will give the participants a unique opportunity to learn about the various aspects of interactive recommender systems in the video and music domain. The tutorial assumes familiarity with the common methods of recommender systems.
Many powerful Machine Learning algorithms are based on graphs, e.g., Page Rank (Pregel), Recommendation Engines (collaborative filtering), text summarization, and other NLP tasks. Also, the recent developments with Graph Neural Networks connect the worlds of Graphs and Machine Learning even further.
Considering data pre-processing and feature engineering which are both vital tasks in Machine Learning Pipelines extends this relationship across the entire ecosystem. In this session, we will investigate the entire range of Graphs and Machine Learning with many practical exercises.
Conf 2023 TLD - ChatGPT impact dans le DesignTanguyLeDuff1
Conférence de Tanguy Le Duff (Lead UX Designer @Mega International ; Enseignant en UX Design) sur la thématique "ChatGPT : quel impact dans le Design?"
Recommendation System --Theory and PracticeKimikazu Kato
This document provides an overview of recommendation systems and collaborative filtering techniques. It discusses using matrix factorization to predict user ratings by representing users and items as vectors in a latent factor space. Optimization techniques like stochastic gradient descent can be used to learn the factorization from existing ratings. The document also notes challenges of sparsity and scale for practical systems and describes approaches like elastic net regularization and sparsification to address these.
This document discusses optimization techniques and provides examples to illustrate key concepts in optimization problems. It defines optimization as finding extreme states like minimum/maximum and discusses how it is applied in various fields. It then covers basic definitions like design variables, objective functions, constraints, convexity, local vs global optima. Examples are given to show unconstrained vs constrained problems and illustrate active, inactive and violated constraints. Optimization techniques largely depend on calculus concepts like derivatives and hessian matrix.
Where to find better ideas? +10 categories to explore with examplesBoard of Innovation
This document provides tips for finding creative ideas as a team. It suggests getting inspiration from problems users face, observing how people workaround frustrations, exploring your company's existing unused assets, tracking trends, researching history and old ideas, observing extreme users, and browsing sources randomly for Eureka moments. The overall message is that being open to diverse sources of information can trigger novel ideas.
This document discusses key aspects of product management including defining the role of a product manager, common frameworks used in product definition and design such as Facebook's three questions, jobs to be done framework, product canvas, and design thinking. It also covers prioritization frameworks like MoSCoW and RICE, different types of product metrics like north star metric, behavioral metric and success metric, and the AARRR pirate metrics framework. The document provides an overview of processes, methodologies and metrics used in planning, developing and measuring success of products.
This deck was presented on 28th January 2017 at Chiang Mai Startup Events. It covers questions such as "What is JTBD framework"? and "How does JTBD help businesses understand the WHY rather than the WHAT?" It is based on Tony Ulwick's presentation.
In this presentation I introduce a tool for strategic planning; Impact Mapping (http://impactmapping.org).
This is one of the best tools I've used to help us produce great, well communicated and easily understood strategic plans, by involving everyone needed to execute the plan.
This presentation is a continuation of my presentations about Mission, Vision and Strategic plans, but this time it's much more hands-on and practical.
A product manager is responsible for the overall success of a product by understanding customer needs and ensuring the product delivers value. Key responsibilities include defining product requirements and strategy, building business cases, conducting user research, creating roadmaps, and tracking metrics. The role requires balancing internal needs while representing customers externally throughout the product development process.
If you've heard of the agile process, you've probably heard about it's value in developing quality software, Here are steps on how to plan a sprint in Agile.
Context-aware Recommendation: A Quick ViewYONG ZHENG
Context-aware recommendation systems take into account additional contextual information beyond just the user and item, such as time, location, and companion. There are three main approaches: contextual prefiltering splits items or users based on context; contextual modeling directly integrates context into models like matrix factorization; and CARSKit is an open source Java library for building context-aware recommender systems.
Making Netflix Machine Learning Algorithms ReliableJustin Basilico
This document discusses making Netflix machine learning algorithms reliable. It describes how Netflix uses machine learning for tasks like personalized ranking and recommendation. The goals are to maximize member satisfaction and retention. The models and algorithms used include regression, matrix factorization, neural networks, and bandits. The key aspects of making the models reliable discussed are: automated retraining of models, testing training pipelines, checking models and inputs online for anomalies, responding gracefully to failures, and training models to be resilient to different conditions and failures.
Design thinking is a 5-stage process used to solve complex problems in an innovative way. The 5 stages are: empathize to understand user needs, define the problem from their perspective, ideate potential solutions, prototype the top ideas, and test them with users. It provides a human-centered approach to problem solving by gaining empathy for users and iterating on solutions.
A compilation of the absolute basics for those who want to know about Agile Methodology with some insights on Scrum. The idea is to give enough to fuel the curiosity to learn more. It might not interest one of he / she is an Agile guru but may I ask for your review / comments / suggestions. I'd love to hear from you all...
User Story Maps: Secrets for Better Backlogs and PlanningAaron Sanders
User story mapping is an intuitive way to build and organize a product backlog. During this session you’ll get hands-on experience building a user story map. You’ll learn:
How story mapping drives productive conversations with users and stakeholders.
How to plan incremental releases of your product using minimal holistic slices that deliver value at each product release.
Secrets to effective prioritization for both planning releases, and figuring out what to build next.
Tactical management of your backlog as you grow your working software to releasability.
The backlog building and managing strategies in this session will take you well beyond the agile basics.
Quick guide to the Design sprint.
The sprint is a five-day process for answering critical business questions through design, prototyping, and testing ideas with customers. Developed at Google Ventures, it’s a “greatest hits” of business strategy, innovation, behavior science, design thinking, and more — packaged into a battle-tested process that any team can use.
To use the links within the deck - download the presentation and open it in the browser.
This document provides an overview of Scala data pipelines at Spotify. It discusses:
- The speaker's background and Spotify's scale with over 75 million active users.
- Spotify's music recommendation systems including Discover Weekly and personalized radio.
- How Scala and frameworks like Scalding, Spark, and Crunch are used to build data pipelines for tasks like joins, aggregations, and machine learning algorithms.
- Techniques for optimizing pipelines including distributed caching, bloom filters, and Parquet for efficient storage and querying of large datasets.
- The speaker's success in migrating over 300 jobs from Python to Scala and growing the team of engineers building Scala pipelines at Spotify.
Music Recommendations at Scale with SparkChris Johnson
Spotify uses a range of Machine Learning models to power its music recommendation features including the Discover page, Radio, and Related Artists. Due to the iterative nature of these models they are a natural fit to the Spark computation paradigm and suffer from the IO overhead incurred by Hadoop. In this talk, I review the ALS algorithm for Matrix Factorization with implicit feedback data and how we’ve scaled it up to handle 100s of Billions of data points using Scala, Breeze, and Spark.
From the NYC Machine Learning meetup on Jan 17, 2013: http://www.meetup.com/NYC-Machine-Learning/events/97871782/
Video is available here: http://vimeo.com/57900625
This document provides guidance on creating a product vision. It discusses why a product vision is useful, including to get buy-in, compare initiatives, and serve as a decision-making standard. It provides a template for the product vision board with categories for the user, their needs, key features, and business goals. These elements should align and deliver on the overall vision statement. The document also covers how to develop a product vision, including preparing for a workshop, facilitating the session, and next steps after the vision is created. It discusses how to manage multiple visions using a Lean Value Tree to focus on value outcomes and connect initiatives to organizational goals and strategies. Finally, it addresses using OKRs and PIRATE metrics together to measure
Interactive Recommender Systems with Netflix and SpotifyChris Johnson
Interactive recommender systems enable the user to steer the received recommendations in the desired direction through explicit interaction with the system. In the larger ecosystem of recommender systems used on a website, it is positioned between a lean-back recommendation experience and an active search for a specific piece of content. Besides this aspect, we will discuss several parts that are especially important for interactive recommender systems, including the following: design of the user interface and its tight integration with the algorithm in the back-end; computational efficiency of the recommender algorithm; as well as choosing the right balance between exploiting the feedback from the user as to provide relevant recommendations, and enabling the user to explore the catalog and steer the recommendations in the desired direction.
In particular, we will explore the field of interactive video and music recommendations and their application at Netflix and Spotify. We outline some of the user-experiences built, and discuss the approaches followed to tackle the various aspects of interactive recommendations. We present our insights from user studies and A/B tests.
The tutorial targets researchers and practitioners in the field of recommender systems, and will give the participants a unique opportunity to learn about the various aspects of interactive recommender systems in the video and music domain. The tutorial assumes familiarity with the common methods of recommender systems.
Many powerful Machine Learning algorithms are based on graphs, e.g., Page Rank (Pregel), Recommendation Engines (collaborative filtering), text summarization, and other NLP tasks. Also, the recent developments with Graph Neural Networks connect the worlds of Graphs and Machine Learning even further.
Considering data pre-processing and feature engineering which are both vital tasks in Machine Learning Pipelines extends this relationship across the entire ecosystem. In this session, we will investigate the entire range of Graphs and Machine Learning with many practical exercises.
Conf 2023 TLD - ChatGPT impact dans le DesignTanguyLeDuff1
Conférence de Tanguy Le Duff (Lead UX Designer @Mega International ; Enseignant en UX Design) sur la thématique "ChatGPT : quel impact dans le Design?"
Recommendation System --Theory and PracticeKimikazu Kato
This document provides an overview of recommendation systems and collaborative filtering techniques. It discusses using matrix factorization to predict user ratings by representing users and items as vectors in a latent factor space. Optimization techniques like stochastic gradient descent can be used to learn the factorization from existing ratings. The document also notes challenges of sparsity and scale for practical systems and describes approaches like elastic net regularization and sparsification to address these.
This document discusses optimization techniques and provides examples to illustrate key concepts in optimization problems. It defines optimization as finding extreme states like minimum/maximum and discusses how it is applied in various fields. It then covers basic definitions like design variables, objective functions, constraints, convexity, local vs global optima. Examples are given to show unconstrained vs constrained problems and illustrate active, inactive and violated constraints. Optimization techniques largely depend on calculus concepts like derivatives and hessian matrix.
This document provides an overview of machine learning concepts. It discusses big data and the need for machine learning to extract structure from data. It explains that machine learning involves programming computers to optimize performance using examples or past experience. Learning is useful when human expertise is limited or changes over time. The document also summarizes applications of machine learning like classification, regression, clustering, and reinforcement learning. It provides examples of each type of learning and discusses concepts like bias-variance tradeoff, overfitting, underfitting and more.
Facebook Talk at Netflix ML Platform meetup Sep 2019Faisal Siddiqi
In this talk at the Netflix Machine Learning Platform Meetup on 12 Sep 2019, Sam Daulton from Facebook discusses "Practical Solutions to real-world exploration problems".
Stochastic optimization from mirror descent to recent algorithmsSeonho Park
The document discusses stochastic optimization algorithms. It begins with an introduction to stochastic optimization and online optimization settings. Then it covers Mirror Descent and its extension Composite Objective Mirror Descent (COMID). Recent algorithms for deep learning like Momentum, ADADELTA, and ADAM are also discussed. The document provides convergence analysis and empirical studies of these algorithms.
This document discusses matrix factorization techniques for recommender systems. It begins by describing common approaches like content-based, collaborative filtering, and hybrid recommender systems. It then focuses on collaborative filtering, discussing memory and cold start issues with user-based and item-based approaches. The document introduces latent factor models like matrix factorization that address these issues by representing users and items as vectors of factors. It covers optimization techniques like alternating least squares for explicit and implicit feedback datasets. Finally, it discusses evaluation metrics like MAP and NDCG that are more appropriate than RMSE for recommender systems.
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...IJERA Editor
This document demonstrates using linear programming to determine the optimal product mix for a manufacturing firm to maximize profit. The firm produces n products using m raw materials. The problem is formulated as a linear program to maximize total profit subject to raw material constraints. The optimal solution is found using the simplex method and provides the quantities of each product (v1, v2, etc.) that maximize total profit (z0). The solution may show some product quantities as zero, indicating those products should not be produced to maximize profit under the given constraints.
Determination of Optimal Product Mix for Profit Maximization using Linear Pro...IJERA Editor
This paper demonstrates the use of liner programming methods in order to determine the optimal product mix for
profit maximization. There had been several papers written to demonstrate the use of linear programming in
finding the optimal product mix in various organization. This paper is aimed to show the generic approach to be
taken to find the optimal product mix.
Linear, Machine Learning or Probabilistic Predictive Models: What's Best for ...Bohdan Pavlyshenko
Linear, Machine Learning and Probabilistic models are often used in the predictive analytics. Each of them has its pros and cons for different industrial and business problems. Linear models make it possible to extrapolate forecasting, study impact of external factors but does not allow us to capture nonlinear complicated patterns in the data. Machine learning models can find a complicated pattern but only in the stationary data, at the same time these models require a lot of historical data for training to get sufficient accuracy. Probabilistic models based on the Bayesian inference can take into account expert opinion via prior distributions for parameters and can be used for different kinds of risk assessments. In the speech, I am going to consider the use of these models and their combinations in different use cases. One type of use case is numeric regression for time series forecasting, another one is logistic regression in manufacturing failure detection problems. I will also consider multilevel predictive ensembles of models based on the bagging and stacking approaches.
This document discusses an upcoming lecture on linear regression and gradient descent. The lecture will cover gradient descent for linear regression, implementing gradient descent in code, and interpreting models from multiple linear regression. It will review cost functions and the intuition behind gradient descent, then demonstrate gradient descent for linear regression.
The document discusses machine learning optimization problems and linear/logistic regression algorithms. It notes that machine learning can be viewed as an optimization problem with constraints, a function to optimize, and an optimization algorithm. Linear regression aims to minimize prediction error by finding the best fitting linear model, while logistic regression predicts class probabilities using a sigmoid function. Both use gradient descent to optimize their error functions and learn model parameters from data.
Fractional factorial designs (FFDs) are used to efficiently study many factors using fewer experimental runs than a full factorial design. FFDs exploit redundancy in estimating interactions to select a subset of runs. Regular FFDs have desirable properties like balance and orthogonality. Resolution indicates how interactions are aliased, with higher resolutions preferred. FFDs are useful in screening experiments to identify important factors efficiently before further optimization. Software helps select appropriate FFDs based on desired resolution and aliasing.
Simulators play a major role in analyzing multi-modal transportation networks. As their complexity increases, optimization becomes an increasingly challenging task. Current calibration procedures often rely on heuristics, rules of thumb and sometimes on brute-force search. Alternatively, we provide a statistical method which combines a distributed, Gaussian Process Bayesian optimization method with dimensionality reduction techniques and structural improvement. We then demonstrate our framework on the problem of calibrating a multi-modal transportation network of city of Bloomington, Illinois. Our framework is sample efficient and supported by theoretical analysis and an empirical study. We demonstrate on the problem of calibrating a multi-modal transportation network of city of Bloomington, Illinois. Finally, we discuss directions for further research.
1. Optimization methods are used widely in business, industry, government and engineering to solve problems involving optimal allocation of limited resources. Many optimization techniques originated during World War II to improve war efforts.
2. A linear programming problem aims to maximize or minimize a linear objective function subject to linear constraints. It has various applications including production scheduling, transportation routing, and cutting stock problems.
3. The document provides an example of using a linear programming model to maximize profits for a pottery company by determining the optimal product mix given constraints on available labor hours and clay materials. Decision variables, objective function, and constraints are defined to formulate the mathematical model.
This document discusses time series forecasting techniques for multivariate and hierarchical time series data. It presents several cases involving energy consumption forecasting, sales forecasting, and freight transportation forecasting. For each case, it describes the time series data and components, discusses feature generation methods like nonparametric transformations and the Haar wavelet transform to extract features, and evaluates different forecasting models and their ability to generate consistent forecasts while respecting any hierarchical relationships in the data. The focus is on generating accurate forecasts while maintaining properties like consistency, minimizing errors, and handling complex time series structures.
This document discusses various classification algorithms including logistic regression, Naive Bayes, support vector machines, k-nearest neighbors, decision trees, and random forests. It provides examples of using logistic regression and support vector machines for classification tasks. For logistic regression, it demonstrates building a model to classify handwritten digits from the MNIST dataset. For support vector machines, it uses a banknote authentication dataset to classify currency notes as authentic or fraudulent. The document discusses evaluating model performance using metrics like confusion matrix, accuracy, precision, recall, and F1 score.
The document presents a modification to the Jaya optimization algorithm. The standard Jaya algorithm seeks guidance from only the best and worst solutions in each iteration. The modification proposes that Jaya should also seek guidance from the top and bottom 10% of solutions, in addition to the best and worst. This allows information to flow more continuously from the extremities.
The proposed algorithm is tested on the sphere function optimization problem. Initial candidate solutions are generated and ranked. The top and bottom 10% solutions near the best and worst are identified. Each candidate is then modified based on these neighboring solutions, moving toward the top 10% and away from the bottom 10%. Finally, candidates are refined using the standard Jaya equations seeking guidance from the
The document discusses Python programming and data science tools like NumPy, Scikit-learn, and Cython. It provides examples of using NumPy to quickly sum a large array and speed up a prime number calculation with Cython. It also briefly mentions past Python conference talks and techniques like spectral clustering and activation functions.
Fast and Probvably Seedings for k-MeansKimikazu Kato
The document proposes a new MCMC-based algorithm for initializing centroids in k-means clustering that does not assume a specific distribution of the input data, unlike previous work. It uses rejection sampling to emulate the distribution and select initial centroids that are widely scattered. The algorithm is proven mathematically to converge. Experimental results on synthetic and real-world datasets show it performs well with a good trade-off of accuracy and speed compared to existing techniques.
This document discusses Python and machine learning libraries like scikit-learn. It provides code examples for loading data, fitting models, and making predictions using scikit-learn algorithms. It also covers working with NumPy arrays and loading data from files like CSVs.
Effective Numerical Computation in NumPy and SciPyKimikazu Kato
This document provides an overview of effective numerical computation in NumPy and SciPy. It discusses how Python can be used for numerical computation tasks like differential equations, simulations, and machine learning. While Python is initially slower than languages like C, libraries like NumPy and SciPy allow Python code to achieve sufficient speed through techniques like broadcasting, indexing, and using sparse matrix representations. The document provides examples of how to efficiently perform tasks like applying functions element-wise to sparse matrices and calculating norms. It also presents a case study for efficiently computing a formula that appears in a machine learning paper using different sparse matrix representations in SciPy.
Kimikazu Kato is the Chief Scientist at Silver Egg Technology, which provides recommender system and online advertising services. He has a PhD in computer science and experience in areas like computer graphics and parallel computing. Silver Egg uses a real-time recommender platform called Aigent Suite to consistently target users from initial visits to retention. The system analyzes user behavior data to determine personalized recommendations and ad targeting. While collaborative filtering and matrix factorization are common recommendation algorithms, approaches need adjustments for sales recommendations versus movie ratings. Consulting is also important for tuning algorithm parameters to specific business needs.
Exploring the advantages of on-premises Dell PowerEdge servers with AMD EPYC processors vs. the cloud for small to medium businesses’ AI workloads
AI initiatives can bring tremendous value to your business, but you need to support your new AI workloads effectively. That means choosing the best possible infrastructure for your needs—and many companies are finding that the cloud isn’t right for them. According to a recent Rackspace survey of IT executives, 69 percent of companies have moved some of their applications on-premises from the cloud, with half of those citing security and compliance as the reason and 44 percent citing cost.
On-premises solutions provide a number of advantages. With full control over your security infrastructure, you can be certain that all compliance requirements remain firmly in the hands of your IT team. Opting for on-premises also gives you the ability to design your infrastructure to the precise needs of that team and your new AI workloads. Depending on the workload, you may also see performance benefits, along with more predictable costs. As you start to build your next AI initiative, consider an on-premises solution utilizing AMD EPYC processor-powered Dell PowerEdge servers.
Introduction and Background:
Study Overview and Methodology: The study analyzes the IT market in Israel, covering over 160 markets and 760 companies/products/services. It includes vendor rankings, IT budgets, and trends from 2025-2029. Vendors participate in detailed briefings and surveys.
Vendor Listings: The presentation lists numerous vendors across various pages, detailing their names and services. These vendors are ranked based on their participation and market presence.
Market Insights and Trends: Key insights include IT market forecasts, economic factors affecting IT budgets, and the impact of AI on enterprise IT. The study highlights the importance of AI integration and the concept of creative destruction.
Agentic AI and Future Predictions: Agentic AI is expected to transform human-agent collaboration, with AI systems understanding context and orchestrating complex processes. Future predictions include AI's role in shopping and enterprise IT.
Jira Administration Training – Day 1 : IntroductionRavi Teja
This presentation covers the basics of Jira for beginners. Learn how Jira works, its key features, project types, issue types, and user roles. Perfect for anyone new to Jira or preparing for Jira Admin roles.
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....Jasper Oosterveld
Sensitivity labels, powered by Microsoft Purview Information Protection, serve as the foundation for classifying and protecting your sensitive data within Microsoft 365. Their importance extends beyond classification and play a crucial role in enforcing governance policies across your Microsoft 365 environment. Join me, a Data Security Consultant and Microsoft MVP, as I share practical tips and tricks to get the full potential of sensitivity labels. I discuss sensitive information types, automatic labeling, and seamless integration with Data Loss Prevention, Teams Premium, and Microsoft 365 Copilot.
Nix(OS) for Python Developers - PyCon 25 (Bologna, Italia)Peter Bittner
How do you onboard new colleagues in 2025? How long does it take? Would you love a standardized setup under version control that everyone can customize for themselves? A stable desktop setup, reinstalled in just minutes. It can be done.
This talk was given in Italian, 29 May 2025, at PyCon 25, Bologna, Italy. All slides are provided in English.
Original slides at https://slides.com/bittner/pycon25-nixos-for-python-developers
Jeremy Millul - A Talented Software DeveloperJeremy Millul
Jeremy Millul is a talented software developer based in NYC, known for leading impactful projects such as a Community Engagement Platform and a Hiking Trail Finder. Using React, MongoDB, and geolocation tools, Jeremy delivers intuitive applications that foster engagement and usability. A graduate of NYU’s Computer Science program, he brings creativity and technical expertise to every project, ensuring seamless user experiences and meaningful results in software development.
European Accessibility Act & Integrated Accessibility TestingJulia Undeutsch
Emma Dawson will guide you through two important topics in this session.
Firstly, she will prepare you for the European Accessibility Act (EAA), which comes into effect on 28 June 2025, and show you how development teams can prepare for it.
In the second part of the webinar, Emma Dawson will explore with you various integrated testing methods and tools that will help you improve accessibility during the development cycle, such as Linters, Storybook, Playwright, just to name a few.
Focus: European Accessibility Act, Integrated Testing tools and methods (e.g. Linters, Storybook, Playwright)
Target audience: Everyone, Developers, Testers
Introducing the OSA 3200 SP and OSA 3250 ePRCAdtran
Adtran's latest Oscilloquartz solutions make optical pumping cesium timing more accessible than ever. Discover how the new OSA 3200 SP and OSA 3250 ePRC deliver superior stability, simplified deployment and lower total cost of ownership. Built on a shared platform and engineered for scalable, future-ready networks, these models are ideal for telecom, defense, metrology and more.
nnual (33 years) study of the Israeli Enterprise / public IT market. Covering sections on Israeli Economy, IT trends 2026-28, several surveys (AI, CDOs, OCIO, CTO, staffing cyber, operations and infra) plus rankings of 760 vendors on 160 markets (market sizes and trends) and comparison of products according to support and market penetration.
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025Nikki Chapple
Session | Protecting Your Sensitive Data with Microsoft Purview: Practical Information Protection and DLP Strategies
Presenter | Nikki Chapple (MVP| Principal Cloud Architect CloudWay) & Ryan John Murphy (Microsoft)
Event | IRMS Conference 2025
Format | Birmingham UK
Date | 18-20 May 2025
In this closing keynote session from the IRMS Conference 2025, Nikki Chapple and Ryan John Murphy deliver a compelling and practical guide to data protection, compliance, and information governance using Microsoft Purview. As organizations generate over 2 billion pieces of content daily in Microsoft 365, the need for robust data classification, sensitivity labeling, and Data Loss Prevention (DLP) has never been more urgent.
This session addresses the growing challenge of managing unstructured data, with 73% of sensitive content remaining undiscovered and unclassified. Using a mountaineering metaphor, the speakers introduce the “Secure by Default” blueprint—a four-phase maturity model designed to help organizations scale their data security journey with confidence, clarity, and control.
🔐 Key Topics and Microsoft 365 Security Features Covered:
Microsoft Purview Information Protection and DLP
Sensitivity labels, auto-labeling, and adaptive protection
Data discovery, classification, and content labeling
DLP for both labeled and unlabeled content
SharePoint Advanced Management for workspace governance
Microsoft 365 compliance center best practices
Real-world case study: reducing 42 sensitivity labels to 4 parent labels
Empowering users through training, change management, and adoption strategies
🧭 The Secure by Default Path – Microsoft Purview Maturity Model:
Foundational – Apply default sensitivity labels at content creation; train users to manage exceptions; implement DLP for labeled content.
Managed – Focus on crown jewel data; use client-side auto-labeling; apply DLP to unlabeled content; enable adaptive protection.
Optimized – Auto-label historical content; simulate and test policies; use advanced classifiers to identify sensitive data at scale.
Strategic – Conduct operational reviews; identify new labeling scenarios; implement workspace governance using SharePoint Advanced Management.
🎒 Top Takeaways for Information Management Professionals:
Start secure. Stay protected. Expand with purpose.
Simplify your sensitivity label taxonomy for better adoption.
Train your users—they are your first line of defense.
Don’t wait for perfection—start small and iterate fast.
Align your data protection strategy with business goals and regulatory requirements.
💡 Who Should Watch This Presentation?
This session is ideal for compliance officers, IT administrators, records managers, data protection officers (DPOs), security architects, and Microsoft 365 governance leads. Whether you're in the public sector, financial services, healthcare, or education.
🔗 Read the blog: https://nikkichapple.com/irms-conference-2025/
Introducing FME Realize: A New Era of Spatial Computing and ARSafe Software
A new era for the FME Platform has arrived – and it’s taking data into the real world.
Meet FME Realize: marking a new chapter in how organizations connect digital information with the physical environment around them. With the addition of FME Realize, FME has evolved into an All-data, Any-AI Spatial Computing Platform.
FME Realize brings spatial computing, augmented reality (AR), and the full power of FME to mobile teams: making it easy to visualize, interact with, and update data right in the field. From infrastructure management to asset inspections, you can put any data into real-world context, instantly.
Join us to discover how spatial computing, powered by FME, enables digital twins, AI-driven insights, and real-time field interactions: all through an intuitive no-code experience.
In this one-hour webinar, you’ll:
-Explore what FME Realize includes and how it fits into the FME Platform
-Learn how to deliver real-time AR experiences, fast
-See how FME enables live, contextual interactions with enterprise data across systems
-See demos, including ones you can try yourself
-Get tutorials and downloadable resources to help you start right away
Whether you’re exploring spatial computing for the first time or looking to scale AR across your organization, this session will give you the tools and insights to get started with confidence.
Offshore IT Support: Balancing In-House and Offshore Help Desk Techniciansjohn823664
In today's always-on digital environment, businesses must deliver seamless IT support across time zones, devices, and departments. This SlideShare explores how companies can strategically combine in-house expertise with offshore talent to build a high-performing, cost-efficient help desk operation.
From the benefits and challenges of offshore support to practical models for integrating global teams, this presentation offers insights, real-world examples, and key metrics for success. Whether you're scaling a startup or optimizing enterprise support, discover how to balance cost, quality, and responsiveness with a hybrid IT support strategy.
Perfect for IT managers, operations leads, and business owners considering global help desk solutions.
New Ways to Reduce Database Costs with ScyllaDBScyllaDB
How ScyllaDB’s latest capabilities can reduce your infrastructure costs
ScyllaDB has been obsessed with price-performance from day 1. Our core database is architected with low-level engineering optimizations that squeeze every ounce of power from the underlying infrastructure. And we just completed a multi-year effort to introduce a set of new capabilities for additional savings.
Join this webinar to learn about these new capabilities: the underlying challenges we wanted to address, the workloads that will benefit most from each, and how to get started. We’ll cover ways to:
- Avoid overprovisioning with “just-in-time” scaling
- Safely operate at up to ~90% storage utilization
- Cut network costs with new compression strategies and file-based streaming
We’ll also highlight a “hidden gem” capability that lets you safely balance multiple workloads in a single cluster. To conclude, we will share the efficiency-focused capabilities on our short-term and long-term roadmaps.
Securiport is a border security systems provider with a progressive team approach to its task. The company acknowledges the importance of specialized skills in creating the latest in innovative security tech. The company has offices throughout the world to serve clients, and its employees speak more than twenty languages at the Washington D.C. headquarters alone.
Introduction to behavior based recommendation system
1. Introduction to Algorithms for Behavior Based
Recommendation
Tokyo Web Mining Meetup
March 26, 2016
Kimikazu Kato
Silver Egg Technology Co., Ltd.
1 / 36
2. About myself
加藤公一 Kimikazu Kato
Twitter: @hamukazu
LinkedIn: http://linkedin.com/in/kimikazukato
Chief Scientist at Silver Egg Technology
Ph.D in computer science, Master's degree in mathematics
Experience in numerical computation and mathematical algorithms
especially ...
Geometric computation, computer graphics
Partial differential equation, parallel computation, GPGPU
Mathematical programming
Now specialize in
Machine learning, especially, recommendation system
2 / 36
3. About our company
Silver Egg Technology
Established: 1998
CEO: Tom Foley
Main Service: Recommendation System, Online Advertisement
Major Clients: QVC, Senshukai (Bellemaison), Tsutaya
We provide a recommendation system to Japan's leading web sites.
3 / 36
6. Recommendation System
Recommender systems or recommendation systems (sometimes
replacing "system" with a synonym such as platform or engine) are a
subclass of information filtering system that seek to predict the
'rating' or 'preference' that user would give to an item. — Wikipedia
In this talk, we focus on collaborative filtering method, which only utilize
users' behavior, activity, and preference.
Other methods include:
Content-based methods
Method using demographic data
Hybrid
6 / 36
7. Rating Prediction Problem
usermovie W X Y Z
A 5 4 1 4
B 4
C 2 3
D 1 4 ?
Given rating information for some user/movie pairs,
Want to predict a rating for an unknown user/movie pair.
7 / 36
8. Item Prediction Problem
useritem W X Y Z
A 1 1 1 1
B 1
C 1
D 1 ? 1 ?
Given "who bought what" information (user/item pairs),
Want to predict which item is likely to be bought by a user.
8 / 36
9. Input/Output of the systems
Rating Prediction
Input: set of ratings for user/item pairs
Output: map from user/item pair to predicted rating
Item Prediction
Input: set of user/item pairs as shopping data, integer
Output: top items for each user which are most likely to be bought by
him/her
k
k
9 / 36
10. Evaluation Metrics for Recommendation
Systems
Rating prediction
The Root of the Mean Squared Error (RMSE)
The square root of the sum of squared errors
Item prediction
Precision
(# of Recommended and Purchased)/(# of Recommended)
Recall
(# of Recommended and Purchased)/(# of Purchased)
10 / 36
11. RMSE of Rating Prediction
Some user/item pairs are randomly chosen to be hidden.
usermovie W X Y Z
A 5 4 1 4
B 4
C 2 3
D 1 4 ?
Predicted as 3.1 but the actual is 4, then the squared error is
.
Take the sum over the error over all the hidden items and then, take the
square root of it.
|3.1 − 4 =|
2
0.9
2
( −∑
(u,i)∈hidden
predictedui
actualui )
2
− −−−−−−−−−−−−−−−−−−−−−−−−−
√
11 / 36
12. Precision/Recall of Item Prediction
If three items are recommended:
2 out of 3 recommended items are actually bought: the precision is 2/3.
2 out of 4 bought items are recommended: the recall is 2/4.
These are denoted by recall@3 and prec@3.
Ex. recall@5 = 3/5, prec@5 = 3/4
12 / 36
13. ROC and AUC
# of
recom.
1 2 3 4 5 6 7 8 9 10
# of
whites
1 1 1 2 2 3 4 5 5 6
# of
blacks
0 1 2 2 3 3 3 3 4 4
Divide the first and second row by total number of white and blacks
respectively, and plot the values in xy plane.
13 / 36
14. This curve is called "ROC curve." The area under this curve is called "AUC."
Higher AUC is better (max =1).
The AUC is often used in academia, but for a practical purpose...
14 / 36
15. Netflix Prize
The Netflix Prize was an open competition for the best collaborative
filtering algorithm to predict user ratings for films, based on previous
ratings without any other information about the users or films, i.e.
without the users or the films being identified except by numbers
assigned for the contest. — Wikipedia
Shortly, an open competition for preference prediction.
Closed in 2009.
15 / 36
16. Outline of Winner's Algorithm
Refer to the blog by E.Chen.
http://blog.echen.me/2011/10/24/winning-the-netflix-prize-a-summary/
Digest of the methods:
Neighborhood Method
Matrix Factorization
Restricted Boltzmann Machines
Regression
Regularization
Ensemble Methods
16 / 36
17. Notations
Number of users:
Set of users:
Number of items (movies):
Set of items (movies):
Input matrix: ( matrix)
n
U = {1, 2, … , n}
m
I = {1, 2, … , m}
A n × m
17 / 36
18. Matrix Factorization
Based on the assumption that each item is described by a small number of
latent factors
Each rating is expressed as a linear combination of the latent factors
Achieve good performance in Netflix Prize
Find such matrices , where
A ≈ YX
T
X ∈ Mat(f, n) Y ∈ Mat(f, m) f ≪ n, m
18 / 36
19. Find and maximize
p (A|X, Y , σ) = N ( | , σ)∏
≠0aui
Aui X
T
u Yi
p(X| ) = N ( |0, I)σX ∏
u
Xu σX
p(Y | ) = N ( |0, I)σY ∏
i
Yi σY
X Y p (X, Y |A, σ)
19 / 36
20. According to Bayes' Theorem,
Thus,
where means Frobenius norm.
How can this be computed? Use MCMC. See [Salakhutdinov et al., 2008].
Once and are determined, and the prediction for is
estimated by
p (X, Y |A, σ)
= p(A|X, Y , σ)p(X| )p(X| ) × const.σX σX
log p (U , V |A, σ, , )σU σV
= + ∥X + ∥Y + const.∑
Aui
( − )Aui X
T
u Yi
2
λX ∥
2
Fro
λY ∥
2
Fro
∥ ⋅ ∥Fro
X Y := YA
~
X
T
Aui
A
~
ui
20 / 36
21. Rating
usermovie W X Y Z
A 5 4 1 4
B 4
C 2 3
D 1 4 ?
Includes negative feedback
"1" means "boring"
Zero means "unknown"
Shopping (Browsing)
useritem W X Y Z
A 1 1 1 1
B 1
C 1
D 1 ? 1 ?
Includes no negative feedback
Zero means "unknown" or
"negative"
More degree of the freedom
Difference between Rating and Shopping
Consequently, the algorithm effective for the rating matrix is not necessarily
effective for the shopping matrix.
21 / 36
23. Adding a Constraint
The problem has the too much degree of freedom
Desirable characteristic is that many elements of the product should be
zero.
Assume that a certain ratio of zero elements of the input matrix remains
zero after the optimization [Sindhwani et al., 2010]
Experimentally outperform the "zero-as-negative" method
23 / 36
24. One-class Matrix Completion
[Sindhwani et al., 2010]
Introduced variables to relax the problem.
Minimize
subject to
pui
( − ) + ∥X + ∥Y∑
≠0Aui
Aui X
T
u Yi λX ∥
2
Fro
λY ∥
2
Fro
+ [ (0 − + (1 − )(1 − ]∑
=0Aui
pui X
T
u Yi )
2
pui X
T
u Yi )
2
+ T [− log − (1 − ) log(1 − )]∑
=0Aui
pui pui pui pui
= r
1
|{ | = 0}|Aui Aui
∑
=0Aui
pui
24 / 36
25. Intuitive explanation:
means how likely the -element is zero.
The second term is the error of estimation considering 's.
The third term is the entropy of the distribution.
( − ) + ∥X + ∥Y∑
≠0Aui
Aui X
T
u Yi λX ∥
2
Fro
λY ∥
2
Fro
+ [ (0 − + (1 − )(1 − ]∑
=0Aui
pui X
T
u Yi )
2
pui X
T
u Yi )
2
+ T [− log − (1 − ) log(1 − )]∑
=0Aui
pui pui pui pui
pui (u, i)
pui
25 / 36
26. Implicit Sparseness constraint: SLIM (Elastic Net)
In the regression model, adding L1 term makes the solution sparse:
The similar idea is used for the matrix factorization [Ning et al., 2011]:
Minimize
subject to
[ ∥Xw − y + ∥w + λρ|w ]min
w
1
2n
∥
2
2
λ(1 − ρ)
2
∥
2
2
|1
∥A − AW ∥ + ∥W + λρ|W
λ(1 − ρ)
2
∥
2
Fro
|1
diag W = 0
26 / 36
27. Ranking prediction
Another strategy of shopping prediction
"Learn from the order" approach
Predict whether X is more likely to be bought than Y, rather than the
probability for X or Y.
27 / 36
28. Bayesian Probabilistic Ranking
[Rendle et al., 2009]
Consider matrix factorization model, but the update of elements is
according to the observation of the "orders"
The parameters are the same as usual matrix factorization, but the
objective function is different
Consider a total order for each . Suppose that
means "the user is more likely to buy than .
The objective is to calculate such that and (which
means and are not bought by ).
>u u ∈ U i j(i, j ∈ I)>u
u i j
p(i j)>u = 0Aui Auj
i j u
28 / 36
29. Let
and define
where we assume
According to Bayes' theorem, the function to be optimized becomes:
= {(u, i, j) ∈ U × I × I| = 1, = 0} ,DA Aui Auj
p( |X, Y ) := p(i j|X, Y )∏
u∈U
>u ∏
(u,i,j)∈DA
>u
p(i j|X, Y )>u
σ(x)
= σ( − )X
T
u Yi Xu Yj
=
1
1 + e
−x
∏ p(X, Y | ) = ∏ p( |X, Y ) × p(X)p(Y ) × const.>u >u
29 / 36
30. Taking log of this,
Now consider the following problem:
This means "find a pair of matrices which preserve the order of the
element of the input matrix for each ."
L := log[∏ p( |X, Y ) × p(X)p(Y )]>u
= log p(i j|X, Y ) − ∥X − ∥Y∏
(u,i,j)∈DA
>u λX ∥
2
Fro
λY ∥
2
Fro
= log σ( − ) − ∥X − ∥Y∑
(u,i,j)∈DA
X
T
u Yi X
T
u Yj λX ∥
2
Fro
λY ∥
2
Fro
[ log σ( − ) − ∥X − ∥Y ]max
X,Y
∑
(u,i,j)∈DA
X
T
u Yi X
T
u Yj λX ∥
2
Fro
λY ∥
2
Fro
X, Y
u
30 / 36
31. Computation
The function we want to optimize:
is huge, so in practice, a stochastic method is necessary.
Let the parameters be .
The algorithm is the following:
Repeat the following
Choose randomly
Update with
This method is called Stochastic Gradient Descent (SGD).
log σ( − ) − ∥X − ∥Y∑
(u,i,j)∈DA
X
T
u Yi X
T
u Yj λX ∥
2
Fro
λY ∥
2
Fro
U × I × I
Θ = (X, Y )
(u, i, j) ∈ DA
Θ
Θ = Θ − α (log σ( − ) − ∥X − ∥Y )
∂
∂Θ
X
T
u Yi X
T
u Yj λX ∥
2
Fro
λY ∥
2
Fro
31 / 36
33. Practical Aspect of Recommendation
Problem
Computational time
Memory consumption
How many services can be integrated in a server rack?
Super high accuracy with a super computer is useless for real business
33 / 36
34. Concluding Remarks: What is Important for
Good Prediction?
Theory
Machine learning
Mathematical optimization
Implementation
Algorithms
Computer architecture
Mathematics
Human factors!
Hand tuning of parameters
Domain specific knowledge
34 / 36
35. References (1/2)
For beginers
比戸ら, データサイエンティスト養成読本 機械学習入門編, 技術評論社, 2016
T.Segaran. Programming Collective Intelligence, O'Reilly Media, 2007.
E.Chen. Winning the Netflix Prize: A Summary.
A.Gunawardana and G.Shani. A Survey of Accuracy Evaluation Metrics of
Recommendation Tasks, The Journal of Machine Learning Research,
Volume 10, 2009.
35 / 36
36. References (2/2)
Papers
Salakhutdinov, Ruslan, and Andriy Mnih. "Bayesian probabilistic matrix
factorization using Markov chain Monte Carlo." Proceedings of the 25th
international conference on Machine learning. ACM, 2008.
Sindhwani, Vikas, et al. "One-class matrix completion with low-density
factorizations." Data Mining (ICDM), 2010 IEEE 10th International
Conference on. IEEE, 2010.
Rendle, Steffen, et al. "BPR: Bayesian personalized ranking from implicit
feedback." Proceedings of the Twenty-Fifth Conference on Uncertainty in
Artificial Intelligence. AUAI Press, 2009.
Zou, Hui, and Trevor Hastie. "Regularization and variable selection via the
elastic net." Journal of the Royal Statistical Society: Series B (Statistical
Methodology) 67.2 (2005): 301-320.
Ning, Xia, and George Karypis. "SLIM: Sparse linear methods for top-n
recommender systems." Data Mining (ICDM), 2011 IEEE 11th
International Conference on. IEEE, 2011.
36 / 36