Machine Learning
Machine Learning
* [**overview**](#overview)
* [**theory**](#theory)
* [**methods**](#methods)
* [**representation learning**](#representation-learning)
* [**program synthesis**](#program-synthesis)
* [**meta-learning**](#meta-learning)
* [**weak supervision**](#weak-supervision)
* [**interesting papers**](#interesting-papers)
- [**theory**](#interesting-papers---theory)
- [**systems**](#interesting-papers---systems)
----
[**deep learning**](https://github.com/brylevkirill/notes/blob/master/Deep
%20Learning.md)
[**reinforcement learning**]
(https://github.com/brylevkirill/notes/blob/master/Reinforcement
%20Learning.md)
[**causal inference**]
(https://github.com/brylevkirill/notes/blob/master/Causal%20Inference.md)
---
### overview
#### applications
[**artificial intelligence**]
(https://github.com/brylevkirill/notes/blob/master/Artificial
%20Intelligence.md)
[**recommender systems**]
(https://github.com/brylevkirill/notes/blob/master/Recommender
%20Systems.md)
[**information retrieval**]
(https://github.com/brylevkirill/notes/blob/master/Information
%20Retrieval.md)
[state-of-the-art algorithms](https://paperswithcode.com/sota)
["Machine Learning is The New Algorithms"]
(http://nlpers.blogspot.ru/2014/10/machine-learning-is-new-algorithms.html)
by Hal Daume
Any source code for expression y = f(x), where f(x) has some parameters
and is used to make decision, prediction or estimate, has potential to be
replaced by machine learning algorithm.
<http://metacademy.org>
<http://en.wikipedia.org/wiki/Machine_learning> ([*guide*]
(https://github.com/Nixonite/open-source-machine-learning-degree/blob/
master/Introduction%20to%20Machine%20Learning%20-%20Wikipedia.pdf))
#### guides
#### courses
[course](http://youtube.com/playlist?list=PLE6Wd9FR--
Ecf_5nCbnSQMHqORpiChfJf) by Nando de Freitas `video`
[course](http://youtube.com/playlist?list=PLE6Wd9FR--
EdyJ5lbFl8UuGjecvVw66F6) by Nando de Freitas `video`
[course](http://youtube.com/playlist?list=PLTPQEx-31JXgtDaC6-
3HxWcp7fq4N8YGr) by Pedro Domingos `video`
[course](http://youtube.com/playlist?list=PLZSO_6-
bSqHTTV7w9u7grTXBHMH-mw3qn) by Alex Smola `video`
[course](http://dataschool.io/15-hours-of-expert-machine-learning-videos/)
by Trevor Hastie and Rob Tibshirani `video`
[course](http://youtube.com/playlist?list=PLD0F06AA0D2E8FFBA) by Jeff
Miller `video`
[course](http://youtube.com/c/SergeyNikolenko/playlists) by Sergey
Nikolenko `video` `in russian` `2019-2020`
[course](http://youtube.com/playlist?
list=PLwdBkWbW0oHFDCTvO6R8l3V3Pe2ophxpD) by Sergey Nikolenko
`video` `in russian` `2020`
[course](http://youtube.com/playlist?list=PL-_cKNuVAYAWXoVzVEDCT-
usTEBHUf4AF) by Sergey Nikolenko `video` `in russian` `2018`
[course](http://machinelearning.ru/wiki/index.php?title=%D0%9C
%D0%B0%D1%88%D0%B8%D0%BD%D0%BD%D0%BE%D0%B5_%D0%BE
%D0%B1%D1%83%D1%87%D0%B5%D0%BD%D0%B8%D0%B5_
%28%D0%BA%D1%83%D1%80%D1%81_%D0%BB%D0%B5%D0%BA
%D1%86%D0%B8%D0%B9%2C_%D0%9A.%D0%92.%D0%92%D0%BE
%D1%80%D0%BE%D0%BD%D1%86%D0%BE%D0%B2%29) by Konstantin
Vorontsov `video` `in russian` `2021`
[course](http://youtube.com/playlist?list=PLk4h7dmY2eYHHTyfLyrl7HmP-
H3mMAW08) by Konstantin Vorontsov `video` `in russian` `2020`
[course](http://youtube.com/playlist?
list=PLJOzdkh8T5krxc4HsHbB8g8f0hu7973fK) by Konstantin Vorontsov
`video` `in russian` `2019/2020`
[course](http://youtube.com/playlist?
list=PLJOzdkh8T5kp99tGTEFjH_b9zqEQiiBtC) by Konstantin Vorontsov `video`
`in russian` `2014`
[course](http://youtube.com/playlist?list=PLlb7e2G7aSpSSsCeUMLN-
RxYOLAI9l2ld) by Igor Kuralenok `video` `in russian` `2017`
[course](http://youtube.com/playlist?
list=PLlb7e2G7aSpSWVExpq74FnwFnWgLby56L) by Igor Kuralenok `video`
`in russian` `2016`
[course](http://youtube.com/playlist?
list=PLlb7e2G7aSpTd91sd82VxWNdtTZ8QnFne) by Igor Kuralenok `video`
`in russian` `2015`
[course](http://youtube.com/playlist?list=PL-
_cKNuVAYAWeMHuPI9A8Gjk3h62b7K7l) by Igor Kuralenok `video` `in russian`
`2013/2014`
course ([1](http://youtube.com/playlist?list=PL-
_cKNuVAYAV8HV5N2sbZ72KFoOaMXhxc), [2](https://youtube.com/playlist?
list=PL-_cKNuVAYAXCbK6tV2Rc7293CxIMOlxO)) by Igor Kuralenok `video` `in
russian` `2012/2013`
[course](http://youtube.com/playlist?list=PLEqoHzpnmTfDwuwrFHWVHdr1-
qJsfqCUX) by Evgeny Sokolov `video` `in russian` `2019`
[course](http://coursera.org/specializations/machine-learning-data-analysis)
from Yandex `video` `in russian`
#### books
#### blogs
<http://offconvex.org>
<http://argmin.net>
<http://inference.vc>
<http://blog.shakirm.com>
<http://machinethoughts.wordpress.com>
<http://hunch.net>
<http://machinedlearnings.com>
<http://nlpers.blogspot.com>
<http://timvieira.github.io>
<http://ruder.io>
<http://danieltakeshi.github.io>
<http://lilianweng.github.io>
#### podcasts
<https://twimlai.com>
<https://thetalkingmachines.com>
<https://lexfridman.com/ai>
<https://jack-clark.net/import-ai>
<https://newsletter.ruder.io>
<https://getrevue.co/profile/seungjaeryanlee>
<https://getrevue.co/profile/wildml>
<https://reddit.com/r/MachineLearning>
#### conferences
- NeurIPS 2019 [[videos](https://slideslive.com/neurips)] [[videos]
(https://youtube.com/playlist?list=PLderfcX9H9MpK9LKziKu_gUsTL_O2pnLB)]
[[notes](https://david-abel.github.io/notes/neurips_2019.pdf)]
[[summary](https://github.com/hindupuravinash/nips2017)] [[summary]
(https://github.com/kihosuh/nips_2017)] [[summary]
(https://github.com/sbarratt/nips2017)]
[video collection](https://github.com/dustinvtran/ml-videos)
---
### theory
----
problems:
- How can we make sure what we learn will generalize to future data?
- design algorithms
- quantify knowledge/uncertainty
frameworks:
----
ingredients:
- distributions
- i.i.d. samples
- learning algorithms
- predictors
- loss functions
*A priori analysis*: How well a learning algorithm will perform on new data?
- (statistics) Can we match the best possible loss assuming data generating
distribution belongs to known family?
----
----
----
[**deep learning**](https://github.com/brylevkirill/notes/blob/master/Deep
%20Learning.md#theory)
[**reinforcement learning**]
(https://github.com/brylevkirill/notes/blob/master/Reinforcement
%20Learning.md#theory)
---
### methods
**challenges**
----
["The Three Cultures of Machine Learning"]
(https://www.cs.jhu.edu/~jason/tutorials/ml-simplex.html) by Jason Eisner
["Algorithmic Dimensions"]
(https://justindomke.wordpress.com/2015/09/14/algorithmic-dimensions/) by
Justin Domke
----
[state-of-the-art algorithms](https://paperswithcode.com/sota)
[algorithms]
(http://en.wikipedia.org/wiki/List_of_machine_learning_algorithms)
(Wikipedia)
[algorithms](http://metacademy.org) (Metacademy)
[algorithms](http://scikit-learn.org/stable/tutorial/machine_learning_map/
index.html) (scikit-learn)
[cheat sheet](http://eferm.com/wp-content/uploads/2011/05/cheat3.pdf)
[cheat sheet](http://github.com/soulmachine/machine-learning-cheat-
sheet/blob/master/machine-learning-cheat-sheet.pdf)
---
----
----
[**deep learning**](https://github.com/brylevkirill/notes/blob/master/Deep
%20Learning.md)
[**probabilistic programming**]
(https://github.com/brylevkirill/notes/blob/master/Probabilistic
%20Programming.md)
[**knowledge representation**]
(https://github.com/brylevkirill/notes/blob/master/Knowledge
%20Representation%20and%20Reasoning.md#knowledge-representation)
---
programmatic representations:
- *well-specified*
- *compact*
- *combinatorial*
- *hierarchical*
challenges:
- *open-endedness*
- *over-representation*
- *chaotic execution*
- *high resource-variance*
Programs in the same space may vary greatly in the space and time they
require to execute.
----
["Program Synthesis"]
(https://microsoft.com/en-us/research/wp-content/uploads/2017/10/
program_synthesis_now.pdf) by Gulwani, Polozov, Singh `paper`
----
[overview]
(https://facebook.com/nipsfoundation/videos/1552060484885185?t=5412)
by Scott Reed `video`
----
[**interesting recent papers**]
(https://github.com/brylevkirill/notes/blob/master/interesting%20recent
%20papers.md#program-induction)
[**selected papers**](https://yadi.sk/d/LZYQN7Lu3WxVVb)
---
### meta-learning
[overview](https://facebook.com/icml.imls/videos/400619163874853?
t=500) by Chelsea Finn and Sergey Levine `video`
[overview]
(https://facebook.com/nipsfoundation/videos/1554594181298482?t=277) by
Pieter Abbeel `video`
[overview](https://slideslive.com/38915714/metalearning-challenges-and-
frontiers) by Chelsea Finn `video`
[overview]
(http://videolectures.net/deeplearning2017_de_freitas_learning_to_learn/
#t=631) by Nando de Freitas `video`
----
[meta-learning](https://lilianweng.github.io/lil-log/2018/11/30/meta-
learning.html) overview by Lilian Weng
----
[overview](http://people.idsia.ch/~juergen/metalearning.html) by Juergen
Schmidhuber
[overview](https://youtu.be/3FIo6evmweo?t=4m6s) by Juergen
Schmidhuber `video` *(meta-learning vs transfer learning)*
[overview](https://youtube.com/watch?v=nqiUFc52g78) by Juergen
Schmidhuber `video`
[overview](http://people.idsia.ch/~juergen/metalearner.html) by Juergen
Schmidhuber
[**Goedel Machine**]
(https://github.com/brylevkirill/notes/blob/master/Artificial
%20Intelligence.md#meta-learning---goedel-machine)
*(Juergen Schmidhuber)*
----
----
---
### automated machine learning
problems:
- aspect ratio Ptr/N of the training data matrix: Ptr >> N, Ptr = N or Ptr <<
N
----
[**interesting papers**](#interesting-papers---automated-machine-
learning)
----
[tutorial](https://facebook.com/nipsfoundation/videos/199543964204829)
by Frank Hutter and Joaquin Vanschoren `video`
["Automated Machine Learning"](https://youtube.com/watch?
v=AFeozhAD9xE) by Andreas Mueller `video`
----
[*auto-sklearn*](https://github.com/automl/auto-sklearn) project
[*TPOT*](https://github.com/EpistasisLab/tpot) project
[*auto_ml*](http://auto-ml.readthedocs.io) project
[*H2O AutoML*](http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html)
project
- [overview](https://youtube.com/watch?v=aPDOZfu_Fyk) by Zoubin
Ghahramani `video`
- [overview](https://youtu.be/H7AMB0oo__4?t=53m20s) by Zoubin
Ghahramani `video`
- [overview](https://youtube.com/watch?v=WW2eunuApAU) by Zoubin
Ghahramani `video`
[*AlphaD3M*]
(https://github.com/brylevkirill/notes/blob/master/Reinforcement
%20Learning.md#alphad3m-machine-learning-pipeline-synthesis-drori-et-al)
project
----
[AutoML challenge](http://automl.chalearn.org)
---
[**data programming**](#weak-supervision---data-programming)
([post](https://microsoft.com/en-us/research/blog/using-transfer-learning-to-
address-label-noise-for-large-scale-image-classification),
[post](https://blogs.bing.com/search-quality-insights/2018-06/Artificial-
intelligence-human-intelligence-Training-data-breakthrough))
---
[Snorkel](http://github.com/HazyResearch/snorkel) project
[Snorkel](http://hazyresearch.github.io/snorkel) blog
----
[overview](http://videolectures.net/kdd2018_re_hand_labeled_data) by
Chris Re `video`
[overview](https://slideslive.com/38916920/building-and-structuring-
training-sets-for-multitask-learning) by Alex Ratner `video`
----
["Structure Learning: Are Your Sources Only Telling You What You Want to
Hear?"](https://hazyresearch.github.io/snorkel/blog/structure_learning.html)
`post`
----
----
---
- [**theory**](#interesting-papers---theory)
- [**automated machine learning**](#interesting-papers---automated-
machine-learning)
- [**systems**](#interesting-papers---systems)
---
> "Proof that if you have a finite number of functions, say N, then every
training error will be close to every test error once you have more than log N
training cases by a small constant factor. Clearly, if every training error is
close to its test error, then overfitting is basically impossible (overfitting
occurs when the gap between the training and the test error is large)."
> "There are two cultures in the use of statistical modeling to reach
conclusions from data. One assumes that the data are generated by a given
stochastic data model. The other uses algorithmic models and treats the
data mechanism as unknown. The statistical community has been committed
to the almost exclusive use of data models. This commitment has led to
irrelevant theory, questionable conclusions, and has kept statisticians from
working on a large range of interesting current problems. Algorithmic
modeling, both in theory and practice, has developed rapidly in fields outside
statistics. It can be used both on large complex data sets and as a more
accurate and informative alternative to data modeling on smaller data sets.
If our goal as a field is to use data to solve problems, then we need to move
away from exclusive dependence on data models and adopt a more diverse
set of tools."
> "Machine learning algorithms can figure out how to perform important
tasks by generalizing from examples. This is often feasible and cost-effective
where manual programming is not. As more data becomes available, more
ambitious problems can be tackled. As a result, machine learning is widely
used in computer science and other fields. However, developing successful
machine learning applications requires a substantial amount of “black art”
that is hard to find in textbooks. This article summarizes twelve key lessons
that machine learning researchers and practitioners have learned. These
include pitfalls to avoid, important issues to focus on, and answers to
common questions."
> "During last fifty years a strong machine learning theory has been
developed. This theory includes: 1. The necessary and sufficient conditions
for consistency of learning processes. 2. The bounds on the rate of
convergence which in general cannot be improved. 3. The new inductive
principle (SRM) which always achieves the smallest risk. 4. The effective
algorithms, (such as SVM), that realize consistency property of SRM principle.
It looked like general learning theory has been complied: it answered almost
all standard questions that is asked in the statistical theory of inference.
Meantime, the common observation was that human students require much
less examples for training than learning machine. Why? The talk is an
attempt to answer this question. The answer is that it is because the human
students have an Intelligent Teacher and that Teacher-Student interactions
are based not only on the brute force methods of function estimation from
observations. Speed of learning also based on Teacher-Student interactions
which have additional mechanisms that boost learning process. To learn from
smaller number of observations learning machine has to use these
mechanisms. In the talk I will introduce a model of learning that includes the
so called Intelligent Teacher who during a training session supplies a Student
with intelligent (privileged) information in contrast to the classical model
where a student is given only outcomes y for events x. Based on additional
privileged information x* for event x two mechanisms of Teacher-Student
interactions (special and general) are introduced: 1. The Special Mechanism:
To control Student's concept of similarity between training examples. and 2.
The General Mechanism: To transfer knowledge that can be obtained in
space of privileged information to the desired space of decision rules. Both
mechanisms can be considered as special forms of capacity control in the
universally consistent SRM inductive principle. Privileged information exists
for almost any inference problem and can make a big difference in speed of
learning processes."
- `press` <http://learningtheory.org/learning-has-just-started-an-interview-
with-prof-vladimir-vapnik/>
- <https://github.com/brylevkirill/notes/blob/master/Reinforcement
%20Learning.md#alphad3m-machine-learning-pipeline-synthesis-drori-et-al>
> "Neural networks dominate the modern machine learning landscape, but
their training and success still suffer from sensitivity to empirical choices of
hyperparameters such as model architecture, loss function, and optimisation
algorithm. In this work we present Population Based Training, a simple
asynchronous optimisation algorithm which effectively utilises a fixed
computational budget to jointly optimise a population of models and their
hyperparameters to maximise performance. Importantly, PBT discovers a
schedule of hyperparameter settings rather than following the generally sub-
optimal strategy of trying to find a single fixed set to use for the whole
course of training. With just a small modification to a typical distributed
hyperparameter training framework, our method allows robust and reliable
training of models. We demonstrate the effectiveness of PBT on deep
reinforcement learning problems, showing faster wall-clock convergence and
higher final performance of agents by optimising over a suite of
hyperparameters. In addition, we show the same method can be applied to
supervised learning for machine translation, where PBT is used to maximise
the BLEU score directly, and also to training of Generative Adversarial
Networks to maximise the Inception score of generated images. In all cases
PBT results in the automatic discovery of hyperparameter schedules and
model selection which results in stable training and better final
performance."
> "Two common tracks for the tuning of hyperparameters exist: parallel
search and sequential optimisation, which trade-off concurrently used
computational resources with the time required to achieve optimal results.
Parallel search performs many parallel optimisation processes (by
optimisation process we refer to neural network training runs), each with
different hyperparameters, with a view to finding a single best output from
one of the optimisation processes – examples of this are grid search and
random search. Sequential optimisation performs few optimisation processes
in parallel, but does so many times sequentially, to gradually perform
hyperparameter optimisation using information obtained from earlier training
runs to inform later ones – examples of this are hand tuning and Bayesian
optimisation. Sequential optimisation will in general provide the best
solutions, but requires multiple sequential training runs, which is often
unfeasible for lengthy optimisation processes."
> "In this work, we present a simple method, Population Based Training
which bridges and extends parallel search methods and sequential
optimisation methods. Advantageously, our proposal has a wallclock run time
that is no greater than that of a single optimisation process, does not require
sequential runs, and is also able to use fewer computational resources than
naive search methods such as random or grid search. Our approach
leverages information sharing across a population of concurrently running
optimisation processes, and allows for online propagation/transfer of
parameters and hyperparameters between members of the population based
on their performance."
- `post` <https://deepmind.com/blog/article/how-evolutionary-selection-
can-train-more-capable-self-driving-cars/>
> "Large labeled training sets are the critical building blocks of supervised
learning methods and are key enablers of deep learning techniques. For
some applications, creating labeled training sets is the most time-consuming
and expensive part of applying machine learning. We therefore propose a
paradigm for the programmatic creation of training sets called data
programming in which users provide a set of labeling functions, which are
programs that heuristically label large subsets of data points, albeit noisily.
By viewing these labeling functions as implicitly describing a generative
model for this noise, we show that we can recover the parameters of this
model to “denoise” the training set. Then, we show how to modify a
discriminative loss function to make it noise-aware. We demonstrate our
method over a range of discriminative models including logistic regression
and LSTMs. We establish theoretically that we can recover the parameters of
these generative models in a handful of settings. Experimentally, on the
2014 TAC-KBP relation extraction challenge, we show that data programming
would have obtained a winning score, and also show that applying data
programming to an LSTM model leads to a TAC-KBP score almost 6 F1 points
over a supervised LSTM baseline (and into second place in the competition).
Additionally, in initial user studies we observed that data programming may
be an easier way to create machine learning models for non-experts."
- `video` <https://youtube.com/watch?v=iSQHelJ1xxU>
- `post`
<http://hazyresearch.github.io/snorkel/blog/weak_supervision.html>
- `post`
<http://hazyresearch.github.io/snorkel/blog/dp_with_tf_blog_post.html>
- `audio` <https://soundcloud.com/nlp-highlights/28-data-programming-
creating-large-training-sets-quickly> (Ratner)
- `notes` <https://github.com/b12io/reading-group/blob/master/data-
programming-snorkel.md>
- `code` <https://github.com/HazyResearch/snorkel>
- [Snorkel](https://github.com/brylevkirill/notes/blob/master/Knowledge
%20Representation%20and%20Reasoning.md#machine-reading-projects---
snorkel) project `summary`
- `video` <https://youtube.com/watch?v=0gRNochbK9c>
- `post` <http://hazyresearch.github.io/snorkel/blog/socratic_learning.html>
- `code` <https://github.com/HazyResearch/snorkel>
- [Snorkel](https://github.com/brylevkirill/notes/blob/master/Knowledge
%20Representation%20and%20Reasoning.md#machine-reading-projects---
snorkel) project `summary`
- `notes` <https://blog.acolyer.org/2018/08/22/snorkel-rapid-training-data-
creation-with-weak-supervision>
- [Snorkel](https://github.com/brylevkirill/notes/blob/master/Knowledge
%20Representation%20and%20Reasoning.md#machine-reading-projects---
snorkel) project `summary`
---
> "Machine learning offers a fantastically powerful toolkit for building useful
complexprediction systems quickly. This paper argues it is dangerous to think
ofthese quick wins as coming for free. Using the software engineering
frameworkof technical debt, we find it is common to incur massive ongoing
maintenancecosts in real-world ML systems. We explore several ML-specific
risk factors toaccount for in system design. These include boundary erosion,
entanglement,hidden feedback loops, undeclared consumers, data
dependencies, configurationissues, changes in the external world, and a
variety of system-level anti-patterns."
- `notes` <https://blog.acolyer.org/2016/02/29/machine-learning-the-high-
interest-credit-card-of-technical-debt>
- `post` <http://john-foreman.com/blog/the-perilous-world-of-machine-
learning-for-fun-and-profit-pipeline-jungles-and-hidden-feedback-loops>
`Vowpal Wabbit`
> "We present a system and a set of techniques for learning linear predictors
with convex losses on terascale data sets, with trillions of features, 1 billions
of training examples and millions of parameters in an hour using a cluster of
1000 machines. Individually none of the component techniques are new, but
the careful synthesis required to obtain an efficient implementation is. The
result is, up to our knowledge, the most scalable and efficient linear learning
system reported in the literature. We describe and thoroughly evaluate the
components of the system, showing the importance of the various design
choices."
>"
>"
- <https://github.com/JohnLangford/vowpal_wabbit/wiki>
- <https://github.com/brylevkirill/notes/blob/master/Reinforcement
%20Learning.md#making-contextual-decisions-with-low-technical-debt-
agarwal-et-al>
> "In this paper we present CatBoost, a new open-sourced gradient boosting
library that successfully handles categorical features and outperforms
existing publicly available implementations of gradient boosting in terms of
quality on a set of popular publicly available datasets. The library has a GPU
implementation of learning algorithm and a CPU implementation of scoring
algorithm, which are significantly faster than other gradient boosting libraries
on ensembles of similar sizes."
- <https://catboost.yandex>
- `video` <https://youtube.com/watch?v=jLU6kNRiZ5o>
- `code` <https://github.com/catboost/catboost>