100% found this document useful (5 votes)
2K views2 pages

Ai Cheat Sheet Machine Learning With Python Cheat Sheet

AI Cheat Sheet provides an overview of artificial intelligence, machine learning, and deep learning concepts and applications. It defines key AI terms and outlines common supervised, unsupervised, and reinforcement learning algorithms. Examples of how these algorithms are used include ranking in bioinformatics, classification for email spam filtering, and reinforcement learning for warehouse inventory management and delivery routing. Popular open-source frameworks for machine learning are also listed.

Uploaded by

A.K. Mars
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (5 votes)
2K views2 pages

Ai Cheat Sheet Machine Learning With Python Cheat Sheet

AI Cheat Sheet provides an overview of artificial intelligence, machine learning, and deep learning concepts and applications. It defines key AI terms and outlines common supervised, unsupervised, and reinforcement learning algorithms. Examples of how these algorithms are used include ranking in bioinformatics, classification for email spam filtering, and reinforcement learning for warehouse inventory management and delivery routing. Popular open-source frameworks for machine learning are also listed.

Uploaded by

A.K. Mars
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

AI Cheat Sheet

AI Basics USE CASES USE CASES


ARTIFICIAL INTELLIGENCE (AI) Ranking is used in bioinformatics, drug discovery, Manufacturing. Robots use deep reinforcement
“The theory and development of computer systems information retrieval, sentiment analysis, machine learning to pick a device from one box and put it in
able to perform tasks normally requiring human translation, and online advertising. a container. They learn from successful and failed
intelligence.” attempts.
Classification is applied to e-mail spam filtering,
Oxford English Dictionary bank customers loan pay back willingness Inventory Management. RL algorithms reduce
prediction, cancer tumour cells identification, transit time for stocking and retrieving products in
sentiment analysis, drugs classification, facial the warehouse to optimize space utilization and
BUSINESS USE CASES keypoints detection, and pedestrians detection in warehouse operations.
an automotive car driving.
1. AI and CUSTOMER SERVICE ENHANCEMENT: Delivery management. Reinforcement learning is
Business value is generated through optimization Regression is employed for pricing optimization, used to solve the problems of operational research
of the “front-office” operations. modeling historical sales in order to determine a and logistics (e.g., the split delivery vehicle routing
pricing strategy and predict demand for products problem).
2. AI and PROCESSES OPTIMIZATION: that have not been sold before.
The value is generated through optimization of the Finance sector. The Q-learning algorithm is able
“back-office” operations in order to reduce costs Supervised ML methods are leveraged to to learn an optimal stock market trading strategy
and improve compliance. minimize the number of returns for online with a single instruction: maximize the value of a
purchases. portfolio.
3. AI and INSIGHTS GENERATION:
New business value is created from the existing DL methods are used for predicting the next
purchases of customers in advance. OPEN-SOURCE FRAMEWORKS
data by enabling better, more consistent, and
faster decision making. RL-Glue, OpenAI Gym, RLPy, BURLAP
Forecasting models are the bread and butter for
MACHINE LEARNING (ML) business intelligence.
ML is one of the AI approaches, which uses Unsupervised learning
statistical techniques to give computer systems OPEN-SOURCE FRAMEWORKS
the ability to “learn” (i.e., progressively improve R Data Science libraries: Caret, randomForest, ALGORITHMS
performance on a specific task) from some data, nnet, dplyr
without being explicitly programmed. Clustering [k-means, Mean-Shift,
Python Data Science libraries: Scikit-learn, Scipy, Hierarchical, Fuzzy c-means] — the algorithm
DEEP LEARNING (DL) NumPy, NLTK, Matplotlib, CatBoost, XGBoost, is asked to group the similar kind of data items by
DL is a machine learning method. It uses neural PyTorch, Caffe2, Theano, Keras, TensorFlow, considering the most satisfied condition: all the
networks and allows us to train an algorithm to OpenCV items in the same group (called a cluster) are more
predict outputs, given a set of inputs. A neural net- similar to each other than to the items in the other
work consists of an input layer, a hidden layer(s), groups (clusters).
and an output layer. The “deep” in deep learning Semi-supervised learning
refers to having more than one hidden layer of Anomalies detection [Density-Based, SVM-
neurons in a neural network. Both supervised and ALGORITHMS Based, Clustering-Based] — the computer
unsupervised learning can be used for training. program sifts through a set of events or objects
Pseudo labeling is an algorithm used for and flags some of them as unusual or atypical.
expanding training data sets. It requires some
TYPES OF ML ALGORITHMS data to be labeled first, but then it uses this data Dimensionality reduction [PCA, Singular
in a conjunction with a large amount of unlabeled Value Decomposition, LDA] — the algorithm is
Supervised learning algorithms make data to learn a model for a domain. It is compatible asked to reduce the number of random variables
predictions based on a set of examples. They are with almost all neural network models and training under consideration by obtaining a set of principal
trained using labeled data sets that have inputs methods. variables. Dimensionality reduction can be divided
and expected outputs. into feature selection and feature extraction.
Generative models [VAE and GANs] are neural
network models that can replicate the data Missing data imputation [mean imputation,
Semi-supervised learning learning algorithms distribution given as an input. This allows to k-NN] — the algorithm is given examples with
use unlabeled examples together with a small generate “fake-but-realistic” data points from real some missing entries and is asked to provide the
amount of labeled data to improve the learning data points. values of the missing entries.
accuracy.
Association rules learning [AIS, SETM,
Unsupervised learning algorithms work with USE CASES APRIORI, FP-GROWTH] — is a rule-based method
totally unlabeled data. They are designed to Pseudo labeling is applicable to malware/fraud for discovering interesting relations between
discover the intrinsic patterns that underlie detection, document structure analysis, stock variables in large databases. It is intended to
the data, such as a clustering structure, a low- predictions, real-time diagnostics, NLP/speech identify strong rules discovered in databases using
dimensional manifold, or a sparse tree and graph. recognition, and any other type of problems where some measures of interestingness.
small labeled data set size represents a constraint.
Reinforcement learning algorithms analyze and USE CASES
Generative models are used for real-time visual
optimize the behavior of an agent based on the processing, text-to-image generation, image-to- Unsupervised learning methods are used in
feedback from the environment. Machines try image translation, increasing image resolution, or healthcare and pharma for such tasks as human
different scenarios to discover which actions yield predicting the next video frame. genetic clustering and genome sequence analysis.
the greatest reward, rather than being told which They are also widely used across all industries for
actions to take. customers segmentation, recommender systems,
OPEN-SOURCE FRAMEWORKS
chatbots, topic modeling, anomalies detection,
TensorFlow, numPy, Scikit-learn grouping of shopping items, search results
Supervised learning grouping, etc.
ALGORITHMS Reinforcement learning
OPEN-SOURCE FRAMEWORKS
Regression [Linear, Polynomial,
ALGORITHMS R Data Science libraries: Caret, Rattle, e1071,
Nonparametric] — the algorithm is asked to nnet, dplyr
predict a numerical value given some input: “How Q-learning — the algorithm is based on a
much money would a bank gain (lose) by lending to mathematical optimization method known as Python Data Science libraries: BigARTM,
a certain client?” dynamic programming. Given current states of a Tesseract, Scrapy, Scikit-learn, PyTorch, Caffe2,
system, the algorithm finds an optimal policy (i.e., Theano, Keras, TensorFlow
Classification [Naive Bayes, k-NN,SVM, set of actions) that maximizes Q-value function.
Random Forest, Neural Networks] — the
algorithm is asked to specify which of k categories State-Action-Reward-State-Action — the Information sources:
some input belongs to. “Will a client be able to pay algorithm resembles Q-learning a lot, but learns
his loan back?” Q-value based on the action performed by the Ian Goodfellow, Yoshua Bengio and Aaron Courville
current policy instead of the greedy policy. (2016) “Deep Learning”, MIT Press
Learning to rank [HITS, SALSA, PageRank] —
the algorithm is asked to rank (i.e., to produce a Deep Q Network — the algorithm leverages a Andrew Burgess (2017) “The Executive Guide to
permutation of items in new, unseen lists) in a way Neural Network to estimate the Q-value function. Artificial Intelligence”, Springer
that is similar to the rankings in the training data. It resolves some limitations of the Q-learning Hui Li (2017) “Which machine learning algorithm
“What are the top 10 world’s safest banks?” algorithm. should I use?”, SAS Blog
Forecasting [Trending, Time-Series Deep Deterministic Policy Gradient — the
Modeling, Neural Networks] — the algorithm is algorithm is designed for such problems as
asked to generate predictions based on available physical control tasks, where the action space is
data. continuous.

Get the latest version at: http://altoros.com/visuals.html


Machine Learning with Python Cheat Sheet
General-purpose machine learning The face_recognition framework allows for universe is a software platform for measuring and
recognizing and manipulating faces from Python or training an AI’s general intelligence across the world’s
The Auto_ml framework is developed for automating
from the command line.” supply of games, websites, and other applications.
a machine learning process and making it easier to
get real-time predictions in production. It automates Dockerface is a Docker-based solution for face
analytics, feature engineering, feature selection, detection using Faster R-CNN. Data analysis and data visualization
model selection, data formatting, hyperparameter Detectron is a software system by Facebook AI Apache Spark is a fast and general cluster computing
optimization, etc. Research that implements state-of-the-art object system for big data. It provides high-level APIs in
The machine-learning framework provides a web detection algorithms, including Mask R-CNN. It Python and an optimized engine that supports general
interface and an API for classification and regression. is written in Python and powered by the Caffe2 computation graphs for data analysis.
The support vector machines and support vector framework. NumPy is a fundamental package needed for scientific
regression algorithms are available via the framework computing with Python.
out of the box. Natural language processing SciPy is open-source software for mathematics,
XGBoost implements machine learning algorithms NLTK (the Natural Language Toolkit) is a suite of science, and engineering. It includes modules for
under the Gradient Boosting technique. XGBoost open-source Python modules, data sets, and tutorials statistics, optimization, integration, linear algebra,
provides a parallel tree boosting (also known as GBDT supporting research and development in natural Fourier transforms, signal and image processing, ODE
or GBM), which solves many data science problems in language processing. solvers, etc.
a fast and accurate manner.
TextBlob is a Python library for processing textual Pandas is a library providing high-performance, easy-
scikit-learn is a Python module for machine data. It provides a simple API for diving into common to-use data structures and data analysis tools for the
learning built on top of the SciPy framework. The natural language processing tasks, such as part-of- Python language.
module encapsulates methods for enabling data speech tagging, noun phrase extraction, sentiment PyMC is a Python module that implements the Bayesian
preprocessing, classification, regression, clustering, analysis, classification, translation, etc. statistical models and fitting algorithms, including
model selection, etc.
PyNLPl is a library for natural language processing the Markov chain Monte Carlo methods. Its flexibility
SimpleAI is a library for solving search and statistical that contains various modules useful for a variety of and extensibility make it applicable to a large variety
classification problems. The search module includes natural language processing tasks, such as extraction of problems. Along with core sampling functionality,
traditional and local search algorithms, constraint of n-grams and frequency lists or building simple PyMC includes methods for summarizing output,
satisfaction problem algorithm, and interactive language models. plotting, goodness-of-fit, and convergence diagnostics.
execution of search algorithms. The classification
Polyglot is a multilingual text processing toolkit. statsmodels is a package for statistical modeling and
module of SimpleAI supports decision tree, Naive
It supports language detection (196 languages), econometrics in Python. It provides a complement to
Bayes, and k-nearest neighbours classifiers.
tokenization (165 languages), named entity recognition SciPy for statistical computations, including descriptive
MLlib in Apache Spark is a distributed machine (40 languages), part-of-speech tagging (16 languages), statistics and estimation, as well as inference for
learning library in Spark. Its goal is to make practical sentiment analysis (136 languages), and other statistical models.
machine learning scalable and easy. It provides a set features. Matplotlib is a Python 2D plotting library, which
of common machine learning algorithms, as well as
Fuzzy Wuzzy is a fuzzy string matching implementation produces publication-quality figures in a variety of
utilities for linear algebra, statistics, data handling,
in Python. The algorithm uses Levenshtein Distance to hard copy formats and interactive environments
featurization, etc.
calculate the differences between sequences. across platforms.
Theano is a numerical computation library for Python.
jellyfish is a Python library for approximate and ggplot is a plotting system for Python built for making
It allows you to efficiently define, optimize, and
phonetic matching of strings. professional looking plots quickly and with a minimum
evaluate mathematical expressions involving multi-
of code.
dimensional arrays.
Topic modeling scikit-plot is a visualization library for quick and
TensorFlow is an open-source software library for
BigARTM is a powerful tool for topic modeling. Additive easy generation of common plots in data analysis and
numerical computation using data flow graphs.
regularization of topic models is the innovative machine learning.
Originally developed by the Google Brain team,
TensorFlow allows to easily deploy computations approach lying at the core of the BigARTM library.
across a variety of platforms (CPUs, GPUs, or TPUs), as The solution helps to build multi-objective models Other projects
well as on clusters of servers, mobile and edge devices, by adding the weighted sums of regularizers to the The deepdream repository contains IPython Notebook
etc.It is widely used in a bundle with neural networks. optimization criterion. BigARTM supports different with sample code, complementing Google Research
features, including sparsing, smoothing, topics blog post about the neural network art.
Keras is a high-level neural networks API, written in decorrelation, etc.
Python and capable of running on top of TensorFlow NeuralTalk2 is an efficient image captioning code
or Theano. It was developed with a focus on enabling Gensim is a Python library for topic modelling, based on recurrent neural networks.
fast experimentation. document indexing, and similarity retrieval with large
corpora. Kaggle-cifar contains code for the CIFAR-10 Kaggle
Caffe is a deep learning framework that supports competition on image recognition. It uses a cuda-
many different types of architectures geared towards topik is a topic modeling toolbox, which provides a convnet architecture.
image classification and image segmentation. full-suite and high-level interface for anyone interested
in applying topic modeling. It includes a bunch of The Lime project is about explaining what machine
Caffe2 is a lightweight, modular, and scalable deep utilities beyond statistical modeling algorithms. learning classifiers (or models) are doing. At the
learning framework. Based on the original Caffe, moment, it supports explaining individual predictions
Caffe2 aims to provide an easy and straightforward for text classifiers or classifiers that act on tables
way to experiment with deep learning and leverage Chatbots (e.g., the NumPy arrays of numerical or categorical
community contributions of new models and End-to-end-negotiator is a PyTorch implementation data) or images. The project aims at helping users to
algorithms. of research paper “Deal or No Deal? End-to-End understand and interact meaningfully with machine
PyTorch is a Python package that provides two high- Learning for Negotiation Dialogues” by Facebook learning.
level features: tensor computation (like NumPy) with AI Research. The code trains neural networks to
DeepJ is a deep learning model for style-specific music
strong GPU acceleration and deep neural networks. hold negotiations in natural language and enabless
generation.
reinforcement learning self-play and rollout-based
CatBoost is a general purpose gradient boosting planning. deep-neuroevolution is a GitHub repository
on decision trees library with categorical features containing implementation of the neuroevolution
support out of the box. It is an easy-to-install and well DeepPavlov is an open-source library for building end-
approach, where neural networks are optimized
documented package. It supports CPU and GPU (even to-end dialog systems and training chatbots built on
through the evolutionary algorithms. It is an
multi-GPU) computation. TensorFlow and Keras.
effective method to train deep neural networks for
awesome-bots is a GitHub repository with a collection reinforcement learning tasks.
Computer vision of materials dedicated to chatbots.
scikit-image is a collection of algorithms for image Free online books
processing in Python. It includes algorithms for Reinforcement learning
segmentation, geometric transformations, color space DeepMind Lab is a first-person 3D game platform Understanding Machine Learning: From
manipulation, analysis, filtering, morphology, feature designed for research and development of general Theory to Algorithms by Shai Shalev-Shwartz
detection, etc. artificial intelligence and machine learning systems. and Shai Ben-David (2014)
OpenCV is a computer vision framework designed for DeepMind Lab can be used to study how autonomous Natural Language Processing with Python by
computational efficiency with a strong focus on real- artificial agents may learn complex tasks in large, Steven Bird, Ewan Klein, and Edward Loper (2009)
time applications. Usage ranges from interactive art to partially observed, and visually diverse worlds.
Deep Learning by Yoshua Bengio, Ian
mines inspection and advanced robotics. OpenAI Baselines is a set of high-quality
Goodfellow, and Aaron Courville (2015)
SimpleCV is a framework that gives access to several implementations of reinforcement learning algorithms.
high-powered computer vision libraries, such as OpenAI Gym is a toolkit for developing and comparing Neural Networks and Deep Learning by
OpenCV. To use the framework, you don’t need to reinforcement learning algorithms. Michael Nielsen (2014)
first learn bit depths, file formats, color spaces, buffer RLPy is a framework for conducting sequential Deep Learning by Microsoft Research (2013)
management, eigenvalues, and matrix versus bitmap decision-making experiments. The current focus of
storage. Deep Learning in Neural Networks: An
this project is on value-function-based reinforcement Overview by Jurgen Schmidhuber (2014)
OpenFace is a Python and Torch implementation of learning.
face recognition with deep neural networks.

Get the latest version at: http://altoros.com/visuals.html

Common questions

Powered by AI

Reinforcement learning significantly enhances decision-making processes in dynamic environments by allowing agents to learn optimal policies through trial-and-error interactions with the environment. In gaming, reinforcement learning enables the development of autonomous agents that can learn strategies to outperform human players by maximizing rewards over time . Similarly, in finance, algorithms like Q-learning can learn optimal trading strategies, adapting to market changes to achieve portfolio objectives without explicit programming . These capabilities highlight the adaptability and efficiency of reinforcement learning in optimizing complex and dynamic decision-making scenarios.

Neural networks, as employed in deep learning, differ structurally and functionally from traditional statistical models through their multi-layered, flexible architecture designed to model complex non-linear relationships. Structurally, neural networks consist of interconnected neurons organized in layers—input, hidden, and output—allowing the model to learn progressive abstractions of data features without manual intervention . Functionally, this layered approach enables neural networks to perform tasks like image and speech recognition with high accuracy by capturing intricate patterns across large data sets, which traditional models may fail to do due to their reliance on predefined assumptions and linear relationships .

Supervised learning offers advantages in predictive modeling tasks where there is a clear relationship between input data and the desired output, allowing for the creation of predictive models with labeled training data. It is best suited for tasks like classification and regression where specific predictions are needed based on historical data . In contrast, unsupervised learning is suited for exploratory tasks such as clustering or association, where the objective is to identify hidden patterns or underlying structures in unlabeled data without prior knowledge of the output classes .

Reinforcement learning optimizes logistics and inventory management in manufacturing industries by simulating various scenarios to determine the most efficient actions. These algorithms are applied in logistics to resolve complex scheduling and routing problems, enhancing operational research outcomes such as the split delivery vehicle routing problem . In inventory management, RL algorithms improve space utilization by efficiently stocking and retrieving products, thus reducing transit times and operational costs . By continually adapting to changes in demand and supply conditions, reinforcement learning helps maintain optimal inventory levels, leading to enhanced operational efficiency and cost savings.

Generative models such as Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) can revolutionize data augmentation by creating synthetic data that closely mimics the distribution of real training data. VAEs offer a probabilistic way to reconstruct inputs, thereby generating new data points that enrich the training set diversity without introducing artifacts or biases that can occur in traditional augmentation methods . GANs, with their adversarial structure, enhance this capability by employing a generator to create fake data and a discriminator to assess its authenticity, resulting in highly realistic augmentations that improve model robustness and performance in tasks like image synthesis and resolution enhancement .

Advancements in topic modeling tools have significantly improved natural language processing (NLP) tasks by enabling more accurate and meaningful text analysis. Tools like BigARTM and Gensim facilitate the extraction of latent themes from text data, thereby improving applications such as document classification, sentiment analysis, and information retrieval . These tools enhance NLP by providing better dimensionality reduction, smoothing, and regularization capabilities, enabling more robust models that can handle a variety of input languages and domains with higher efficiency and accuracy. This leads to enhanced understanding of textual information and automated processing in multilingual contexts .

The usage of AI raises ethical considerations primarily around data privacy and algorithmic transparency. Regarding data privacy, the intensive data collection required for training AI algorithms poses a risk of unauthorized access and misuse of sensitive information, necessitating stringent data protection measures. Algorithmic transparency is a concern because complex models, particularly deep learning systems, often function as 'black boxes,' making it challenging to explain decision-making processes. This opaqueness hinders accountability and trust, as users cannot easily perceive why certain outputs are produced . Ensuring ethical AI use entails fostering greater transparency, fairness, and user control over personal data.

Deep learning differs from traditional machine learning approaches by utilizing neural networks, specifically designed with multiple hidden layers, which allow for the automatic detection of intricate patterns within large data sets. Traditional machine learning models often require manual feature extraction, whereas deep learning models can automatically discover features during the training process. This capability allows deep learning to handle complex data structures like images and speech more effectively .

AI techniques enhance customer service experience and operational efficiency in businesses through automation and optimization of processes. By using AI in customer service, businesses can offer personalized interactions and 24/7 assistance, which improves customer satisfaction and loyalty. Additionally, AI-driven analytics enable the extraction of insights from vast customer data to identify trends and refine service offerings . In operational efficiency, AI optimizes back-office processes, reducing costs and minimizing errors, thereby improving compliance and decision-making speed . Such integration results in more agile and customer-centric business operations.

Open-source machine learning frameworks such as TensorFlow and PyTorch play a crucial role in democratizing AI development by providing accessible, scalable tools for researchers and developers to build and deploy machine learning models. These frameworks offer comprehensive libraries and user-friendly APIs that facilitate experimentation and implementation of complex deep learning algorithms without requiring extensive programming expertise. They also promote collaborative innovation by enabling users to share and contribute to ongoing projects, thus accelerating advancements in the field .

You might also like