PUSHKAR
PUSHKAR
Machine Learning is the science of getting computers to learn without being explicitly
programmed. It is closely related to computational statistics, which focuses on making
prediction using computer. In its application across business problems, machine learning
is also referred as predictive analysis. Machine Learning is closely related to
computational statistics. Machine Learning focuses on the development of computer
programs that can access data and use it to learn themselves. The process of learning
begins with observations or data, such as examples, direct experience, or instruction, in
order to look for patterns in data and make better decisions in the future based on the
examples that we provide. The primary aim is to allow the computers learn automatically
without human intervention or assistance and adjust actions accordingly.
Arthur Samuel, an early American leader in the field of computer gaming and artificial
intelligence, coined the term “Machine Learning ” in 1959 while at IBM. He defined
machine learning as “the field of study that gives computers the ability to learn without
being explicitly programmed “. However, there is no universally accepted definition for
machine learning. Different authors define the term differently. We give below two more
definitions.
1
reinforcement learning. Supervised learning involves training a model on labeled data,
while unsupervised learning involves training a model on unlabeled data. Reinforcement
learning involves training a model through trial and error. Machine learning is used in a
wide variety of applications, including image and speech recognition, natural language
processing, and recommender systems.
2
REVIEW OF LITERATURE
While there has been much progress in machine learning, there are also challenges. For
example, the mainstream machine learning technologies are black-box approaches,
making us concerned about their potential risks. To tackle this challenge, we may want to
make machine learning more explainable and controllable. As another example, the
computational complexity of machine learning algorithms is usually very high and we
may want to invent lightweight algorithms or implementations. Furthermore, in many
domains such as physics, chemistry, biology, and social sciences, people usually seek
elegantly simple equations (e.g., the Schrödinger equation) to uncover the underlying
laws behind various phenomena. Machine learning takes much more time. You have to
gather and prepare data, then train the algorithm. There are much more uncertainties.
That is why, while in traditional website or application development an experienced team
can estimate the time quite precisely, a machine learning project used for example to
3
provide product recommendations can take much less or much more time than expected.
Why? Because even the best machine learning engineers don’t know how the deep
learning networks will behave when analyzing different sets of data. It also means that
the machine learning engineers and data scientists cannot guarantee that the training
process of a model can be replicated.
4
NEED OF THE STUDY
Future of Machine Learning is as vast as the limits of human mind. We can always keep
learning, and teaching the computers how to learn. And at the same time, wondering how
some of the most complex machine learning algorithms have been running in the back of
our own mind so effortlessly all the time. There is a bright future for machine learning.
Companies like Google, Quora, and Facebook hire people with machine learning. There
is intense research in machine learning at the top universities in the world. The global
machine learning as a service market is rising expeditiously mainly due to the Internet
revolution. The process of connecting the world virtually has generated vast amount of
data which is boosting the adoption of machine learning solutions. Considering all these
applications and dramatic improvements that ML has brought us, it doesn't take a genius
to realize that in coming future we will definitely see more advanced applications of ML,
applications that will stretch the capabilities of machine learning to an unimaginable
level.
Machine learning is one of the most exciting technologies that one would have ever come
across. As it is evident from the name, it gives the computer that which makes it more
similar to humans: The ability to learn. Machine learning is actively being used today,
perhaps in many more places than one would expect. We probably use a learning
algorithm dozen of time without even knowing it. Applications of Machine Learning
include:
5
• Web Search Engine: One of the reasons why search engines like google, bing etc
work so well is because the system has learnt how to rank pages through a complex
learning algorithm.
• Photo tagging Applications: Be it facebook or any other photo tagging application, the
ability to tag friends makes it even more happening. It is all possible because of a face
recognition algorithm that runs behind the application.
• Spam Detector: Our mail agent like Gmail or Hotmail does a lot of hard work for us
in classifying the mails and moving the spam mails to spam folder. This is again achieved
by a spam classifier running in the back end of mail application.
There are many types of Machine Learning Algorithms specific to different use cases.
As we work with datasets, a machine learning algorithm works in two stages. We
usually split the data around 20%-80% between testing and training stages. Under
supervised learning, we split a dataset into a training data and test data in Python ML.
Followings are the Algorithms of Python Machine Learning -
1. Linear Regression-
Linear regression is one of the supervised Machine learning algorithms in Python that
observes continuous features and predicts an outcome. Depending on whether it runs
on a single variable or on many features, we can call it simple linear regression or
multiple linear regression.
This is one of the most popular Python ML algorithms and often under-appreciated. It
assigns optimal weights to variables to create a line ax+b to predict the output. We
often use linear regression to estimate real values like a number of calls and costs of
houses based on continuous variables. The regression line is the best line that fits
Y=a*X+b to denote a relationship between independent and dependent variables.
6
2. Logistic Regression -
Logistic regression is a supervised classification is unique Machine Learning
algorithms in Python that finds its use in estimating discrete values like 0/1, yes/no,
and true/false. This is based on a given set of independent variables. We use a logistic
function to predict the probability of an event and this gives us an output between 0
and 1. Although it says ‘regression’, this is actually a classification algorithm. Logistic
regression fits data into a logit function and is also called logit regression.
8
highly scalable, requiring a number of parameters linear in the number of variables
(features/predictors) in a learning problem. Maximum-likelihood training can be done
by evaluating a closed-form expression, which takes linear time, rather than by
expensive iterative approximation as used for many other types of classifiers.
9
OBJECTIVES OF THE STUDY
DreamUny Education was created with a mission to create skilled software engineers for
our country and the world. It aims to bridge the gap between the quality of skills
demanded by industry and the quality of skills imparted by conventional institutes. With
assessments, learning paths and courses authored by industry experts, DreamUny helps
businesses and individuals benchmark expertise across roles, speed up release cycles and
build reliable, secure products.
• Python Programming
• ML Algorithms
10
METHODOLOGY
There were several facilitation techniques used by the trainer which included question
and answer, brainstorming, group discussions, case study discussions and practical
implementation of some of the topics by trainees on flip charts and paper sheets. The
multitude of training methodologies was utilized in order to make sure all the participants
get the whole concepts and they practice what they learn, because only listening to the
trainers can be forgotten, but what the trainees do by themselves they will never forget.
After the post-tests were administered and the final course evaluation forms were filled in
by the participants, the trainer expressed his closing remarks and reiterated the
importance of the training for the trainees in their daily activities and their readiness for
applying the learnt concepts in their assigned tasks. Certificates of completion were
distributed among the participants at the end.
The analytical parameter model requires human ground work, where the model designer
usually first visualizes the data and considers the science behind the system to create a
parametric model template, without specified parameters. Some popular examples would
be a polynomial model, an exponential model or a Fourier series. Then that model
template is being fitted to a set of data by minimizing error by some definition,
commonly least square error. The choice of parameter model template and error
definition causes a human bias in how the final model might turn out based on the
experiences and preferences of the designer. A way to minimize such bias would be to do
a cross-over analysis to determine the most accurate model. However, this usually
requires a lot more work and there would still be a human bias since different designer
may have different preferences when it comes to performance evaluation. Since two
different designer always could design two different analytical parametric models for the
same problem, there is no one analytical model that alone can be designed to represent
the entirety of the analytical model approach. However, by comparing the machine
learning approach with at least one of these infinite number of potential analytical
models, there could still be some important insights to gain.
By having this reference method, discussion concerning the relative performance and
suitability of different models can be done. An analytical parameter model was chosen as
11
reference method since this approach is the usual method used when creating models. It
was also the method used by ABB prior to this project to create their primary model. In
the case of the primary model, it was created by doing a pre-study to determine the first
principles of the model, then fit it to the available experimental data. These two methods
are fundamentally different and most machine learning models can be considered and
nonparametric.
12
FINDINGS and ANALYSIS
Machine Learning algorithms don’t work so well with processing raw data. Before we
can feed such data to an ML algorithm, we must preprocess it. We must apply some
transformations on it. With data preprocessing, we convert raw data into a clean data set.
To perform data this, there are 7 techniques –
1. Rescaling Data - For data with attributes of varying scales, we can rescale attributes to
possess the same scale. We rescale attributes into the range 0 to 1 and call it
normalization. We use the MinMaxScaler class from scikitlearn. This gives us values
between 0 and 1.
5. Mean Removal - We can remove the mean from each feature to center it on zero.
6. One Hot Encoding - When dealing with few and scattered numerical values, we may
not need to store these. Then, we can perform One Hot Encoding. For k distinct values,
we can transform the feature into a k-dimensional vector with one value of 1 and 0 as the
rest values.
7. Label Encoding - Some labels can be words or numbers. Usually, training data is
labelled with words to make it readable. Label encoding converts word labels into
numbers to let algorithms work on them.
13
CONCLUSION
Machine learning is a field of artificial intelligence that deals with the design and
development of algorithms that can learn from and make predictions on data. The aim of
machine learning is to automate analytical model building and enable computers to learn
from data without being explicitly programmed to do so.
14
REFERENCE
• https://expertsystem.com/
• https://www.geeksforgeeks.org/
• https://www.wikipedia.org/
• https://www.coursera.org/learn/machine-learning
• https://machinelearningmastery.com/
• https://towardsdatascience.com/machine-learning/home
BOOKS
Barros, Rodrigo C., Basgalupp, M. P., Carvalho, A. C. P. L. F., Freitas,
Alex A. (2012). A Survey of Evolutionary Algorithms for Decision-Tree
Induction. IEEE Transactions on Systems, Man and Cybernetics, Part C:
Applications and Reviews, 42(3), 291-312.
15