0% found this document useful (0 votes)
3 views

[English] Python Tutorial _ Python Tutorial for Beginners - Full Course _ Python Programming _ Simplilearn [DownSub.com]

The document provides a comprehensive tutorial on installing Python and using Jupyter Notebook, emphasizing its user-friendly features and capabilities for programming. It details the installation process for Python on Windows, including setting up the environment with Anaconda and accessing Jupyter Notebook. Additionally, it covers basic functionalities of Jupyter Notebook, such as running code cells, managing kernels, and utilizing markdown for documentation.

Uploaded by

keniemally
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

[English] Python Tutorial _ Python Tutorial for Beginners - Full Course _ Python Programming _ Simplilearn [DownSub.com]

The document provides a comprehensive tutorial on installing Python and using Jupyter Notebook, emphasizing its user-friendly features and capabilities for programming. It details the installation process for Python on Windows, including setting up the environment with Anaconda and accessing Jupyter Notebook. Additionally, it covers basic functionalities of Jupyter Notebook, such as running code cells, managing kernels, and utilizing markdown for documentation.

Uploaded by

keniemally
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 118

python is one of the most widely used programming languages today this easily

readable english-like language has lured many developers and has proven to be the
most rewarding contribution to technology hi guys welcome to the complete python
tutorial by simply learn in this video we will be covering some of the most
important topics related to python now let's have a look at them first we have
anjali and richard to take you through the installation process behind python
followed by a tutorial on jupiter notebook a very warm welcome to all our viewers
i'm anjali from simply learn and today i'll be showing you how you can install
python on your windows system it's a pretty straightforward method so let's begin
the first thing you do is open your browser and search python go to the very
first link and this is the official python page so if you scroll down here you
can see the most latest version of python which as of today is 3.7.1 and we will
be installing the most latest version so go to downloads right at the top and
since we are installing on windows click on windows so here you have the various
python releases for windows as you can see here 3.7.1 has a web-based installer
there's an executable installer embedded so on now we want the executable
installer and since my machine is a 64-bit one i'll go for this so of course you
need to select the file that is suited for your system so click here and the
setup file is getting downloaded so once that's done we'll open it and this is
your setup page so now we need to install it but before that click add python 3.7
to path so what this does is it ensures that you can access python from the
command line by just entering its name rather than the entire path where it's
stored so this really does make things simpler and once you have ticked that
click on install now go for yes and the installation process has begun and with
that our installation process complete so before we wind up let's just test if
everything is working fine so go to your search and type in python so as you can
see here python 3.7 which is the latest version is installed and then we also
have the idle for python which is the integrated development environment so let's
test this out first now this is not the typical place where you program but you
can have one line command here so to test this out i'll just have a print command
[Music] and enter so hello world is printed this is working fine now there's
also a command line interpreter for python which is python 3.764 bit that opens
your command line this works the same as our idle of course it's simpler to write
there it's more convenient and aesthetic let's test the command line out too
[Music] and that's working perfectly fine now the command line and the idle for
python are great places to start with python coding but when you move on to
actual program you'd want to offer say jupiter notebook or pie charm which we
will cover later in our videos for now that is all so i hope everything was clear
and you were successfully able to install python welcome to simply learn that's
www.simplylearn.com get certified get ahead let's go ahead and take a look at the
jupiter notebook for doing your python programming in so we're going to cover the
basics of the jupiter notebook i'm showing you how it works and what it looks
like let's go ahead and start with the install if you go to jupiter.org that's j-
u-p-y-t-e-r dot r g you can click on the install button and then you can run the
prerequisites python and downloads here and you can see the setup on this and
you'll see the very first thing they suggest is that you install jupiter using
the anaconda so we have jupiter notebook and then we have the anaconda setup and
that's www.anaconda.com you just go up to the downloads once you're under the
downloads you'll see in the anaconda that they have it set for version 3.7 or 2.7
i generally work in the newer version although i did have to reset my anaconda to
3.6 for working with google's tensorflow which you can do very easily in anaconda
this way i remember the first time someone showed me jupiter notebook and
anaconda i was so excited because of all the cool things you could do with it so
this is your install here you can download it open and run that i happen to be
under the windows setup it works on apple works on linux what's nice about
anaconda with just the cool things you can do the first one was jupiter notebook
someone showed me um the jupiter notebook and you'll see this let me just flip
back on over here for the jupiter symbol and then anaconda creates environments
so it makes it very easy to create a python 37 environment with the different
modules installed so if you're working with a referenced google tensorflow a
little troubles with that need to go to python36 you can easily do that you can
create an environment for python36 and that's all through the anaconda so once
you've installed this there's a couple ways to get to your jupyter notebook so
you'll go ahead and just download and run the anaconda install and then in this
case i go into my windows and i actually have anaconda 64-bit and we can go to
anaconda oh there's jupiter notebook you can directly access it that way you can
do the anaconda prompt i can do jupiter notebook well actually i'm going to go
back and do this the way i like to do it but let me just show you what this looks
like when i hit enter on here it opens up a browser window and i now am in
jupyter notebook where i can create new projects and go in here and create a new
file running and we'll look at that in just a minute on what that means but let's
go back a step i'm going to go ahead and close out of this and we'll open up my
anaconda and when i run my anaconda it'll start up here in just a second this is
the same by the way they have a shortcut icon but it's the same thing as going
down here to see where we anaconda navigator so i'm running out of the anaconda
navigator which i just love i just adore the anaconda navigator and the anaconda
navigator comes up with all kinds of cool tools you'll see over here i have
applications on and it says base root what you find out is if you click on
environments on the left hand side base root is the one that that defaults to but
i also have my data science i have my stock poll i have another one is called
node gpu because i was working with some gpu setup you can create as many
environments as you want i can go down here and create a new environment i can
tell it what i want in the environment which version of python you can click on
one of the environments and clone it at the very end of environments this little
triangle if you click on there you can open a terminal window you can open it in
different ways the terminal window then allows you to use usually like if you're
used to python use pip install you can use pip or conda install conda does the
same thing as the pip except it looks for all the dependencies so if i'm in a
rush i use conda if i am working in this and i need to be tracking exactly what i
install on here then i use the pip when you're in an environment don't mix and
match the two stick with condor or stick with pip because you can run into
problems with imports and reliance and stuff like that so we'll go ahead and go
under data science because that's what i enjoy doing data science and so when i
go back to my home menu you'll see my data science opens up and i have my jupiter
notebook first thing and there's a lot of tools you can open up uh rstudio from
in here i've never really used spider that much but spider is another python
editor there's a jupiter lab vs coding so there's a lot of steps in here there's
even stuff you can add in mostly i use jupyter notebook and then occasionally i
use rstudio so i try to wrap everything in here the first time you open jupyter
notebook you'll have this little install because it doesn't automatically install
the jupiter notebook underneath this environment until you're ready for it and
then we go ahead and just launch it and it's going to open it up in whatever my
default browser is this is going to go underneath my chrome browser and open that
and once it's in the browser i'm now in jupyter notebook just like we did the
other way so we did two different ways to get in here this is i've set this to my
data science setup you'll see down here i have folders it's actually on my d
drive i pointed this to my d drives because where i keep everything and i can go
under simply learn and you can see we have different tutorials and different
things we've done in here over the years actually over the last year since this
is a fairly new computer this is over the last couple months i think my oldest
one here yeah oh i do have something from five months ago but that was brought up
brought it in from earlier so it's about two months old my computer and i want
you to notice that the extension on these is i p y in b and this is important
because when you create new python file to run in here it's going to create what
they call an iron python file so in here you'll see i p y n b we'll just go ahead
and zoom in so you can see that a little better there's our i p y and b and we'll
go up here to uh on the upper right and you can upload data and all kinds of
things in here but we're going to start with just creating a new notebook so i'm
going to click on here and i'm going to click this one we're going to work in
python 3 so i'm going to click on python 3 and by default if you installed three
seven it's gonna be running three seven i have to actually set this to three six
i went back and changed
the environment to make sure it's running three six in here and we're now
actually running python script so we can just go right in here and type in print
and everybody does this the hello world and then i can go up here and click on
the big run button and it's going to run it and it's going to print hello world
and just like any notebook i can go up here to view a toggle header so now my
header is on this is untitled number four we can click on here and change this to
call this jupyter notebook basics and then i'll hit rename and so now you can see
jupyter notebook basics and if i hit the little save disks it's now saved my hard
drive so i can go back and open this up at any time if we go back to this window
jupyter notebooks basics right there it is so and it's running so these are the
files i got up and going and it's got an open kernel now kernel it opens up a
kernel to execute the program in and when we're looking at this it's important to
know that it's only setting up one kernel so everything in your jupyter notebook
is going to work great except for multi-process even multi-threading works fine
in jupyter notebook but if you get into multi-processing you'll start seeing
some problems because it's only opening up one kernel to run it in here and if we
go under a kernel you can see how we can restart it sure you want to restart it
so any variables that were saved let's do this b or let's just call it hello
hello equals hello simply learn hello simply learn we'll just print b over here
or hello so if i run this hello is now stored but if i go up to kernel and i
interrupt if i restart the kernel hello is no longer stored so if i print hello
and i run it down here it's going to give me an error because it's not stored or
if i come up here and i hit run it's now stored on here and i hit run on this
cell it's also it still continues in memory so it continues down there as long as
it's not interrupted and this brings up another interesting point is we run each
cell one at a time so when i hit the run button it doesn't go through all the
cells unless i want it to and it's so easy to click on the run button when you're
doing a lot of editing but there is shortcuts to that i can also do shift enter
and that's the same as hitting the run button so it's running this first cell
which is labeled number four we have our output from that cell and then we can go
down here and run the second cell and there's our output from that cell so it's
literally running one section at a time and when we go up to kernel we can
interrupt the kernel if you have a long run you can just stop it which terminates
the program completely we did the restart which resets the kernel i always use
restart and clear output which then clears all your outputs on here so you know
that those aren't saved anymore you can go to restart and run all so restart and
run all will start from the top cell it'll run it then i'll go to the next cell
and run it and it goes all the way down you can also do the same thing under cell
you can do run cells so it's going to run the top cell and go down run cells and
select below you can see how that just runs the first cell and then moves down to
the next cell then you can do run cells and insert below run all run all above
run all below pretty self so you have a lot of choices to run the most common is
either to click on the run arrow or hit the shift enter to run your code and one
of the really great things about this is once you start playing with it it's so
visual i click on the cell and i'm editing that cell i come down here and click
on the next cell i'm editing that cell whether i'm deleting it or whatever i'm
doing on here so you can see it's pretty straightforward as far as the setup and
these are the most basic editings you can do of course is just to put your code
in and because it's such an easy input it's so easy just to keep scrolling down
and adding new cells in so that you can go back up and execute different portions
of your program let's go ahead and do an input and we'll set up there we go
let's add a space on there just so it looks nice and then we'll put uh print hi
common name and let's go ahead and run this and a couple things to note is
there's an asterisk appears here on the left so this piece of code is running
right now and the enter comes in in line and so i can go ahead and type in my
name richard hit enter it says hi richard see our little quick short program on
there this inline editing and output mix is great for doing presentations the
first time i did this we were doing a data science predicting when the huge
blowers on the sewage plant go out these huge aerators and when they go out they
cost a lot of money to replace and so we were trying to come up with the code
that would look at the different wobbling so they could replace the pieces
instead of having to replace a whole aerator so they could replace the bushing
instead of having to go in there and replace the whole fan unit the fan unit runs
about five to ten thousand dollars the bushing is an hour worth of labor and runs
about ten dollars you know between five and ten thousand dollars versus ten
dollars is huge so if you could predict when it starts to wobble too much and
something's going wrong but they did it visually so we could actually see the
plots on here and stuff like that and the output so we'll show that in just a
minute what it looks like to put a plot on here but just for a quick rehash and
print let's see hope we see you in class soon when you run the cells they run top
to bottom and it only runs one cell at a time so whatever cell i click on that's
the cell that's running and you can see the asterisk appears it shows i'm running
and once we type in our name it actually prints whatever the output is down below
in this case enter your name richard enter and it won't print hope to see you in
class soon until i click on this cell and run it and then you'll see hope to see
you in class soon so you have a lot of control you can work on one piece of code
maybe you're loading your variables up and then you can start executing the code
based on those variables but you do have to remember if the problem is in the
cell above you got to fix that you can't just keep working on the cell below and
expect it not to change the answer another important thing to notice this is a
title you know use your comments to comment something else but in jupyter
notebook i can come in here to the cell and i can change the cell type to
markdown and you can see in markdown it changes the colors and everything and
when i run it you know i end up with this is a title this is a bigger title so i
can create nice titles in here if i'm working with a project and i'm actually
doing a demo i'm actually doing some kind of production i'm showing the graphs
and i've generated the graphs already in my jupyter notebook instead of going
back out putting that into a presentation i can just open up the notebook scroll
down add my titles in and i'm ready to go so you can see right here it's very
useful to be able to put to tag a box as in this case markdown for our cell and
then i mentioned to you that we can also do our plots in here so let's go ahead
and plot import map plot library as plt very commonly used that way i will do a
plot let's just keep it simple one two three four and we'll just keep it as a
straight line so we'll do one two three four comma one two three four x y
coordinates and this is x and y if you want to call it that because your first
set of coordinates and your second set for the matplot library and we'll simply
show this plt.show and when we do shift enter or hit the run button typo there i
forgot to put in the pi plot since that's what we're using matplot
library.pipeplot is plt and when i run that we're doing a nice straight line so
it's going to go ahead and do the figure size it does a basic figure size 640 by
480 with one axis and you can see it displays it right in line so we can do a lot
of work with this as far as any of your pi your matplot library is going to come
in in line and you'll see a nice display here again just a diagonal straight line
we're displaying and if i wanted to i could do something let's do this and just
change this double click on it to edit it below is a graph of a diagonal line and
so we have our title below is a graph of a diagonal line and then we have our
nice plot of a diagonal line and i can even break this up and let's do this
insert cell below so now i've added a cell below this one i'm going to put let's
do take our plot show and i'm going to put it down two cells i'm going to do this
one as a markup and we'll do cell type or mark down i call the markup and we'll
say welcome to my simple graph and so when i run this it makes a nice markup and
then i run this cell and we get a nice plot show so we have a plot plot and i run
it and you can see welcome to my simple graph so everything's nice and orderly
this is kind of nice because once you set this up you can see how you can create
a nice presentation while working on your project you don't have to even get out
of your project to generate the information you want to show to the shareholders
and if it takes too long let's say we're running this script you know what let's
just kind of overload it here we'll do a lot of plotting i'm going to run it and
it happened too fast if your kernel gets stuck you can always interrupt the
kernel you can restart it or interrupt it usually you restart it because it's
loaded data in there and you want to reload the data but if i restart it here
remember we did that
before whatever i had that ran up here where i said hello up to hello simply
learn that's gone i have to rerun this cell to reload that data into the variable
hello and of course i can do run cells and selected below or run cells and just
run all i can run on it goes all the way to the bottom so it's not a big deal if
you forget but you can easily run all the cells i do want to point out one thing
since we did a run all still doing some plotting in here and coming down for
whatever reason if you go to the top you'll see up here in the tab there's an
hourglass that means this kernel is running we'll go ahead and interrupt this
kernel and i'll take it a moment to interrupt the kernel and stop it and you'll
see that shut down in just a minute another there's so many cool things you can
do with jupiter i get so excited and it's so simple there's not like a huge
number of hidden commands on the page although certainly you can there's all
kinds of back-end stuff you can do one of the things you can do in here is i can
go up to file and if you go under file you'll see down here download as and i can
download it as a notebook which it automatically saves as i can download it as a
python dot pi file so it would remove the non-python stuff in there and you just
have your regular python file and i can also download it as an html there's also
the js slice rest markdown but i love the html my goodness i click on here and
the machine i'm going to go ahead and open it and it opens up in my browser and i
can actually take this code and just put it onto a web page so now i have my html
code of what i just did you know that's a lot of that's pretty cool you can flip
that over so quick and easy so we've covered a lot of stuff we've covered that it
runs in a single cell we've covered going through the kernel interrupt restart
restart clear output restart run all we've discussed cells where you can run the
cells below run the cells above run all most common we covered cell type we've
gone under file and we've seen where you can go ahead and download as a different
version there's a lot of other things in here but those are the main ones you can
save and create a checkpoint you can rename it we clicked up here to rename but
if you're under view let's say i don't want the header on i can still just go
under file and rename so if i don't want to see the header and i want that extra
screen space which i like i can toggle that on and off i can toggle the toolbar
off put that back on with all the shortcuts and to wrap it up one more reference
let me just close these out if you do a jupiter notebook repositories on git and
i just go to trending notebook repositories on the github you'll see all kinds of
stuff on here that you can go practice with you can pull what somebody else is
working on they have practical ai deep learning tf2 course tensorflow examples
that's the google tensorflow i mentioned earlier which is a neural network
handsome machine learning i don't know what any of these actually are other than
by the name but you can see they have a lot of stuff that's published on github
which helps you get started you find something you're interested in you do a
search on github and you'll find that jupyter notebook on there gets you some
hands-on and then just regular coding how do you become a good python programmer
you write python code that's the basics so thank you for joining us today we
covered anaconda and jupiter notebooks now that that's done we have richard and
anjali to teach you about python variables numbers data structures like arrays
and lists conditional statements functions objects classes threading and
scripting i'm anjali from simply learn and today i'll be taking you through one
of the most basic topics in python variables so here's a very simple statement x
equal to hundred a variable's definition is basically an entity of a program that
holds a value so in this case x would be our variable and hundred would be the
value it holds to better visualize this statement let's consider a box now if
this box holds a value say hundred then the name we give to this box which in our
case is x would be the variable name and hundred that is a content within the box
or within the variable would be the value of the variable now this is the basics
of what a variable is as we'll go through the various topics today you'll have a
better understanding of why we use variables and how to use them so let's move on
to the next topic which is the various data types of variables before we move on
to this let me explain to you what data types are if you have dealt with other
programming languages before you probably already know what data type is but just
in case you haven't data type is basically the type of value that you assign to
the variable so in our previous example where we said x equal to 100 in layman
terms 100 is a number so the data type would be number but when we come to
programming languages there are two types of numbers there are integers and there
are floats integers are basically numbers without decimal point floats are
numbers with decimal point other than this a very common data type that we have
in python and other languages is string so string is any word in technical terms
we call them a collection of characters python also has a few of its own data
types as we will see so let's move on so i'll move on to my jupiter notebook so
the first data type that we'll be looking at is integer as you saw in our
previous example x equal to 100 is an apt example for this let me run this
statement as you can see there's no error that means this assignment is proper
now we'll check the type of this variable so type and within bracket you enter x
run this and as you can see the type is int so int is the short for integer any
number without a decimal point comes to end in python now this is not the only
way you can assign values to variables you can also assign values to variables
while simultaneously performing arithmetic operations on them so 654 into 6734
enter now let's check out the type of this variable which would be int again now
if you print the value of this variable you'll see that the value x stores is
the product of these two numbers so now let's move on to the next data type which
is float x equal to 3.14 as you can see this is a number with a decimal point if
you run this there won't be any errors so this is a proper assignment statement
you print x and the value is printed now you print the type of x run and it's
float we'll move on to our next data type which is strings so string is any word
it's a collection of characters so x equal to and within quotes so i'll type
simply learn and run so no error print x and the value of x is printed which is
simply learn so guys whenever you're assigning strings to a variable it's
important that you do them within quotes now you can do them within single or
double quotes i'll show you how it's done within single quotes [Music] and the
exact same output so it does not matter whether you use single or double quotes
now let's check out the type of this variable type x and str which is a short
form for string so now that we had a look at integers floats and strings let's
move on to some of the data types specific to python the first one we look at is
list so a list is basically a collection of values so far we assigned a single
value to a variable with lists we can assign multiple values and this is how
it's done so give the variable name equal to open square brackets and within this
you type your values so 14 67 9 run this no error now let's print x and see
what this results in as you see when you print x all the values you stored within
it is visible now let's check out the type of x and the type is list now with
list since there are multiple values stored within a variable you might be
wondering how you can extract a single value so this is simple each value in a
list has an index so you always start with zero this is the zeroth position first
position second position so if you want to extract 14 then you type print x 0
and that will display 14. if you want to print the last value print x two because
we start from zero so this is one and this is two so that should print nine
great now we'll change the values which are stored within the list so if i say x
2 which right now is 9 but i want to change this to 67 so that is a simple
statement how you can do so and run now let's see what x holds so when you just
type x you're printing every value within x and when you give a particular number
within the square brackets with x you're printing the value at that particular
location so print of x run and as you can see here guys previously x held 14 67 9
now it holds 14 67 and 67 again because we changed x of 2 to 67 so now that we
are done with list we'll move on to another data type in python this is called
tuple so with tuple 2 it's about the same thing you can store multiple values in
a single variable the syntax for doing so though is slightly different with
tuples so instead of square brackets you now have round brackets and within that
you type in your values so i'll just put four eight six and let's print x so
all the values within x is printed now let's check the type of x and the type
gives double so you might be wondering what is the difference between list and
tuple now the core difference between list and tuple is that tuples are immutable
so what i mean here is that in case of tuples 2 you access each value within it
in the same way you do with list so if i want to access the value 8 in x right
now i'll
just give x within square brackets 1 [Music] print this and it outputs 8 but
now if i want to change this value so the way of doing that as we saw with list
is x of 1 and the value i want to change it to so let's say 5 and i run this we
get an error so you cannot change the values in case of tuples once you have
stored the value within the variable for a tuple it remains that way right in the
end and in technical terms this means that tuples are immutable while lists for
which you can change the values they are mutable so now we have something
slightly different when we deal with files we need a variable which points to a
particular file so in general these are called file pointers the advantage of
having file pointers is that when you need to perform various operations on a
file instead of providing the file's entire path name or the file's name every
time we can just assign it to a particular variable and use the variable instead
so this is exactly the advantage variables have with all other values too but the
syntax for doing so with files is slightly different so i give x equal to open
and within brackets open quotes enter your file name so i want to open say a
file called variable underscore com and this is ipy and b so it's my python
notebook and the mode i'd like to open it in so r let me run this no error so
it's fine this kind of an assignment is completely legit now we check the type of
x so as you see here the type for x is underscore io.text io wrapper so in
python this is the particular type assigned to this variable but in general terms
you can refer to them as file pointers now suppose you want to assign values to
multiple variables what you can do here is instead of having statements like x
equal to 5 enter y equal to 10 and z equal to 7 instead of having three such
statements delete open bracket x y z within bracket equal to 5 10 7 and this
works exactly the same so now if you print [Music] x y and c [Music] you'll
see that they have been assigned their respective values of course the number of
variables and the number of values on either side should match so if you put x
equal to y so if you give x comma y equal to 5 7 this would result in an error
because you have only two variables on your left hand side while you have three
values on your right hand side now suppose you want to assign the same value to
multiple variables in that case you can do now suppose you want to assign a value
to multiple variables say x equal to 1 y equal to 1 is equal to 1. a short form
to this as you saw previously would be x comma y comma z equal to 1 comma 1
comma 1 but then again you'll have to type the same value 3 times instead what we
can do is x equal to y equal to z equal to 1 and this would work perfectly fine
so if i print the value of x y and z now they all have the value 1 stored within
them so we have covered the various data types in python and how you can assign
values to a variable in multiple ways we'll next move on to the various rules for
naming the variables now there are certain rules that you must follow while
naming the variables we'll go through each of these rules and simultaneously i'll
also demonstrate to you in our notebook the validity of each variable so our
first rule is variable name must begin with an alphabet or an underscore so let's
move on to a notebook so abc equal to 100 this should be valid because it starts
with an alphabet at the same time underscore abc would also be valid so if the
variable starts with an alphabet or an underscore it's a valid name for the
variable but if we say 3a equal to 10 this would result in an error of course
because the variable name starts with the number and this is invalid same way we
cannot start the variable name with the special character other than underscore
so if i say add the rate abc this would also result in an error as you see the
error is invalid syntax because that is the invalid syntax for a variable name
let's now look at the second rule for naming variables the first character can be
followed by alphabets numbers or underscore so our first character must be an
alphabet or an underscore and after this first character we can have alphabet we
can have numbers or we can have underscore so back to our notebook by the second
rule a hundred should be valid as it is and if we put underscore which is a
valid starting character a 984 so our first character is followed by a letter
three digits and then another underscore this should also be valid because at the
end of the day it only consists of numbers letters and an underscore however if a
variable name is a 9967 few digits and then a dollar sign would this work well
no this would result in an error invalid syntax because as already mentioned for
variable names the only special character you can use is underscore what about x
y z hyphen 2 again this would be invalid because it includes a special character
that is not underscore can't assign to operator let's have a look at our third
rule variable names are case sensitive so what case sensitive means is that
small letters and capital letters are treated differently so let's move back to a
notebook and say i assign a value 100 to a variable a hundred small a and i
assign another value 200 to a different variable capital a hundred so as you see
here guys between these two variables the only difference is that the first
variable has a small a and the second variable has a capital a let me just
execute this if i print the value of a hundred and capital a hundred you'll see
that both these variables are recognized as completely different variables they
have the individual values so now that we establish that variables are case
sensitive we can also say that python itself is a case sensitive language now
our final rule is that reserved words cannot be used as variable names now if you
have gone through any other languages like c c plus plus java you know this is a
common rule among all of them you cannot use words which have special meaning to
the language as variable names so some of these words are break class try you
have other words such as continue while which is a loop or if which is a
statement a conditional statement so let's go back to our notebook and see how
assigning values to these reserved words can result in errors so if i give break
equal to 10 that would be an error similarly class equal to 5 would result in an
error try equal to 100 would also result in an error because these words break
class try have special meaning to python and with that we cover the various rules
for naming variables the next thing we'll check out is the arithmetic operations
that you can perform on these variables once you have stored integer values in
them also perform various arithmetic operations with variables that store integer
or float values so let's have a look at some of these operations that we can
perform we'll start with initializing two variables so x equal to 20 and y equal
to 10 and now we'll perform our operations with these two variables so first
let's check out how addition works result equal to x plus y so result is a new
variable and it should store the value of x plus y let's print this out and as
you see x plus y equal to 30 so now result stores 30. similarly we can perform
subtraction we'll print this out too so 20 minus 10 is 10 we can also perform
multiplication we use asterisks for the multiplication sign so 20 to 10 that's
200 next we'll check out division so as you can see here in case of division the
result is not an integer but rather a float if you want your result in integer
what you can do is instead of putting just one slash you can apply two slashes so
x slash slash y and print the result and there you go now you have the resultant
integer so in this particular case since 10 completely divides 20 the requirement
of the double slash is not very visible but if we do division for numbers that
do not divide each other completely for example if we have result equal to 2 by
3 and we print the result we get 0.66 and so on but if we give double slash here
and run this we get the result as zero so in a lot of cases the result from
double slash makes way more sense than from single slash so this is an extra
benefit you have with python which is not present in many other languages we
also have the modulus operator here which gives the remainder rather than the
quotient so if i say x mod y it'll give me the remainder of the division from x
by y and the remainder is 0 because 10 completely divides 20. on the other hand
if you do want to see the remainder you can give numbers such as 3 by 3 mod 2
and then print the result and as you see the remainder of this operation would be
1. now so far we saw arithmetic operations with integers all these operations can
also be performed with floating point numbers so if your x value is 3.14 and your
y value is 5.7 let's just perform one operation using the new values of x and y
to see what it looks like so let's say result equal to x by y print result and
there's your quotient so arithmetic operations work both with integers and
floating point numbers next let's check out some of the operations you can
perform on string variables as i mentioned earlier strings are basically a
collection of characters so let's look at a few string operations first we'll
initialize a string variable so var equal to simply learn and we'll perform all
our operations on this var variable now the first thing we'll do with this string
variable is we'll see how we can extract each character and string so just like
in case
of list and tuples every character in your string has an index and it always
starts with zero so the first character is at location zero second at one third
at two and so on so if you want to extract the first character of your string
which is s in our case we just do var or 0 and as you see s is printed similarly
if you want to extract the fifth character of our string we'll put var of four
so if you want to extract n character your index for it would be n minus one
because we start our index from zero and that prints l so as you can see here s
is 0 and then we have 1 2 3 4 and that is l so now that we learned how to extract
a single character let's see how we can extract multiple characters from the
string suppose i want to extract the first three characters in that case i would
say bad off the first three so my starting character would be at zero so if i
want to extract the first three characters of my string what i'm basically saying
is that i want to extract the character at the zeroth first and second position
then i'd say print off var of starting from the zeroth location up to the third
location we say 3 and not 2 because the last index is always excluded so if i put
2 here instead of 3 then only the 0th and the first character would be printed so
i want my first three characters to be printed so var of 0 to 3 and i'll run this
and as you see the first three characters have been printed capital s followed by
i m now if you want to start from your first character you can also just ignore
the zero so you can say var of colon three that is we are not typing the zero in
this case it's automatically understood that you start extracting from the first
character when you miss out the zero so you run this and as you see it's the same
result now if you want to print the character starting from the fifth location
until the end of the string you can go with print of where of fifth character so
five and up to the end in this case you do not need to mention the last index you
can just leave it blank and it's automatically understood that you're printing
till the very last character of your string and as you see from your fifth index
to the very last character has been printed now suppose you give var of 0 to 20
and it's pretty obvious that you do not have 20 characters in this particular
string so you do not have 20 indexes so what do you think would happen if i do 0
to 20 well the entire string is printed so although we have only 11 characters in
our string and we have given the ending index as 20 this does not result in an
error just our entire string gets printed now if you want to find out the length
of your string without actually having to count it manually you can use the
function len of var length is an inbuilt function in python now run the command
and as you can see we have 11 characters in our string and today we look at what
numbers in python are so i'll be using jupiter notebook to explain all the
concepts and numbers so let's move on to this now the first thing we look at is
the different type of numbers python supports so first and foremost we have our
most common type of number which is int so here i have a variable num which is
equal to the value 5. now if i check the type of this variable num and i run this
as you can see here num is type int so int is basically any whole number that is
it does not have a decimal point also now the thing with numbers in python is it
can be of any length so this can also be stored in a variable num and it's stored
as an integer itself so the only restriction you have with size when it comes to
python and integers is the limitation of your system's memory now let's look at
the second type of number that python supports and that is when we have floating
points or decimal points so if i have a value 5.4 which i store in num variable
and then i'll check the type of my variable num and as you see here this is a
flow type variable because now it stores a decimal value now just like integers
floats can also be of any size now here's something new you probably remember
complex numbers that is complex numbers have two parts there's the real part and
there's the imaginary part now python also supports these so if i have num equal
to 2 plus 5 js so the j represents the complex part now often when you write
instead of j you use i it's basically the same and i'll check the type of num now
and as you see here it's complex now you can also view the real and the imaginary
parts of your variable num separately and for that i just need to type on num dot
real which displays the real path that is 2.0 so as you notice here the real part
although we entered it as a int that is just 2 once we print this out that is a
complex numbers automatically stored in python in floating points so once you
printed out the real part it's 2.0 and not two so now it's float and in the
similar manner i can also print out the imaginary part and there you go so both
the real and the imaginary part for a complex number in python is stored in
floating points irrespective of whether you enter it as an ind or a float now
numbers can also be negative so you can have num equal to minus 5 6 7 point so
on and if i just print this out so a variable can also hold a negative value not
just positive now when you think about numbers the obvious thing are the
mathematical operations or the arithmetic operations that can be performed on
these numbers so python 2 supports a number of these operations so to demonstrate
this i'll create two variables in which i'll store two integer values say num1
which has value 10 and num2 which holds 2. now the first and the most basic
arithmetic operation is of course addition so num1 plus num2 and that's how
simple it is to use your arithmetic operations in python just like you write them
the exact same thing you need to type them out so 10 plus 2 is of course 12
similarly we have num1 minus num2 10 minus 2 which is 8 there's multiplication 10
into 2 which is 20 division 10 by 2 which is 5.0 now as you see so far since num1
and num2 were holding into values our results were also int that is for addition
subtraction and multiplication but now when we did division our result is in
decimal points so division always in python gives out decimal values or floating
point values this is because if we had divided 10 by 3 the answer wouldn't be a
whole number there would be some value after the decimal point now python gives
importance to these numbers unlike many other languages so that have minimum
accuracy lost now say you did not want the answer to be in floating points so
although 10 by 3 actually does give 3.333 and so on you want the result to be
just three that is you want to focus only on the whole number part in that case
you can have division with two slashes so when you put two slashes here the
result is an int value now suppose you want to raise 10 to the power of 2 that is
you want to raise num 1 to the power of num2 in that case we just use two
asterisk symbols so num1 star star num2 and as you know 10 to the power of 2 is
100 so in case of division we saw how to get just the integer value or get the
floating value now what if you do not want the quotient at all you want the
remainder of the division in that case you can use the modulus operator so you
have print off num1 mod num2 and that gives 0 because 10 is completely divisible
by 2 the quotient is 5 but the remainder is 0. so if i use this operator for 10
by 3 that's 10 mod 3 now we'll have a remainder which is 1 so that's because 3
into 3 is 9 and you have a remainder of 1. so the next thing we look at is
conversions now if i say x equal to 192 as we saw earlier this would mean that x
is of type int but if i just enclose this 192 value within quotes it makes it a
string so now although essentially what we are storing is a number 192 because
it's enclosed in quotes it's seen as a string so now x stores a string value if i
check the value of x the type of x you see it's of type str which is a short for
string now often when you take input from the user it's always in a string format
so python irrespective of what the value that is entered by the user always
stores it in a string format and now if you need this to be of integer type
floating point complex or any other type you need to explicitly convert this so
that is what we look at over here now initially we have stored a string 192 in x
but we want to use this as a number and not a string so then we need to convert
it to integer type for that we use the int method you say it and you just pass
the variable within the brackets so when you do that as you can see here the
value printed out is an integer but now it just printed out x as an integer x
still holds a string value so to change this we say x equal to int of x now x is
converted to an integer type and then stored back into x and now if i check the
type of x as you can see here it's now an in-type variable now we can also
convert this to float so if i say x float of x it's converted to float and now if
i print out the value of x you see it's 192.0 so what we did here is initially
we stored 192 as a string into x we checked the type of x and it was str that is
its trick then we converted x to an integer type and stored it back into x so now
x is in type after that we converted extra float value and stored it back into x
and that changed the value from 192 which was a string to 192.0 which is now a
float now you can also change this to a complex number so i say x equal to
complex of x and then print out x so as you see here 192 which was our value
stored in x is the
real part and since we didn't have any imaginary part previously so j's
magnitude is just 0. now there's also a function to create complex numbers it's
called complex just that and within brackets you can pass your real and imaginary
part so suppose i want my real part to be 2 and my imaginary part to be 6 so i
just pass that as parameters to my complex function and i'll print out whatever
this function returns and as you see here it returned a complex number which was
created from the two numbers we passed to complex let's now explore some inbuilt
functions that work with numbers so the first one we'll see is absolute which
returns the absolute value of a number that is whether your number is negative or
positive the return value is always positive so if i have a variable say x which
is equal to minus 7.5 and i print out the absolute value of x so the function
name is apps and within the braces i pass x so as you see although x held a
negative number the value returned by the abs function would always be positive
next we'll check out the exp of function that is the exponent function now the
exponent function takes a single parameter and it returns that parameter as the
power to e that is e raised to the power of the parameter that you pass to exp
now before we use the exp function we need to import the math library because exp
is present within the math library so if i say x equal to 10 and i want to print
out e raised to the power of 10. so math dot exp and you pass x let me run this
and that is the value of e raised to the power of x now if you're wondering what
the value of e is python holds e as a constant so if i say math dot e you can see
the value of e here similarly python also has pi stored as a constant which is
3.14 and so on now let's check out another function within the math library which
is the square root function so you pass some value to this function and it
returns the square root of that value so i'll print out math dot sqrt of 9 and
that returns 3.0 so it prints out a floating point number not an integer type so
even if i pass 6 which is not a square number still returns a pretty accurate
value now we have another function now this is not within the math library it's
present by default and this is the max function so if you have a large number of
values you can pass all these values to the max function and will return the
maximum out of all of them so let me just pass a few random values and run this
and it returns the largest of the numbers passed just like max we also have min
which returns the minimum of the values passed and the minimum is one obviously
so numbers have very basic concepts and we have covered most of the important
ones here today you'll be learning what lists in python are so we'll start with
the basic definition of what a list is we'll then see how you can create the list
we then have accessing of the elements in the list the various operations that
can be done on the list and then methods and build in functions specifically for
lists and finally we'll have a short exercise using some of the concepts that we
learned today so what are lists a list is a collection of data and this data
could be of any data type in fact a list could hold data of different data types
as we can see in our example down here we have a list named x and x holds 1 89
the single letter a the string hey another number zero and the capital letter b
now another important factor with lists are it indices every element in the list
has a position and this position is called an index the first position in a list
is always 0 so 1 is at index 0 89 at index 1 a at index 2 and so on so this means
that the position of your last element would be the length of your list minus one
so now that you know what a list is let's move on to how you can create a list
so this is my jupiter notebook and we'll start by creating the most basic list
which is an integer list so my list name would be num and then open square
brackets and within this just put in your numbers separated by commas so that is
my first list run this and there seems to be no error let's now print this list
out just to see what it looks like and as you can see the elements that we
inserted in this list are printed which are 1 2 3 and 4. next let's create a list
with just single characters so my list name is letter and the syntax is exactly
the same you open square brackets and now that we have letters inside which are
characters you type these within quotes let's print this too and once again
after the run you can see the letters that we inserted into our list letter next
let's create a list of strings so i'll name my list stg open square braces and
again strings should always be written within double or single quotes and that's
my list of strings i'll print this too great so now we have created a list of
only integers only letters and only strings now as i mentioned before list can
contain a combination of different data types so let's try that out i'll name my
list mix because it's going to be a mix of number strings two numbers and then
i'll follow this up with a few strings and let's run it so that works perfectly
fine now lists have another format if you have worked with any other programming
language previously you would have come across the term matrix so usually in
other programming languages matrix are associated with arrays so they are
basically 2d arrays in case of pythons matrix can also be of list so you can have
a list containing two or more lists so let's do that i'll name my list matt as
in short for matrix open square braces and since the elements of this list are
also list put these two in square braces so our first list would be 1 comma 2
and then comma because our second list is basically the second element of our
list mat and our second list would be a b and that's how you create a list
containing list so our parent list is mat and this contains two lists the first
list is one comma two and the second list is a comma b so let's print this out
and there you go so we also created a list containing list so these are the
different type of lists you can create next we'll move on to how you can access
the elements within this list so there's no point of creating a list if you
cannot access each individual element within the list now here to understand how
to access the elements of the list we'll be using the list mix that we created
previously so before we start i'll just print out mix so you can have an idea of
what is inside so we have five elements within mix the first two are numbers or
integers one comma six and that's followed up by three strings simply learn get
certified we previously saw that every element in the list has an index position
so the simplest way of accessing an individual element in the list is by
following the list name with square braces and within the square braces mention
the index number of the particular element so if i want to access the element get
my index would be 0 1 2 3 so mix of 3 and i'll run this we got our element get
now in a similar way you can also use negative indices to access the elements so
when you put a negative sign the indices are basically counted from the back of
the list now if i want to access the same word get but all i know about this word
position is that it's a second last word in the list i also do not know the
length of the list in this case i can just say mix of minus 2 so 2 as in it's
the second last element and the minus signifies that you count from the end of
the list and not from the beginning so i'll run this and extract it get now as
you probably would have noticed in case of indices when counted from the front of
the list it starts with 0. on the other hand when you count the indices from the
end of the list it starts with minus 1 and not 0. so now we saw how you can
access a single element from the front of the list and from the back of the list
now what if you want to access a range of elements say i want to extract the
first three elements of the list mix in that case i could just type mix again
square braces and within here i'll put colons followed by three so ideally there
would be a number on either side of the colon but in this case we are taking from
the beginning of the list so if the digit before the colon is supposed to be
zero you can just leave it blank and automatically it's interpreted as zero also
we know that our third element would be at the index position two but here we
have written index position 3 this is because our last index positions always
excluded so if you give 2 instead of 3 then the only elements that would be
extracted would be the ones at zeroth and first position since we want the
element at the second position to we give the ending index as three so let's
print this and we received our first three elements of the list mix which are 1 6
and the word simply learn now what if we want to extract all the elements from a
certain position of the list say our third index position of the list up till the
end of the list in that case again we have mix square braces and we have the
same colons as we used previously but now since we want our starting index to be
3 we enter 3 and the element at the third index would be included but now we
want to extract all elements till the end of the list so you can just leave the
last position blank when you do this it's automatically interpreted that you're
taking all the elements up till the end of the list let's run this and we
received get and certified now i want to extract the word simply learn and get
from our list mix in this
case what would be the parameters within our square braces so simply learn is at
the index location 2 and get is at the index location 3 so our starting index
would be 2 but as we already mentioned previously if i put 3 as my ending index i
would not receive the element at the index position 3. therefore i'll put my
ending index as 4 and run this and there you go we received our two words simply
learn and get now here's something slightly more fun so what if i want to extract
every second element of my list in that case i put two double colons followed by
two because i want every second element of my list and run this so as you see
here 1 has been printed which is followed by 6 in our list mix but 6 is skipped
and then we have simply learn again the word get is kept and we have certified
now there's one more thing you can do with list indices you can print them in the
reverse order so to do that just two colons and minus one let's print that and
as you can see your list mix is printed in the reverse order so now if we go back
to line 16 these two double colons are basically positions for your indices so
suppose you want the second element from the first index and not the zeroth index
you can write one here and if you want every second element from the first index
up to say the fifth index you can put five here but in our case we do not have a
fifth index so i'll just leave it at this you can obviously try that out in fact
we'll be using a variation of this in our exercise later on so now that we saw
how you can access the various elements of the list let's look at the various
operations you can perform on the list suppose you want to create a list of a
hundred zeros if we go by the traditional methods we literally have to type in a
hundred zeros separated by commas within the square braces and this gets very
tedious so a short way of doing this is type in that one element that you want
repeated multiple times within your list and outside the square braces follow it
up with asterisks digit and the number of times you want this element to be
repeated so 0 into 100 and let me print this [Music] and as you see a hundred
zeroes exist within our list c now we can also create a new list concatenating
two or more lists so let's create a list concatenating the two lists that we
previously created letter and sdj so i'll just print these lists out first
[Music] that's our list letter [Music] and that's our list stg now we'll create
a new list called con a short form for concat and how do you concatenate two
lists well you just add an addition sign between them so letter plus sdg and
that's all it takes to join tourists let's print out kong and there you go so the
first four elements in our list con is from the list letter and the last four
elements is from the list stg now there's another operation with list called
unpacking so here you basically give a string as an argument to this function
list and that unpacks each element of the string as a separate element in the
list so i'll show you this with an example if i name a list var and list off
within the brackets i enter the string i say hey there and now i'll print var
so as you can see every letter of this string hey there has become a single
element in the list var now let's check out another operation on list using the
list num that we created previously so once again i'll just print num out so we
can recall what was within it we had the four numbers one two three four now
suppose i want the first element of my list num which is one to be in a variable
1 and i want the rest of the elements to be stored in another list called say
others how do i do this i can just put 1 which is my first variable a comma and
then star other equal to num so what happens here is that the star signifies
that every element that has not been put into the previous variable or variables
is put into this list other so let's print out the variable 1 and the list other
separately and see what they contain and as expected 1 contains the value 1
which is the first element of our list num and the following elements of our list
which are 2 3 and 4 are stored in the list other let's now look at a few methods
specific to list so for the following few examples we'll be using the list that
we created previously called num i'll print the list out first so num has the
element 1 2 3 and 4. now the first method that we'll be using is the append
method and the syntax for that is the list name which is num dot append followed
by bracket and within these brackets you basically insert the element that you
want to add to the end of your list so suppose i want my list to be one two three
four six all i do is enter six here and that should ideally add six to the end of
your list now so now let's print out num and there you go so previously our list
contained the numbers one two three four and after we executed the num dot append
method six was added to the end now what if we do not want to add a single
element to the end of the list but an entire other list so in this case we can
use the method extend the syntax for this is num dot extend and within the
brackets you'll enter the name of the list you want to add to your previous list
so here i want to add the list stg that we created previously to the end of my
list num now before i execute this command let me just print out sdg so we know
what's in there so within sdg i have get certified get ahead i'll just delete
this cell and now i'll run the command so let's print out num and as you see num
previously had one two three four six from here and then we added get certified
get ahead to the end of this list now now these kind of additions only take place
at the end of the list what if we want to insert an element somewhere in between
the list for this we use the method insert and the syntax is your list name num
dot insert and within brackets you first enter the position or the index at
which you want to enter the new element so i want to insert the string simply
learn just before get so this is my fifth position so my first parameter here
would be 5 followed by the element i want to enter which is a string simply learn
so i'll type it within quotes and let me run that let's now see what num looks
like and as you see here we have one two three four six and then we have the new
element that we just inserted simply learn followed by the elements which were
previously in our sdg list alone now there's another method you might come across
regularly which is the remove method so this method as the name suggests removes
a element it basically removes the first occurrence of that particular element so
the syntax for this is num dot remove and within the brackets you enter the
element that you want to remove so i'll remove the element simply learn and run
let's now see what num holds so as you can see simply learn has been removed now
if there were two occurrences of simply learn one here and maybe one after a head
only this word would be removed and not the one later on because removes only the
first occurrence of that particular word or that particular element okay so now
let's create a new list say bad one and we'll just insert a few letters in here
now imagine you receive a list like this where there are few elements but in a
jumbled order you want to sort out this list one simple method sort is all you
need to do vad or sort off run it and now let's print out what var1 holds so
all your elements in var1 are now sorted python also provides a few built-in
functions that can be used with lists so we'll have a look at the most basic ones
so before we begin let's create a list x and i'll fill my list with just
integers so that's my list and we'll use this list throughout all of built-in
functions now the most basic built-in function and the most commonly used is len
what len does is it tells you the length of your list so len of x and the length
is 6. our next built-in function that we'll try out is min min gives you
basically the minimum of all the elements that's present in your list so min of x
and the minimum element in our list is 4. similarly we also have max which gives
you the maximum of the elements present in the list max of x which is 90. now we
also have a function sum which basically gives you the sum of all the elements
present in your list so sum of x and that's 189 now in python we do not have a
function which gives you the average but it's very simple to calculate this we do
so using the sum function so you find the sum and then you divide that by the
length of the list so sum of x by len of x would give you the average of x so
with that we covered the basic built-in functions for python lists we also cover
the methods in list a few operations in list we saw how we can access the various
elements in list today we look into what tuples in python are so let's begin what
are tuples a tuple is a collection of immutable heterogeneous python objects so
if you have gone through a previous video on lists you would have noticed that
even list is a collection of heterogeneous python objects the only difference
between tuple and list here so the extra term that comes in case of tuples are
that they're immutable so over here we have a tuple that's x and within x we'll
store various elements 1 89 a single letter a which is a string also a word which
is again a string in python hey o and b so as you can see this tuple here may not
be necessarily completely filled with integers completely filled with floats or
strings it can be a mixture of all of these data elements so that is one
advantage
we have when it comes to tuples and lists as compared to arrays because with
arrays you can have only one data type elements within it now we look at indices
so every element in a tuple takes up a position and this position is what we
refer to as an index in plural that's indices so tuple's positioning or the
indices for tuples and list strings also in case of python and most other
programming languages starts with zero so your element one is at the zeroth
position your element 89 is at the first position and so on where b is at the
fifth position so in a way you can say that the length of your tuple x is six but
then the last index of your tuple x is 5 because it starts from 0 onwards so now
that we have a basic idea of what a tuple is let's begin with creating tuples
creating tuples now the first thing that we are going to do is we will create an
empty tuple so say my empty tuple's name is emp all you have to do to create an
empty tuple is give these curved brackets with no elements inside so this is a
lot like list but in case of lists we had square brackets in this case we have
curved brackets so that's the only difference now if i print the type of emp
you'll see it's of class tuple i can also print out just emp and since emp is an
empty tuple there's no element within it but then of course the brackets are
still printed because that's a list because that is the basic characteristic of a
tuple so even if it's empty these brackets would still be existing so we saw how
to create empty tuples but how often do we actually use empty tuples so let's put
some elements into our tuple so let's start with creating a tuple with just one
element so say my tuple's name is city and i'll have one element pune if you're
creating a tuple with one element if you're not putting the bracket it's
completely valid as long as you have a comma after that particular element so
with this kind of a syntax you can create a tuple so i'll create a tuple say city
and i'll just have one element within this which is pune now if you're not
putting this element within these curved brackets that's very characteristic to
tuples it's completely okay while creating a tuple as long as you have a comma
after your element so although you have no element after this comma that one
single comma is important so let me run this and just type of city as you see
city has been stored as a tuple now in the most standard way of creating tuples
you'll say city and within brackets you'll pass your element also print out city
after that and the same syntax can be used and then you can also combine these
both although that's completely unnecessary effort but it's possible it does not
result in an error so you have your brackets and the comma works just fine now i
want to add more elements to city because clearly pune is not the only city we
know so pune bangalore these are some of the cities in india if you're from
another country and wondering so after this demo you not only know tuples you
also know a few cities in another country so i have four elements in this tuple
and this is how you create it all the elements separated by commas and within
brackets print out city now i can do the same thing without the brackets too just
the commas indicate that this is a tuple now those are the various ways you can
create a tuple of course all these ways do not come handy usually you use this
method to create a tuple with n number of elements and when it comes to one
element you usually go for this just put one element within the brackets this is
also used sometimes now another thing we saw while learning what a topolous or
looking at the definition of a tuple is the word immutable so we said lists are
mutable which means that they can be changed and tuples are immutable as in they
cannot be changed so here we'll see a small example as to what the difference
between tuples and lists are when it comes to the execution of these both so say
i have a list list one and that has elements one two three four and then i have a
tuple say tuple one which has the same elements one two three and four now with
list one i'll use this method that you probably have encountered before called
append so i say list dot append of five so what this function does is it adds
five at the end of this list which is list one now we'll print out list one run
this code and as you can see here five has been concatenated to our list one now
let's try the very same thing in tuple so tuple one dot append five and i'll run
this and see you get an error here so the tuple object has no attribute append in
fact it has none of the functions which you usually use with lists because these
functions aim at changing the original list and not giving back a new list in
tuple's case you cannot change the original list once your tuple is set that is
how it's going to be as long as it exists so that is the main difference between
lists and tuples list is mutable whereas tuple is immutable and other than that
of course you must have noticed the syntactic difference which is lists are with
square brackets and tuples may or may not use parenthesis or the curve brackets
so every time you print out a tuple or you view a tuple it's always got those
brackets but as you saw while creating a tuple sometimes a comma is enough so now
let's access these elements in a tuple if you want to print out the entire tuple
we already saw just put it in a print statement put the tuple list put the tuple
name in a print statement and you'll have your tuple printed now if you want to
access some particular element of the tuple say i want to take out bang lock so
in that case we use the indices as we saw earlier indices start from zero and
bangalore is therefore at the first position so i have to just say city and
within square brackets one so this particular line or the syntax is exactly the
same as you would use for a list or string just run that and here we go we have
bangalore now we can also extract elements from the end of the string just like
we did in case of list so if i do not know the length of the string now suppose i
want to access the very last element of my tuple but i have no clue as to what
the length of my tuple is in that case what index do i use well i can just go
with -1 now every time you give this minus sign it starts accessing elements from
the end of the tuple so minus 1 means the first element from the end of the tuple
which is mumbai and that's what we got as our output now although lists are
immutable there are a number of operations you can still perform on them for
example the very first thing we look at is concatenation so i already have my
tuple city we'll just print it out once again here to see what it consists of now
we'll create another tuple say num and this will have the elements one two three
four now sure i can't modify tuples but i can definitely concatenate these two so
how do you do that extremely simple city plus num the plus sign works perfectly
fine for concatenation so i will print this out and we have an error here that is
because we have not compiled our num list first so our num list is not existing
right now i ran my line 19 now i can run the next line we have one complete list
which has all the elements of city followed by the element of num now you can
also nest tuples that is you can have one tuple within another tuple so so far we
have a tuple city and num now i will create another tuple so first now we already
have our tuple city and num what i'll do is i'll create a new list say nest so
this is going to be a list so we have our brackets here so what i'm going to do
is i'm going to create a tuple within another tuple so we have a new tuple i'll
call it nest and since this is a tuple the basic syntax i'll have the brackets
now what will be the elements within this the elements would be my other two
tuples which are city and num so i put in both my elements here and let's see
what nest contains now so nest is a tuple which itself contains of other tuples
which are city and num so that is how you nest tuples extremely simple isn't it
now here's something else say i want to create a tuple rep which has an element
python but that element is repeated five times so what you would generally go for
is type out python five times but thank god for python there's a much simpler way
of doing this type in python once and multiply this five times let's print out
rep now and as you see here python exists within this tuple five times one two
three four five now maybe this is not how you want to create your tuple you want
your tuple to just have that one element python initially but later on while
printing you decide you know what i want python to be printed 10 times within
this tuple not a problem put your tuple name here and then go for it into 10 and
as you see here the tuple has now 10 elements within it so guys remember when
you're putting the statement here it's not actually modifying the tuple it's just
printing it this way so if i say rep once again and i run this this tuple just
has one element so next we look at slicing in tuples so we created our tuple num
previously i'll just print that out once again so it has the elements 1 2 3 4.
now suppose i want all the elements from 2 up to the end of this tuple to be
printed in that case what i do is i go for slicing where i will mention my
beginning and my end index so my beginning index starting from 2 is of course 1
and my end index is basically the position 4 is at or to the end of this tuple
now when i want all the elements up to the end i do not need to mention my end
index it's by default taken so
so if i just say num of 1 up till black run this your tuple sliced it gives out
the elements at the first second and third position considering 0th is our first
position of curse now you can also print this num tuple in reverse using these
indices so you say num i leave my start and end index empty give two colons and
then in the end i have minus one so when you put this minus one here so it
basically prints out the reverse of your tuple let's check that out and as you
see your last element is now your first element and your first element is now
your last element if you have any doubt regarding how slicing works please check
out our video on python list over there we have a detailed description on how
slicing is and there will be no confusion since in case of slicing at least
tuples and lists are exactly similar so let's look at the string that i have
here simply learn now i want every letter of the string simply learn to be an
element of a tuple how do i do this well pass the string as a tuple so write
tuple off and within this you have your string which is again within quotes right
we'll run this command and as you see here every character is now a separate
element in a tuple so this procedure is called unpacking now i'll just print out
my num tuple once again so we can see what's in it okay so i want to put these
elements which are inside a tuple into various variables now in this case i know
that my tuple has four elements so i can have four variables a b c d and i'll put
this equal to num run this and then i'll print out a b c and d so every element
of the tuple has gone into one of these variables the first element went into a
second element into b and so on now what if we had no clue how many elements are
there in num in that case i could say a i want my first element to go into my
first variable a no matter what and i want my last element to go into this
variable c no matter what now all the elements in between i'm not very concerned
about where they go so in that case i can have another variable b but this
variable will be a list so the star indicates that b is a list and then this is
equal to num we print out a b and c so you see what happens here is my first
element went into a and my second element went into c because these were clearly
indicated but every other element in between went into my list b so that is how
unpacking works so since tuples are immutable it's impossible to delete the
elements within it but that does not mean you cannot delete the tuple entirely so
that is what the dell keyword does we'll implement this so i'll create a tuple
tuple one and put some elements into it one two three four now i'll use the del
keyword now i'll just print out this tuple so we know it exists and what all
elements are within it so create our tuples created it has the elements one two
three four within it now i'll use the del keyword to delete this tuple so i say
del space followed by your tuple name run this command now let's try printing
tuple one once again of course since we did it all right hopefully this tuple
should not exist and this should result in an error and it is resulting in an
error it says tuple 1 is not defined so you can completely delete the tuple if
not delete elements within the tuple so as i mentioned previously most of the
methods that you have with list does not work with tuples because these methods
change the original list itself but there are a few built-in functions which do
work with tuples they return values or new lists and they do not have any impact
on the original tuple so it works just fine we'll explore some of these built-in
functions i'll continue using the tuple i created before that's num so i'll
create a new tuple num1 which has the elements three five say four twos six five
eight and now we'll perform our built-in function operations on this num1 tuple
so the first one that we look at is counting the number of occurrences of a
particular element so that is the count function so you give your tuple name dot
count off and the element whose count you wanna know so i want to know the number
of twos in my tuple num1 run that and as you see there are four twos now you can
also find the sum of all the elements within your tuple so that is sum of num1
you just have your function sum pass num1 to it now the sum function works for
tuples lists everything it's not a function particularly for tuples as such so
the sum of all the elements within a tuple is 35 and in a similar way you also
have the len function which is in fact even used for strings so len of num1 gives
the number of elements within num1 which is 9 we have 9 elements but remember our
last index is 8 not 9. our element is a2 coincidentally now we have the max
function for finding out the maximum number in your tuple so say max of num1
which is eight and we also have a function for checking the minimum of the
numbers within a tuple which is two in this particular case so now list and
double are very similar in the sense of what they hold so what if i have a list
say lst which holds the elements one two three four and i want to convert this
list to a tuple so we'll see how that is done first i stored my list one two
three four so i'll just check the type of the variable lst ensure that it's a
list which it is now we can convert this to a tuple so to convert this to a tuple
enter your variable name and then pass your list name within these braces for the
method tuple run this code and now let's print out tpl and as you see our list
lsd has been converted to a tuple we can also check its type so yeah a conversion
has worked out fine now this kind of a situation often arises when you put
certain elements into a list so you create a list that you desire and then you
realize that you don't want this to be changed at all throughout the program so
then you can convert the list to a tuple you cannot convert a tuple to a list
though so we previously saw how we can have nested tuples that is tuples within
tuples now we can also have tuples within lists so we'll check that out now see i
have a list lst and within this list i'll define my tuples so two of the elements
of my list lst will be separate tuples i have a tuple one two three and i'll have
another tuple as the second element of my list as four five six so i'll run this
code print out lsd so lst is created just fine it's a list which holds two tuples
now tuples are immutable but lists are mutable so we can add as many tuples as we
want inside this list so we can make modifications to the list so i could say
list dot append and to this i'll pass another list which has three elements say
tuple inside list from this code now if i print out list you'll see that the
third tuple has also been added to the list now we can also remove tuples from
the list just use the remove function so lst dot remove and within this i'll pass
the list that i want removed which is my first list in this case so run that code
and print out the value of lst to check what's in it right now so first list has
been removed now we just saw how you can nest tuples within lists now we'll check
out how we can nest lists within tuples so i'll create a tuple tpl and within
this tuple i have two elements each of the elements being a list so they have a b
c in my first list and d e f in my second list run this and print out the value
of tuple before we proceed yeah so that's our tuple with two elements each
element a list now we cannot modify tuples so initially when we had tuples within
list we could add more tuples to that particular list but now we have lists
within tuples and we cannot add more elements to this already defined tuple but
what we can do is we can modify the list within the tuples so i can add elements
to the first element and the second element of my object tuple so i'll try that
out if i say tuple 0 dot append so when i say tuple 0 it's accessing the first
element of the tuple and the first element of a tuple is a list so to this i can
use the append function and i'll append an element z run this code and then check
out the elements within tuple and in the similar manner we can also remove
elements from the list within this tuple so i can say tpl of 0 dot remove z and
then print out tpl so you see z has been removed here now just for the sake of it
let's see what happens when we try to add another list to this tuple so i say tpl
dot append and pass a list x y z and then print out tpl so we get an error
because this time our object was a tuple not a list and tuples cannot be changed
so with that we saw the many concepts that is used with tuple along with what
tuple is and today we'll cover the concepts in strings so before we begin let's
look at what exactly a string is now just like you have integers floats bytes and
so on string is a data type now what does this data type hold it holds a
collection of characters and when i say characters that could include letters
numbers and even special characters such as space period dollar add rate symbol
and so on now the syntax for storing a string is much like that for any other
variable you have the variable name followed by an equal to sign and the value of
your variable but in case of strings it's important that you always put your
values within double or even single quotes so that is string at the very basic
let's start our demos and i'm sure through the demos you'll have a better
understanding of how to use string and where to use strings so let's start with
the most basic demo where we'll store a string within a variable and then print
the value of this variable so my variable name is sdg
within which i'll store the string simply alert so i'll enter the string within
double quotes as you can see here or i could also do this within single quotes
like that and then in the very next line i'll print the value of sdg let's run
the code and as you can see here the value is printed now let me store a
different value in sdg i want to store the string tim's birthday so i insert
this within single quotes tim's birthday let me run the program but here you
see there's an error it says syntax error which is an invalid syntax so if you
come back to your string you'll notice this is because our string itself has a
single quote so we try to enclose our entire string within single quotes but what
happened is when the python compiler encountered the single quote within the
string it assumed this to be the end of the string so sdg became equal to tim and
the rest of the word that is the s space and birthday was just there for no
reason and this resulted in the syntax error so in a case like this what you do
if your string contains within it a single quote enclose the string in double
quotes like this and now if you run your program as you see tim's birthday is
printed now in a similar manner if my string has within it double quotes then
i'll store the entire string within single quotes so let's say my string is tim
said i am busy today so since my string has double quotes within it i'll store
it in single quotes and a string is printed right now suppose tim said i'm busy
today that is i apostrophe m so now in this case our string has both single and
double quote so here what we can do is we can simply escape the single quote here
so for that you use the slash and when you insert the slash what it does is the
character that just follows the slash is taken as simply a part of the string
without any special meaning so the python compiler in this case sees this
apostrophe and ignores it for any special meaning so if i print the string now
both my double and single quotes are printed correctly let's try something else
out what if my string needs to be of multiple lines so let's say my string is hey
there and on the next line i'm typing welcome to simply learn so if i try to do
this in the similar manner that we did for the other strings that is insert the
entire string within double of single quotes as you see here that does not work
welcome to simply learn is just now taken as a part of the string even if i try
single quotes let's try that out still no different still our second line of the
string is not considered a part of the entire string so here what you do is you
start and you begin your string with three codes now these three codes can be
either three single or three double quotes so if i'm going with single quotes
i'll have three single quotes here and three single quotes at the end so in this
manner you can have a string with multiple lines let's print this out too so we
have our multiple line string so now that we saw the various ways in which you
can store strings in variables let's try out something new now the first thing we
look at right now is the length function so the length function is something that
you'll come across very often when you start programming what the len function
does is it returns the length of your string so if my string is say simply learn
and i want to print the length of sdg variable and just say print len off and
within the brackets enter your variable name print that so simply learn is of
length 11. what that means is that it has 11 characters within it so even if you
have a space in between even that will be counted as a character because as i
told you a string is a collection of characters and this character could be
anything we print that out as you see 12 is the new length of your string now
just like in case of arrays even each character of your string can be extracted
separately and again just like in arrays this is done through the indices so if
i want the fifth character of my string to be printed i can just say print stg
and just as the syntax is an array the exact same way you have your square
brackets within which you have your index number mentioned and print that so our
fifth character in the string stg is i now remember the indices in string two
start from zero capital s is at the zeroth index i is at the first m is at the
second indus and so on now if i want to access every character of my string one
by one i can do this using a for loop because as we learned previously for loop
is used to iterate over a sequence and the string two is a sequence it's a
sequence of characters so my counter variable is i for i in sdg and print i so i
will iterate through the string start from capital s and go all the way up till n
and every time you're printing the value of i that is the character that's
holding at that particular point of time so when i print it as you see every
character is printed in a new line now if you want all the characters to be
printed in the same line of course you can have your end attribute here print
that and so the string is printed the same line but again it was printed one by
one that is one character at a time now with strings there's a concept called
slicing so slicing is extracting a part of the string right now we saw how you
can extract exactly one character from a string now what if you want to extract
say the first five characters of your string stg so this is where slicing comes
in and it's a very simple concept all you have to do is you give your variable
name your string variable name and within your square braces as previously you
mentioned that one index this time we'll mention the two indexes and a colon in
between so i want my first five characters so i'll start from zeroth index and go
up till five with a colon in between them so let's print this and as you see i
got my first five characters of the string stg now when you're starting from the
beginning of your string you can avoid giving this first index what i mean is
you can just put in your variable name and instead of having 0 to 5 here you can
just ignore the 0. so python assumes that your first index is 0 when you just
skip it when we print this you see both the results are the same now let's say i
want to extract all the characters from the fifth index up to the end of the
string in this case i can just say print off stg and within the square braces
since we want all the characters starting from the fifth index our first index
would be five colon and since we want every character till the end of the string
to be extracted we can leave the second index space empty so when you do this
the interpreter assumes that your end index is basically till the end of the
string so let's run this and as you see here all elements from the fifth index up
till the end of the string has been extracted now suppose you want to extract
some elements from in between the string so in this case it is compulsory that
you must enter both your beginning and your end index so say print stg off and
suppose i want to extract all characters from the second index up till the fourth
index so in this case my first index would be two because i want to start
extracting from my second index position and my end index would be five and not
4. so the reason here is that the element at the end index position is always
excluded so if you put 5 here as your end index then all the characters till the
4th index position would be extracted and if i enter four here then all the
elements till the third index position would be extracted let me print this two
so our last output it's the elements in the second third and the fourth index
position now python has a number of inbuilt methods which makes it easy and
efficient to deal with the string data type now there are a large number of such
methods and of course we don't have the time to go through all of them so we'll
go through some of the most commonly used ones let's start by creating a string
so my string is welcome to simply learn now the first method we look at is to
print our entire string in uppercase so print your variable name period upper
so all the inbuilt methods this is how they are called you have your variable
name your string variable name followed by period and then your method name and
the parenthesis so some of the methods require some parameters to be passed
within it and some of the methods do not so the upper method prints our string in
uppercase now you must remember that these methods do not change the value of the
original string they just return a new string so stg.upper let's print that and
our entire string is printed in uppercase in the similar manner we have a method
to print our string in lowercase 2. there you go that's just dot lower so here
are entire strings in lower case entire string is printed in uppercase now
initially we saw that if you want to extract a particular character at certain
location you can just give the location index and the character is extracted now
what if you want to find out the index of a particular character in that case we
have a method called find find returns the index of the first occurrence of the
character so the syntax is very similar your variable name dot find now for this
method you need a parameter passed and the parameter of course is the character
whose index we want to find out so suppose i want to find the index of the
character capital s so a character capital s is at the 11th position with the
positioning or the index starting from 0 onwards now there's another function
which is exactly similar to find which is the
index function and the index function 2 does the same thing it returns the index
of the character that you pass within the brackets so if i say s here both of
them return the index which is 11. now these methods they give the index of the
first occurrence of this character since we only have one capital s it's obvious
it's giving the index of this particular character now if i say l as you can see
a string it has an l here and one here too so if i give l in these two places
let's run a program you see the index return is for this l that is our very first
l now let's look at another inbuilt method for strings which is the split
function so what the split function does is it it converts your string to a list
and it does this based on a delimiter that we pass as a parameter to the method
so if i say print off stg.split and within these brackets i'll pass the
parameter now if you look at a string here it's a sentence or a phrase where each
word is separated by a space so if we want each word to be a separate element of
the list then what separating these words is the space and that would be a
delimiter here so within quotes i pass space let's run this if you notice the
last output that we got here this is a list and the elements of this list are the
words in our string so if we observe the return value of these methods you'll
notice that the first two return a string these two they return an integer or the
index number and the last one that we executed that is a split method that
returns a list now often this is a very useful method so what you do is you store
the return value of this method in a variable say x equal to stg.split off and
then you can utilize this list in doing multiple functions in this case i have
nothing to do with x i'm just printing it but when you start coding you'll find
that the split function comes very handy now what if you want to replace a
certain part of your string for this we have the replace method so you have sdj
dot replace now within the bracket you have two parameters to pass the first
parameter is that part of the string that is to be replaced so suppose i want to
replace the word simply learn in my string so that will be my first parameter
and my second parameter is what i want to replace this particular part of the
string by so i want my string to say welcome to python tutorial instead of
welcome to simply learn so i'll replace simply learn with python tutorial let me
print that out so this method 2 will return a string it will not change our
original string none of these methods do that yeah so as you can see here it's
returned welcome to python tutorial instead of welcome to simply learn now python
has another data type called tuple if you would have gone through a previous
video on variables we introduced tuples there so tuples are a lot like lists
except they're immutable now our next method is the r partition method so our
partition method is a lot like the split method but here it creates a tuple so it
returns a tuple and this tuple always has three elements let's start writing this
method and then i'll explain to you how it works so sdg dot our partition and
within the brackets you enter that one part of the string based on which you'll
create this partition now if i pass within my brackets the string space to space
what this will do is my tuple always has to have three elements so if you look at
the original string this part here that is the space 2 space forms the middle
element of my tuple that is the second element of my tuple everything before this
part that is the entire welcome word will become the first element of my tuple
and everything after this part which is the word simply learn becomes the third
element of my tuple so i'll print this out and you can see how it looks like so
as you see here our partition returned a tuple and split here this returned a
list so our tuples second element is what we passed within our brackets and the
other two elements are everything before and after it respectively so these are
some of the basic and commonly used inbuilt methods with strings in python now
let's move on to concatenating strings so string concatenations are pretty
simple with python say you have two strings sdg one which holds good and sdg 2
which holds morning now using these two strings you want to create a third string
which holds good morning so what i can do is i'll just have a variable stg and
that will be equal to sdg 1 plus now i need a space between these two words so
plus space which will be inserted within quotes because at the end it's all a
string again the plus and our next variable which is stg2 and then i'll print
sdg out so you can see here scg printed out good morning good from sdg one and
morning from stg2 now if you have no operations to be done with the concatenation
of two strings that is you do not need this third variable at all you can
directly put this concatenation part within our print statement like that it's
exactly the same now things get slightly more complicated when you have a number
of strings to add now let's say i have three strings so sdg one which holds hey
sdg 2 which holds there and sg 3 which holds all now what i want my new string
to have is hey there comma all followed by an exclamation now this could be done
in the previous way we can have the plus symbol to add the various strings
including the space the comma and everything inserted within quotes but that gets
a little messy when you have a number of variables to add and other extra strings
to be added in between so here is where we use the format inbuilt function for
strings so what we're going to do here so we'll have a fourth variable sdg now
i'll just write down the format first and then i'll explain it to you okay so
how this works is first you have to give a format for your string so that is
inserted within double quotes now these braces here the curly braces they are
called placeholders wherever the curly braces are placed that is exactly where
the respective string from your format function will go to that is our first
placeholder will be replaced by the value of stg1 our second placeholder will be
replaced by the value of sdg2 and our third placeholder here would be replaced by
the value of sdg 3 and everything that is in between is printed exactly the same
so here i have inserted a space this space will be printed exactly as it is in
the same location that i have given the comma and the space here two will be
printed exactly the same way and the exclamation would be printed the same too so
at first you give a string within which you have placeholders for these strings
that already exist followed by dot and then the format function to which you
pass the variables that take the place of these placeholders so now let's print
out sdg and as you see sdg holds exactly what we wanted hey there comma space all
exclamation dictionaries and sets in python so first let's have a look at what a
dictionary is by definition a dictionary is an unordered collection of data
stored as a pair of key and value so dictionary basically is another data type
just like your lists and your tuples but here they are not exactly a sequence
because the order in which you store the elements are not very important in the
case of a dictionary also every element is a pair of values so there's the key
and there's the value the key is basically like your index in a list or a tuple
but here in dictionaries keys can be not just integers but also strings so that
is the main difference between dictionaries and other sequences such as lists and
tuples now let's have a look at exactly how this works but first we look at its
syntax so you have your variable name or the dictionary name equal to and within
curly braces you have your data stored so each data here is a key value pair as i
mentioned earlier so our first data would be key one and then you have the double
colon so the double colon basically separates the key and the value so value 1 is
assigned to key 1 value 2 is assigned to key 2 and so on and every pair of value
and key is separated by a comma so now we'll move to jupiter notebook to have a
look at how dictionaries work let's start with creating dictionaries so first
we'll make an empty dictionary so how is this done you have your variable name
which will be your dictionary name basically so i'm taking my variable name as d1
and as we previously saw dictionary elements are enclosed within curly braces so
you can just put down curly braces here and since you are creating an empty
dictionary have no element inside and that is pretty much it that's how you
create an empty dictionary i'll just print this out print d1 also print out its
type so as you see here guys this is an empty dictionary the class is ticked the
short for dictionary of course now let's create a dictionary with elements so my
dictionary name this time would be d2 again curly braces and now within these
curly braces we'll enter our elements so my first elements key is say 1 and the
value assigned to this key would be welcome my second elements key is 2 value
assigned to this would be 2 3rd key and third value and finally the fourth key
and its corresponding fourth value now let me print d2 out so we created our
second dictionary successfully now here these integers are the keys and the
strings are the values now i mentioned previously too that in case of
dictionaries the keys are like indices but here these indices slash keys do not
need to be integers they can also be strings so next up let's create a dictionary
with the keys being strings too so my next
dictionary's name is d3 and now within curly braces enter your elements again so
my first element key would be name the double colon and then enter the name
although name is a key since it's still a string we need to enter them in double
quotes did not forget that the next key would be age which is 22 and finally the
profession and sam here is a student so that's our dictionary d3 i'll print this
out too so we created an empty dictionary we created a dictionary with its
indices as just integers now there's another method of creating dictionaries
which is using the dict method so let's check this out my dictionary name is d4
and i'll be using the dictionary method now i'll pass the dictionary elements
within the dictionary method again using curly braces now this kind of creation
of a dictionary is not needed the above methods are so much more simpler and
that's what we usually go with but just for the sake of your knowledge we'll
check this out too so again i'll just put down these elements here copy paste
them inside these curly braces and then print out d4 so as you see it's just the
same thing we get the exact same result using this dictionary method to now when
you're using the dictionary method make sure that you have not named any variable
as dict previously in that case if you did name any variable when you put down
this method it'll give you an error saying method is not callable so if you did
by mistake create a variable or a method saying dict make sure you delete it
before you run this line now another syntax for using the method and creating a
dictionary let's look at that so now my dictionary is d5 using that method again
so this time we'll pass every key and its corresponding value as a pair within
round brackets so we are basically passing these pairs as elements of a list or
by syntax you can say as tuples within a list so you have one is welcome
remember to put the quotes always for strings two so as you notice in this case
we are not using the double colons what we are doing is to the method we are
passing a list of pairs of the elements and each of these pairs are enclosed
within these round brackets here now print out d5 so there's an error here the
error is of course that we have double colons in space of comma so i'll correct
that and run the code again and there you go so d5 is also created it's exactly
the same as d2 or d4 it's just a different method of creating it now the final
type of dictionary that we'll be creating are nested dictionaries so you can have
a dictionary within a dictionary just like you can have a list within a list or a
tuple within a tuple so let's see how this is done and why this would be done too
so if you look at d3 here we had this key name which had one value sam but now
what if our key name has two values that's the first name and the second name now
in this case name can be the key to another dictionary and within this dictionary
we can store the first and the second name again with their corresponding keys so
we look at that i have my dictionary d6 here and we'll use a simple method of
creating a dictionary i'll just copy paste this line where i'm storing sam's
details and edit this out so name is the key to a dictionary and within this
dictionary i'll have two keys first which is basically your first name that is
sam and last that sam's last name which is crew and that's how you create a
nested dictionary print d6 out so it's done as you can see here name is the key
for our dictionary which is inside our main dictionary and inside this nested
dictionary too we have two keys first and last each of them with values so now
that we learned how to create dictionaries let's see how you can add more
elements to them so i'll create an empty dictionary first say d and now i want to
add my first element to this dictionary so i can say d of 0 just like how you do
in case of list equal to and the value then i'll print out d so that's how it's
done 0 becomes your key and welcome is the value now i can also add a tuple to
this dictionary so if i want my second element or the value of key 1 to be a
tuple i go like d of 1 equal to and then just how you would define a double go
ahead and create that and then i'll print out d so now we have two elements in
our dictionary d the first is the string welcome followed by a tuple with three
elements so i just missed out a comma here and forgot to close my strings so yeah
our dictionary has two elements the first one is a string welcome and the second
element is a tuple with three elements within it which are how are and the string
u so so far we use numbers as indices for adding elements which made the number
the key now as we saw previously dictionaries can also have keys which are
strings so how do you assign these well the same way but now instead of number
put down a string and i say d of name equal to sam right and then print out d so
our first two elements have the key 0 and 1 integers our last element is having
the key as name which is a string and a value which is again a string now
similarly you can also add dictionary as an element to this already existing
dictionary now in this particular case i don't want to add another element in
fact i want to update the value of a key name i want the value to be another
dictionary just like we had previously so i'll again say name where the index is
supposed to be and then equal to curly braces where i'll create my dictionary so
that'll be the key first which will have the value sam and the key last which
will have the value true print d out and there you go so your value for the key
name has been updated now so we saw how we can add elements or update elements
now let's have a look at how you can access the elements which are already
present in your dictionary so the syntax we use again here is very similar to
that for lists or tuples before we move ahead though i'll just print out d so
that we can refresh us to whatever was stored in d so we have three elements here
key 0 has the value welcome key 1 with the tuple how are you and then finally key
3 which is name which has a dictionary as its value and the dictionary holds the
first and the last name so now if i want to retrieve the value of the key name i
can just say d of name and i'll give you the value of the key name now if i want
the value of just the key first which is within the key name i can go ahead with
the sort of syntax you'd use for lists and tuples again when they are nested so d
of name which returns this dictionary here as we saw previously and now from this
dictionary if i want to extract the value of the key first then i would just say
first within the square brackets and that is how you retrieve the value within a
nested dictionary now this is a very simple method yet again we have a method
which is specific for retrieving values with dictionaries which is the get method
so if i say d dot get off and within the brackets i pass the key say 1 run this
line here and that too retrieves the value of the particular key so we created
dictionaries we inserted elements into it updated those elements and even saw how
to access these elements now let's look at how you can delete the elements so
once again before we begin i'll just print out our dictionary on which we'll be
operating so that's d and these are the elements within d now the first method
that you can use for deleting elements is the pop method so i'll say d dot pop
off and within the bracket to insert the key whose values you want to delete so
suppose i want to delete welcome which is at key 0 so i just enter 0 in here run
the code and the pop method basically deletes the value for the key that's given
and it returns this value so it's returned welcome also if we check d now welcome
won't exist within it anymore now the other method which is popularly used for
deleting elements is pop item now with pop we specified the element we want to
delete with pop item we did not specify this element because it's always the last
element of the dictionary so you just say d dot pop item run the code again pop
item just like pop returns the element while deleting it also so i'll print out d
now and as you see here d just has one element left after the two deletions which
is the value of key one and that's basically how you delete elements from a
dictionary now we'll also check out some inbuilt methods or inbuilt functions
with dictionaries before we move on to that though we'll put a dictionary back
into its previous state so we'll just run the line where we created our
dictionary once more this line here let's check out d now so our dictionaries
back to its original state and now we can perform the next few operations on them
so there are a number of inbuilt functions used with dictionaries but here we'll
just focus on some of the very important and commonly used ones so the first one
we're going to look at is values so you say d dot values off now what this
function does it returns all the values of the dictionary so it returns just the
values not the case run this so as you can see here all the values are returned
as a dict values type object now there's a slightly different method of creating
a dictionary which is using the from keys function so what this function does is
basically it takes two sequences one for the keys and one for the value and then
it creates a dictionary using these keys and values so first i'll create a
sequence for my keys my keys would be abcd and then i'll just have one single
value here one and i want each of these keys to have the value as one so this is
where the from keys method comes very
handy say dick dot from keys and then you pass on the sequence of keys and the
value let's run the command so a dictionary is created here with our keys abcd
and as i mentioned previously you can probably notice here the best that
dictionaries are unordered so just because i gave the sequence as abcd does not
mean that this sequence or this order will hold in the dictionary that's created
and now finally we'll use the clear method to completely remove this dictionary
now if i try to print d here it's an empty dictionary so with that we come to an
end of what dictionaries is and next we look at sets so what are sets in python
by definition a set is an unordered collection of unique elements so the terms we
must concentrate on here are unordered just like dictionaries in sets also
there's no particular order to the elements stored in it also unique element as
in suppose you have the letter a and you enter this twice while creating the set
ultimately when the set is stored or when you print out the set you'll notice
that a only occurs once so every element of the set is unique now set basically
brings out the mathematical notion of a set that means you can easily find out
the union of two sets or the intersection of two sets or any such operation that
you'd normally do on mathematical sets also sets are iterable and mutable and
they provide a very optimized way of checking if an element is within it or not
now let's have a look at the syntax of a set you have the variable name or the
set name equal to and then you use the set method to create the set you pass to
this a list of the elements that you want inside the set so unlike dictionaries
in sets you don't have key value pairs it's just the values now we'll move back
to our jupiter notebook to demonstrate sets so first things first let's create a
set my set name would be s and then i'll use the set method and within this i'll
pass for elements one two three four and that's how i create my set very simple
isn't it i'll just print out my set and also print out its type so we can confirm
that what's stored in s is actually a set run the code so as you see here we have
our set one two three four belongs to the class set now sets need not have just
integers or just strings it can be mixed of course and we look at this while also
looking at how you can add elements to a set so for adding elements you use the
add method so s dot add off and then within the brackets you pass whatever the
element is that you want to add in this particular case i'm adding a so i'll run
that and then print out s so yes a is added to the end of the set but again since
this is a set the order does not matter so now sets are mutable so i was able to
add a new element to the set now we have another thing called frozen set now
frozen sets are just like sets except they are immutable so here i'll create a
frozen set for which we use the function again called frozen set pass on the
elements here and then i'll print out fs so the syntax for creating a frozen set
is just like that for creating a set and yeah we have the elements here now i
want to add an element to fs if i say fs add off and pass e here you'll notice
you get an error and the error says that attribute add is not associated with
frozen set this simply because add modifies a set and frozen sets do not allow
this kind of a modification now as i mentioned before set in python represent the
notion of sets in math so that means that we can also have operations such as
union and intersection on set and these are the most common operations that you
do perform with sets so let's have a look at these so i'll create two sets
initially s1 which will have the elements 1 3 7 2 and then s2 having the elements
three two eight nine so these are our two sets now first we'll apply the union
operation so you say s1 dot union off and within the brackets pass s2 run this so
what union does is it brings all the elements of s1 and s2 together into another
set and of course any kind of reputation is removed because sets have unique
elements so as you can see here you have three in s1 and s2 both but three
appears just once in our union set now we can also have intersection of s1 and s2
now we can also subtract a set from another so s1 dot difference of s2 basically
removes all the elements from s1 which are also found in s2 run this line here
and the result is 1 and 7. so we'll go back to our list here and confirm this now
s1 has 1372 but s2 has elements 3 and 2. so when you remove s2 from s1 all you're
left with is 1 and 7 if else statement in python so if else is a decision making
statement and there are various formats of this so let's begin and one by one we
look at each of these formats so the first one and the most primitive form of the
if else statement is the if statement here's the syntax for if so you have the
keyword if followed by the condition and then we have the double colons on the
next line you can put in your statements which will be executed if this condition
results in true so all these statements are indented slightly towards the right
this indicates that these statements are within the if block so how exactly does
this if statement work let's look at the flow chart to understand this so at
first we have the test expression or the condition now if this condition results
in true that means that the statements which are inside the if block those will
be executed if this condition results in false then the control of the program
will skip to the statement after the if block and in either case the program then
proceeds to print these statements right after the if block so we'll better
understand this through an example i'll move on to my jupiter notebook where i'll
demonstrate a code now i'll have a variable a which is equal to the value 20. so
i'll write an if statement the keyword if followed by the condition and my
condition here is if a is greater than 50. so if a is greater than 50 we want to
enter the if statement or the if block and in here i'll print out this is the if
body and if a is not greater than 50 we would skip this entire part and directly
move on to the statement which precedes the if block and here i'll have the
statement print this is outside the if block so i'll run this program now so in
this particular case the value of a was 20 which is obviously less than 50 which
is why we did not enter the if block we simply skip to the statement after the
if block now if i make the value of a equal to 60 and then run the program you
see that we have entered the if block so this print statement here that is the
body so this print statement which says this is the if body is printed and then
the control moves on to the next statement which is outside the if block which is
this is outside the if block so both these statements are printed this is how if
works so the next format for if is the ifl statement so so far we could specify
the statements that will be executed if a condition is true now in case of if
else we can specify the statements which will be executed in case the condition
is true and a separate set of statements in case the condition is false so that
is the part that comes under else now look at now let's look at the syntax here
there's if followed by the condition and then the double colons just as before
under which you have your statements which will be executed slightly indented to
the right and following the if block you have the else block so if your condition
results in false the execution of the program moves on to the else block and
under the else block again you have your statements which will be slightly
indented to the right so here we'll have a look at the flowchart for the if else
statement if the test expression or the condition results in true we move on to
the body of if and the statements within this body of if will be executed after
which we'll move on to the statements just below if if the condition initially is
false then we'll move on to the else block and we'll have a separate set of
statements within the else block which will be executed after which again we'll
move back to the statements just below the if so again we have a demo here to
better understand how if else works we will print out if a number is odd or even
so initially i have a variable i which is equal to 20 and then i'll have the if
statement and then i'll have the if n statement to check if i is order even so i
have i mod 2 equal to equal to 0 so if i mod 2 is equal to 0 that means the
number is even because i is completely divisible by 2. so in that case i'll print
out this is the if block and i is an even number and if i mod 2 is not equal
to 0 we need to print out that i is odd so initially we did not have this
capability with just if statement but when you have the else 2 if imod 2 is not 0
the control goes to the else statement and only the statements which are inside
the else block gets printed so here i have print this is the else block and then
print i is an odd number now let's run our program so as we know 20 is an even
number so imod 2 results in 0 we go inside the if statement and these two
statements get printed this is the if block and i is an even number now i'll
change the value of i to 23 which is an odd number so then the statements inside
the else block gets printed now let's look at the nested if so nested if is when
you have an if statement within an if statement this kind of a situation often
occurs when you have to filter a variable multiple times so let's look at the
syntax for nested if you have if followed by condition and the double colon just
as before and
now within the if statement indented again to the right you'll have the second
condition so you'll have if and your second condition which is condition two
followed by colons now you can have as many levels of nesting as required but of
course the more number of nestings you have the less optimized the program is so
you should always try to minimize or if possible completely avoid nested ifs now
we look at a flowchart for nested if you have your first expression let's call
this expression one if this results in true then the control moves on to the
second expression that is the nested if now if this condition results in true
you'll have the execution for the body of if if this condition results in false
you'll have the execution for the body of else now of course this else is not
mandatory you can either have an else statement or completely avoid it altogether
the control however after else follows the statements right after the if block
now i'll move on to my jupiter notebook to show yet another demo for nested if so
here i'll have a number and first i need to filter this number as to whether it's
less than 25 or greater than 25 if this number is greater than 25 then i need to
also specify so here i'll take an integer and my first filter would be if the
integer is less than or greater than 25. if this integer is less than 25 i
further need to specify if this number is even or odd and if the number is
greater than 25 i just need to print out that yes this is greater than 25 so
let's begin i'll have a variable c and for now i'll have the value 21 student c
now if c is less than 25 i'll first check if c is an even number so if c is less
than 25 and then c mod 2 is equal to equal to 0 i'll print out c is an even
number less than 25 c is an even number less than 25 else that is if c mod 2 is
not equal to 0 then i'll print out c is an odd number less than 25 now if c is
not less than 25 that is if c is greater than 25 that would come under the else
block which matches with our parent if statement so if you notice the indentation
here this else statement which goes with this if statement is indented at the
same position and on the other hand our parent if statement which is paired with
this else statement here are indented at zeroth or the same location so now under
my parent else statement i'll just print out c is greater than 25 let me now run
this program and c is 21 so it gets into this if statement here because it's less
than 25 and then c more 2 which is 21 mod 2 is not equal to 0 so the control
moves to this else statement and c is an odd number less than 25 gets printed if
i had 50 here run my program again this if statement would not get executed
control would transfer to this else statement here and c is greater than 25 gets
printed now so far with if and with if else we had a very binary approach that is
if this condition results in true print out the statement a if this results in
false print out statement b now what if we have various other conditions for
example we have a number a and we need to check if a is within the range 1 to 10
in which case we need to print out that this variable with is within the range 1
to 10 else if it's within the range 10 to 20 we need to print that out now if
it's in the range 20 to 30 we need to print that out that is it's in the range 20
to 30. so in this case we cannot just stick to if else statement we need
something more we need something which has more steps and that is where if lfl's
ladder comes into picture so here if we look at the syntax we have your if
condition first the semicolons followed by your statements and under this we do
not directly have the else statement but we have another keyword the alif so the
elif statement lets you provide another condition and not just go with the false
of if so you have elif and then condition 2 under which you have statement now
you can have any number of elif statements and finally you'll have your else
statement which will be executed if none of the previous statements have been
executed so we'll have a look at the flowchart to understand basically how the if
lfl slatter works you have your first condition expression one if expression one
results in true body of if gets executed and immediately you move on to the
statement which is right after the else statement if your expression 1 results in
false that means any of the preceding l if or else condition could result in true
so the program keeps checking all these conditions so your next l if condition is
checked which is expression 2 and then if expression 2 results in true then the
body of this elif statement is executed and no further statements are checked in
the if elif else block directly you move on to the statement right after these
blocks if none of the previous life conditions or the if conditions match the
control moves on to the else statement and the else statement will definitely get
executed so once the else statement is executed the control moves to the
statements which are right after the flf else block so that is how the control
flows in if else ladder we'll have a look at another demo to better understand
this so here i'll have a variable say var which stores a single alphabet which is
for now z and then i'll have the if alif else ladder to check if this variable or
if this value stored within the variable is a vowel or a consonant moreover with
specific vowel it is so first i have if bad equal to equal to a print this is a
bubble and print this is the vowel a now we have five consonants that's a e i o u
so we cannot directly go with an else statement here we need to check the
condition for each of these alphabets or these bubbles so now i'll have the l if
var equal to e print this is the vowel e just copy paste this alif again the
stem for i then for o u and finally if your variable is not equal to any of
these five letters then we just print out that it's a consonant so that can come
in the else part so if none of these statements on top execute the control
directly comes to else statement and else definitely gets executed so here we'll
print out this is a consonant before i run this program a small correction it
should be double equal to single equal to is used for assignment okay now we can
run this code and it prints out this is a consonant so var healthy value z which
is of course a consonant i'll put in a vowel now say o run the code again and
prints out this is a bubble o so this works perfectly fine loops in python so
jake here wants to print his name 10 times now there are two ways of doing this
one is he can literally give 10 print statements for printing his name or he
could use a smarter method and he could have just one print statement and somehow
repeat the execution of this one statement now this is where loops come in so a
loop is basically an instruction that repeats multiple times as long as some
condition is met so jay could simply use a loop to execute that one print
statement 10 times now in python we have three main types of loops there's the
while loop the for loop and finally the nested loop now before we move into these
loops let's look at the flow of a loop so the dot at the top represents the start
of the program now all the statements of the program are executed one by one
until a condition for the loop is encountered now if this condition results in
true the statements within the loop are executed and once again the condition is
checked now this loop continues as long as the condition is satisfied the minute
the condition fails the program's control exits out of the loop and all these
statements following it are executed so now that we got a basic idea of what
loops are and how they run let's look into our three basic loops the first is the
while loop so the while loop is used to repeat a section of code an unknown
number of times until a specific condition is met so suppose jake did not want
his name to be printed exactly 10 times but until the end of the page in this
case you can use while loop and the condition would be print until end of page is
reached so what is the syntax of a while loop you have the keyword while followed
by the expression or the condition that has to be met and then colons and
following the colons on the very next line you can start the statements which
will exist within your while block intended slightly to the right now let's look
at a short demo to better understand how while loop works so i'll move into
pycharm so we'll write a very simple program that asks the user to enter a
multiple of 7 and the program keeps doing this until a multiple of 7 is correctly
entered by the user so if the user enters say 14 or 21 you simply print yes this
is a multiple of 7 but on the other hand if the user enters say 10 or 12 or any
other number which is not a multiple of 7 you got to take the input again so
let's begin first i'll take an input from the user and i'll store this in my
variable val so in python every input that you take from the user is
automatically taken in the string format and you need to convert this into int
so that is what this int function is for and within the braces you take the input
now we need to make sure that the number entered by the user is a multiple of 7
and if it's not we need to take the input again so here's where the while loop
comes in we check the modulus of val against 7 and if this is not equal to 0
that means if val divided by 7 does not give a remainder that means val is not a
multiple of 7. so in this case again we need to take the input from the user so
we'll repeat our first line once again let's have a look
at the syntax for a while before we move ahead so we have the first keyword
which is while and then val mod 7 is not equal to 0 is a condition now every
time this condition results in true this statement is executed so this is the
only statement within the while loop now we need to also take care of the
condition where the value is a multiple of seven so this will come under else now
if you've gone through other programming languages this might be something new
you're coming across because else is always used with if you don't use it with
while or for or any other loop statement as such but in case of python this can
be done else colon and next line print so if the user did finally enter a
multiple of 7 we'll just say that this number is a multiple of seven [Music] so
when you're printing the value of a variable within a sentence this how you do it
wherever you want the value to appear in the sentence place the placeholder have
the placeholder for the variable there so in our case val is an integer type of
variable therefore we have the placeholder percentile d if val was a string type
we'd have percentile s and after the code you put the modulus sign followed by
your variable name if you have two or more variables after the model is signed
within brackets you can enter your variable names separated by commas like this
so i just have one variable and that's it now let's run this save it first so
your first line enter a multiple of 7 is printed let's enter 18 which is not a
multiple of 7 and see what happens so once again as we wanted the input needs to
be taken from the user and this will continue until you do give a multiple of 7.
so now if i give 14 there you go 14 is a multiple of 7 and a program has
successfully terminated now to better understand the flow of while loop let's
debug this program so the first thing you do is you keep a breaking point on your
first line and then you go to run and debug the program so in debugging we
basically see the execution line by line and to do this we press f8 so after your
breaking point is placed press f8 and the first line is printed in your console
so let's given an input say 87 let's go back to our debugger as you can see here
the value of val is now 87 and we have moved to the second line where val mod 7
was checked again 0 and because it was not 0 this means that the while's
condition is true and you move into the statements within the while loop so now
we have reached the third line where we are taking the input from the user again
so once again we do f8 and as you see the user input needs to be taken again so
now i'll give 42 which is a multiple of 7 enter and it goes back to the while
loop because now once again it needs to check if val mod 7 is not equal to 0 and
as you can see here now the value of val is 42 we press f8 you'll notice how we
jumped from line 2 to line 5. this is because this time 42 mod 7 was equal to 0.
so the while resulted in false and the statement right after false which is your
else is what got executed so now you have the execution for the last print
statement and if we press f8 once again you see a program's terminated with the
final output 42 is a multiple of 7. so i hope you understood while loop let's
move on to the next loop which is the for loop now for loop is used to iterate
over a sequence the sequence could be a list it could be a tuple it could be a
array a string or it could even be a range basically if you have certain elements
arranged one after the other a for loop can be used to iterate over these
elements so now let's look at the syntax or for loop you have the for keyword
followed by counter so counter is basically a variable say you want to repeat
your name 10 times what you do is you always keep track of the number of times
you've already repeated your names in your fingers and that is exactly what a
counter is to a for loop a counter keeps track of the position that you're in in
the sequence after counter you have another keyword in and then you have the
sequence so the sequence you can literally give your list there or you can have
the variable which stores your list tuple array or string now let's move into a
demo for the for loop once again i'll open my pycharm so now we'll write a
program to iterate over a list now i'll store my list in the variable x and my
list will have the elements 1 comma 6 comma simply learn [Music] now using the
for loop i'll iterate over x so my counter in this case will be i in and my
sequence which is x always followed by the colon and now i'll just print i let's
save this and run it so as you can see here the elements of x are printed
there's one six followed by simply learn so what it does is basically when you
give i in x i assumes the value of the first element in x prints this value and
every time you go back to the for loop i is incremented by one so the second time
you reach the for loop i is now holding the value six and the third time i holds
a value simply learn now the same thing can be done with just a string so if x
is equal to simply learn which is a string and we run this code now you see all
the letters of the string are printed one by one so in case of strings i holds a
value of each character in the string right from the beginning up till the end
now python also allows nested loops and by nested loops we mean loops within
loops so now there are various ways that nested loops could be implemented it
could be a for loop within a for loop it could be a while within a while loop a 4
within a while loop or a while within a for loop so in our demo for nested loops
we'll see one of the most popular applications for it which is accessing the
elements of a matrix so first we'll begin with creating a matrix store my matrix
in variable x and my matrix will have two lists one would contain the elements
one two three and the other will be of alphabet abc now with a matrix as you can
see there are two lists within it our first for loop will iterate over the
elements of x so the range is x [Music] so when i say that i iterates over the
elements of x i basically takes the value of the first list in the first
iteration and the second list in the second iteration now our second for loop
which is nested within our first for loop will iterate over the values within
that list so for j in i let me explain this once again so if i points to your
first list j will be used to iterate over every element in your first list and in
your second iteration i will point towards your second list and j will iterate
over every element in your second list we'll just print out the elements and
let's run this just increase the size of my console here run it and as you can
see the elements of your matrix are printed 1 2 3 abc now what if you want 1 2 3
to be printed in one line and your next list elements which are abc to be printed
in the next line so every time you print j you do not want a next line so put n
equal to quotes what this does is that it removes the new line which is
automatically put by the print function in python and once you exit the inner for
loop you're basically moving to the second list so now you want a change in line
and hence i just put a print statement here let's run this and as you see one two
three abc now nested loops can get a little confusing with the flow for this
purpose we'll also debug this code place your breaking point on the first line
and debug f8 and as you see x holds the two lists one two three a b c now f
eight again and our first for loop is executed and i now holds the value one two
three as you can see here we now have moved into a second for loop and j holds
the value 1 so i right now points to our first list in x and j points the first
element in our first list in x now once we execute the print statement your
first element gets printed and once the first element is printed you go back to
your inner for loop and now 2 gets printed once again you go back to your inner
for loop three is printed and again you go back to your inner for loop now you
have reached the end of the ith list the next time you click the f8 you're
basically jumping over to the print statement at the end which shifts to a new
line and now and now we are back to our outside for loop so out of for loops
counter which is i now points to the second list in x which is abc so when we are
back so we click f8 again and j has taken the value of the first element in your
second list which is a once again f8 and as you can see a is printed in the
console below we are back to our inner for loop print press f8 again b gets
printed again we are back to our inner for loop and c gets printed now we have
reached the end of our second list2 so the next time we press f8 the control
skips to the last print statement which is a new line statement and it goes back
to the outer for loop now here too the counter i has reached the end of our
matrix x so once again when we hit the f8 our program terminates now as we saw
loops have a certain flow and you might come across certain scenarios where you
want to break out of this flow so this is where the loop control statement comes
in we look at the two most popular loop control statements used in python the
first is the break keyword suppose you're executing a loop for say 10 times but
if a certain condition occurs between our first and our tenth iteration we want
to immediately exit out of the loop in this case you can use the break statement
so i'll just move back to my pycharm and show you a simple demo on how play can
be used so the first thing we do is we'll create a string i'll store
this in my variable x now to iterate through the string of course we'll use the
for loop for i in x and colon now what we want our program to do is that it
should print only the first sentence of our string so it should print just hey
there and as we can see in our string here hey there is followed by a full stop
so we can say that whenever we encounter the full stop we'll break out of the
loop and print no more so if i is equal to full stop will break out of the loop
and for every other case we'll simply print i which is the character in our
string so let's run this code and as you can see here all our characters here
there are printed and once i is equal to full stop the if statement results in
true and we encounter the break statement which breaks the loop so as you can see
here every character of our string is printed on a new line which is not what we
want so in our print statement put end equal to quotes and this will ensure that
you're not going to a new line every time let's run break again and yeah hey
there appears in the single line so once again i'll repeat how the flow of this
program works goes to the for loop i iterates over every character in the string
hey there how are you and every time i is checked against a full stop so if i is
not equal to equal to the full stop we come to this print statement which is a
part of our outer for loop and only the character is printed we go back to our
for loop every time and i is incremented so it keeps pointing to the next
character and once i is equal to equal to full stop we encounter the break
statement which breaks the control to outside the loop and our program terminates
successfully now we'll have a look at our second loop control statement which is
continue so in some cases when a certain condition occurs within a loop you do
not want to break out of the loop but you want to skip that particular iteration
for the loop so in this case you can use continue once again we'll move back to
pycharm so i can show you a short demo on how continue works now for our demo for
the continue keyword we want to print all the numbers in a list which are less
than 10. so to iterate through a list we have 4 i n and our list which is 1 13
56 4 6. now let's see how we write this program without using the continuous
statement first so within the for loop if i is greater than 10 which is
basically the number we did not want to print we have no action to give under if
because there's nothing we want to do if the number is greater than 10 and under
else which is if the number is not greater than 10 we'd want to print it now
this seems simple but the problem here is that you cannot leave if without any
action this would result in an error as you can see here there's an indent
expected and this is where continue comes in so if i is greater than 10 you
continue so basically you skip all statements after this continue statement in
the for loop and when you insert your continue statement the control
automatically goes back to the for loop so in that case you do not even require
your else statement here and you can just have print under your for loop so once
again i'll explain how this works i iterates over our list so i points to our
elements 1 13 56 4 and 6 in each iteration respectively and every time we check
if i is greater than 10 if i is greater than 10 we do not want to print i we just
say continue so the control goes back to the for loop and if i is not greater
than 10 that is if i is any number between 0 to 9 we simply come here and the
number is printed so i'll save this and let's run this code so as you can see
here one gets printed 4 and 6 is printed we can debug this too just to see how
the flow works so place your break point here and debug continue as you can see
first time i is pointed to 1 now i is checked if it's greater than 10 of course i
is not greater than 10 so the control skips to the print statement and if you go
back to the console press f8 you can see one gets printed now the control goes
back to the for loop and this time i is 13 is 13 greater than 10 yes 13 is
greater than 10 so the if condition is satisfied and we reach the continue
statement so once the continue is executed as you can see here the control
transfers back to for loop and i moves on to the next iteration and i currently
holds a value of 56 so this way all your numbers which are less than 10 gets
printed one by one so in our previous video we went through what loops in python
are and we briefly went through for loops while loop and nested loop today we'll
focus solely on for loop so let's start from scratch and look at what exactly for
loop is so for loop is used to iterate over a sequence now the sequence could be
a list tuple add a string or even a range of numbers so if you have multiple
elements stored consecutively for loop is exactly what you need to access each of
those elements now here is the syntax for follow so first you have the keyword
for followed by counter counter is just a variable which keeps track of the
position of the element that you're accessing right now and then you have the
keyword in followed by sequence now sequence is basically the name of your list
your tuple array string or even a range of integers and then it's followed by
colons and in the very next line you can put down your statements which come
within the for loop so to better understand this we have a series of demos which
i'll be showing on pycharm so let's move on to that so our first demo is a
classic example of where you use for loop that is to access the element of a list
so first we'll create our list i'll name my list x and we'll have the elements 1
4.2 and the string simply learn now this particular list of ours has just three
elements so we could print the elements specifying their positions individually
say x comma 0 x comma 1 and x of 2 so we could print the elements of our list
specifying the indexes individually so say x of 0 comma x of 1 comma x of 2 and
if we run this code save this and run it as you see here all the elements are
printed correct but what if our list had hundred elements this would be quite a
tedious approach then wouldn't it so here is where you use the for loop with for
you can have a counter variable which is i in my case in and this is the place
where you enter your list name so x colon and i'll just say print of i so what
we are doing here is we have this variable i which is a counter and will iterate
over every element of x so the first iteration i will hold the value 1 the second
iteration i will hold the value 4.2 and the third it will hold the value simply
learn so let's run this code and there you go all our elements are printed so
this is a much simpler way to print the elements of your list now for can also be
used to print every character in your string separately so if x is just equal to
simply learn and we run this code once again you see all the elements of your
string are printed now if you want all of them to be printed together that is in
one line in your print statement just put end equal to braces and then run this
code and your entire string is printed here but character by character so we can
debug this to see how it's printed just place a breaking point on your first line
and debug sample and i'll enlarge my console press f8 to move from one line to
the other and our first lines executed so x holds the value simply learn and now
we are on our for loop line the first time we execute it i takes the value of the
very first character in our string x which is s so now when we say print of i s
gets printed go back to a console we can see this here and next time we press f8
now i will take the value i from x and i gets printed in this manner all the
characters of your string are printed one by one and then your program terminates
so that's the very basic example of how you can use for loop now let's try
another case where we want to print all the even numbers in the range of 0 to 20.
so once again we have for i which is our counter variable in now when you want to
specify a range of values you use the function range and give your starting value
and your ending value so the thing with range is guys that if you keep your
ending value as 20 it will only consider the numbers from 0 to 19 but we want 20
also to be included so our ending value is 21 here put your colons and let's
just see what this prints first so print of i and we'll run this code so here the
for loop printed all the values from 0 up till 20. now we want only the even
values between 0 to 20 to be printed one way of doing this is that you can check
using an if statement if every value that is stored in i is a multiple of 2 or
you can simply modify this range function slightly and put comma 2. so here what
this means is that every second element is only printed so zero will be printed
one would be skipped and then two would be printed let's see that run it so as
you see here in just two lines of code we got all our even numbers between 0 to
20 printed now say we want to find the sum of all these even numbers but this
time because we want to use a slightly different approach we'll use the if
statement to know if the value in i is even or not so the first line would be
about the same except you delete the two from the range and then you have if
statement so if i mod 2 is equal to equal to 0 you will have a variable say sum
which is equal to sum plus i so what we're doing here is i will hold a value
between 0 to 21 and then this value is tested to check if it is a multiple of 2
that is if division by 2 gives any remainder
or not in the if statement and if it does not give any remainder that is if the
remainder is 0 then we come in and we keep adding this i to another variable
which in our case is sum so sum is the addition of all the even numbers now we
have not declared sum as yet and some needs to have some initial value so just
outside your for loop put sum equal to 0 so here we have added all our even
numbers now once this entire procedures we can print a variable sum let's run
this code and the sum of all the even numbers between 0 to 20 is 110 now let's
move on to our next demo where we'll be printing patterns so patterns are a great
way to implement for loops and also to sharpen your programming skills in a
previous video we had a pattern with asterisk symbols we printed an inverted
triangle using asterix so here this is the pattern we'll be printing with numbers
so for this particular program we'll be taking input from the user specifying a
number so in this example the number entered by the user would be 5 and as you
can see the number given by the user is the last digit of the last row also it is
the number of rows in the pattern so let's begin coding this first we'll take the
input from the user in a variable say n so all the inputs given by the users
always in string format in case of python we need to convert this into end so
input enter a number now we'll have the first for loop which is for every row so
for i in range of 1 to n plus 1 because if we give n then the range will be
taken only from n to n minus 1. now that's our outer follow and in case of
patterns as you go through others you'll notice that there are at least two for
loops even for the most simple patterns so here is where we implement nested
loops specifically nested for loops now the outer for loop as i mentioned earlier
is for the number of rows and we'll have the inner for loop for every element in
the row for j in range of again 1 up till i so when we consider our outer for
loop it goes from 1 then 2 3 4 and then 5 and the inner for loop prints from 1 up
till the ith number now inside our inner for loop we'll just print j and we do
not want to go to the next line immediately after printing j so put end equal to
quotes but once we complete printing the entire row then we want to move on to
the next line so after our inner follow that is under our outer for loop we'll
print the new line so just put a plain print statement here now let's run this
code enter a number we start with five and as you can see here one one two one
two three one two three four has been printed but the fifth row is not printed
yet so if you go back to your code you'll see where the error is it's in the
second for loop we have run it from 1 to i and not i plus 1. so what happened
here is that since we ran it till i when i was 1 the inner volume ran zero number
of times and therefore it's the first row that was not printed and the second row
printed just one our third row print is just one two and so on so we'll make this
correction and run it again enter 5 and there you go a pattern is printed let's
run it again and try a different number this time say 10 so a pattern is printed
for any number that you enter now another very popular application of nested for
loops is accessing the elements of a matrix so here we implement matrix as a list
containing list and what we'll do is we'll take two such lists or two such
matrices from the user and we'll find a sum so to add to matrices it's very
important that they have the same dimensions that is they have the same number of
rows and columns so we'll take the number of rows and columns from the user first
and convert this to paint next take the number of columns my variable c now
we'll create our list our first list which is x right now x is empty okay so our
first for loop will be to iterate over the elements in our list x so for i in r
because r is the number of elements in x and c is the number of elements in the
list in x so for i and r and now are in a for loop so our outer for loop are for
the lists in x that is for the elements in x and our inner follow will be for the
elements within those lists in x so for j in c so the approach we'll be taking
is that we'll first create those individual lists within x and then we'll add
that list to x so let's name our inside list as val and to add elements to a
list use the function insert now the element would be inserted in the jth
position and the value for the element would be again taken from the user so
input enter the i into j element so we'll enter the placeholders to
personality element i comma j so we haven't declared val yet let's do that it's
initially an empty list so all the values within our list val would be inserted
within the inner for loop so once we are out of the inner follow our list val
would be ready so now we can add this list val to our list x so outside here put
x dot insert at position i we insert val i'll explain this part once again so
our outer for loop for i in r where r is the number of elements in x that is the
number of lists within our parent list x would be counted in i and our in a for
loop which is for the elements within our child list would be counted in j and
inside our inner for loop we'll take the input from the user and insert it into a
temporary list val and then once we exited the inner for loop we'll add this list
val to our parent list x and we are done with taking the input for the first list
now in the similar manner we take the input for the second list2 we'll name our
second list y which is initially empty and you can just copy paste this code
here change the x to y and that should do it now one thing we missed out here is
clearing valve we don't necessarily have to do this because every time you go
back to the inner for loop and you say input this value at the jth position of
val the previous value will be automatically overwritten but still we'll clear it
every time we exit the inner forum okay so we are done with taking the input for
both our list containing lists and now we move on to the part where we find their
sum so first we create a variable sum which will be an empty list now this list
will hold the added values from the other two lists so again for accessing the
elements of a list containing lists we need nested for loops so for i in r once
again and for j in c so just like when we took input when we are finding the
sum2 we will first add the elements of the child list so say find the sum of the
first list with an x and the first list within y and once we find the sum of
these two lists we'll put that into our parent list sum so we'll use val again
for a temporary list and val dot insert at the j position the sum of the
elements in the j position of our ith list in x and y so x of i j plus y of i
j and once we come out of the inner for loop that means our val list is complete
we can add this list to a parent list sum so sum dot insert at the eye position
insert valve and every time we exit the inner for loop we also clear valve and
that's how you take two lists of lists from the user find their sum and finally
print it so just print some let's run this code enter the number of rows let's
start with 2 number of columns 2 okay so we have an error here for i in r so the
error here is r is an integer and when we want to iterate over a range of numbers
we need to specify the function range so range of and and the indices in list or
matrices start from 0 so 0 up to r in the similar way let's make this correction
for all our for loops 0 to c and here too and here too okay let's run the
program once again and now we gotta enter our elements so our first element
second thought fourth element so with that we completed all the elements of our
first list zero zeroth element is the first element in your first list of x zero
one is the is the second element in your first list one zero is the first element
in your second list and one one would be the second element in your second list
all in x now we start entering the elements for y the first element of the first
listed y so we entered all the elements for both x and y and this is our list sum
but then this is not the sum of the elements what has happened here is that it's
just concatenated the elements of the two list so our first element in list x
that is one and our first element in list y which is 2 is just concatenated and
it's given 12 whereas we want 3 the sum of them this because we took the input
from the user but we forgot to convert it to int that means all the elements were
stored in the list as strings and when we use just the plus sign on strings it
just concatenates the values so now we'll convert each of the inputs to integer
and there you go our inputs are now converted to integer now we can run the
program once again enter the elements and this time our sum is printed right so
1 plus 5 is 6 2 plus 6 is 8 3 plus 7 which is 10 and 4 plus 8 is 12 while loops
in python so previously we briefly went through while four and nested loops we
saw that for loops are used mainly to iterate over a sequence while loops are
basically used to repeat a section of code an unknown number of times so if you
have a certain code that needs to be repeated say n number of times where n will
be decided during the runtime you can definitely opt for a while loop and every
time you move into a while loop a specific condition is checked the minute that
condition results in false that's when you exit out of the while loop so here's
the syntax for a while loop you have the while keyword followed by the expression
or the condition this is what needs to result in true to enter the while loop
and the expression is followed by colons and in the very next line you can put
down the various statements that go into your while loop so now that we saw what
a while loop is typically used for and the basic syntax of a while loop let's
move into a few demos and explore the various applications of while loop so i'll
be performing my demos on pycharm now my first demo is a typical example of any
loop statement suppose i want to print these strings simply learn 10 times i
could do this using the print statement alone and half the statement copy paste
it 10 times that could work but with loops we have a much simpler way of doing
this since we are talking about while loops that's what we'll use so say while
and i want to print my string 10 times so i is my counter and i say i less than
equal to 10 colon and in here i put my print statement just once now i is the
counter as i mentioned previously so this counter needs to be initialized first
we need to give it a starting value so i starting value will be 1 and i will
move from 1 to 10 that means every time you're inside the loop every time simply
learn is printed i needs to be incremented by one so i equal to i plus one now
a shorthand of doing this is you can have i plus equal to 1 this is exactly the
same as i equal to i plus 1 just shorter and simpler so i'll explain this once
again before we run it we have i equal to 1 where i is something that we're using
as a counter and then we have a while loop so now while loop we are checking if i
is less than equal to 10 and since the first time i is equal to 1 we will enter
the while loop simply learn will be printed and in the very next line we are
incrementing i so every time simply learn is printed i gets incremented we go
back to the while loop and it's checked against this condition whether it's less
than equal to 10. so the minute i is equal to 10 you'll have simply learn printed
one last time the next time we go back to the while loop i will be 11 and the
while condition would result in false so we exit out of the loop and since we
have no more statement here our program terminates so let's run the code and
check if everything works as we expect it to so here we go simply learn is
printed 10 times now can you only increment your counter no we can also decrement
our counter so if i say i is equal to 10 that is i start my counter from 10 and i
want to decrease the counter till 1 so we'll have the while loop executed as long
as i is greater than one and now that we're decrementing i every time we need to
change the plus sign to a minus so it's the exact same thing that we are doing
now as we did previously we are printing simply learn the string 10 times but
this time our counter starts from 10 and we go down to one so every time we check
if i is greater than 1 and the minute i is not greater than one that is i is
equal to one we do not print simply learn anymore we just exit out of the while
loop so let's run this co2 so here the string simply learns printed only nine
times whereas we want it to be printed ten times this because in our while
condition statement we put greater than one what we need is actually greater than
equal to one so that even when i is equal to one we are printing simply learn
once so now if we run it again we'll see that we'll have simply learn printed 10
times so for our next demo instead of printing something 10 times let's find the
sum of the first 10 natural numbers so in that case i would start from 1 and
then we go all the way up till 10 so i is less than equal to 10 and within your
while loop we can remove this print statement and now every time i needs to be
added to some other variable so we'll call that variable sum and sum equal to
sum plus i so i starts from 1 and every time you go inside the while loop you
have to check if i is less than equal to 10 so as long as i is less than 10 you
move into the while loop and in the while loop you keep adding this i to another
variable sum and after it's added you increment the value of i so that i goes all
the way from 1 up till 10 and once i is 11 the while condition results in false
and you break out of the while loop moving to the next statement after the while
loop which in our case would be the print statement to print the final sum so
let's run this code so can you guess what the error is in this case our variable
sum here has not been declared or initialized previously so outside our while
loop let's have a variable sum and sums initial value should be 0 let's run the
code now and there you go so the sum of your first 10 natural numbers would be
55. now let's modify this code slightly so we can find the sum of the even
numbers between 1 to 10. so our first few lines would remain the same i will
start from 1 sum will be equal to 0 a while condition 2 will be the same but this
time inside a while loop we'll have an if statement and the if statement will
check if i is an even number so to check if a number is even the remainder from
the division of the number with 2 should be equal to 0 so i mod 2 equal to equal
to 0 and if that is the case we can find the sum of the number and whether
that's the case or not we still need to increment i every time within the while
loop so sum would come under the if statement but i plus equal to 1 would remain
under the while loop and not within the if statement so i'll repeat this once
again i which is our counter variable will start from 1 sum will be equal to 0
every time a while loop will check if i is less than equal to 10 and every time i
is less than 10 we can enter the while loop now within the while loop we need to
check if i is an even number because we want to find the sum of only the even
numbers so if i mod 2 is 0 that means i is an even number and we can add i to our
variable sum now whether i is an even number or an odd number every time we still
need to increment i so incrementing statement would be outside the if statement
and under the while loop and finally once we exit the while loop we'll print our
sum we run the code and the sum of all the even numbers between 1 to 10 is 30.
now let's explore while loop for the very purpose it was made that is to run a
certain code unknown number of times so for this demo we'll work on reversing an
integer so the first thing we do is we need to take the input from the user so
the user will give us the integer and i'll store my input in the variable n we
also need to convert the input from the user into integer because it'll be
provided in the string format so we have the end function within which we'll have
our input function now we'll declare another variable nr which is basically our
reverse n and this will be initialized to 0. now suppose n is equal to 5 6 7 8 n
r should be equal to 8 7 6 5. so if you look at the values of n and nr you'll
notice that what we basically need to do is we need to remove the last digit from
n and we need to add it to the front of nr so to remove the last digit from a
number we can use the modulus function that is n mod 10 and we need to keep
doing this every time with n so we'll keep doing this until n mod 10 is equal to
zero that is until there are no more digits left in n so this will be the
condition for a while loop so while n mod 10 is not equal to 0 we can continue
with our statements now within the while loop let's take a variable c where we'll
extract this digit that is nmot10 would be stored in c now we need to add c to
the beginning of nr but now nr is an integer and so is c so if we just say nr
equal to nr plus c what this will do is it will add the two integers and will
give you the sum so in our case if 8 is put into nr the first time and 7 is put
into nr the second time when we come to the statement where nr is equal to nr
plus c what it will potentially do is give you the sum of 7 and 8 which is not
what we want we want 87 to be stored in nr we want the digits 8 followed by the
digit 7 to be stored in nr but now if we multiply nr with 10 and then add c to it
our problem is solved so if the first time we are putting 8 into nr since nr is 0
it would be 0 into 10 plus 8 and nr would be 8. now the second time when we
extract 7 from n and nr is already 8 and we come to this statement here it says 8
into 10 which is 80 plus 7 which will give you 87. so this way we can slowly
build up nr now a small piece of the puzzle that we are missing is that the value
of n remains constant throughout the while loop so far so what happens if the
value of n is constant is that every time you do n mod 10 you're extracting the
digit 8. you're not moving forward to 7. so what we can do here is that once 8 is
extracted we completely remove the digit 8 from n so that can be done by dividing
n by 10. so now n equal to n by 10. but now in python again if we say n divided
by 10 it puts a decimal point just before 8 so 8 is still there but on the other
hand if we say n slash 10 our result would be an integer and only 5 67 would be
returned to n now since n has reduced from 5 6 7 8 to 5 6 7 the next time you go
back to the while loop and do n mod 10 it's 7 that is extracted and this way we
can extract every digit from n and keep adding it to nr in the reverse order now
that we are done with our while loop outside our while loop we'll print the value
of nr let's run this code enter your number so i entered 4652 and my output was
2 5 6 4 which is just as expected now we can debug this code too here's my
debugger so press f8 to move from one line to the next a first line n where you
need to put the input so go back to your console put your input i'll put a
small number since we're just trying to figure out how it works so 467 should do
right now go back to your debugger so as you can see the value of n is 4 6 7
press f8 again now the value of nr is 0. we have reached a while loop statement
and n mod 10 is not equal to 0 that is n10 is basically 7 right now so we move
into the while loop and the number at the end of n is extracted and put into c so
c as you can see here is 7. now we have reached the part where we add this number
to nr so nr is now 7 because nr was previously just 0 0 into 10 is 0 and plus c
which is 7 so nr is 7 press f8 again and this time as you can see down here the
value of n has been changed to 46 because we divided it by 10. so right now c is
7 from a previous iteration n is 46 and nr is 7. now once again we go back to our
while loop the while condition is checked and n mod 10 is still not equal to 0.
so we go into the while loop now c is equal to 6 and nr is now equal to 76
because it was previously equal to 7 into 10 plus 6 and once again the value of n
is changed now the value of n is just 4. so n mod 10 is still not 0 go into the
while loop c is extracted c is 4 n r is calculated which is now 7 6 4 and n is
changed now n is 0. so now 0 mod 10 is 0 and the while condition results in false
so we exit out to the while loop and we reach the print statement and now our
nr should be printed in the console as you can see here it is now let's write
another program to calculate the length of a list without using the lend function
so the first thing we'll do there is we'll create our list so my list name is x
and it has three elements 1 2.3 and the string simply learn now let's initialize
our variable length which will hold the length of our list and we'll be using the
while loop to do this we'll have a counter i which is equal to zero now we'll run
a while loop as long as x of i has something with it as long as x now we'll run
our while loop as long as x of i has anything within it so we can just say x of i
that means if x of i is equal to something we'll be able to enter this while loop
now within this while loop we'll increment a length variable that is every time
we are entering the while loop it means that x of i is holding some value so our
length is incremented by one so you can say length equal to length plus one or
as we saw earlier you can just go with length plus equal to one now x of i holds
a value we incremented the value of length now we need to move on to the next
element whether it exists or not so we increment the value of i so i plus equal
to 1 and that's pretty much it so the minute we reach the end of our string that
is x of i does not hold any value while should result in false and will exit out
of our while loop so then we'll print the value within length so let's run this
program but as you can see here there's an error while x of i index error lists
index out of range this means that the first time we're going through the while
loop when x of i is 1 there's an element so this means that the first time we're
going through a while loop x of i holds some value the second time to x of i is
holding some value and so is it the third time but the next time when i is equal
to 3 that is we are trying to access the fourth element of our list x there is no
value and this is resulting in the index error so this is where we use the try
block the block that produces an error is put within the try block and the try
block is written like this with the keyword try followed by colon and everything
within it must be spaced or indented slightly to the right so while loop is the
one producing the error so the while loop comes within the try block now what
happens is once the error has occurred try will catch this error and we will
write another block which will handle this error so this is our accept block and
this is how it's written you have your keyword accept followed by the error that
this block will be handling so in our case is the name of our error index error
so put that down here colon again and everything within the except block is again
indented slightly to the right and there you go so our error which was occurring
previously that is the index error will be caught within the try block and our
except block will handle this error let's now run our code and as you see here
this time our error is handled perfectly and it's just ignored and the value of
length gets printed which is three so this is one way you can find the length of
a list using while loop and not your len function now a while loop can be nested
within another while loop so for our next demo we'll have a pattern printed using
nested while loops so this is the pattern we're attempting to print here we'll
take a value n from the user which will be an integer and that number n would
basically be the number of rows in your pattern so in this particular example
here n is five and we start by printing one so the digit one is printed once and
then in the next line the digit two needs to be printed twice the digit 3 3 times
digit 4 4 times and then finally the digit 5 five times so let's begin coding
first thing we take the input from the user store it in our variable n now we'll
have one while loop which will be to take care of the number of rows so our
number of rows should be till n so i is less than equal to n and i of course will
start from 1 so i will go from 1 up till n okay so the outer for loop takes care
of printing each row now within each row as you can see elements are printed so
if we are on our first row then the number 1 is printed just once if we are on
our second row the number 2 is printed twice and so on so we'll have another
counter variable within our out of while loop which is j and j will be
initialized to 1. now j again has to run all the way from 1 up till i because i
number of times an element is printed on the ith row now we'll have a second or
a nested while loop here and the condition for this is j is less than equal to i
because j will go from 1 up till i and as long as j is less than or equal to i
we enter the second while loop now within the second while loop the only element
that is printed in every row is element i so if you're on the first row one is
printed second row two is printed third row three is printed and so on so what we
need to print here is i and every time we print i we need to make sure that we
are still on the same line so end equal to quotes and increment j of curse now
once we exit our inner while loop we increment the value of i so that we can move
on to the next row so now that we have completed with one row we can then move
on to the next line to print our next row so here we put in an empty print
statement which will shift to the next line and that's pretty much it we can now
run our program so enter a number let's say 5 and as you can see our patterns
printed here let's try out another number say seven so a pattern can be printed
for any integer now let's get on to our final demo so here we'll try out a simple
game so program will generate a four digit number using the random function and
our user needs to guess this number so for every digit that the user guesses at
the correct position we print that one place is correctly guessed so suppose say
a program generates the number seven six three two and a user enters eight nine
three one if you compare these two numbers you can see that the second last digit
here is both three so three has been entered in the correct position therefore we
can output one digit entered at the correct position and in this manner the
user needs to guess all the digit in their correct order if the user wants to
quit this game at any point of time he or she just needs to enter 10. so let's
begin coding first as i already mentioned we'll be generating a four digit number
using the random function so let's import a random library let's now generate a
random number so i'll store this random number in num p because the p is for
permanent as we'll have another variable which stores the same number but the
value in that particular variable will be altered numpy variable on the other
hand will remain constant so this is a random number generated and we want a
four digit number so our boundaries will be set now let's take the input from
the user so i'll store the input in a variable n okay so now let's begin a while
loop and this while loop will run as long as n is not equal to 10 because as i
already mentioned if the user enters 10 basically the user wants to quit the game
now in here the first thing we need to do is we'll create another variable num
and into this variable we'll put the value of num p because num is going to be
altered throughout this program and yet we want to store the actual value
somewhere so we'll alter num and num p where p stands for permanent will remain
constant now we'll have another variable cor the short of correct which will
specify how many digits have been guessed in the correct places so initially the
value of correct will be 0. now to check if a digit is in its correct place so
we need to extract each digit from both n which is the number entered by the user
and num p which is the number generated randomly by our program so to extract the
numbers as we did in our previous example we'll use the mod function so we'll run
the while loop as long as num of mod exists so you can just put num of mod 10
so what this means is that every time you do a number mod 10 it extracts the last
digit so if a number mod 10 does
not have anything to return that means a number is empty and then we can exit
out of our while loop so now let's begin extracting the numbers now we'll have a
variable num c which will store the last digit from the variable num which is
basically the number generated by our program so num10 would be stored in num c
and we have another variable which will hold the digit extracted from the number
guessed by the user so i'll name that variable nc and that will be equal to nmod
10. now just like we did in one of our previous demos every time you extract a
digit you also need to remove that digit from the number itself so now num should
be equal to num divided by 10 and n should be equal to n divided by 10. okay so
now we have the extracted digits now we need to check if num c and nc are equal
so if num c is equal to equal to nc that means there's this one digit which is
held in both num c and nc which is also present at the correct position in that
case we increment a correct and if this is not the case we simply continue with
our while loop now this process will continue as long as there's any digit left
in num or a variable n basically now once we exit out of the while loop we must
check if correct is equal to 4 so if correct is equal to 4 that means all the
four digits guessed by the user were right and in their correct places and that
means that the user gets the entire number correctly so over here we have if
correct equal to equal to 4 you print congrats you guessed it right and if
this is not the case that is all the digits were not correctly guessed then we'll
print how many digits were actually correctly guessed in the correct positions so
else percentile d digits were guessed right now if all the digits were guessed
right that is the user guessed the number right we can also break out of our
while loop so you put a break statement there and we'll come back here where we
print how many digits were guessed right since all of them weren't we need to
give the user another try where the user again enters a four digit number so once
again we'll just copy paste this input line here so that the user can again
guess the number and this process will continue until all the numbers are
guessed right and the control goes into this if statement and breaks out of this
while loop now once the control flows out of this while loop the program should
be terminated because we have nothing else to print but if the user does enter
10 instead of a four digit number we need to exit the program immediately that is
we need to exit this while loop the outer while loop immediately so here we'll
have an else statement that is if n is equal to 10 then we print you quit the
game and the program will end right there okay so that is done let's now run our
program enter a four-digit number so i'm trying out five six seven three okay i
made a slight mistake here percentile d digits were guessed right but what is
this personality d is just a placeholder so we need to place the value of a
variable cor there let's run this program once again 6748 is my guess so zero
digits for guessed right let's try again eight two five six again zero digits
for guess right okay so there are a number of permutations to make it simpler
for the purpose of our demo i'll just put in a print statement here which will
print the value of numpy and accordingly i'll show you how the program flows let
me run the code again so our numbers three three four five let's start with a
completely incorrect guess so six seven three two zero digits for guess write
now suppose i guess my first digit is 3 and 9 7 6. so one digit was guessed right
now since i know my number is 3 3 4 5 it's easy for me to say it was the first
digit it's pretty obvious let me try three five four five so three digits for
guess right and three three four five and congrats you guessed it right now if i
had to quit the program zero guesses but now i want to quit the program so i
enter just 10 and you quit the game so as you can see our program is running just
fine to decrease the complexity of the game you can change it to like two digit
numbers so you can go from 10 to 99 or so you can put a smaller range too so
with that we come to an end to while loops in python today we're going to cover
the basic array in python and some of the functionality around it and before we
jump in i'd like to please remind you that you can always post something in the
notes here on the youtube videos or you can go to www.simplylearn.com and go
under our forums and ask questions there we have a team that monitors these and
they'll be happy to answer those questions array in python array is a container
that holds multiple values of the same type and this is very key is that the
array has to be of the same type so the syntax for developing your basic array
is going to be your variable whatever you want to call it my array or whatever
you're working on equals array your type code and then the elements in the array
this is the main type that they have for arrays and you'll see a quick list here
character is basically your character as you know it you have a b c d e f g
they're represented by a number between uh one there's like up to 128 characters
that's how many numbers are in there so when you have assigned a character that
means that they're now using that instead of as a character notation it's being
used as just an actual value plus or minus unsigned character you don't usually
use a lot of these as far as signed and unsigned characters but it does come up
for using them for containing a small amount of space to do something and then
you have your pi unicode the pi unicode is your unicode characters so if you
remember if you're in the american set of characters they use only half the
memory but they don't have all the different characters used in different
languages that's why a lot of times you'll see especially when you go
international you need to be very aware of whether unicode or just a regular
character and then there's the interregular integer where they have signed short
so it has a smaller amount of values unsigned short your signed integer plus or
minus again you're unsigned then you have your long your signed long unsigned
long so if you want to have a long basically is double that so each one of these
just doubles in size and how much numbers they can hold until you get all the way
to float and double so again those are just different numbers and just depends on
how many significant digits you want to keep on there that's a brief on type code
definitely we're not going to go into too much detail any more detail on that but
we do want to jump in and actually do an array and start showing you how the
arrays work and what you can do with them now to do this you're going to use your
favorite python editor or ide your interface i myself go through anaconda and
jupyter notebook certainly if you're using any other interface i'll work just the
same this is very basic code in python i use anaconda because it's a great
navigator for following your different modules you install into your python and
it has different packages and then jupyter notebook sits on there there's also
spyder which is another python editor there's a ton of python editors out there
but we'll go ahead and launch our jupiter notebook once i'm in the jupiter
notebook it opens up whatever folder it's set to and i can simply go under new
and i'm going to create a new python3 and that'll open up my python3 notebook so
now i can start writing code and we'll start off the bat by importing so we're
going to from array import we're going to import all the different functionality
that the array offers and then we're going to create our first array we'll just
call it arr or we can call it my array or whatever you'd like and it is simply
array and then we're going to put in the type and then we'll go ahead and just
put in brackets what we want for an array and we'll do 1 comma 2 comma 3 comma
four comma five real simple array and then let's go ahead and just print that out
and see what that looks like and we get an array of type integer one two three
four five and then let's flip over and just remember what the i is lowercase is
signed integer and uppercase is assigned this is a lowercase we have a signed
integer this um it's actually a short integer when you do the i and then the h is
the unsigned short integer and to see what that means when we're assigning this
let's go up here and change one of our values to minus one and we have it set to
i and we run it and it comes out okay no errors but what happens when we change
this to h which is the unsigned short integer we run this we're going to get an
error why because we gave it a minus value and if you remember correctly from the
h h is an unsigned short integer so there should be no negative values in there
and we'll go ahead and go back and just switch this to i and run it and then
there's so many things we can do with this but let's start by looking at what's
going on in the computer array buffer underscore info there we go and we do an
array buffer info we're just going to print this out and we run this this is
going to show us the size of what's going on with our buffer in here this is our
actual operating system address and then the size is here it's got a size of five
so you can actually go in there and you can hard locate this on your computer
usually we very rarely use something like that but it's kind of interesting that
we can look that up so easily and when we're manipulating the array we can simply
do print here's our array
we'll put our brackets in here and if we print two let me go ahead and run this
and you look at this we got minus one two three and it printed three and the
reason it does that is we always count from zero so in programming language you
always start zero one two and the two is going to be the section where it says
three on there and if we can print just one object in there this next one is so
basic the for statement for i an array this is a simple iteration so what this
means is we're going to take each value in the array we're going to do something
with it and in this case we're going to go ahead and just print it out so if we
want to print out all the variables you can go for i and array print i i want you
to notice the difference here is instead of it printing out that this is an array
with the values in it it comes down and just prints each individual value in here
now this does all of them but let's say you are looking at this you say hey i
don't want to do all of them we're going to do for we'll do pointer instead of i
for pointer in range five and instead we're going to print and here's our array
r and we're going to look up the index pointer so this is going to print out
let's just see what this looks like here we go it printed out the same thing so
we have our minus one two three four five so it starts at zero one two three four
five on here oops zero one two three four on here that's how this is actually red
on here so if we actually print out the pointer let's do that let's just print
out the pointer on here so you can see what's going on pointer comma r pointer
and let's run that and you can see that the pointer 0 1 2 3 4 so it does not
include the 5 and it prints at each value we can do something like this we can go
four and you can see it just drops the bottom one off and we can even do
something like this range one to four and it drops the top one off so by changing
your iteration loop you can change the pointer and pull up any of these values in
your array and another really cool thing we can do we have our array we can
actually do array reverse so instead of iterating and creating a new array we
know a functionality in our array just lets us reverse all the different values
and then if we go ahead and print the array let me go and run that you'll see
that the array is now in reverse five four three two minus one and let's just do
a quick rehash of what we covered so we went over here and we're gonna go on
more there's three sections to this or two another section or two we imported our
array so make sure you always import it we set our value so we created array
equals array we have a type and then we have the values in that type and if you
print it out you can see that it's an array type i minus one two three four and
five we can look up the buffer info so we can actually pull out the actual memory
location and then what kind of resources it's using that's what the five
represents and then we came in here we printed just one value out of it and
remember that it goes starts with zero so we have zero one two which prints our
three position there our number three we learn how to iterate through the whole
array for i and arr or i in our array and we just printed the i out and then we
did for pointer in range one to four we started off with range five and you can
see here where it goes one two three and it printed out the values that
correspond with that pointer and finally we went ahead and reversed the array so
we have our r dot reverse print r and then the next step with our array is we
want to go ahead and add something onto it and we do that simply with an append
so here's our append we're going to append the value 10 and let's go ahead and
print array and now we have five four three two minus one ten so we've added that
right on to the end of the array and we can also if we're going to pin something
we can also remove something so the simple command is remove and we're gonna
remove two and then let's go ahead and print that print r run and you'll see
here the value two is now missing from the array so it goes through and it finds
number two in here and this leads into an interesting question what happens if we
have two values of the same or two twos there we go we're going to put in two
tubes so let's go ahead and i'm just going to copy this down here and recreate
our array and i'm going to add a second 2 in here 2 comma 2. and let's see what
happens when we remove the two from there it removes one of the two and the way
it works is it removes only the first two in the list so we do the remove value
on there you'd have to rerun this a number of times to get all the different twos
out now earlier we did print here's our r and we can do position let's do
position three we'll just run that and position three happens to point to four
what if we wanted to do something in reverse like print r dot index and i'm
going to put the 4 in there and let's see what happens it prints 3 which was our
pointer so index does the reverse and if you remember correctly when we did let
me go ahead and do let's just take this whole array and recreate it again and
i'm going to change the index from the 3 to the 2 3 is actually going to point to
one of the twos because we have to see 0 1 2. so we're going to do index of 2
let's run that we have 0 1 2 and i did my argument of 2 which is going to come up
as 2. but when i did my index the index is only 1 y because 0 1 it's going to
look at the first 2 in the array so when you do your index remember if there's
multiple twos in that array it's only going to look at the first one and then
we'll go in here in front we have our array imported and let's go ahead and
create a new array this one we're just going to make as an empty array it's going
to be the i will stick with that and it has no values in it so that's what this
means and if we print it out just print our array we say we just have an array
with no there's no values in there nothing coming through and so we want to go
ahead and do on this array is we're going to create an input and we'll set a
variable x equal to it has to be an integer so it's going to restrict it it's
going to be an input and then from our input we'll give it what's the enter size
of array there we go enter on this and then let's do print what did we do let's
do it this way enter d element x this is kind of a fun thing we can set up on
here this has to do with print format on here so let's take a look at this real
quick and see what's going on before we do our print let me just take the print
out there and let's run this and of course it helps if i match my brackets
correctly there we go so we're going to enter the size of array so this is what
this is going to generate the first one generates our input box and let's just
put a four in there just to see what that looks like and we're going to run that
and so at this point we're not getting anything coming out make sure i get all my
brackets in the right place there and try it again so we're going to enter the
array 4 i hit the enter key and it's going to print out enter 4 elements so this
is a marker in your print statements for formatting and it just lets us know
we're going to take the x value we're going to put it in there and then from here
we're going to do 4i and range in this case x and this should look familiar from
earlier so we're going to do if it's in range of x and i enter the number 4 in
it's going to go 0 1 2 3. it always starts at 0 on the range and we're going to
do n equals integer input there we go and if you look at this this is just
another input statement just like we had integer input inner size of array now i
have integer input again we're not going to do we can actually do like have a
print something out but we're just going to leave that blank i'll show you what
that looks like in a second and we're going to append that new value to our array
and then let's print it out let's print out our array so let's go ahead and take
a look and see what this looks like when i run it let's create an array of four
since that seems to be fun and it says enter four elements so element one two
three four i hit enter and we print out our array we see that i have an array of
one two three four so i can use a user input just like this a real simple setup
but allows me to enter the data into the array we look into what functions in
python are so first let's look at the definition of a function a function is a
set of code that performs some task so what this essentially means is that you
can have a number of instructions which are bundled together in a function that
is given a particular name and now using this name you can call this function to
execute these instructions from anywhere in the program any number of times so
this is the syntax of a function you have the keyword div following which you
have the function name and then the double colons and then in the next line you
can put your statements or your instructions so that is basically what a function
is we'll go through a number of demos to understand the concepts of functions so
i'll move on to pycharm let's start from the very basic that is how to create a
function now this is called the definition of a function and as we are creating
the definition of the function the keyword that we begin with is def that is the
short form of definition and now following def we have the function name so here
i'm going to have a function called welcome and always you end your function name
with parenthesis now these parentheses may be empty or maybe filled which we look
into later for now it's
empty and then we have the double colon so i want this function to just print
out good morning so that is my instruction and with that my function welcome is
complete now this function has no purpose unless it's called and for that we
write the call statement for a function welcome so for calling the function just
put in the function name followed by parenthesis now let's run the code and as
you see here our call was made to the welcome function and the instruction within
it was executed thus we have good morning printed out here so as i said
previously these parentheses here these could be filled or left empty we saw what
happens when it's left empty now let me create another function where i'm taking
in values so sometimes the operation that a function does requires some values to
be passed to it so this is where we have parameters or arguments coming so now
i'm going to write a function called add which will basically add two numbers now
these two numbers need to be passed to this function so the function does not
know what these two numbers are until you literally give it the numbers so these
numbers will be stored in variables say a and b colon and within this i'll have a
variable say total which will hold the sum of a and b and then we'll just print
out the sum and with that our add function is also defined now we'll make a call
to our add function add and now as you see here we are accepting two values in
this add function that means we also need to pass two values to this add function
now suppose i want to add the numbers 10 and 20 i can go ahead and just put the
numbers 10 and 20 here or i can have two variables say x which is equal to 2 and
y which is equal to 3 and then to add i can just pass x and y either of these
methods work fine now i'll run the program so as you see here the add function is
called twice this is our first call to the add function where we pass 10 and 20.
so a holds 10 and b holds 20 so the total of this is 30 which is printed out the
sum is 30. now after this we have another call to our add function where we are
passing x and y x holding 2 and y holding 3 so over here when we pass x we're
essentially passing 2 and when we pass y we are passing 3 now this 2 and 3 get
stored in a and b respectively and our total is again printed 2 plus 3 which is
5. now this is one of the main advantages of using function if i had to write
this code total equal to a b twice to add two numbers that would be a little
lengthier procedure than having a function and just making a single line call to
the function every time i want to add two numbers now as you notice the values
that we are passing to a function are very position dependent that is since i'm
passing 10 first 10 goes into my first variable a and 20 which i'm passing second
goes into my second variable b now just to make sure that this is how it works
let me print out the value of a and b separately so now let me run my program and
as you see here since 10 was passed first it's stored in a and 20 which is per
second is stored in b same here 2 which was passed first stored in a and 3 store
in b now what if i don't know whether the function accepts a first or b first now
in this particular case it does not matter because 10 plus 20 or 20 plus 10 still
gives 30 but we still want to ensure that 10 does go into a and 20 does go into b
so let's say our function accepts b and then a now we won't tend to go into a but
we don't know that a is the second or the first argument that's accepted in the
function so this is where we have keyword arguments so with keyword argument what
happens is that since you want 10 to go into a you can specify here a equal to 10
and you want 20 to go into b so specify b equal to 20. now let's run this program
so as you see here although the function accepts b first and then a and we passed
a first and then b still the values that are held are correct because we said a
should be 10. so 10 went into a and we specify that b should be 20 so 20 went
into b so here we are making our values that we are passing independent of the
position of the arguments so far we pass exactly the number of variables or
values that the function is expecting now what if although this function add is
expecting two values we only pass one i'll just remove this part here and i only
pass 10 i run the program and there's an error now we don't want this error to
happen so even if we are passing an unknown number of variables that is we don't
know how many variables the function takes we are just passing values i want that
something happens it does not result in an error so in this case what i can do is
i can give all the values that i'm accepting a default value make this back to a
comma b so if the value a is not passed its default value should be 0. if value b
is not passed the default value should be 0 again so this how i set my default
values now let me run the program so now as you see although we only pass 10 it
does not result in an error because the first value went into our first argument
a holds 10 and since b has a default value it's 0 and the sum is 10. now what if
the one who's writing the program does not know how many inputs the user could
give or the user could give variable number of inputs to the function now suppose
i want this function add to be able to find the sum of two numbers of five
numbers or even 10 numbers in that case the argument that the add function would
accept could be a list and as you know a list can hold any number of values so
now in my call i can just pass n number of numbers and i'll just delete this line
because now we don't have a b we have one list a and we want to find the sum of
all the elements in this list a so for this i can use a for loop to iterate
through the list so for i in a and total equal to total plus i and total needs to
have an initial value so total equal to zero and let me run this program now so
the sum is hundred and ten so what happens here is we are passing four numbers
now even if i want to pass just one number that would also be completely
acceptable by this function all these numbers are put into a list and a list can
have variable number of values now we are using a for loop to iterate through
this list and we keep adding each number to our variable total and finally we
print out total so if you are unaware of how many values a function can accept or
you want to provide the function the capability of being able to access variable
number of values you can go for a list we now move back to a previous version of
our program where we are accepting a b just adding a b and displaying it and we
are passing our values to the add function through our variables x and y now we
will make certain modifications here so x holds 10 y holds 20 we are passing x
and y to our function add now in at what if i change the value of a and b so i
say a equal to 2 and b equal to 3 and then i'll also put a print statement here
to print out the sum of x and y let me run this program now so the first line of
our output that is the sum is 5 comes from our function add and add has changed
the value of a to 2 b to 3 but now when we are printing the sum outside our
function at it seems like our variables are still holding 10 and 20 and not 2 and
3 that means that the change in the value was reflected on our variables a and b
within the function but then that did not affect our variables x and y so
although we are passing the value of x and y to a function at and a is holding
the value of x initially and b is holding the value of y initially are a and x
essentially different that is does a hold a different memory space and x hold a
different memory space so let's find this out let's print out the addresses so
here after we created x and y let's print out the address of x and y then within
our function before we actually change the values of a and b let's print out the
addresses of a and b and after we change the values let's again print out the
address i'm running the program so when you look at the ids here our first line
of output which is the id for x and y comes from here our second ids come from
this line here and a third line of ids comes from here so our first line prints
out the id of x and y which is exactly the same as our id for a and b within the
function but then again after our values of a and b are changed the id differs so
the id here printed out in this line is different from the id printed out in
this line so what this means is that x and y is actually passed to a and b a and
b and x and y are holding the same memory location but the minute you change the
value of a and b the location also changes so this kind of a passing from the
call to the definition is called call by reference so call by reference is when
you're actually passing the object and not just a pointer to the object so you're
not passing the value of x and y you're passing x and y to a b which is why the
id here that is the id of x and y is exactly the same as the id of a and b but
once you change the value within the function this is not reflected back in x
what python does it creates two new objects a and b and the values are then put
into these now we know that integer strings and most other things in python are
immutable so this might be why we have new objects being created every time we
are trying to change the values so what if we pass something that is mutable and
try to make a change to this as we know lists are mutable so now let's pass a
list to our function and try to change the value in the list within our function
and see if this is reflected back on the original list so
i'll just delete all of this although we won't be using the function to add
anything right now let's leave the name of the function at add now i'm passing
lst list to my function at and then i say lst of 2 equal to 0 and outside my
function i'll first create my lst so that is 0 1 2 are my three values then i'll
call add and i'll pass lst to it since we have no print statement there's no
output so i'll have a print statement here lst and i'll also have a print
statement before the call so again print lst run the program so as you see here
the first time we created our list the values we gave for it are 0 1 2. so this
is what's printed in this line here then we made a call to our add function and
in the add function we change the value of the element in the second index
position or the third element of the list we change that to 0 and when we printed
the value of lst even outside our function this change was reflected in our
original list so it printed 0 1 0 and not 0 1 2. so with this we come to a
conclusion that with integers we were creating new objects every time we wanted
to change the value because integers are immutable but with list since they are
mutable the changes are reflected back in the original object so we are back to
our basic program of the add function accepting two values a and b finding their
total and printing it out now what if we don't want total to be printed within
the add function so the add functions like sure i'll do the operations but i do
not want to give the output right here so in this case what we want so the add
function should be able to somehow send the value of total back to this line of
call and then we can print out the value of total so first of all add will not
print the value of total anymore add will instead return the value of total to
its call so for this we use the keyword return followed by the value that it's
going to return which is total in our case and the line where we are making the
call this is the same line where the value will be returned so now we need a
variable here which will hold the returned value so let me say sum which is our
variable so result is a variable which will hold the return value of our function
at and then over here we can print out result let's run this code and that
worked perfectly fine so i'll explain this flow once again execution starts from
this line here the result equal to add 10 comma 20 first add of 10 comma 20 is
called so the add functions called a holds 10 b holds 20 and the total is
calculated then this function returns this value total to the calling line so
this is a call line and we have a variable here result which will hold the
returned value then we use this variable we just print out the value today we'll
cover what objects and classes in python are so previously we saw that python is
an object-oriented programming language so what this means is that python is
completely focused around the presence of an object and this is why we will
stress more today on what this object actually means and what classes are in
relation to objects so let's begin first we look at what an object is so in the
real world everything tangible is an object the table the chair your mobile phone
your laptop even you are an object and in the similar manner in python since you
are focusing so much on an object and programming a round object so every
instance in python is an object now if that is what an object is what is a class
a class is a blueprint of similar objects so you'll have multiple objects in
programming often and all these objects have some similar features so a class
basically holds all these objects together and gives them a common definition now
let's better understand this through an example so here we are considering person
as a class now if person's a class every person has certain features that is name
gender age irrespective of who the person is the person has to have these
features so this is something common shared by all the people now a person also
has a behavior and behavior basically means the functions a person is entitled to
perform so obviously there are a number of things that a person can do but in
this particular case we look at just two of them which is talk and vote
definitely there are some exceptions which we'll not dwell into right now so now
that we saw what features and what behavior are and how person which is a class
describes these things let's look at what an object would look like under this
class person so here we have two objects of the class person our first object
which is on your left hand side has the features named gender age and the
behavior talk and vote now every object has to have these features because these
are what's defined under the class person and in a similar manner the object
that's defined to your right also have the same features and the same behaviors
now what is it that makes these two objects two separate entities then it's the
value of the features so our features name gender age in either case has a
different value for example our first object has the name sam our second object
has the name mia gender is male and for a second object gender is female age is
also a different value for both objects now they could have the same value that
is there could be some common features but if all of them have the same value
that wouldn't make sense they wouldn't make them two separate objects so to
summarize this person is a class and the class defines the features name gender
and age it defines behaviors talk and vote and then we defined two separate
objects and gave value to the features in these objects now let's begin to code
this example so i'll move on to pycharm first thing first let's create a class so
our class is created using the keyword class and then put down the name of your
class which in our case is person followed by colon and enter so your class is
created now once that your class is created the next thing we need to focus on
are the features that this class defines so the person class defines the features
name age and gender now in our previous video on oops we saw that the features or
the members of a class are defined using a constructor usually so let's create a
constructor which is def init and once you click on init the self parameter
automatically appears self basically refers to the object that you're passing to
this constructor or the object that is being created when the constructor is
called now we'll give value to our three features so our features are name and
that will have the value say sam gender male and age 22. now one thing that
we're missing out here is that we need to remember that name gender age all these
are features of the class that means these features are strictly tied to an
object that would be the object that's created when this init function is called
for that particular time so the reference to that object would be stored in self
therefore we won't just write name equal to sam but instead self dot name equal
to sam and in a similar manner self.gender and self.age so we have defined our
features of the class person and also given it value now next thing is we have to
define the behaviors so behaviors are implemented through functions or we can say
methods so methods are basically functions but the functions which are called
through an object or tied to an object are called methods so let's define our
methods our first behavior is the talk behavior so if the talk method is called
we just want to print out hi i am and the value of the name attribute of the
object that's calling torque now our next behavior to be implemented is the vote
behavior so for that we'll create a method vote and here we'll put in a condition
so if the age of the person is less than 18 the person's not eligible to vote so
we'll print this out and if the person is above 18 then we'll print out that the
person is eligible to vote so if self dot age is less than equal to 18 i am
not eligible to vote actually less than 18 if the person is 18 they are
eligible to vote and let's print i am eligible to vote so the two behaviors of
the class person are implemented through methods now now all that's left for us
to do is to create the actual object so how do you create the object if my
object name is obj i'll write obj equal to and then put in the type of the
object and the type of object is of course the class now here's something that
you need to understand everything that you create in python is actually an object
so i'll explain this a little more in my console so if i have a variable say a
and i give this variable the value 100 we know that 100 is an integer type value
therefore that makes a an integer type variable but then if i check the type of
this variable it shows class int that means the type of a is the integer class
which makes a an object in the similar manner if i have another value to a say
simply learn and now if i check the type of a we'll see that a is now an object
of class string str is basically the short form of string therefore everything in
python is an object of some class and that is how our object here obj is an
object of the type person where person is our class so now that we have created
an object during this creation of the object our constructor will be called and
the features of this object will be given some value now we can use this object
to call its behaviors so this can be done either in this manner so you put in
person which is the type of our object dot the method and within the bracket as a
parameter you can pass the object in the similar manner we'll also call our
behavior
vote and again pass obj within it now let's run this code just pull up my
console here run the program and as you can see here so we created an object and
the object got its values as mentioned in the constructor that is in it and then
we called the talk behavior or the talk method using our object obj and this
printed our first line which is hi i'm sam and then we call the vote behavior and
in the word behavior the age of our object was checked since our object's age was
greater than 18 it says i'm eligible to vote now these two lines where we are
calling our methods can be done in a different way instead of calling it with the
type of our object and then passing our object we can directly call it with our
object so we can have it in this way obj dot talk off and over here obj dot
vote off so we must remember that in this particular case where we are creating
classes and objects and have this kind of a structure all these methods are
actually tied to the object which is why we cannot just say talk of we need an
object to call it or at least pass an object to this method let's run the code
now so it's the exact same result works just fine now in this case we just
created one object and the values for this object were predetermined and just put
in our constructor now what if we want to create two separate objects and the
values for these objects vary so in this case so to demonstrate this we'll create
two objects just like in our example previously we'll have our first object which
is obj1 now both these objects will be of course of the type person which is our
class so put that there now every time the object is created automatically our
init method is called so if our object needs to have separate values what we can
do is instead of putting in predetermined values here we can pass the values for
the features of the object in the brackets so if i want my name to be sam i pass
sam my gender and the age so these are the three values for the three features
that this object has which will pass into the constructor or the init method now
in the init method we need to be able to accept these values so other than self
will have one variable which will store the value of the name when it's passed
from here we'll store it in variable n n for name and the gender will store it in
variable g age will store in variable a so what happens here is basically these
values that you pass within the parentheses when you're creating your object are
sent to your init method now in the init method we are putting down these three
variables to accept the values so these three variables store these three values
so sam goes to n male goes to g and 22 goes into a now instead of putting in the
strings here self.name can be equal to n so the value which n got is now put into
cell.name and the value which g received is put into cell.gender and the value
which a received will be put into self.h so that way we created our object and
provided a way in which we can actually have unique values for each object so now
if we have another object say obj2 which is also of the type person we can put in
different values into these parentheses so say our second object's name is
jessie she's female and 16 years old so now these values will go into these
variables and will be assigned to these features so for obj1 if i say obj1.name
the name would be sam if i say dot name it would result in jessie we'll see that
now let me just print that out obj1 dot name and obj2 dot name so as you see
although our feature is the same we are printing the feature name in both the
cases the object to which this feature is tied is different so the first time
it's printing name related to object one and the second time it's printing the
name related to object two now this is not what we want to show here here what we
want to do is after creating our two objects we'll call the methods using these
objects so obj1 dot talk off and obj1 dot vote off now we'll call the same two
methods using obj2 so obj2 dot talk off and obj2 dot vote off run the program
and here's our output so what happened here is obj1 was created and the three
features which are name gender and age were given the respective values for obj1
and the similar manner obj2 also received the values for its various features now
we used obj1 to call our two methods talk and vote at talk all we did is print
our name of the object so the first time when we called with obj1 it printed hi
i'm sam because sam was the name related to our first object and then we call the
vote method using obj1 itself so our first object's age was 22 which means it's
eligible to vote and that is what's printed here i am eligible to vote then we
call the same two methods using obj2 and got different results so obj two's name
was printed here which is jessie and then jesse is under age that is she is 16 so
it printed out i am not eligible to vote when obg2.vote was called so that is the
basic crux of what objects and classes are in this we're going to cover a basic
understanding of object oriented programming and we'll be specifically working in
the python interface or the python code oops the most con the basic concept is
not the accident you just had or the mistake you just knocked your coffee cup
over off your desk but it's actually stands for object oriented programming and
whether you're working in python which we'll be doing today or in any scripting
language almost all of them rely on object-oriented programming as their basis
programming paradigm that focuses on objects so everything is an object in the
coding every instance in python is an object and again this is true for any of
the main programming languages you know if you're working in java or scala or any
of these you're going to be looking at object oriented programming it just makes
a lot of sense and really is the one of the cornerstones of today's programming
and when we talk about object in programming an object has two different things
it has a tribute it has two different things attributes which is data describing
the object and behavior methods on the attributes so the first one is your actual
data and the second is what are you doing to that data and if you're a car lover
we're going to use cars as an example this is kind of easy to see how that might
fit in to the different objects and so we're going to start with a bmw for
example that's an object and with the bmw you might have like the year of the
manufacturer maximum speed those would be attributes of that car and then you
have the behavior these are methods on the attributes for example you might
display the speed or change the speed so even though you have a maximum speed you
might have current speed also which changes depending on what you want it to do
step on the accelerator go faster step on the brakes go slower and the primary
terminology in programming is usually class and it's definitely true in python so
class is a collection of similar objects and in this example we have car as our
class so you might have trucks bicycles and airplanes but in cars we have some
very specific things cars have four wheels that are on the ground maybe a spare
hanging down headlights steering wheel things you would expect in a car doors to
get in so it's all inside but in here we actually have a number in the collection
we have our bmw our ford our audi you can guess the guys in the back like the bmw
the best i don't blame them and so we work with classes or collection of items
today i'm going to use the user interface pycharm it's one of the very popular
python editor i myself have never used it so this is a new experience for me it's
pleasantly fun to use and easy to set up you can go to jetbrains.com that's the
people behind it so www.jetbrains.com slash pycharm slash and once you're on the
pie charm if you go under downloads you'll have two options let me just hit the
download now button it'll take you to the two options one is the professional
which you would pay for which allows integration between teams and company setup
and has a bunch of extra cool tools you need for a business and then there's a
community version which is the lighter weight one for the purposes of this demo
and most the work you do the community one's probably just fine you can download
it it's open source and it's free i don't know about you but i like open source
and free and a lot of my stuff and i've already opened up a project file but you
can create a new project file up here put a note down in the youtube and they can
send you some of this code or go visit simply learn they'll send you the code
itself if you need it there's a lot of different like i said functionality in
here we're both mostly going to be underneath just run the demo in this case i'm
doing oops simply learn so we'll just be running what i'm working in here let's
go ahead and start with creating a class in python and you simply just tell it
it's a class so the class means this is going to be our python template whatever
we are working on i'll call this car and this is interesting because class itself
let me just highlight that so when i put up class here that is a python object
and then i'm using that to build car my class is specifically on here and let me
go ahead and just change that font i have it pretty large already in my editor
let's go back down here to editor font and i put it up all the way to 26 i think
we can get it by with 30 so you can really see it hopefully that's going to be
big enough on there and so once we have our class we wanted to do something and
so in a class when you put the words def that stands
for definition whatever that's called that is a function it's going to actually
do something on there and you'll see what i like about this is nice with pycharm
it automatically automatically puts self in there why because self refers to
whatever the object is now it doesn't refer to the template class car it refers
to the instance of that and i'll show you in a minute what that exactly means but
in this case we're going to do something simple we're just going to have it print
and you see it automatically indents it so python if you remember or if you're
new to python is based on indentations you have to indent so everything under
this indent is part of the def everything under class that's indented is part of
the class and so we're just going to look down at our speedometer and we're going
to print 155 miles per hour maybe it's on the audubon in europe i don't know race
track not very often you go 155 miles and now we're in a car at least not in my
experience maybe in the future we have auto cars that do hundreds of miles per
hour and shoot down the freeways and let's go ahead and create an instance of the
class and we'll simply do bmw equals car and we'll put brackets on there and
let's also do let's do a ford ford equals car so if you remember car is like a
template it's still an object so i have bmw equals car and i have ford equals car
these are two separate instances so they're copying or they're actually putting a
pointer to car so everything under car is part of bmw and the same thing with
ford and we can do something like this we can go car dot get speed let's put in
the bmw there or we could also do car dot get speed forward given the choice i
think i'd go with the bmw and there's another way you can do this we can also do
bmw dot get because remember it points back to the original template so all the
functionalities in the original template and then we could also do forward dot
get speed these are a little bit more calming more than you'll see the top part
but let's go ahead and run this and if we go under run we find out that alt shift
f10 you can use the hotkeys which i'll probably start doing and i just simply hit
make sure i highlight the one i'm working on and it runs it down here and so you
can see right here we have 155 miles per hour printed out by the bmw 155 miles
out printed by the ford uh car get speed and then i did it in a different format
bmw dot get speed and four dot get speed the thing to notice is the term self
that is a python term which means that when we create an instance whatever comes
in here is what's being processed and so if i call the class or the template
directly in this case i did car get speed i have to include what that self is in
this case it's a bmw or if i'm calling it from the instance bmw because it
includes the car class it knows that git speed and it knows automatically that
this is self that's where the term self comes in so that's important to note in
there in python it's a one of the more pythonic kind of solutions is that when
you see the term self that refers to that instance that you created in this case
the bmw instance and the forward instance when you're setting up a class there's
a lot of things you can do in here there is a key term init what this means is
that when i create the new instance whatever happens in here is going to be set
this sets up your information let me show you how that works and we're going to
do is we're going to pass year and speed in here and this is really important is
i need to take self that's going to be the new instance i created i mean the k in
this example will be our bmw and our forward and put a period here and then year
equals year that's the variable that was passed and we'll do the same thing with
speed self.speed equals speed so when we create this we now have two different
variables in here and if you remember from our slide an object has attributes
data describing the object and a behavior methods on the object and so in our
class or our template we're going to net it initialize it and hit it we're going
to initialize it with the year and the speed on here but if we initialize it here
when we create the instance we have to send that information so that knows what
the year is and what the speed is for the bmw and the forward and that means we
create this instance let's go ahead and do 2018 maybe this is the maximum speed
that shows up on the speedometer i don't know but 2018 and we'll put it at 155
miles per hour and then we'll take the ford a little bit older runs a little
slower it's a 2016. and this will be the maximum speed we'll do 140 miles per
hour that's the fastest it can go and then so now we we've set up some data on
our object right here coming through and so bmw now has a year and a speed and
so does ford and when we get speed let's go ahead and change this and in fact i'm
going to do let's just get rid of this middle part because i tend to do the
standard bmw.getspeed and we'll call this maximum speed maximum speed is and
then we'll just put down self speed so that's our variable so it's whatever the
instance is so self speed is going to refer to our bmw or our ford and if we go
ahead and run this and that was go up here to run and alt shift f10 i'm just
going to do alt shift f10 it's already set to the tab i'm on hit enter and we
come down here and you can see had an error there this is a double underline for
the init the code that they sent me had just a single and i forgot to double
check that uh so we have a double underline init it comes in gear and speed the
bmw has 2018 with the maximum speed of 155 and when we print that out bmw get
speed you'll see it right here the maximum speed is 155 and then it does it for
the fort maximum speed is 140 and then we're done processing it so we've stored
some data in here we started the year we stored the speed and then we've had a
process on that data and that is to go ahead and just print it out print out the
self.speed and let's go ahead and briefly look at encapsulation def set speed
self speed so we're going to send it a new speed value to set the speed and from
here we'll go ahead and do self dot speed equals speed and we'll just work with
the bmw on this one let's take the forward out of there for a moment and we have
our forward get speed and then we're going to set the speed and the new speed is
going to be oh let's make it 143 just an odd number so it's easy to see and then
we'll go ahead and do bmw dot get speed and when we go ahead and run this i'll
do the alt shift f10 and run it as you can see here it comes out and says 155
because that's what the speed was when we first entered it and then we went ahead
and changed it bmw dot set speed and is now 143. so you can see here how we can
alter the information in our template and if we remember we had the ford the ford
is completely separate from this so if we still have ford down here equals car
2016 140. if we still had that down here this 143 isn't going to affect the
forward at all because this ford instance is separate from the bmw so if i do
forward dot get speed and we do an alt shift f10 to run it there we go here's our
forward 140 so it hasn't changed even though we did we set the speed on the bmw
to 143. the first thing we want to look at is inheritance and so inheritance is a
mechanism for a new class to use the features of another class and if you've ever
done like word processing you might use a template from that template you can
have all kinds of stuff you put into the template and you can save it in
different instances depending on what you're doing and then those particular
instances you've saved could be used as a template for your next thing i use that
all the time when i was in cells i would or i do some cells on the side as part
of my job working with them to support them and build the software for them we'd
build put together a document and then we take that document and then become the
template for the next three documents that are very similar this is the same way
you have a class and from that class we can inherit everything from there and
then make changes on it and then we're going to go ahead and cover let's do this
ford bmw just close that up take this out of here for now okay let's go ahead and
do uh inheritance that's where we're going to take the class car and we're going
to bring all these features from class car into a new template or a new class and
let's start with a class sedan and the sedan is going to inherit from car so it's
going to bring the car in and now we have a sedan coming down once we create the
sedan this is now a child class to car coming down and under this class i'll just
go ahead and we'll just type in a def accelerate course self and on here we're
just gonna have it print and oh let's have it do let's say i would do 150 that
seems like very high let's go 133 just because i don't know why i like the 133
let's change it to 137. there we go and then let's do another one here we'll do
this a class suv and it also is going to inherit from car and this too is a
child class of car so we'll just put a label there so we can remember that and
we'll also give it an accelerate and when we do an accelerate keep the self and
we'll print let's see uh this year whatever production is 180 i don't know why
that seems awfully high for an suv versus a sedan let's do um uh 1 27. the svp
goes a little bit slower than the sedan there we go so let's go and hit enter on
there and let's take a look at what we got going here i have my bmw i have my
ford and let's add a honda and we're going to make it equal
to the sedan and just for fun let's add one more thing def open trunk self a
little bit print trunk has been opened there we go let's take a look and see we
got going here i've created my honda if you remember up above we have our bmw and
i'll just do just the bmw get speed and then i want to do the honda get speed
and when we run this we can see our bmw is 155 and our maximum speed on our honda
is 150 because that's what we set now remember this inherited it from cars so
this init is now part of the sedan the open trunk if we do this if i go the other
way around let's say oh let's do this let's do the honda open trunk and then
we'll do just to throw an error here this is going to give us an error it
doesn't even auto type it for me so we'll do alt shift f10 and run it and you can
see down here is that we had our bmw get speed okay there's our 155 we had our
honda get speed we had our honda open the trunk and because our car when we go up
here to car does not have open trunk the bmw does not have that definition so it
gives us an error so we can't do that unless we add that to the top as part of a
car it just turns out though that let me just take the bmw out for whatever
reason this bmw doesn't have a trunk i think they all have trunks but that's okay
for this example we'll just leave it like that so you can see right here we've
inherited and then we've gone ahead and added new features to our new class so
this child class of sedan has come in here now has both accelerate open trunk the
suv which we're going to look at in just a second also has open trunk so we've
got a lot of functionality when we have our original car template so when
whatever's set up up here comes down here and then we've added both accelerate
let me just do that right here or honda can accelerate the honda and if we run
that you can see the honda has their acceleration come in and prints out in this
case 137 down here that's why i like to mark it with a weird number like 137 so
you can tell the difference where we're going so we have our accelerate so we've
added features into our sedan encapsulation is a important feature preventing
data from direct access this is what we call encapsulation we'll discuss that
more later on but what it means in this particular case is that when you create
a class from another class you can't alter the original template the child class
can make all its own changes to that class and do all kinds of things but it
can't make any direct changes to the parent let's go ahead and take a look at
encapsulation in the code itself so we can have a better idea of what we're
talking about so we have our class car and we have our definitions here's our
init this is from if you remember from the last slide we had our net maybe we had
a couple more functions to it or modules to it and we have get speed and set
speed all of this under class car is encapsulated what that means is that if i
create an instance of it in this case i'm going to create a bmw wouldn't that be
nice i can just whip out a bmw make a bmw when we create our instance bmw it does
not change the class car this whatever we do to this instance does not make
changes to our class and the same thing with the ford the bmw doesn't get speed
it might do something here in the definitions under the class but the initial
class if i create another car let's say we take our bmw we set the speed for 155
like we did here and we create a ford whatever i do with the 4 does not change
the class car and if i create like we did before with our inheritance a subclass
that's not going to change the class car that's what they call encapsulate it
bundles things together and it really is central to oops object oriented
programming is we're basically bundling things together and then we keep them in
a nice safe bundle anything we do from that we either point to it and say hey
let's use this definition or we make a copy of it so we kind of do a combination
of there and say okay here's our new inheritance but you can see right here this
is encapsulation and again that's like central to oops this is how oops was
originally discovered created and founded is when they realized hey why don't we
just bundle things together instead of rewriting the code over and over and over
and then polymorphism we talked about encapsulation in classes just recently like
i said we'll go in this and we open up the code polymorphism is we take the
feature in the code and then we can alter that feature so we can use the same
function in multiple ways that's a lot of fun a lot of power in that let's go
ahead and take our polymorphism and take a look at it what's going on in the
actual code when writing our python code so here we have our parent class car and
we switch colors here we have a sedan where we inherited the car and an suv also
an inheritance and when we look at this we have up here our init let me just put
so we have our definition we're initializing it the first class we've added in
accelerate so what that means is with our acceleration that's not part of the
initial car class right now until we add it in so we have our definition
accelerate which print 150 miles per hour so all sedans have a print of 150 we
need to accelerate all other cars don't have anything going on in them right now
and then with suv we're going to define the accelerate differently so we have the
accelerate down here with self print 180. so we have two different polymorphs
going on here we have accelera one polymorph going on in this example we've added
accelerate to both of these classes and so it doesn't matter which one i
accelerate it's going to do something and so if we create in this case we're
going to create an object down here an instance of it do this in pink here's our
instances oop that's a little light let's do something a little darker than that
one there we go so we have our instances and we're going to create one object
which is a sedan a camry and one object which is an suv scorpio and then we can
just go through the objects we can actually just loop through the objects one
object at a time and if we do print object name and end we have our object name
so it's going to print the name in this case i'll print camry and i'll print
scorpio and then we have object accelerate and for the sedan since that's our
camry it's going to print the 150 and for the suv's going to print 180 for the
scorpio and so by doing this we polymorph the same definition to do two different
things depending on what our child class is so we can polymorph all kinds of
things and we'll actually look at a couple more objects in there when you go
through the full code later on there's a couple more examples in this but this is
a polymorphism coming down and you can see how we're able to leverage the same
name for the definition to do different things depending on what our object is
today we look into threading in python so before we jump into what threading is
let's have a look at what a process is because that is a core concept that you
need to understand before you understand threads so a process is an executable
instance of a computer program that means anything that is running on your
computer is actually a process so a word pad that's opened a mail that you're
sending anything that's running that's working could be a process now that we
know what a process is let's look at what a thread is now a thread exists within
the process so the definition of a thread is that it's a sequence of instructions
in a program that can be executed independently of the remaining program so if we
write a large program of course there are many small tasks that the program
performs now each of these small tasks can be a thread as long as their execution
is independent of the execution of the other tasks but then why do we need
threads so consider your system now these days we all use multi-core processors
that is we have multiple cores in our system now the point of threading is to
make the maximum utilization of these cores so if one program would run on a
single thread that means it would utilize just that one cpu or that one core but
if we can divide a program into multiple threads and have each of these threads
use a different cpu a program would execute much faster and it would be way more
efficient so this is why we use trit so now let's look at threading in python
through some demos so first let's look at the most basic and the simplest way of
creating a thread i have a function here show and this function will just print
this is a child thread as we are going to have the child thread execute this
function so first thing we need to create our thread now before we create the
thread let's import the threading package so we have included everything from the
threading package now we'll use the thread of method to create a thread which
will be called t and to this method thread off we'll pass the target of our
thread that is basically the task that our thread will be performing which in
this particular case is the show of function so you say target so we set target's
value equal to show our function show and now outside here we'll just print this
is the parent thread so i'll just explain in general how a thread works now the
minute you write a program and you start running it there's a thread created now
this default thread that is created is called the parent thread so the execution
of your program is performed on the parent thread unless you explicitly create a
separate thread so if we had not created this thread here our entire program
would be run on the main thread like every other program is but
now that we created a thread here this thread that is t will run only this show
of function because that is the target of this particular thread and everything
else in the program will be run on the parent thread which is why we have this
print statement here saying this is the parent thread because this print
statement is not within the show of function and hence it will be run by the
parent thread now so far we have only created the thread though so we need to
start this thread and to start a thread use the thread name followed by dot and
you call the start of method so the start of method belongs to the thread class
and as you have probably noticed already a thread here t is an object of the
thread class so once you say t dot start the target function of this particular
thread is run which in our case is show now i'll run this program pull this up
here and as you can see first we have our child thread executed and then we have
the parent thread executed now let's look at the second way that you can create a
thread which is by importing the thread class so you still need your threading
package and i'll create a class my thread which is a user defined class so i'll
be defining this class and this class will now be a sub child of the thread class
now the thread class already exists it's a predefined class and we're extending
this thread class into our class my thread class what this does is all the
properties of the thread class is now present within our my thread class now
within the my thread class i'll be creating a function def run within which i'll
just print this is a child class say five times using the for loop and outside
here i'll create my thread so now that our class my thread has extended the
thread class we can create an object for my thread and that by itself would be a
thread so t is my object of my thread which would also be the thread now if you
notice here i'm not passing any target so this is because when you don't pass any
target by default your thread will call the run function so it will come here and
it will execute this function so the thread class has a blueprint of the run
function and every thread that is created as an object of the thread class is
designed to call this run function by default so what we do here in our my thread
class is we redefine this run function for our purpose now of course you need to
start your thread and outside here which is the path that will be executed by my
main thread i'll just print out this is the main thread and let's run this code
so we'll just add the new lines so here there's child class main thread main
thread main thread so on and then we're back to child class so if you're
wondering why this kind of a mix-up is happening i'll explain that to you now our
entire program was initially planned to be run on the main thread but we created
a child thread here which is steep and made the child thread take care of the
execution of this def run of this function run now while our thread our child
threat team was executing this for loop and between that gap of sprinting this
out and moving to the next iteration our main thread was completely idle so the
main thread took up this time to start printing out this for loop and that is why
you have this mix up of child class and the main thread printouts and now we'll
have a look at the third way you can create a thread this time we'll not be
extending the thread class so i'll have a class say demo within which i'll have a
function show and within this function i'll print out this is the child thread
five times using the for loop now outside here i'll create an object of my class
demo and now i'll create a thread which is basically an object of the thread
class and to this i'll pass the target just like we did previously but this time
because a function exists within a class we'll refer to the function with an
object so we have a reference now and once that's done back to starting our
thread and outside here i'll print out this is the parent thread five times using
the for loop let me run this code now so you can see the output here we have this
is a child thread printed five times and then five times of the parent thread now
python can be quite deceiving so now that we looked at what thread is and how
threading works in python exactly we can move on to multi-threading so multi-
threading is a model where you have multiple threads within a process and all
these threats can execute independent of one another but they are also sharing
all the resources of the process so by the resources of the process we could mean
the various data the process holds the files or the stacks and so on now all the
threads that are taken care of the execution of this process will share these
resources but they will be independent of each other's execution too so let's
look at multi-threading through an example here we'll write a program where we
want to print out in the first line a number say 1 followed by the double of the
number and then the square of the number now we want these three statements to be
printed for every number from 1 to 5. so what we'll be first doing is we'll have
a class and within this class we'll have three functions where each function is
responsible for one of the following which is printing the number printing the
double of the number and printing the square of the number so let's begin writing
our code since we'll be using threads here the first thing we do is we'll import
our thread package name my class demo and within this i'll start defining my
functions so my first function is def num and this function will just be printing
out the number so we want all the numbers from 1 up till 5 so that's all our
first function is now our second function which is for printing out the double of
the number 2 into i and finally our third function where we'll be printing out
the square of the number so now we have the three functions now here's our
challenge we want one iteration of every function to happen one by one so what i
mean is for i in range one to six in a function def num we want the number is 1
to be printed and then immediately after that we need to move on to the first
iteration of a function square now for this kind of an execution of course we
need three different child threads and each thread will take the responsibility
for execution of one of these functions so we'll create our threads down here but
before that first of all we need to create an object of our class demo and now
we'll create our threads my second thread and while we are creating our threads
we also need to specify the target so t1 is responsible for execution of the num
function t2 will take care of the execution of the double and t3 for square so
we have created a thread now we need to run these threads or we need to start
these threads so for that we have t one dot start off t2 dot start off and t3 dot
start off now we need to end these threads or we need to make sure that the
program does not terminate until these threads have completed their work so for
that we use the join function and out here which will be run by the main thread
we'll just print out this is the main thread and that's all it is that's our
program so everything in this program was initially meant to be run by the main
thread but since we created our child threads the main thread is responsible for
running only this much of the program now t1 t2 t3 will take care of the
execution of these functions respectively now let's run this program well we are
not getting what we expected all these iterations are being completed in blocks
so the first function is completing itself first followed by the second function
but now in between the second function the third function also started printing
and this is a complete mix up so how do we fix this now the first thing we need
to see is as we saw in context switching we cannot completely predict the order
in which these print statements would be carried out or the order in which the
threads would be executing so what's happening here is while double was printing
this while the thread that is t2 which is doing the execution of double function
was printing this there was some time where t2 went into an idle mode and during
this time t3 started printing so how can we fix this so first of all to prevent
this kind of an output where a print statement is printing something out but then
it stops for a while and another thread takes on the execution and the previous
thread was not able to print it correctly so to prevent this what we can do is we
can have a sleep function on each of these threads so for that first i'll import
the time package now what the sleep function does is it basically puts that
particular thread into sleep or in idle mode for the amount of time that we
specify in seconds so if i say time dot sleep of one it holds this particular
thread which is t1 at sleep for one second and during this time we expect e2 to
begin its execution now once t2 has finished one iteration let's put t2 also into
sleep for one second and we hope that during this one second of hold for t2 t3's
first execution will take place so we put here time dot sleep of one after the
execution of t3 now t3 is put into hold and again we can start off with t1 second
execution or t1 second iteration followed by t2 then t3 and so on so this two has
not fixed our problem there's still a mix up of statements now if you see the
statement here you'll notice that this mainly occurred because your second thread
started before the completion of the previous thread so in that case our problems
not with the child
threads but in fact in the main thread so what if i put a sleep for the main
thread after every thread has started its execution and even after t2 put a sleep
now since i'm putting a sleep of 0.2 seconds during this time we can ensure that
the previous thread has completed at least printing out one complete statement
and the printout is not halted or is not interrupted by the second threads
process now let's run our program once again see if it worked this time and yeah
it did so once we put sleep on our main thread we kind of had a control over how
our threads are starting we saw that the output that we wanted is exactly what we
got so this is how threads work and how you can control threads using time dot
sleep function and that is pretty much all the basics about threading python
scripting so what is scripting scripting is writing programs to automate a task
and python by itself is a scripting language so what this means is that the lines
that you write the code lines they are interpreted rather than compiled so they
are executed at run time so in this tutorial we look at a few libraries which
demonstrate the scripting feature of python very clearly other than that we'll
also look at a few libraries which demonstrate its runtime behavior so let's
begin i'll be coding on pycharm today and the first library that we look into is
the os library which stands for operating system so to use the os library first
we need to import it so say import os now in here i'll have two functions i'll
just demonstrate two of the most common functions that you can utilize from the
os library first one is to extract the current working directory create a
function and i'll have a variable cwd which stands for current working directory
now this variable will store the return value from a function which retrieves the
current working library so you say os dot get cwd and that is it one line and you
have your present directory stored in our variable cwd so i'll print this out and
that's our first function so out here i'll call the function so now we can run
our program so this is our current directory you can see the users within the
user my particular user name and pycharm projects slash demo is this particular
project under pycharm okay so that is our first function let's have a look at the
second one now i'll demonstrate how you can retrieve the path of a particular
file so create another function here and to this function i'll pass a file name
now in here have another variable i say path which holds the value of the path
that this file is stored in and to retrieve that you say os dot path dot apps
path which stands for absolute path now you know there are two types of paths
there's the absolute and the relative path name the absolute path name is it
gives the path name right from your c drive or d drive whichever drive it belongs
to so the relative path name is the path respective to our current working
directory now i'll have a variable file name out here which will hold the name of
my file whose path i need to find out so i see sample.txt and then i'll pass this
file name to my function file path and that's it now let's run our code okay so
we have not printed the value out let's do that print path run your code again
the first function that we call is to retrieve our current working directory and
the second function is to retrieve the path of this file sample.txt and this is
the right part this is exactly where i've stored my file so all is good so those
are the two most common functions that you use with the library or the module os
there are quite a lot more which you can definitely check out so next we look
into the time module so as usual first import the module now before i show you a
few functions in the time module you need to understand what epoch is so first
january of the year 1970 is the starting point of a certain timeline and ebook
time is basically the number of seconds that's passed since first january of
1970. so to get your epub time use the function time dot time off and i'll store
the value of this function returns the epic time and i'll store it in a variable
ppc print that out so that's the number of seconds that has passed as of right
now since first january 1970. now obviously this has no meaning to us so what we
do is we'll convert this to a format that means something to us and for that
purpose we use the function local time so i create a variable local time and use
the function local time to which you pass the epc variable now print out local
time so as you see here now we actually have numbers that make sense to us we are
in the year 2019 the third month which is march date is 18 12 hours 42 minutes 47
seconds so this is something we can actually decipher now again this is a
structure it's not in a traditional formatted way so if you want to extract
certain information from this structure you can easily do it saying local time
that you use the variable and then just use these keys in here to extract that
particular value for example i want to extract the year so i say tm underscore
year print that and you see here the year alone is extracted now if i want the
entire thing to be displayed in a more formatted way that's exactly what we have
the function c time for that you say print and use a function c time again to
this you need to pass the epoch value that's epc run our program so now you see
we have it in a more formatted way monday march 18th so we even got the day of
the week in here so those are the basic things you can do with the time module
let's move on to our next module so the next module we'll look at is the smtp
module now just like http smtp is also a protocol this is a protocol for sending
emails so it stands for simple mail transfer protocol so first we import smtp
smtp lit now in this demo what we'll be doing is of course we'll be sending out
an email so we'll send an email to someone's email id through python now
obviously python itself cannot send an email out what you have to do is you'll be
accessing a domain so you'll give the username and password for this domain and
you'll put in commands here such that it automates a process of sending a mail
from that particular domain to another email id so first we need to set a domain
and a port and for that we use a function smtp lib.smtp and the first so this
takes two arguments the first argument is our domain name now the domain name is
basically like if you're using gmail it'll be smtp.gmail.com if you're using
hotmail it would be smtp.hotmail.com and so on so i'll be using gmail.gmail.com
and the second argument is the port number which is generally 587 now 587 is the
standard port number according to the standard encryption standard which is tls
now that you have set the domain and the port number the next thing you need to
do is make these servers communicate with one another that is basically the
sender server telling the receiver server hi i'm present here and the receiver
server gives an appropriate response now in here we'll be using smtp dot e h l o
instead of h e double l o that is basically just an extended version of hello it
adds a few additional features to your smtp now before that we need a variable
into which we'll store the object returned by this function so i'll have smtp obj
so that smtp is the object through which will be referencing our domain and port
number so now you say smtp obj dot ehlo now the next thing we do is we'll put our
smtp in the tls mode so for that we use the function start tls so as i mentioned
before guys this port number is the standard port number for tls so now it must
be clear why we are using this particular port number it's because we are using
the tls connection so you say start okay so now a connection setup account
through which will be sending out the email so you say sntp obj dot login and now
you pass two parameters in here the first is of course the email id and that is
followed by your password so to create our file first we need to import the os
package and in particular we need to import path library from the os package now
create function under this function i'll be passing the path at which i'll create
my new file so now before you create the file you actually need to check if the
file is already present so you say if not path dot is file test so you're passing
the destination here to this function dot is file and if your files already
present then this portion of the code will be skipped because our entire program
is for building because the entire purpose of our program is to build a file
which we do not need to do if it already exists now in here you say open dest
comma w so guys this function actually returns true only if this file is present
and this particular part is actually a file so suppose this destination whatever
password this is present but it's a folder not a file we'll still enter this
particular block and so in here what you do is you open this folder in write mode
and say f dot write welcome to python so let's assume destination was a folder
and not a file you enter this block here a file is created which is opened in the
right mode and within that file you enter the content welcome to python scripting
now you're ready to close the file now outside here i'll assign the path to
destination so this is the path guys i want to create this file sample.txt under
this path so basically along with the path name where you're creating your file
you also pass the file name and then you call this function create file passing
test and once your file is created you can just print out file created right okay
now let's
now we need to run our code well we don't need a input from the user so we say
print so we print file created now let's run our code according to the statement
here there's no error our program executed completely and created a file so now
let's check if this actually worked so i go to my desktop this is the folder
where sample should have been created and yes it is we'll open the file and as
you can see the content two matches so we were successfully able to automate the
creation of a file so the first argument is the email id so you have the email id
within quotes put it within single quotes and then your second is the password
for this particular id and that's my password so with these credentials your smtp
will basically log in to your account and then use the services of this account
to send to send an email now the next function that will be used now the next
function that we'll be using is the one that basically sends your email out so
that's called send mail and in here you will put in your to and from email ids
okay next is the id that we are sending the email to next is the from id i'll be
sending an email to the id enj2 simply learn again gmail.com now the third
argument to this function is the message that we'll be actually sending so i'll
be sending a very short message here the first thing you enter is the subject and
the subject would be smtp check and then to move on to the next line which is
basically the body of your message put in slash n this is a test email and that's
it that's the body of my message with the subject preceding it and the and the
last thing we need to do is we will quit the snt connection so once your mail
sent out will close the connection this terminates the program let's now run
this code so programs run now to check if this actually worked we need to go
open our account at anj2 simply learn and check if this particular message is
present in our inbox so guys this is the mailbox that i send the email from and
this is the one that i send the email to as you can see here and j2 simply learn
and here is the message which has been sent so we checked out three libraries in
python the os library the smtp and the time library next we'll check out a few
programs which determine things at run time so now i'm going to write a function
punch1 and this function will receive a few arguments i'll fill in the space
later on now all this function does is it prints out the various arguments that
it receives so suppose the arguments are stored in i and what you pass here is
also i so you pass a number to this function and this function prints out that
number so out here we'll be calling the function and passing a value to it say
10. now what if i want to pass varying number of arguments so say 10 20 30 for
this particular run and maybe the next one i want to pass four arguments i want
to pass 42 so how do we make this function flexible enough that it accepts any
number of arguments and prints these out that is where we have args args come in
so when you put this within the brackets what happens is that your function will
take any number of arguments that you pass to it and store it in the variable
args so now to print out each value that's passed to this function you take a for
loop and you iterate that over arcs and then one by one you can print each of
these values out so let's run this program now and as you see 10 20 30 40 each of
them have been printed out in a separate line and the beauty with this function
is you can change the number of values that you print in fact you can even given
different types of argument so i have all integers here followed by a string and
it all works so that is where you use the variable arcs now similarly we have
another variable which is used in this argument space which is called quarks so
that's k w a r g s and you always proceed this with two stars two asterisks so so
far we looked at how a function can accept variable number of arguments now we
look into how the function can do this while accepting the arguments as values to
labels so what i mean here is instead of just passing 10 20 30 what if i want to
pass a equal to 10 b equal to 20 c equal to 30 and again the number of such pairs
could vary so in that case we have arcs for the varying part and quarks for the
label part so what you do here in the similar manner you will have a for loop
which will iterate through works and we want to extract every item in it so we
say items and print out i run this code and as you can see here so you get all
the pairs printed out so we saw how python allows a function to determine the
number of arguments it receives at runtime instead of it being fixed beforehand
itself now let's move on to another program which demonstrates automation and
that is creation of a file so this is one of the most common and the most basic
things that you come across when you're learning scripting not just in python but
any language so so far we saw how a function work we have a function name and
within it we have some code to perform a certain function now functions in python
can also be nested and that is what we look at now so i'll first create a
function func one and in here i'll have a variable say x which is equal to 10 and
within this function i'll create another function so that will be the nested
function function and to function i'll pass x and within funct2 i'll just
increment x by 1 and return this value so here's the thing now within function 1
we need to have a call to function 2 so you say func 2 and you pass x to it and
out here we'll call function1 so result equal to funct1 print out the value of
result so what happens here is you're calling the function func one control goes
in here x is assigned the value 10 and over here there's a definition for functo
but right now it's kept it moves on to this return line now in the return line
you're basically returning func 2. so what that does is it will call function2
along with the argument x and function 2 returns x plus 1. so where will this
value be returned of course it will be returned to the statement from where the
call is made so that is still within function 1 and that means function 1 will
return the value that function 2 is returning so what we have in result would be
10 plus 1 which is 11 let's now run this code so as you see here 10 plus 1 11 is
what's passed that is how nested functions work now why are nested functions
important they are important to understand the concept of how functions can also
be passed as an object in python and why we would do so so i'll delete this
entire code create a function func one once again and in here i'll print this is
first function so that's func one now within function one i'll have my second
function the nested function and within the nested function i'll print out this
is the nested function and out here i'll have a third function which i'll call
the outer function which will print out this is the outer function now what i can
do here is instead of directly making a call to outer function so sometimes what
happens is you do not want to make a direct call to a particular function you
want to make a direct call to say function a and through function a you want to
call a function b or c so for that what you do is you make a call to func one and
as an argument to func one you passed another function name as an object so i
pass out a funk so as you see here guys outer funk is essentially a function but
i'm passing this function as an object to func one so func funk1 will receive
this function i'll store that in a variable called func maybe called func will
also be passed to our nested function and within the nested function we'll make a
call to this past function and of course finally the return statement where we'll
be calling our nested function let's now run this code our outer function was not
call and that's because we did not put these parentheses here which does not
really make it a call statement let's run this code now and as you see all our
three functions were called but through a call statement for just one so we just
called our outer function func one passed the other outer function as an argument
to this and that took care of executing all the three functions that we have this
kind of a situation does not come handy very often but in certain applications
where you need to decide which function needs to be executed at runtime then
having the capability of passing a function name as an object comes really helps
so we saw how you can create functions at runtime now python also allows you to
create classes at runtime so let's check that out now the term for creating class
at run time is called factory so what you do here is you have to basically create
the skeleton or just write down the basic description for all your possible
classes so you'll mention the class that it inherits from and the attributes
associated with these classes but you do not create it you just give that sort of
definition and then later you'll have certain expressions to evaluate which class
is to be created depending on certain situations so let's begin creating the
definition for our classes so my first class would be a base class of course so
you say type base class and this will inherit from the mother of all classes
which is the objects class and following this you'll have list of all the
attributes associated with this class so i have no attributes associated with my
base class now the second class i create c1 will inherit from my base class which
is b and with this class c1 we'll have an attribute named val associated to it to
which we'll assign
the value 5. now my third class c2 will be about the same thing as c1 just that
this time my attribute val will have a different value 10 just so we can
differentiate between the two and of course our class name would be c2 now we'll
have a function say def class creator and to this i'll pass a variable say bool
now if the variable is true you say return c1 and that will essentially create
your class c1 else return c2 which will create class c2 now outside here i'll
call the class creator function and i'll call it with both the values true and
false so we can see the difference now of course this just creates a class so to
extract an attribute value which is how we'll be able to differentiate between
the creation of the both classes say dot attribute name which is val in our case
and in a similar manner we'll call class creator again but this time with false
extract the value so let's run the code now so as you can see here guys first
time we pass true to class creator and if it is true it's creating class c1 which
is this and that has the value 5 associated with attribute val similarly when we
pass false it creates c2 just the value 10 associated with the attribute valve so
that's how python allows you to create classes even during runtime so now that we
looked into that we are back to our nested functions or program in which we call
functions as objects now suppose i want to call this outer function alone now the
question is is this the only way of calling the outer function within the func
one what i mean is func one is basically your wrapper function right now and
every function that's called through func one comes inside this so i want to call
all the functions in the same manner in the same serial as before but is there an
easy way to do this well yes there is this is where decorators come handy so just
on top of outer funk if i say at funk1 that way this allows me to call outer
object through func one but just with this one statement in this format so i just
say out of funk but now that i have added decorator of funk one although i'm
calling out a funk it'll be called through funk one just as before let's run this
quote and as you see here your funk one has been called first and outer funk has
just been passed as an argument to funk one so that is what decorators basically
do they allow you to change the operation or the functionality of a function
thanks guys now we have a picture to teach you about five important libraries
used in python python is the most widely used programming language today when it
comes to solving data science tasks and challenges python never ceases to
surprise its audience most data scientists out there are already leveraging the
power of python every day hi i'm a peksha from simply learn and well after some
thought and a bit more research i was finally able to narrow down my choice of
top python libraries for data science what are they let's find out so let's talk
about this amazing library tensorflow which is also one of my favorites so
tensorflow is a library for high performance numerical computations with around
35 000 github comments and a vibrant community of around 1500 contributors and
it's used across various scientific domains it's basically a framework where we
can define and run computations which involves tensors and tensors we can say
are partially defined computational objects again where they will eventually
produce a value that was about tensorflow let's talk about the features of
tensorflow so tensorflow is majorly used in deep learning models and neural
networks where we have other libraries like torch and thiano also but tensorflow
has hands down better computational graphical visualizations when compared to
them also tensorflow reduces the error largely by 50 to 60 percent in neural
machine translations it's highly parallel in a way where it can train multiple
neural networks and multiple gpus for highly efficient and scalable models this
parallel computing feature of tensorflow is also called pipelining also
tensorflow has the advantage of seamless performance as it's backed by google it
has quicker updates frequent new releases with the latest of features now let's
look at some applications tensorflow is extensively used in speech and image
recognition text based time series analysis and forecasting and various other
applications involving video detection so favorite thing about tensorflow that
it's already popular among the machine learning community and most are open to
trying it and some of us are already using it now let's look at an example of a
tensorflow model in this example we will not dive deep into the explanation of
the model as it is beyond the scope of this video so here we're using amnest
dataset which consists of images of handwritten digits handwritten digits can be
easily recognized by building a simple tensorflow model let's see how when we
visualize our data using matplotlib library the inputs will look something like
this then we create our tensorflow model to create a basic tensorflow model we
need to initialize the variables and start a session then after training the
model we can validate the data and then predict the accuracy this model has
predicted 92 accuracy let's see which is pretty well for this model so that's all
for tensorflow if you need to understand this tutorial in detail then you can go
ahead and watch our deep learning tutorial from simply learn as shown in the
right corner interesting right let's move on to the next library now let's talk
about a common yet a very powerful python library called numpy number is a
fundamental package for numerical computation in python it stands for numerical
python as the name suggests it has around 18 000 comments on github with an
active community of 700 contributors it's a general purpose array processing
package in a way that it provides high performance multi-dimensional objects
called arrays and tools for working with them also numpy addresses the slowness
problem partly by providing these multi-dimensional arrays that we talked about
and then functions and operators that operate efficiently on these arrays
interesting right now let's talk about features of number it's very easy to work
with large arrays and matrices using numpy numpy fully supports object oriented
approach for example coming back to nd array once again it's a class possessing
numerous methods and attributes nd array provides for larger and repeated
computations numpy offers vectorization it's more faster and compact than
traditional methods i always wanted to get rid of loops and vectorization of
numpy clearly helps me with that now let's talk about the applications of numpy
numpy along with pandas is extensively used in data analysis which forms the
basis of data science it helps in creating the powerful n-dimensional array
whenever we talk about numpy the mention of the array we cannot do it without the
mention of the powerful n-dimensional array also number is extensively used in
machine learning when we are creating machine learning models as in where it
forms the base of other libraries like sci-fi scikit-learn etc when you start
creating the machine learning models in data science you will realize that all
the models will have their bases numpy or pandas also when number is used with
scipy and matplotlib it can be used as a replacement of matlab now let's look at
a simple example of an array in numpy as you can see here there are multiple
array manipulation routines like there are basic examples where you can copy the
values from one array to another we can give a new shape to an array from maybe
one dimensional we can make it as a two dimensional array we can return a copy of
the array collapsed into one dimension now let's look at an example where this is
a jubilee notebook and we will just create a basic array and uh for detailed
explanation you can watch our other videos which targets on these explanations of
each libraries so first of all whenever we are using any library in python we
have to import it so now this np is the areas which we will be using let's create
a simple array let's look what is the type of this array so this is an end array
type of array also let's look what's the shape of this array so this is a shape
of the array now here we saw that we can expand the shape of the array so this
is where you can change the shape of the array using all those functions now
let's create an array using arrange functions if i give arrange 12 it will give
me a 1d array of 12 numbers like this now we can reshape this array to 3 comma 4
or we can write it here itself so this is how our range function and the reshape
function works for numpy now let's discuss the next library which is scipy so
this is another free and open source python library extensively used in data
science for high level computations so this library as the name suggests stands
for scientific python and it has around 19 000 comments on github with an active
community of 600 contributors it is extensively used for scientific and technical
computations also as it extends numpy it provides many user-friendly and
efficient routines for scientific calculations now let's discuss about some
features of scipy so scipy has this collection of algorithms and functions which
is built on the numpy extension of python secondly it has various high level
commands for data manipulation and visualization also the ndmh function of scipy
is very useful in multi-dimensional image processing and it includes built-in
functions for solving differential equations linear algebra and many more so that
was
about the features of sci-fi now let's discuss its applications so cyber is used
in multi-dimensional image operations it has functions to read images from disk
into number arrays to write arrays to discuss images resize images etc solving
differential equations fourier transforms then optimization algorithms linear
algebra etc let's look at a simple example to learn what kind of functions are
there in sci by here i'm importing the constants package of scipy library so in
this package it has all the constants so here i am just mentioning c or edge or
any and this library already knows what it has to fetch like speed of light
planck's constant etc so this can be used in further calculations data analysis
is an integral part of data science data scientists spend most of the day in data
munching and then cleaning the data also hence mention of pandas is a must in
data science life cycle yes pandas is the most popular and widely used python
library for data science along with numpy and matplotlib the name itself stands
for python data analysis with around 17 000 comments on github and an active
community of 1200 contributors it is heavily used for data analysis and cleaning
as it provides fast flexible data structures like data frames series which are
designed to work with structured data very easily and intuitively now let's talk
about some features of pandas so panas offers this eloquent syntax and rich
functionalities like there are various methods in pandas like drop n a fill n a
which gives you the freedom to deal with missing data also partners provides a
powerful apply function which lets you create your own function and run it across
a series of data now forget about writing those for loops while using pandas also
this library's high level abstraction over low level numpy which is written in
pure c then it also contains these high level data structures and manipulation
tools which makes it very easy to work with pandas like their data structures and
series now let's discuss the applications of pandas so pannas is extensively used
in general data wrangling and data cleaning then pandas also finds its usage in
edl jobs for data transformation and data storage as it has excellent support for
loading csv files into its data frame format then pandas is used in a variety of
academic and commercial domains including statistics finance neuroscience
economics web analytics etc then pandas is also very useful in time series
specific functionality like date range generation moving window linear regression
date shifting etc now let's look at a very simple example of how to create a data
frame so data frame is a very useful data structure in pandas and it has very
powerful functionalities so here i'm only enlisting important libraries in data
science you can explore more of our videos to learn about these libraries in
detail so let's just go ahead and create a data frame i'm using jupyter notebook
again and in this before using pandas here i'm importing the pandas library let
me go and run this so in data frame we can import a file a csv file excel files
there are many functions doing these things and we can also create our own data
and put it into data frame so here i am taking random data and putting in a data
frame also i'm creating an index and then also giving the column names so pd is
the alias we've given pandas random data of 6x4 index which is taking a range of
six numbers and column name i'm giving as abcd now let's go ahead and look at it
so here it has created a data frame with my column names abcd my list has six
numbers zero to five and a random data of six by four so dataframe is just
another table with rows and columns where you can do various functions over it
also i can go ahead and describe this data frame to see so it's giving me all
these functionalities where count and mean and standard deviation etc okay so
that was about pandas now let's talk about next library and the last one so
matplotlib for me is the most fun library out of all of them why because it has
such powerful yet beautiful visualizations we'll see in the coming slides plot
and matplotlib suggests that it's a plotting library for python it has around 26
000 comments on github and a very vibrant community of 700 contributors and
because of such graphs and plots that it produces it's majorly used for data
visualization and also because it provides an object oriented api which can be
used to embed those plots into our applications let's talk about the features of
matplotlib the pi plot module of matplotlib provides matlab-like interface so
matplotlib is designed to be as usable as matlab with an advantage of being free
and open source also it supports dozens of back-ends and output types which means
you can use it regardless of which operating system you're using or which output
format you wish pandas itself can be used as wrappers around matplotlibs api so
as to drive matplotlib via cleaner and more modern apis also when you start using
this library you will realize that it has a very little memory consumption and a
very good runtime behavior now let's talk about the applications of matplotlib
it's important to discover the unknown relationship between the variables in
your data set so this library helps to visualize the correlation analysis of
variables also in machine learning we can visualize 95 percent confidence
interval of the model just to communicate how well our model fits the data then
mat modeler finds its application and outlier detection using scatter plot etc
and to visualize the distribution of data to gain instant insights now let's make
a very simple plot to get a basic idea i've already imported the libraries here
so this function matplotlib inline will help you show the plots in the jupiter
notebook this is also called a magic function i won't be able to display my plots
in the jupiter notebook if i don't use this function i'm using this function in
numpy to fix random state for reproducibility now i'll take my n as 30 and will
assign random values to my variables so this function is generating 30 random
numbers here i'm trying to create a scatter plot so i want to decide the area
let's put this so this is multiplying 30 with random numbers to the power 2 so
that we get the area of the plot which we will see in just a minute so using the
scatter function and the alias of matplotlib as plt i've created this if i don't
use this and i have very small circles as my scatter plot it's colorful it's nice
so that's one very easy plot i suggest that you explore more of matplotlib and
i'm sure you will enjoy it let's create a histogram so i'm using my the style as
gg plot and assigning some values to these variables any random values now we
are assigning bars and colors and alignment to the plot and here we get the graph
so we can create different type of visualizations and plots and then work upon
them using matplotlib and it's just that simple so that was about the leading
python libraries in the field of data science but along with these libraries data
scientists are also leveraging the power of some other useful libraries for
example like tensorflow keras is another popular library which is extensively
used for deep learning and neural network modules keras drafts both tensorflow
and theano backends so it is a good option if you don't want to dive into details
of tensorflow then scikit learn is a machine learning library it provides almost
all the machine learning algorithms that you need and it is designed to
interpolate with numpy and sciby then we have c bond which is another library for
data visualization we can say that cborn is an enhancement of matplotlib as it
introduces additional plot types now richard will focus on libraries like numpy
pandas matplotlib and scikit learn following this richard will also focus on
helping you understand web scraping better welcome to numpy what's in it for you
well today we're going to do part one of numpy in a two-part series now we're
going to go over what is numpy installing and importing numpy numpy array numpy
array versus python list basics of numpy finding size and shape of any array
range and arrange functions numpy string functions and then in part two i'll move
on to cover axes array manipulation and much more so let's start with what is
numpy numpy is the core library for scientific and numerical computing in python
it provides high performance multi-dimensional array object and tools for
working with arrays and i'll go a step further and say there are so many other
modules in python built on numpy so the fundamentals of numpy are so important to
latch onto for the python so you can understand the other modules and what
they're doing number's main object is a multi-dimensional array it's a table of
elements usually numbers all of the same type indexed by a tuple of position
integers in numpy dimensions are called axes take a one-dimensional array or we
have remember dimensions are also called axes you can say this is the first axis
zero one two three four five and you can see down here it has a shape of six why
because there's six different elements in it in the one dimension array and they
usually denote that as six comma with an empty node on there and then we have a
two dimensional array where you can see zero one two three four five six seven
and in here we have two axes or two dimensions and the shape is two four so if
you were looking at this as a matrix or in other mathematical functions you can
see there's all kinds of importance on shape we're not going to cover shape today
but we will cover that in part two
did you know that numpy's array class is called nd array for numpy data array
now we're going to take a detour here because we're working in python and two of
my favorite tools in python is the jupiter notebook and then i like to use that
sitting on top of anaconda and if you flip over to jupiter.org that's j-u-p-y-t-
e-r.org you can go in here you can install it off of here if you don't want to
use the anaconda notebook but this is the jupiter setup the documentation on the
jupiter jupiter opens up in your web browser that's what makes it so nice is it's
portable the files are saved on your computer they do run in ipython or iron
python and you can create all kinds of different environments in there which i'll
show you in just a minute i myself like to use anaconda that's www.anaconda.com
if you install anaconda it will install the jupiter notebook with the anaconda
separate and you can install jupyter notebook and it'll run completely separate
from anaconda's jupiter notebook and you can see here i've now opened up my
anaconda navigator what i like about the navigator this is a fresh install on a
new computer which is always nice i can launch my jupiter notebook from in here i
can bring other tools so the anaconda does a lot more and under environments i
only have the one environment and i can open up the terminal specific to this
environment this one happens to have python37 in it the most current version as
of this tutorial and then you open a terminal if you're going to do your pip
installs and stuff like that for different modules you can also create different
environments in here so maybe you need a python36 python35 you can see we're
having a nice framework like anaconda really helps so you don't have to track
that on your own in the jupyter notebook in your different jupiter notebook
setups we'll go ahead and launch this jupyter notebook and then i've set my
browser window for a default of chrome so it's going to open up in chrome and you
can see here this opens up a folder on my computer we have a couple different
options on here remember i set the environment up as python 3.7 you would install
any additional modules that aren't already installed in your python on this and
it keeps them separate so you do have to for each environment install the
separate modules so they match the environment on there and in here we have a
couple things we can look up what's running you have your different clusters
again this is i just installed this on a new machine so i just have the one a
couple things in here that were run on here recently and what we go on here is we
then have on the upper right new and from the pull down menu you'll see python3
and this will open up a new window and now we're in jupiter python so this is a
python window and we'll just do a print and this of course is let's go hello
world and we'll run that and it prints out hello world in the command line
there's a couple special things you have to know we're not going to do today
which is on graphics if you've never seen this one of the things you can do you
can also do a equals hello world and if you just put the a in there now if you do
a bunch of these where you have a equals hello world b equals goodbye world and
you put a b a and return b you'll only run the last one but you can see here if
you put the variable down here it will show you what's in that variable and that
has to do with the jupiter notebook inline coding so that's not basic python
that's just jupiter notebook shorthand which you'll see in a little bit so back
to our numpy numpy array versus python list python list being the basic list in
your python why should we use numpy array when we have python list well first
it's fast the numpy array has been optimized over years and years by multiple
programmers and it's usually very quick compared to the basic python list setup
it's convenient so it has a lot of functionality in there that's not in the basic
python list and it also uses less memory so it's optimized both for speed and
memory use and let's go ahead and jump into our jupiter notebook since we're
coding best way to learn coding is to code just like the best way to learn how to
write is right and the best way to learn how to cook is cook so let's do some
coding here today and just like any modules we have to import numpy we almost
always import it as np that is such a standard so you'll see that very commonly
we can just run that and now we have access to our numpy module inside our python
and then the most common thing of course is to go and create a number array and
in here we can send it a regular list and so we'll go ahead and send this a
regular array uh let's do one two three to make it simple and then i'm just going
to type in a and we'll run this as you can see down here the output is an array
of one two three and we could also do print just a reminder that this is an
inline command so that wouldn't work if you're using a different editor you can
see that it's array123 but we'll go and leave it as a kind of a nice feature so
you can see what you're doing really quick in the jupyter notebook and just like
all your other standard arrays i can go a of which is going to be a value of 1.
of course we do a of 1. you go all the way through this a of 1 has a value of 2
in it so whether you're using the numpy array or the basic python list that's
going to be the same that should all look pretty familiar and be pretty
straightforward remember the first value is always 0 and when we set on there so
let's take a look why we're using numpy because we went over the slide a little
bit but let's just take a look and see what that actually looks like and what we
want to look at is the fact that it's fast convenient and uses less memory so
let's take a glance at that in code and see what that actually looks like when
we're writing it in python and what the differences are and to do this i'm going
to go ahead and import a couple other modules we're going to import the time
module so we can time it and we're going to import the system module so that we
can take a look at how much memory it uses and we'll go and just run those so
those are imported so we'll do b equals oh range of one yeah one thousand is
fine and so that's going to create a list of one thousand zero to nine hundred
ninety nine remember it starts at zero and it stops right at the one thousand
without actually going to the one thousand and let's go ahead and print and we
want system dot get size of and we'll pick any integer because we have you know
zero to a thousand we'll just throw one in there five it doesn't matter it's
going to whatever integer we put in there's going to generate the same value
because we're looking at the size of how how much memory it stores an integer in
and then we want to have the length of the b that's how many integers are in
there and if we go ahead and execute this and run this in a line we'll see oops
i did that wrong comma if we multiply them together we'll see it generates 28 000
so that's the size we're looking at is 28 000 i believe that's bytes that sounds
about right so let's go ahead and create this in numpy and we'll go with c
equals np and this is a range so that's the numpy command do the same thing that
we were just doing in a list and we'll also use the same value on there the 1000
and then once we've created the c value of c for np dot a range let's go ahead
and print and we can do that by doing c dot size times c dot item size when
it's very similar we did before we did get the size of so the c size is the size
of the array and each item size just reversed so it's the size of an integer five
item size it's going to be the integers and c size let's just take a look and see
what that generates and wow okay we got four thousand versus twenty eight a
significant difference in memory how much memory we're using with the array and
then let's go ahead and take a look at speed let's do um oh let's do size we
tried this with lower values and it would happen so fast that the npra kept
coming up with zero because it just rounded it off so size and let's create an l1
moles range of size and we'll do an l2 we'll just set up to the same thing it's
also range of size on there there we go and then we can do on a1 equals np dot
a range size and then let's do an a 2 equals np dot a range we'll keep it the
same size and what we're going to do is we're going to take these two different
arrays and we're going to perform some basic functions on them but let's go
ahead and just load these up now we'll go ahead and run this so those are all set
in memory except for the typo here quickly fix that there we go so these are now
all loaded in here and let's do a start equals time dot time so it's just going
to look at my clock time and see what time it is and we'll do result equals and
let's do let's say we got an array and we're going to say let's do some addition
here x plus y for x comma y in and we'll zip it up here two different arrays
so here's our two different arrays we're gonna multiply each of the individual
things on here l1 l2 there we go so that should add up each value so l1 plus l2
each value in each array then we want to go ahead and print and let's say
python list took and then we'll do time dot time we'll just subtract the start
out of there so time whoops i messed up on some of the quotation marks on there
okay there we go time minus the start and we'll convert that to second so we'll
go because in milliseconds or times one thousand and let's hit the run on there
it's kind of fun because you also
get a view while we're doing this of some ways to manipulate the script and as
you can see also my bed typing there we go okay so we'll go ahead and run this
and we can see here that the python list took 34 actually i have to go back and
look at the conversion on there but you can see it takes roughly 0.34 of a second
and we go ahead and print the result in here too let's do that we'll run that
just so you can see what the what kind of data we're looking at and we have the 0
four six eight so it's just adding them together it looks pretty straightforward
on there and if we scroll down to the bottom of the answer again we see python
list took 46 a little different time on there depending on what um core because
i have this is on an eight core computer so it depends on what core it's running
on what else is pulling on the computer at the time and let's go back up here and
do our start time paste that into here and this time we're gonna do a result
equals and this is really cool notice how elegant this is so straightforward this
is a lot of reason people started using numpy is because i can add the two arrays
together by simply going a1 plus a2 it makes a lot of sense both looking at it
and it's just very convenient remember that slide we're looking at fast
convenient and less memory so look how convenient that is really easy to read
real easy to see and i don't know if we don't need to print the result again so
let's just go ahead and print the time on here and we'll borrow this from the
top part because i really am a lazy typer and this isn't the python list this is
the numpy list or number array let's go ahead and see how that comes out and we
get 2.99 so let's take a look at these two numbers 46 versus 2.99 so we'll just
round this up to 3. that's a huge difference that's that's like more than 10
times faster that's like 15 times roughly at a quick glance i'd have to go do the
math to look at it and it's going to vary a little bit depending on what's
running in the background the computer obviously so we've looked at this and if
we go back here we found out it's much faster yes there's different going to be
different speeds depending on what you're doing with the array very convenient
easy to read and it uses less memory so that's the core of the numpy that's why a
lot of people base so many other modules on numpy and why it's so widely used so
we did glance at a couple operations when we were looking at speed and size let's
dive into a little bit more into the basic operations these are always nice to
see i mean certainly you want to go get a cheat sheet if you're using it for the
first time you know look things up google is your friend we did this with the
most basic numpy dot array or np.array and we'll go ahead and create an array
let's do pairs one comma two and then let's do a three comma four and if we're
gonna do that let's do five comma six there we go and if we go ahead and take
this and run this and go ahead and do our a down here so it's in line and i'll
print that out you can see it makes a nice array for us so we have a and if you
look at that we have three different objects each with two values in them and
hopefully you're starting to think well how many dimensions or indexes is that
and you'll see three by two so let's go ahead and take a look and let's go how
about a dot in dimensions speaking of which we'll run that and we have two
dimensions for each object and then we can do the item size so a dot we saw this
earlier we looked up how many items it was up here where we wanted to multiply
item size times the actual size of the object so the memory is being used versus
the item size and we should see four there memory is compressed down that's
always a good thing and then the shape the shape is so important when you're
working with data science and you're moving it from one format to another so we
have our shape we just talked about that we have three by two three rows by two
objects in each one generally i don't look too much at the size but the
dimensions i'm always looking up this is nice you can automate it so you might be
converting something you might need to know how many dimensions are going into
the next machine learning package so that you can automatically just have it send
that information over so we looked at a shape let's go and create a slightly
different array np dot array let's go ahead and just do as our original setup
here and one of the features we can do which is really important is we can do d
type equals in this case let's do np float 64. and so what we've done is
converting all of these into a float and we've type in a and now instead of
having one two three four five six you see they're all float values one dot zero
there's no actual zero in there just there's a one dot or the one period two
three period four period five period six period and this again data science i
don't know how many times i've had to convert something from an integer to a
float so that's going to work correctly in the model i'm using so very common
features to be aware of and to be able to get around and use and we'll also do
let's just curiosity item size we'll go and run that and we see that it doubled
in size so it's not a huge increase well doubling is always a big increase in
computers but it's not a huge increase compared to what it would be if you're
running this in the python list format and then we did the shape earlier without
having it set to the float64 let's go ahead and do a shape with it set to 64. and
it should be the same three comma two so it all matches so we've gone through and
remember if you really if this is all brand new to you according to the
cambridge study at the cambridge university if you're learning a brand new word
in a foreign language the average person has to repeat it 163 times before it's
memorized so a lot of this you build off of it so hopefully you don't have to
repeat it 163 times but we did manage to repeat it at least twice here if not a
little bit more and let's go ahead and take this we're going to look at one more
setup on here and let me just take this last statement here on the converting
our properties of our data and instead of float 64 let's do complex let's just
see what that looks like and let's go ahead and print that out and run it and so
we now have a complex data set up and you'll see it's denoted by the one dot plus
zero dot j and if we flip over here and do a basic search for numpy data types
better to go to the original web page but pull up a bunch of these you can see
there's a whole list of different numpy data types shorthand complex we have
complex complex 64 complex 128 complex number represented by 264-bit floats real
and imaginary components one option on there float16 float32 float shorthand for
float64 most commonly used and of course all the different ones that you can
possibly put into your numpy array so we covered a basic addition up there we're
comparing how fast it runs it's some very basic components how to set up a numpy
array how many dimensions it has item size data type item again we went to item
size and there's also the shape probably one of the more used i used a shape all
the time very commonly used and then down here you can see where we actually
created a numpy complex data type so let's look at some other features in numpy
one of them is you could do numpy dot zeros and we're gonna do three comma four
there we go and we'll go ahead and run this and you can see if i do np dot zeros
i create a numpy array of zeros this is really important i was rebuilding my own
neural network and i needed to create an array where i initialized the weights
and i want them all to be the same weight in this case i want them to start off
as zero for the particular project i was working on and there's other options
like you can do numpy ones and we'll do the same thing three comma four we'll run
that and you can see i've created a an array of numpy ones in this case it comes
out as a float array and this is an interesting to note because we have let's go
back to our python and do l range five and we'll print the l so there's our list
and if i run that it doesn't create the range until after the fact until you
actually execute it that's an upgrade in python python27 actually created the
array zero one two three four this one actually creates the script and then once
it's used it then actually generates the array and if we do that in numpy a range
remember that from before and if we do a numpy a range 5 and let's do l equals
or we can just leave it as numpy that's fine there we go just run that you can
see there we actually get an array zero one two three four for the value the
numpy arrange a range five generates the actual array and for part one we're
going to do just one more section on basic setup and we're going to
concatenation do a concatenation out example there we go we're going to do
strings let's take a look at strings what's going on with there and let's do oh
let's see print let's do an np character something new here and we're going to
add and then here's our brackets for what we're going to add oh and let's say
let's do hello comma hi and in the brackets on there let's create another one
and this one's going to be a b c and we'll do x y z so we're just creating some
randomly making some up on here and then we'll go ahead and just print this if we
run that and come down here and of course make sure all your brackets are open
and closed correctly and then you can see in here when we concatenate the
example in numpy it takes
the two different arrays that we set up in there and it combines the hello with
the abc and the high with xyz and if we can also do something like print oh
let's do np character dot multiply so there's a lot of different functions in
here again you can look these up it's probably good to look them all up and see
what they are but it's good to also just see them in action let's do hello space
comma three and we'll run this one and run that without the error you'll see it
does hello hello hello so we multiplied it by three and we can also let's just
take this whole thing here instead of retyping it and we can do character center
so instead of multiply let's do center and over here keep our hello going take
the space out of there and let's do center at 20 and fill character equals and
we'll fill it with dashes so if we run this you can see it prints out the hello
with dashes on each side and we keep going with that we can also in addition to
doing the fill function we can play with capitalize we can title we can do
lowercase we can do uppercase we can split split line strip join these are all
the most common ones and let's go ahead and just look at those and see what those
look like each one of them here we're going to do the hello world all-time
favorite of mine i always like to say hello universe and you can see here we do
capital h with the world but so we want to capitalize so capitalize is the first
one in the array so we get hello world on there and we can also take this and
instead of capitalizing another feature in here is title and let's just change
this to how are we doing how are you doing instead of do you let's run that
and you can see here because we created as a title it capitalizes the first
letter in each word and in this one we're going to do character lower two
different examples here we have an array we have hello world all capitalized and
we have just hello and you can see that one is an array and one is just a string
if we run that you get a an array with hello world lowercase and hello lowercase
and if we're going to do it that way we can also do it the opposite way there's
also upper and let's paste those in there and you can see here we have
character.upper opposite there python.data and that will do python is easy
hopefully you're starting to get the picture that most of the python and the
scripting is very simple it's when you put the bigger picture together and starts
building these puzzles and somebody asks you hey i need the first letter
capitalized unless it's the title and then we have you start realizing that this
can get really complicated so numbi just makes it simple and we like that and so
in this case we did python data it's all uppercase python is easy like shouting
in your messenger python is easy and then if you're ever processing text and
tokenizing it a lot of times the first thing you do is we just split the text and
we're just going to run this in p dot character dot split are you coming to the
party if we do that returns an array of each of the individual words are you
coming to the party splitting it by the spaces and then if we're going to split
it by spaces we also need to know how to split it by lines and just like we have
the basic split command we also have split lines hello and you'll see here the
scoop in for our new line and when we run that if you're following the split part
with the words you should see hello how are you doing the two different lines are
now split apart and let's just review three more before we wrap this up commonly
used string variable manipulations we have strip and in this case we have nina
admin anita and we're going to strip a off of there let's see what that looks
like and then you end up with nin diminish basically takes up all leading and
trailing letters in this case we're looking for a more common would be a space in
there but it might also be punctuation or anything like that that you need to
remove from your letters and words and if we're going to strip and clean data we
also need to be able to reformat it or join it together so you see here we have a
character join we'll go ahead and run this and it has on the first one it splits
these letters up by the colon and the second one by the dash and you can see how
this is really useful if you're processing in this case a date we have day month
year year month date very common things to be have to always switch around and
manipulate depending on what they're going into and what you're working with and
finally let's look at one last character string we're going to do replace if
you're doing misinformation this is good pulling news articles replacing is and
what in this case we're just doing here's a good dancer and we're going to
replace is with was and you can see here he was a good dancer hopefully that's
not because he had a bad fall he just was from like you know 1920s and has gotten
old so there we go we covered a lot of the basics in numpy as far as creating an
array very important stuff here when you're feeding it in how do we know the
shape of it the size of it what happens when we convert it from a regular integer
into a float value as far as how much space it takes we saw that that doubled it
item size you have your n dimensions and probably the most used is shape and
we'll cover more on shape in part two so make sure you join us on part two
there's a lot of important things on shaping in there and setting them up we also
saw that you can create a zeros based array you can create one with ones if we
do a range you can see how it is a lot easier to use to create its own range or a
range as it is in numpy you saw how easy it was to add two arrays we saw that
earlier just plus sign then we got into doing strings and working with strings
and how to concatenate so if you have two different arrays of strings you can
bring them together we also saw how you can fill so you can add a nice headline
dash dash dash well we saw about capitalize the first letter we saw about turning
it into a title so all the first letters are capitalized doing lowercase on all
the letters upper for all the letters just lower and upper nice abbreviation we
also covered how to split the character set how to strip it so if you want to
strip all the a's out from leading aias and ending a's or spaces you can do that
very easily also how to join the data sets so here's a character join option for
your strings and finally we did the character replace today we're going to do the
pandas tutorial really is a core python module you need for doing data science
and data processing there's so many other modules that come off of it there it
actually sits kind of on numpy so if you've already had our numpy array hopefully
you've already gone through the numpy tutorial one and two so today we're going
to cover what is pandas we'll discuss series we'll discuss basic operations on
series then we'll get into a data frame itself basic operations on the data frame
file related operations on a data frame visualization and then some practice
examples roll up our sleeves and get some coding underneath there and let's start
with just some real general what is pandas pandas is a tool for data processing
which helps in data analysis it provides functions and methods to efficiently
manipulate large data sets now this is a step down from say using spark or
hadoop in big data so we're not talking about big data here but we are talking
about pandas when there is some connections there's like an interface going on
with that so there is availability but you really should know your pandas because
if you're working in big data you'll know there's data frames well pandas is a
data frame primarily it has a couple different pieces we'll look at here and if
you've never worked with data frames before a data frame is basically like an
excel spreadsheet you have rows and columns you can access your data either by
the row or the column and you have an index and different that kind of set up and
we'll dig more into that as we get deeper into pandas but think of it as a giant
excel spreadsheet that's optimized to run a larger data on your computer and
then i said it that it's a data frame so the data structures in pandas are series
one-dimensional arrays and then we have data frame two dimensional array and it
really centers around the data frame the series just happens to be part of that
data frame and here's a closer look at a pandas series series is a one-
dimensional array with labels it can contain any data type including integers
strings floats python objects and more so it's very diverse if you remember from
numpy we studied they had to be all uniform not in pandas and pandas we can do a
lot more and pandas actually kind of sits on numpy so you really need to know
both of those if you haven't done the numpy tutorials and you can see here we
have our index one two three four five and then our data a b c d and e very
straightforward it's just two columns and we have a nice index label and a column
label for the data and then a data frame is a two-dimensional data structure with
labels we can use labels to locate data and you can see here we had if we go back
one we had our index one two three four five so in each one of these series they
would share the same index over there the row index so you have your row index df
dot index and then you have a column index df.columns and this should look like i
said this would be really familiar if you've done any work with spreadsheets
excel so it kind of resembles that this does make it a lot easier to manipulate
data and add columns
delete columns move them around same thing with the rows so you have a lot of
control over all of this now we're of course going to do this in our jupiter
notebook you can use any of your python editors but i highly suggest if you
haven't installed jupyter and haven't worked with it it is probably one of the
best ways for easily displaying a project you're working on i skip between a lot
of different user interfaces or ides for editing my python and it's just simply
jupiter.org j-u-p-y-t-e-r.org and then i always let mine sit on anaconda
anaconda.com and just real quick we'll open that up for you oops offline mode
don't show me that again but you can see here that i have different tools that i
can actually install in my anaconda including the jupiter notebook which comes by
default and then i have access to the environments and again that's
anaconda.com named after the very large one of the largest world's largest snakes
and then jupiter notebook in this case jupiter.org and when we're in our i'm
going to go in here to our jupiter notebook and we're going to go ahead and just
do new and a python 3 and this will open up a python 3 untitled folder so diving
right in let's go ahead and give this a title pandas tutorial and we'll go up to
cell and we'll change the cell type to mark down so it doesn't execute it as
actual code one of those wonderful tools when you have jupyter notebooks so you
can do demos with this and let's go ahead and import pandas and usually people
just call it pd that has become such a standard in the industry so we'll go ahead
and run that now we have our pandas has been imported into our jupyter notebook
and then oh we can go ahead and let me do the control plus since it's internet
explorer i can enlarge it very easily so you have a nice pretty view oops too big
there we go and whenever you're working with a new module it's good to check your
version of the module in pandas you just use the in this case pd dot underscore
underscore version underscore underscore that's actually pretty common in most of
our python modules there's different ways to look up the version but that's one
of the more common ones and we'll go ahead and run that we get 0.23.4 and if we
go to the pandas site we see 0.23.4 as the latest release and of course a
reminder that if you're going to environment you need to install it so you'll
need to do pip install pandas if you're using the pip installer we'll go and
close out of that and the first thing we want to do is we're going to work with
series a lot of stuff you do in series you can then do on the whole data set we
need to do what create one we need to manipulate it take pieces of it so query it
query it delete so you can delete different parts of it so we want to do all
those things with the series and we'll start with the series and then almost all
the code in fact all the code does transfer right into the actual data table so
we go from a series of a single list of one column and then we'll take that and
we'll transfer that over to the whole table and we'll start by creating let's put
up there we go creating a series from list and let's just call this arr equals
and we'll do 0 1 2 3 4. if you remember from our last one we could easily do r
equals range of 5 which would be 0 to 4. but we'll do r equals zero to four and
we'll call this s1 and we'll go pd and series is capitalized this one always
throws me is which letters do you capitalize on these modules they're getting
more and more uniform but you gotta watch that with python and we're just going
to go ahead and do arr so we're just going to take this python list and we're
going to turn it into a series and then because we're in jupiter we don't have
to put the print statement we can just put s1 and it'll print out this series for
us and let's go ahead and run that and take a look and you'll see we have two
rows of numbers so the first one is the index now it automatically creates the
index starting with zero unless you tell it to do differently so we get 0 index
row 0 is 0 1 1 2 2 3 3 4 4. and because it's a series it doesn't need a title for
the column there's only one column so why title it and this also lets you know
that it's a data type of integer 64. so we print this out this is our series our
basic series we've just created and let's do a second series pd and we'll use
the same data list and let's go ahead and do order we'll give it an order equals
oh let's do it this way let's go index equals order and it helps if we actually
give it an order so we'll do order equals and let's do one two three four five
so instead of starting with zero we're going to give it an order starting with
one we're going to run that and we'll go ahead and print it out down here s2 and
we'll see that we now have an index of 1 2 3 4 5 and that represents 0 1 2 3 4 in
the series and we're still data type integer 64. and very common as you're
missing with numpy arrays is we can import our numpy as np remember that from our
numpy tutorials we can go ahead and create a numpy out of random with the random
numbers of five and let's just see what that end looks like so we can see what
our number looks like so we have some nice random float values here 2.33 so on
and this from our last tutorial the numpy tutorial one and two and instead of
calling it order let's call it index and we're going to set our index equal to a
b c d and e i want to show you that the index doesn't have to be an integer so it
can be something very different here and then let's go ahead and create our we'll
just use s2 again and here's our np for numpy series capital s and n is our np
for numpy pd for pandas there we go switching my anachronisms so we have
pd.series of n and we're going to do our index equals our index we just created
and then let's go ahead and see what that looks like s2 is a print it and let's
run that and we can see here we have a nice series going on a b c d and e for our
indexes so instead of it being 0 1 2 3 or 4 we can make this index whatever we
want and you can see the numbers here going down that we randomly generated from
the number array so we use numpy to create our panda series right here and so
continuing on with creating our series this one i use so often we create a series
from a dictionary so we have our dictionary in this case we went ahead and did a
of 1 b is 2 c a3 d4 ef5 so each one of those is a key and then a value and then
we're going to use oh let's use s3 equals pd for pandas series and then we want
to go ahead and just do d in here print out s3 here and let's go ahead and run
this and you can see we got a is 1 b is two c is three d is four e is five and
it's still of integer 64 because the actual data is one two three four five and
it's all integer 64 type 64. and the last thing we want to do in the creating
section of our series is to go ahead and modify the index because we're going to
start modifying all this data so let's start with modifying the index of the
series and if you remember let's do a print this time s1 i'll go ahead and run
this and the reason i did print is because it only prints out the last variable
so if i put s1 up here and we're going to do another variable back down lower it
won't print the first one just the last one and we're going to go ahead and take
s1 the index and we're just going to set it equal to a new index and obviously
the number of objects in our index has to equal the number of objects in our data
and then because it's the last variable we can go ahead and just do an s1 and
let's run that and you can see how we went from 0 to 0 0 1 2 3 4 as our index
we've now altered it to a b c d and e so this would be much more readable or
might be representational of a larger database you're working with so cool tools
we've covered creating database based on our basic array python array we've
showed you how to reset the index then we showed you how to use a numpy array so
you can put a numpy array in there it's all the same you know pd.series a numpy
array and then we can set the index on there and the same thing with the
dictionary so it's very versatile how it pulls in data and you can pull in data
from different sources and different setups and create a new series very easily
in the pandas and then we left on changing your index so now we have a new index
on here and then we want to go ahead and do some selection let's do some basic
slicing most common thing you'll probably do on here and we'll just do s1 this
notation should start to look really familiar again this is going to put an
output so i'd usually it doesn't change s1 this just selects it so we might do a
equals s1 and then print a and you'll see that it just looks at the first three
zero one two and we can do the same thing by not having the a in there i'll go
ahead and take that out but just a reminder that it's not actually changing s1
it's just viewing s1 so simple slicing on here and we can likewise do an append
so before we do a pen let's just do a quick kind of fun one we'll do two minus
one and you'll see it covers everything but the e of course you can do minus two
on this side so one another way to select it is to go how far from the end and
likewise we can do a two here cde to the end so it starts at the second one and
another way we can do this is we can do a minus two over here and that looks at
just the last two in the slice so you can see how easy it is to slice the data
and of course there's no reason to do this but you could select all of them if
you wanted to view all of them on there helps 32 there's not 32 so it's just
going to show the
first three there we go and then we can also append so i can take and oh let's
create another series and append one to it and if you remember we had s3 there's
our s3 and we have our s1 we'll go ahead and do s1 and let's go ahead and do oh
let's call it s4 equals s1 a pin s3 so we're just going to combine those two
into s4 and if we go ahead and print s4 on here you'll now see that we have a b c
d e a b c d e zero one two three four one two three four five because we started
the data at one so very easy to append one series to the next and if we're
going to append one series to the next we need to go ahead and drop or delete one
and drop is a key word for that and let's just do e our index e and so if i run
this you'll see that it'll print it out and a b c d there's no e and remember
all these changes if i type in s4 again you'll see that s4 still has e in it so
this change does not affect the series unless you tell it to so i'd have to do
like x s4 equals s4 dot drop e and there's another way to do that which we'll
show you later on let me just cut this one out there we go all right so we've
covered all kinds of cool tools here we have appending we have slicing we did all
the creating stuff earlier as you can see here on the setup how easy it is to
manipulate the series so next what we want to get into is we want to get into
operations that happen on the series let me go ahead and change this cell to mark
down there we go and run that so series operations what can we do with the series
and let's start by creating a couple arrays we'll call it array one and we'll do
zero through seven and array two six through six seven eight nine five i don't
know why we threw the five on the end let's go ahead and run those so those load
up into jupiter and we'll do this a little backwards we're going to do s5 equals
a panda series of array two so i'm doing this in reverse and then when we do s5
you'll see that we have zero to 4 it automatically assign the index 67895 for
our series and let's go ahead and do the same and we'll call this s6 and we'll
set this equal to pd series for our first array and if we do an s6 down here to
print it out we'll see something similar i got zero through six zero one two
three four five seven for the data so those are two series we just created series
six five and six and one of the first things we can do is we can add one series
to the next so i can do s5 dot add s6 and let's see what that generates and just
a quick thing if you never use pandas what do you think is going to happen with
the fact that this only has five different values in it and this one has seven
values so let's see what that does and we end up with six eight ten twelve 9 and
it goes oh i can't add this there's nothing there so it gives us a null return
very different than the numpy that would have given you an error this instead
tells you there's no value here because we couldn't generate one so we can easily
add s5 dot add s6 and likewise we can do s5 dot sub for subtract s6 and we'll
run that and on the add the subtract and you guessed it we're going to do
multiply and divide next again you can see there's null values where it can't
subtract the two because there's no values there to subtract we can also do s5
multiply mul they're all three letters on these that's one of the ways to
remember how they figured out the code for this so remember these are all three
letters mole we'll go ahead and run this and you can you can see how they're
multiplied together and then we can also do the s5 div three letters again s6
and run that and you'll see here this goes to infinity because we have zero in
the wrong position so it actually gives you a whole different answer here that's
important to notice and then in the null values because there's no data and it
can't actually produce an answer off of an old off of missing data and since
we're in data science let's do s6 median so let's look at the median data which
is simply median sorry for those who are following the three letters because
median is not three letters and you can see an s6 is 3.0 and let's do a print
here and we'll do median or average s6 and let's print max comma s6 and just
like median there's max value and if we're going to have a max value we should
also have a minimum value so let's pop in minimum we'll go ahead and run this
and you're starting to see something that would be generated like say an r where
you're starting to get your different statistics we have a medium value of 3 max
value of 7 and a minimum value of 0. and what it does when it hits these null
values if there is no values in there because we could still do that we could
actually you know what let's go up here and do let's pick this one where he
multiplied let's go s7 equals i'll go and print the s7 just so i keep it nice
and uniform so i still have my s7 down there and run it and then i want to take
the s7 because s7 now has null values and an infinity value and let's see what
happens this is going to be interesting because i want to see what it does with
infinity and we end up with a median of 6 maximum of 27 and minimum of zero which
is correct it drops those values so when it gets to there and it had doesn't know
what to do with them it just drops those values and then it computes it on the
remaining data on there so it's important to know when you're making these
computations you're looking at min and max and median you're not going to know
that there's no values unless you double check your data for the null values it's
a very important thing to note on there so just a real quick review on there
we've done our created our pd series and we've gone ahead and done addition
subtraction multiplication division all those are three letters so sub min div
add and then we looked at median maximum and minimum so we're going to go ahead
and jump into the next big topic which is to create a data frame so now we're
going to go from series and we're going to create a number of series and bundle
them together to make a data frame there we go cell type markdown let me go ahead
and run that so we have a nice title on there it's always good to have a good
title all right so our first data frame we'll jump in with some stuff that looks
a little complicated we'll break it down first i'm going to create some dates and
you know what let's just go ahead and do this i want you to see what that looks
like what i'm creating here i've created a series of dates pd date range and
we're going to use these for the index okay so when you look at this you'll see
that it's just basically it comes out kind of like a basic python list or a numpy
array however you want to look at it with our different dates going down and
we've generated six of them and it's going to have whatever time it is right now
on your on the thing for the date for the time that's that time stamp right there
and then you'll see we have 11 19 2008 11 20 11 19 and looking into the future
there so that's all this is is generating a series of dates that we're going to
use as our index and this is a pandas command so we have a date range which is
nice it's one of the tools hidden in there in the pandas that you can use and
next we're going to use numpy to go ahead and generate some random numbers in
this case we'll do the np.random.random in six comma four you can look at this
as rows and columns as we move it into the pandas and of course you could reshape
this if you had those backwards on your data but we want the six to match the
rows and we have six periods so our indexes should match along with the rows on
there and then you know before we do the next one let's go ahead and just print
out our numpy array so you can see what that looks like here we have it one two
three four by one two three four five six four by six so it's a nice little setup
on there and since working with data frames can be very visual let's give our
columns we have four columns and we're going to give them names a b c and d so
now we have columns on there also and then let's put this all together in a data
frame and we can actually you know what let's do this since i did it with
everything else let's go ahead and do columns and you can see there's our columns
on there and we'll go ahead and do df1 equals pandas dot data frame and note
that the d and the f are capitalized series it was just the s and i always
highlight this because you don't know how many times these things get retyped
when you forget what's capitalized on there it's a minor thing you'll pick it up
right away if you do a lot of it and the first thing we want to do is we want to
go ahead and take our numpy array because that's what we're going to create our
data frame off of is the numpy array and then we want our index equal to our
dates so there's our index in there and then we also have columns equals columns
and then finally let's see what that looks like now remember we had all the
different data that just looks like a jumble of data we have our column names and
everything else our numpy array kind of just a jumble array over there four by
six you could sort of read it but look how nice this looks i mean this is you
come into a board meeting you're working with your shareholders this is pretty
readable this is you know this is our date this is our a b c d whatever it is
maybe it's one of these dates has your leads closures lost leads total dollar
made you know whatever it is fits in a business maybe it's measurements on some
scientific equipment whether searching material you know where this is like
higher the temperature low of the day
humidity of the date whatever it is so you can see that we can really create a
nice clear chart and it looks just like a spreadsheet you know we have our rows
and we have our columns and we have our data in there now this one i use all the
time if we're going to create we can create it like you saw here with our numpy
array very easy to do that and reshape it you can also create it with a
dictionary array so here we have some data let me just go down a notch so you can
see all the data on there we have an animal in this case cat cat snake dog dog
cat snake cat dog we have the age so we have an array of ages we have the number
of visits and the priority was it a high priority yes no and then we're going to
take that we're going to create some labels we have a b c d e f g h i and what i
want you to notice on this is we have a title animal and then we have basically a
python list and these lists they don't necessarily have to be equal because we
can have non-data you know np.nand numpy array null value but we want to go ahead
and create labels that are equal to the number in the list so a the first cat b
the second cat c the snake d the dog and so on so we'll go ahead and create our
labels which we're going to use as an index and we'll call this df let's do it
this way we'll call this df2 equals pd for pandas data frame and then we have
our data just like we did before and we have our index equals labels and if
we're going to go from there let's go ahead and print it out so we can see what
that looks like df2 so let's go ahead and run that another again you have a nice
very clean chart to look at we've gone from this mess of data here to what looks
like a very organized spreadsheet very visual and easy to read animal age visits
priority and then a through j cats and all your different animals so on and so on
and then when you do programming a lot of times it's important to know what the
data types are so we can simply do df2 types and if we run that we can see that
our animal is an object because it's just a string but it comes in as an object
age is a float64 integer 64 and then priority again is just an object and
exploring this this one's very popular let's go df2 [Music] head and if we
print that out the df2 head returns the first 5 and we can change this you don't
have to do five you might want to just look at the top two maybe you want to look
at let's see oh let's do six so maybe we'll look at just the top six in the
database in your data frame and you can actually this creates another data frame
so i could have a df three equal to df2 and this now takes the df2 and just the
first six values so if we do df3 run get the same answer and if we do it the
head of the data we can also do the tail it's the same thing df tail you can look
at the last we'll just do the tell which by default does five the last five and
of course you can just look at the last three of those real quick just to see
what's at the end of the data and this is the tell i love doing the tail of one
because i'll have like the index or something like that and it will just show me
the last whatever the last entry was looking at stock values and i might want to
look at just the last five days of the stock values i can do that with the data
frame tail and some other key things to look up are the index so we can do df2
dot index and i want you to notice that this isn't a call function so if i put
the brackets on the end it'll give me an error because index is not callable it's
just an object in there so we do df2.index there's also columns so we can go
ahead and let's do a let's print this remember the first one is not going to
show unless i print it and then df2 column so now we can see we have our indexes
and we have our columns listed here df2.columns animal age visits priority it
tells you what kind of object it is or what kind of data type it is and they're
both object and then finally df2 dot values and again there's no brackets on the
end of df2.values because this is an actual object it's not a callable function
so we'll go ahead and run that and it creates just displays a nice array a very
easy way to convert this back to a numpy array basically so before i go into the
next section let's just take a quick look at what we covered so far with the data
frame we came up here we created our data frame we did it from a numpy array
first setting the columns and the index the index is setting it up is the same as
when we set up the series so that should look very familiar so is the whole
format the numpy array the index dates and the columns columns and remember in
our numpy array we're looking at row comma column so six rows four columns is how
that reads in the data frame and we went ahead and also did that from a
dictionary in this case animal was the column name with all the date data
underneath that column and then age with that data visits that data priority that
data and then of course we added our labels in there for our index so there's no
difference in there but it automatically pulled the column names important to
know when you're dealing with a data frame and importing a data frame this way
and then we did looking up d type we looked at head and tail looking at your data
really quick we also did index and columns and values and note these don't have
the brackets on the end so the next thing we want to do is go ahead since we're
dealing with data science is we want to go and describe the data so we have
df2.described to do that and we're going to manipulate it in just a minute but
let's just see what this generates and you can see right here we have age and
visits so looking at our data from up above let me just go all the way up here
animal age visits priority and it does a nice job generating your age versus
visits which has all the data you have your account your means your standard
deviation your minimum value 25 or in this group 50 75 and your maximum value so
this will look familiar as a data science setup with your describe for a quick
look at your data frame data so let's start manipulating this data frame moving
stuff around and we'll start with transposing and it is simply capital t for
transpose and when we run that it flips the columns and the indexes so now the
indexes are all column names and the columns are all indexes animal age visits
priority so if we had come in here with our data shaped wrong up above where we
had a 4x6 we can quickly just swap it if we had it backwards not a big deal and
we can also sort our data something that you can't do which is more difficult to
do with a lot of other packages and the data frame is really easy to do take our
data frame df2 and we're going to sort underscore values by equals age and so
when we run this you'll see the default is ascending so we have 0.52 2.53 and
everything else is organized so if you look at your indexes they've been moved
around because each index it moves a whole row not just the one piece of data is
not being sorted so very quick way to sort by age are different data in the data
frame and in addition to sorting it we can also slice the data frame so i can do
df2 and this should look familiar from earlier we'll just do one to three so
we're going to pull out oops it does help if i use a df instead of just d and
we're going to pull up just between one and three so we have no zero which is a
we have b which is two or b which is one and c which is two so one two and then
it does not include three which is the standard in python and we can even do
something like this we can combine them which is always fun because remember this
returns a data frame so if i take df2 dot sort values and we'll do by equals
age this is just kind of fun and then i'm going to slice it there we go double
check my typing and run it and now you should see fa because fa are now 1 and 2
on there so you can very quickly create a whole string on here which narrows it
you know that you can sort it then slice it and do all kinds of fun things with
your data frame we'll just go back to the original one run there we go and if we
can slice it by row we can also query the data frame so we can do df2 and this is
a little different because i'm going to create an array within an array and in
this case we're going to look at oh let's do age comma visits so look at the
different format in here we have one to three so we've done this by slicing by an
integer value and then on here i've done df2 age comma visits in an array and
when i run this you can see that we get just these two columns on here we get age
and visits so it's a quick way to select just two columns or select number of
columns you're working with and if you stop there we did the slicing almost
identical to slice is i location which uses the integer location one comma three
there's a push in pandas to move to this particular setup instead of doing just a
regular slice and that's because this can be confusing when we slice one to three
and then we select age and visits so there is a push to go ahead and move to an
eye location which does the same thing you can see here bc it's the same as up
above there's also copy command so we can do df3 equals df2 copy we're just going
to create a straight copy of it and of course if we do df3 it'll be the same as
the df2 on there so df3 equals df2.copy and then let's do df3 dot is null so
we're looking for null values and this will return a nice map and you'll see that
everything is false except when you go up here under the cat or h they had a null
there and so if we go they have a couple up here also underneath of let's see the
dog okay there's a bunch of nulls in here there's d up here so let's look at d
down here and you'll see false true there it is there's our null value so we can
create a quick chart of null values you can use this to do other things we can
leverage that null value to maybe take an average or something and fill those
null spaces with data and we can also modify the location so here's our df3
location and notice this is location not ilocation ilocation has i for integer
location uses the in this case the variables on the left and what we can do on
here and we'll go and just set this equal to one five and then let's um i'll
pick a spot let's go back up here where we had let's do f a just let's see what
are we looking at oh here we go let's do f and h and up here f is set to age of
2.0 and we find out that that's incorrect data so we go ahead and switch to df3
equal and then we're going to print out our df3 and if we go to f and age it is
now 1.5 so we're just changing the value in the df3 and this is changing the
actual data frame remember a lot of our stuff we do a slice and like it returns
another data frame this changes the actual data frame and that value in the data
frame so we've covered uh location and eye location is null making a copy here's
our eye location which is equivalent of a slice and also selecting columns so now
we want to dive just take a little detour here and let's look at df3 means and
this is kind of nice because you can do this you can either do this by as you can
select a single column here by the way you can just add the column selection
right here like we did before so we could have age look up the mean that just
creates a series if i run that there's our age but if i take that out instead of
selecting it we can do the whole setup and it has age and visits so why doesn't
it have priority or animal well those are not integers so it's really hard
they're non-numerical values so what is the average i guess you could do a
histogram which probably will look at that later on but the only two things we
can really look at is age and visits and we have the average or the mean on the
age is 3.375 and the mean on visits is 1.9 and let's do df3 visits we'll go and
steal the visits again and remember all those different functions we looked at
for a series well we can do those here we can do the sum so if we run that we'll
see that these sum up to 19. we could also look up minimum if you remember that
from before the minimum is 1 max so all that functionality is here i'll just go
back to summing it up and adding it all together so real quick we've shown you
how to take the series operations and put them into the data frame and then we
can actually this is interesting one we can just do df3 sum run and you'll see
the different summations on there it just combines them i like the way it just
combines the strings on there for priority and animal we've looked at is null
we've also looked at copying along with the different slices which we talked
about earlier so let's talk about strings let's dive into the string setup on
there and let's go ahead and create a string series string equals pd series and
we just put it right in there we have a c d a a b a c a popped in a null value
cow and al i don't know why they picked cal and al in the background someone must
like those animals and of course we can just do string if we run that you'll see
leave the r out we'll get an error but if we put it in there you'll see that we
have a simple series 0 a 1 c 2 d and it automatically indexes it 0 to 8. and then
we can go string dot lower so when we're talking about our data frame in this
case our data series string in this case we use the string function str and we're
going to make it lower and if we go ahead and put the brackets on there and
you'll see that we've gone from capital a capital c so on to abc and baca cow al
they were all lower case already and of course if you want to go lower you can
also do upper we'll go ahead and run that and you can see we now have acd aaa
baca everything is capitalized except for the null value which is still null all
right so we looked at a few basic string you can see that string functions upper
and lower we're going to jump into a very important topic i'm even going to give
it its own header on here because it's such an important topic what do you do
with missing values panda has some great tools for that so we'll dive into those
we'll call we'll work with df4 and if you remember the df copy from above we're
just going to make a copy of df3 and let's just take a quick look at the data
we're working with oops df3 forgot the three on there there we go so here we
have our cats snakes and dogs hopefully not all in the same container because
that would be just probably mean to all of them so we made a copy we're going to
be working with df4 and the reason we made a copy is we want to go ahead and fill
the data and we just simply do fill in a and then we're going to give it the
value we want to put in there we'll give it the value 4. so i can run in here and
you'll see now that df4 now has where the n a was is filled with the value of
four same thing down here a lot of times we'll compute the mean first so i might
do a mean age equals df4 and then we want to go ahead and do age and dot mean
and then i'll do something like this df4 i only want to select the age and i want
to fill that with the mean age and i run in there and you'll see that our df4h
now has the means in there just a quick way of showing you how you can combine
these let me go back to our original one there we go and run that and keeping
with good practices df5 equals df3 dot copy and we'll print our df5 which should
be the original one and then on the df5 we can now drop our missing data i'm
going to simply drop in a and we're going to use how equals any so i'm going to
drop any row that has missing data in it and you'll see we had d here with
missing data and h and then let's go ahead and see what df5 looks like when we do
that there we go and there it is d is gone and so is h so we create a new data
frame off of this missing those values now if you have a lot of data dropping
values is a good way to take care of it because you don't miss some data if you
have not a whole lot of data you're working with like the iris data set or
something like that or something small you want to start trying to find a way to
fill that data in so you don't lose your computational power of the data you got
so just a quick look at processing null values or missing values you can fill
them usually with the means some people use medium or the mode there's different
ways you can fill it one way is means and we can also just drop those rows those
are the two main things we do with missing data here we go uh we're going to
cover next this is i so love data frames for this file operations it saved me so
much time because they have so many different tools for bringing data in and
saving data so we're looking at the data frame file operations it's really
streamlined i don't know how many times they'll go on to different data
downloads and they'll have panda download standard on there just because it's so
widely used so let's start with the most common file is a csv so we have df3 to
csv or animal and let me just show you what the folder is going into right now i
have some untitled and a few things in here but nothing labeled animal so we go
ahead and run this and this is now saved the animal to my hard drive and you can
now see the animal folder up here and if i let's do edit with a notepad oh let's
open up with just a regular notepad there we go or wordpad if i open that up you
can see it's comma separated our titles they don't have an index on the
categories on the top and the index comma then all the different data is
separated by commas standard csv file on there and if we're going to send it to
csv and notice the format is dot 2 underscore csv and it's just the name of the
file we're sending it to you can also put the complete path by default it's going
to go whatever the active directory this program is running on that's why those
other folders are in there so we have our df3 to csv and then if we're going to
put it in there we want to also get it back out and we'll call this one df
underscore animal equals pd read underscore csv i always have to remember is two
underscore csv and read underscore csv i always want to do like a capital in
there and not the underscore we're going in here again it's the active directory
so if i now do print out my df animal and let's just do the ahead we only want to
look at the first three lines so if i go ahead and run this we'll see the first
three lines and they should match up here what we saved to our csv so very easy
to save and import from our csv files on here and it turns out df 3 also has a 2
xl they actually have a lot of different formats but you know old school excel
was real popular for so long still is we can go ahead and save it as animal dot
xlsx we're going to call the sheet named sheet1 and then i can also do df we'll
call it animal2 animal2 and this one's going to come from in the same format on
here there we go so we still have our animal xlsx the sheet one that's where it's
coming from index columns equals none so we're not going to we're going to
suppress the indexing on the columns n a values and it'll just assign that zero
one up on your indexes so if it says index columns equals none that's what it
does and then we've added null values because there's no values in here and we
want to just make sure that
they're marked as n a and we'll go ahead and just print out the animal animal 2
there we go and let's run that let's make this let's just do the whole thing so
we'll go ahead and run that and it probably doesn't help that i completely forgot
the read so animal 2 equals pd.read excel there we go excel so now we go ahead
and run it and what we expect is happening here we have the same data frame on
here and if i flick back to my folder you can now see that we have the animal one
of these is in excel and one of these is a csv on here and so there's our two
file types on there and they have other formats these are just the two most
common ones used and i don't know how many times i've had stuff from excel i need
to pull out if you've ever played with excel it's a nightmare in the back end
because of the way they do the indexing so this just makes it quick and easy to
pull in an excel spreadsheet today we're going to study the matte plot library
and the python code so let's start with what is matte plot library map plot
library is an open source drawing library which supports rich drawing types it is
used to draw 2d and 3d graphics and there are so many packages in the matplot
library we're going to cover the basics and there are so many packages that sit
on top of the maplight library that we can't even cover them all today but we'll
hit the main one so you have a good understanding of what the matplot library is
and what the basics can do you can understand your data easily by visualizing it
with the help of matplot library you can generate plots histograms bar charts and
many other charts with just a few lines of code and here we have some basic types
of plots you can see here that we'll go into we have the bar chart the histogram
boy i use a lot of histograms in my stuff scatter plot line chart pie chart and
area graph let's start plotting them and to do this i'm going to be using jupiter
notebook you can use any of your python interfaces for programming or scripting
and running it of course we here really like the jupiter notebook for doing basic
a lot of basic stuff because it's so visual and in our jupiter notebook which
opens up in this case i'm using google chrome you can go up here to new and we'll
create a new python 3 and set that up if you're not familiar with jupyter
notebook we do have a tutorial that covers some of the basics of that you'll look
at any of our tutorials i usually cover a number of them showing how to set up
jupiter and anaconda i myself use jupiter through anaconda in fact let's go ahead
and open that up and just take a look at this see what that looks like you can
see your anaconda navigator if you install it it will automatically install the
jupiter notebook and that also installs a lot of other things i know some people
like the qt console for doing python or spyder i've never used them i actually
use notepad plus plus as one of my editors and then i use the jupiter notebook a
lot because it's so easy to have a visual while i'm programming an even simple
script in python i'll take it from the jupyter notebook and then do a save as you
can always go under file and you can download as a python program so that will
download it as an actual python versus the ipython that this saves it as so let's
go ahead and dive in and see we got going here and let's go ahead and put matplot
library tutorial and i'm going to turn this cell into a mark down so it doesn't
actually run it you can see it has a nice little title there that's all jupiter
notebook and then from matplot library let's import pi lab back one and then
let's go ahead and just print we'll go pi lab and the version let's go ahead and
run this so we're going to import our pi lab module from the matplot library and
we find out that we're in version 1.15.1 always important to note the version
you're in probably i was reading an article that said the number one thing that
python programmers struggle with is remembering what version they're working in
and making sure that they're going from one platform to the other with the same
version and if we're going to graph things i think we need some data to graph so
we're going to import numpy as np now if you're not familiar with numpy
definitely go back and check out our numpy tutorial there's so many different
things you can do with it dealing with reshaping the data and creating the data
we're just going to use it to create some data for us and there is a lot of ways
to create data but we're going to use the np.line space so we're going to create
a numpy array and the way you read this is we're going to create numbers between
0 and 10 and we're going to create 25 of these numbers so we're just going to
divide that equally up between 0 and 10. and if we have x coordinates we should
probably have some y coordinates and we'll do something simple like x times x
plus 2. and let's just take a look we're going to print x and print y let me go
ahead and run this and let's see we got going on here so we have our x
coordinates which is 0 0.4 0.83 etc and you can look at this as an xy plot so we
have 0 we have 2. we have 0.416 we have 2.17 and just as a quick reminder we're
going to do print np array x comma y dot reshape 25 comma 2. and the reason i
want to do this is i want to show you something here a lot of times a program
returns x comma y and it's an array of x comma y x comma y x comma y and so when
you're working with the pie plot you have to separate it out and reshape it so
if i start off with pairs like this i can reshape them if i know there's 25 pairs
in there i can switch the 2 and the 25 and this is kind of goofy but we'll do it
anyways reshape so i'm going to reshape my 25 by 2 back to 2 by 25 and if i run
that you'll see i end up with the same output as the x y the two different arrays
in here and this is important that we want x and y separate again that's all
numpy stuff but it's important to understand that this is a format that matplot
library works with it works with an array of x's and they should match your array
of y's so each one has 25 different entities in it and then for our basic
plotting of this data it only takes one command to draw graph of this data and so
we use our from up here where we imported pi lab we take our pi lab and the key
under there is plot for plotting a line and then we want our x coordinates and
our y coordinates and we'll throw in r and the r simply means red so we're going
to draw the line in red let me go and run that you can actually switch this
around if you wanted to do different there's b for blue we have a lot of fun
yellow hard to see yellow there we go but we'll go ahead and stick with red run
and when you're doing presentations with these try to be consistent you know if
the business and the shareholders send you a spreadsheet and they have losses in
red use red for losses in your graph try to be consistent use green for profit
for money you don't have to necessarily use green but it's whatever they're using
whatever the company is using try to mirror that that way people aren't going to
be confused if you switch your data around every time one graph has red for loss
and one graph has blue for loss it gets really confusing so make sure you're
consistent in your graphs and your coloring and something to know because we're
going to cover this in a minute this is your canvas size so we have a canvas here
and what we're going to do next is we're going to look at sub graphs okay so
let's take our pi lab and create a sub plot and one of the things also to know
when we're working with the matplot library i'm not setting when i do this this
is my drawing canvas the pi lab so once i've imported the pi lab i'm drawing my
images on there very important to know and with the subplot we're going to give
it some different values and we're going to represent by rows columns and
indexes and let's do one two one so it's going to be the first row second column
and the index is like you can stack your graphs and things like that we don't
worry too much about indexes but rows and columns we want to go ahead and use row
one and column two and if we're going to have one object we should probably have
two but before we do that we have to plot data onto the subplot so the order is
very important and we're going to stick with our x comma y and let's do this
we're going to add in a third parameter here remember we did red we're going to
add shorthand dash dash for dashed lines so this plots the data into row one
column two and if we're going to do that let's do up another one pilab dot
subplot and if we're gonna do row one let's do column two and index two and this
time we're gonna add g for green and this denotes a style and if we're going to
set up our pylab subplot there we go lab we've got to go ahead and plot that pi
lab plot and instead of x y we want y comma x oops i messed up this is in the
wrong spot there we go we'll move that down here real quick because that goes in
the plot part so the subplot tells it the row column and index and the pi plot
tells it what data in this case we switched them and the color and then the style
shorthand now let's go ahead and run that and you'll see it takes this canvas
splits it in two and now we have two different graphs and we have the red one
with dashed lines and we have the green one which is has little stars going up
and if we take this and let's just um just for fun let's change this and run that
with an index of one it puts them both on the same index and also gives me a
warning because it's a strange way of doing two subplots there's depreciated
there's another way to do it but most people just ignore that warning because
it's not going to go away any time soon now that's using the same setup what
happens if we do instead of this let's change the column on here and find out
what happens and if we do the column it didn't really like that on the setup it
just disappears so let's keep our column as 2 and let's change the row on the
second one to two and run that and you'll see again it kind of squishes
everything together and causes some issues so let's take the index so these need
a unique index and you can see here where i made some changes i said row two and
look what happens when i change to column two so i now have row two column two
index two i squished it up here so you could put another graph underneath is what
that does and there's all kinds of different things you really have to just play
with these numbers till you get a handle on them because you know you have to
repeat it 164 times according to cambridge university if it's completely new to
you and you can see right here where we go three run there we go but you can see
it takes a little bit sometimes to play with these and get the numbers right
hopefully hit the wrong one that's why let's go three there three there run there
we go now it's overlapping so i have this doubled over here on the right for now
we'll just go ahead and leave this with the where we have column and row two and
the two different indexes so they appear nice and neatly side by side and then as
we just saw as we were flashing through them we can put them on top of each other
and let me just highlight that and copy it down here paste it down there and here
we have one two one and then we'll do one two one also for this one and that puts
the two subplots directly on top of each other gives us that warning and you can
see we now have two different sets of data graphed on top of each other and you
can also see how it did the indexes since one of them is from 0 to 10 that's the
green one on the x axis and the other one is from 0 to 10 on the y-axis so it
took the greatest value of either one and then used those as a shared value so
let's next look at operator description and we'll go ahead and turn this cell
into a markdown and run that so it looks nice so fig and you remember i talked
about the canvas earlier i briefly mentioned it we're going to look a little bit
more at the canvas later on but that's what the figure is fid we're going to add
axes so we're going to initialize a subplot add the subplot in rows and columns
and all kinds of different things with this you can do let's look at that code
and see exactly what's going on and i want you to notice that there's fig which
is the actual canvas in the matplot library and ax is commonly used to refer to
the subplots so we're creating subplots you'll see ax equals plt subplot earlier
we did the pi lab so let's go ahead and import pi plot from matplot library and
we're going to do it as plt you'll see that a lot that's really the standard in
the industry is to call it plt just like pandas is pd and numpy array is np
certainly you can import it as whatever you want but i would stick to the
standards and we're going to do the same graph as we did above with the pi lab
but with the plt so if it looks familiar there's reason we're doing this because
we want to show you how the figure part works and working with the canvas goes
but we're going to do the same plot as we did before and we'll call it fig and
we're going to set that equal to plot figure so there's our figure or canvas on
there and let's create a variable called axes and we're going to set that equal
to fig dot add axes and in this we're going to control the left right the width
the height of the canvas from zero to one so we can go ahead and i'm just going
to put some stuff in there i got point five point one point eight point eight so
when you're looking at this this is a zero to one or you could say fifty percent
ten percent eighty eighty percent but it's a control it's going to control your
left and your right along with the width and the height so the width and the
height we're gonna use eighty percent and we're going to have like a little
indent on the left and the right and this should look familiar from above x use
dot plot x comma y and then let's give it a color how about red since we're
recreating the same graph let's keep it uniform oops and it helps if i use a axis
instead of ax es i don't know where that came from but this looks identical to
the one we had up above so here's our axis plot x comma y of red same graph same
setup but this time we've added a variable equal to the figure dot add axes so
our plot figures our canvas our axes is what we're working in and then our
axis.plot x comma y and again we can draw sub graphs let me put that down here
[Music] just like we did before and a little different notation here we're going
to fig comma axes equal plt.subplots and in here it's going to be the number
of rows we're going to do one row and columns equals two so if you remember
before this that we did we had one row with two different graphs on it we're
going to do the same thing but know how we did this here's our figure our canvas
and our axes we're going to create actually two different axes we're going to
create row 1 column 2. and so axis is an array of information so we can simply do
4 let's do x in axes this will now look familiar x dot plot we're going to do x
comma y we'll go ahead and make a red keep everything looking the same remember
nice uniform graphs everything looks the same and if we go ahead and run this
you'll see we get two nice side-by-side graphs so just as we had before the same
look the same setup and just for fun let's change in columns to three we'll run
that and now you see we'll have three on there and let's see if we make it a
little bit more interesting we'll do in rows equals to two and you can see down
here we're going to get an attribute error because it's trying to scrunch
everything together so it does have a limit how much stuff you can put in one
small space that's important to know you can fix that by changing the canvas size
which we'll look at in just a minute and there's other ways to change it on here
but here we go we can do in rows 2 and columns equals 1. you can see two nice
images right above each other we'll go back to the original one row two columns
side by side left to right and we can also draw a picture or graph inside
another graph and that's kind of a fun thing to do it's important to note that we
can layer our stuff on top of each other which makes for a really nice
presentation so let's start by uh fig we'll create another figure so we're going
to start over again with our canvas we set that equal to plt.figure so there's
our new canvas and let's do axes we'll call it axes one and two axis one equals
fig dot add axes remember this from earlier and this here similar numbers we
used before saying how big this axis is this figure in the axis is so this is
going to be the big axis and let's do axes 2 equals another figure add axes and
then 0.2.5.4.3 and if we're going to do this they need data on them so let's go
ahead and plot some data on our axes so axis1 dot plot and we'll make this simply
x comma y comma make it red and then let's go axes2 dot plot and let's reverse
them y comma x comma green there we go doing what i told you not to do you
shouldn't be swapping axes around and plotting your data in five different
directions because it's confusing let's go ahead and run this and see what this
looks like and then let's talk a little bit about this we talked about the
0.2.5.4.3 and let me just grab the annotation for that that's left right width
and height so we have in here that this is going to be left right so here's our
left is point one in point five and we you know what let's just play with this a
little bit what happens when i change this to point one moves it way over to the
left so there's our point one so we can make this point four run that there we go
so you can see how you can move it around the branches on here 0.2 0.5 is the
left so that's our right so see what happens when we do point oh let's make this
point one that actually is they had it down at left right i thought this was
wrong it's actually how far from the bottom let me switch that on here bottom
there we go so we had here on this we can go ahead and put that back to 0.5 and
run that and this is 0.3 let's make this 0.3 also and that is the width and then
of course there's the height we can make that really tiny actually let's do 0.2
and let's run that and you can see it changes the height on there we make it even
smaller 0.2 by 0.2 and as you can see you can get stuck playing with this to make
it look just right it can sometimes take a little bit certainly once you have the
settings if you're doing a presentation you try to keep it uniform unless it
doesn't make sense for the graph you're working on try to keep the same colors
the same position and the same look and feel and i mentioned earlier we can
adjust the canvas sides so this is from earlier i just copied it down below we're
going to replot the same data we've been looking at and what we can do is we can
change the figure size to 16 by nine let me run that and show you what that looks
like so it fills the whole screen and then if you are normally when you're
working on the screen you don't worry too much about this but we can set the dpi
to 300 run that there it goes this is your dots per inch and if you are doing an
output of this and you're
printing a hard copy you want the higher quality i would suggest nothing under
300 if it's a professional print you might get a little less than that but
whenever i'm doing professional graphics and printing them out on something 300
dots per inch is kind of the minimal on there you can go a lot higher too but
keep in mind the higher you get the more memory it takes and the more lag time
and the more resources you use so usually 300 is a good solid number to use your
dots per inch and you can see it drills a nice it draws a nice large canvas here
which is 16 by 9 and then the dpi is 300 on here so it's a little higher quality
and just out of curiosity i wonder how long it takes to draw something double
that size 600 and you can see here where at 600 dpi it's going to take a while
there it goes just because it's utilizing a lot more graphics on there and let me
just go back to the 300 now we'll actually do let's do a 100 you're not going to
see a difference on this because it is web based graphics are pretty low and up
here you saw i did this with the plot figure this works the same if i do figure
axes subplot figure size and then we'll go ahead and do axes dot plot x comma y
comma we'll stick to red let's go ahead and run this and you should get almost
the same thing here here's our axis on the subplot on here with the fixed size
and the dpi let me take this all out let me just remove all that real quick run
it again there we go now we're back to our original figure and let's look at some
of the other things you can do with this one thing we do is we can set a title
for the axis so axis set title you'll see right here since i put this on the axis
it's the main title for the whole graph and if you're going to have a title you
should also label so we can label our x label and we can set our y label in this
case we're just going to call it x and y keep it nice and uniform and if we run
this you'll see that we've added a nice x label and y label whoops where'd they
go and it turns out in this environment that you have to put it before the title
so let me go ahead and put it before the title and there's our xy let me run that
and of course we can also do upper size a little bit so you can see what's going
on a little better so here we have x label x if you come down here you'll see our
x label and our y label we can of course change this to x label you can change
this to y be whatever you want on here of course and our title graph there we go
run so here we have our title graph our y label and our x label all set up on our
nice little plot and then before we move on to the next section let's do one more
thing on here we have a thing called the legend and we're going to do we're going
to set our ax legend label 1 label 2 up here it's a format for it but let's go
down here and actually use it i'm going to do two different plots we're going to
have axes plot x by x times x squared and x cubed and if i run this you'll see it
puts two nice graphs on the setup on there but it's nice to have a legend telling
you what's going on so for the legend we can actually do axes since we have the
two plots legend and on here we've created an array and we have y equals x
squared y equals x cubed you can actually put this as whatever you want those are
just strings and then location two and let's go ahead and run this and see what
that looks like and you can see it puts a nice legend on the upper left hand
corner location two we can do location three and run it and it drops it down to
the bottom location one i can't remember where that's at there we go upper right
so each one of these is a number that refers to the different locations on the
screen zero kinda have to play with them or look them up to remember where
they're at but they do work it just kind of moves around depending on where you
want your legend out on there so on this section we cover the title of the graph
the y labels and legends this is we're getting into some starting to look really
fancy here so we now have something we can actually put out you'll see the title
the graph looks a little fuzzy so i might in a web setup put the dpi up a couple
notches maybe put it at 200 100 might work fine just so you know something to
notice on here when you're playing with these different things we had our
subplots dpi equals oh let's do 200 to see what that looks like so you can see
now it's a lot clearer it's also larger so it's a nice little feature you can
throw in there with your dpi dots per inch so the next section is let's look at
some graph features we're going to look at line color transparency size and a few
more things on here and oops i forgot the main title so we have our figure in our
axis equals our plot and subplots and i'm going to do a dpi equals 150 so the
graph comes out nice and large and easy for you to see and let's go ahead and do
three plots on here we'll do x by x plus once which is going to be a straight
line plot x plus x plus 2 and axis dot plot x x plus 3. this looks like we're
doing nearest neighbor setup we're showing how it uh located data putting your
lines on there between the nearest neighbors there we go so it draws a nice
little graph with three lines on it one of the things we can do is we can control
the alpha on this oops and you can actually see the when they did these lines it
automatically pulls in different colors for your setup so there's some automatic
automatic things going on in there and a lot of times we do that comma r where
we're going to do color equals red another notation on here let's go ahead and
run this now we have a bright red line down there and with the matte plot library
you're not limited to red you can also use the one of many different color
references as you see here with the pound sign one one five five dd which just is
just blue and we can do the same thing with another color on here which turns out
to be green i can just as easily do this green blue oops there we go blue and run
that and you'll see here we have red blue and green and what i want to do is i
want to make this we're going to say what's called the alpha on this and we're
going to set this equal to 0.5 so this is halfway see-through when i run this
and it's almost going to look pink because you can see through it and let's
change this just a little bit just to make this kind of fun let's square it there
we go run it so now we have this nice square that comes up and you can see when
it crosses it because i plotted these two lines after it and they have no alpha
the red is behind those lines or in this case pink because we did the alpha
halfway through so let's go ahead and do this alpha equals 0.5 and oh you know
what instead of squaring it let's take it to the 0.5 power that'll be kind of
interesting to see what that does we'll just go to keep it squared there we go
and run that and let's go back and look at this where it crosses over and the
first thing you see right here is on the blue is kind of light blue now you can
see how the two colors add together you get almost a purple on there so i can
clearly see where the red crosses the blue line and then the green just blanks it
over because i didn't do any opaqueness no alpha on there so this is great if you
have lots of data that crosses over and you need to be able to track those lines
better and we'll go ahead and do this 0.5 and we'll run that oops i did equals
0.5 let me go ahead and run that and so you can see right here now you can easily
see the red line how it crosses the green and the blue down here and if we want
to we can do this as the default is one solid so we can change this all to point
eight let me just do that oops 58 eight there we go run oops i must have hit a
wrong button there let me try that again i actually get rid of a bracket and
let's go ahead and run that and we come down here and look at this you can still
see where it passes behind them but the green dominates and the blue dominates
because we're now at eighty percent instead of fifty percent when you can do less
that's kind of fun although at some point the lanes kind of fade so point five
is usually the best setting on there we have a nice pastel here at point three
and you can easily see where they cross over and just like you can play with the
colors we can play with line width and you know let's do let's try dpi 100 and
see what that looks like on my screen equals 100 and we'll go ahead and just take
our ax plot let's do four of these lines just you can see how they look next to
each other real quick here there we go if i run this they should all appear the
same it automatically does different colors on there so let's do color equals
blue i forgot my quotation marks there we go and we'll go ahead and just make
these all blue just for purposes of being nice and uniform and then what i want
to do is i want to do the line width width equals 0.25 and let's just copy and
paste that down here let's do equals one about 1.5 and let's do one just make
this equal to two let's see what that looks like and we do that you can see it
goes from a very thin line a 0.5 a 1 our 1.5 and 2 which is twice the width of
the 1. and if we're going to do different sizes we had different colors we had
our alpha scheme let's take this whole thing here let's paste it down here and do
another one but instead of line width let's look at styles and something to know
here you can actually abbreviate this with lw so line width can also be point
let's just do everything point two and let's set up a line style we'll do the
first one dashes let me just paste that down here so i'm not doing
a lot of extra typing there we go take this out so we have our dashed we can do
a dash dot let's do the dash dot here and a colon here there we go and there's a
lot of different options we'll look at a few more as we go down for different
ways of highlighting data but when you look at this we have everything is a line
width of two and now we have a straight line we have a dashed line or a dot dash
and a dot dot dot line and then another thing we can add on here is we're going
to do here's our ax plot and we did x let's do x plus 4 so it goes right on the
top then do color black line width 1.5 so it's a smaller line and we're going to
take the line and we're going to set dashes so look i've changed some of the
notation here for my line and my ax plot so i can set my line comma equal to x
plot and then i can change the line settings this way and when i run this let me
run that on here you'll see the 5 10 15 10 creates a series of dashes that are
buried in link link in this case they alternate between a short dash and a long
dash we can play with these numbers curiosity always has me what happens when you
play with the numbers just to see what they look like let's do this let's paste
this down here i'll do two of these just because they're kind of fun to play with
and let's change this from 10 to 3 and we're going to change this one from 15 to
4. and let's run that and you can see the differences in the lines oops very a
little bit confusing on there because i forgot to change the lines are all on top
of each other so let me change that really quick here and let's run that and now
you can see here's our original dashed line alternating when i change these
numbers on the second one the very end value to 3 you can see now we have the
dashes of five let's see i'm going to guess this is a dash is a five skip ten
dashes of fifteen skip three and then it goes back to the beginning dash is five
dashes skip ten dashes skip three and of course the last one we just switched up
a little bit it looks a lot more uniform because i'm using two sets of ten or if
i did something like this and change it to 30 it really becomes pronounced as far
as the distances between them and instead of 4 let's go oh let's put 30 here also
30 by 30 there we go really pronounced on that one and let's look at one more
important group for plotting our data and in this we're going to here's our plot
we started with with the x plus one x plus two x plus three and did it in blue on
this one's three or four different blue lines and this property we want to add
the actual plots so you can see where the plots are on the graph and for that we
might have marker equals o and if we run this you'll see it puts a dot for each
of these and there's 25 dots because we have 25 x values so we actually have zero
and each of the different values of x y are then plotted here with the dots and
we don't want to just limit ourselves to dots you can also do plus sign that's
another option dots is most common i'll actually like the dots the best if we do
the plus sign you see it puts a nice crosshairs or plus sign on there and we can
do a marker there's a number of different markers you can use and i think this
one was it s is another one which is a nice square and that's actually a good
one s for square o for period okay that's just kind of weird so you can see that
probably on these markers another one is uh number one so if we run that you'll
see we now have these little hatch marks and let's take oh let's just go with the
o on this one by the way this works with square really nicely some stuff we're
gonna do here on just a second let's do marker size equals two and change that
to five and run that and you can see here it's a nice little tiny dot versus uh
the size dot here this is interesting because it said two i thought it would be
bigger but if you do 0.5 it gets even smaller and let's just do 10 to see what
that looks like run that looks huge so marker size a lot of these are dependent
on the dpi and the setups there's things that switch around as far as the way the
size shows up you got to be a little careful when you change one setting it can
change all the other markers and then let's take our square on here and we'll do
we have marker size so we also have marker face we'll set that equal to red of
course i mean change the so it's up one notch we'll run that whoops must have
mistyped something on here and i did it's marker face color equals red and so
when i run that you can now see i have the squares on there with the marker face
color of course we can mix and match these come down here and we'll make this
instead of let's make this plus seven and we'll make this size 15 marker face
color equals and we'll do what green just because there we go run very hard to
actually see what's going on there still 25 dots they kind of overlap as you can
see they print them over each other of course if we really wanted to make it look
horrible we could just make that really huge generally though you want something
a little bit smaller and cuter we'll just try doing it this way there we go
that's too small to even see the face so four you can start to see the face on
there around four and maybe an eight eight might be a good number for this there
we go eight again that all just depends on what you're trying to show and display
so we've covered a lot of stuff here as far as our lines we've covered opaque
with our alpha setting on there give us some nice pastels you can see how they
overlap and how they cross over we covered the line with different size on there
different formats for the line itself and these are all you can combine all these
so you can have our line width equals two line style equals dash you can bring
this down here also to the markers and then we added markers in just entered a
circle a plus sign the square a little tick which uses a one then we had a marker
size and a marker color face and we combine those you see we get a nice different
series of representations we also briefly mentioned color where you didn't have
to use like in here we used color black someplace up here and i have to find it
we use the actual number for the color as opposed to i changed them to red and
blue so you get very precise on the color if you have a very specific color set
that you need to match your website or whatever you're working on all those are
tools in the matplot library so we have one more piece to formatting the graph
so we want to show you and then we have two big sections we're going to go over
the different graphs that they have along with a challenge problem so let's go to
the last section we're going to look at is limits we're going to limit our data
so this first primer is going to paste in there we're going to create our
subplots one two so one row two columns we're going to do a figure size of 10
comma five this should all look familiar now since we've done a number of them
and we're going to go ahead and plot and this is an interesting notation you
should notice here our axis zero so one we've used instead of you can just
iterate through them but they're just an array so it's an array of zero is still
the axes of the first axes out of two and we're going to plot x x squared x x
cubed lined with two so we're going to go ahead and just plot two graphs right on
top of each other without doing multiple plots on here and we'll set the grid
equal to true one here let's go ahead and run that and you can see here our two
plots with the x value going across and i'm going to do something similar and by
the way as you can just if you look at it you can see the grid on there that's
all that is easier to spot the data going across we're going to take the same
data for axes one so we have our plot of x x squared x and x cubed line with two
and this time we're going to take our axes one and do y limit it's actually set
underscore y limit this is the y axis so it's going to be an array of two two
values and we'll do 0 comma 60 i'm just making these numbers up the guys in the
back actually made them up i'm just using their numbers we're going to set the x
limit and we'll set the x limit as don't forget our brackets there two comma
five so it's the same data going in and but we're setting a limit on it let's go
ahead and run that and let's see what it comes out of and here we have the y
limit zero to sixty so we're looking at just the lower part of this curve here up
to here and we have the x limit two to five so that starts right here at two and
you can see very different graphs this is kind of nice because you could actually
put one of these on top of the other if you wanted to draw focus to one part of a
graph remember how we did that earlier one inside the other but just a quick note
you can easily limit your graph and re kind of reshape the way it looks quite
easily and we can also add that grid down there if you want a grid we'll run that
and add the grid in there oops i guess you have to do the grid beforehand switch
that there we go sometimes the order on this is really important so you may
double check your order when you're printing these things out it also helps if i
change it to one so in this case might not be the order i wonder if we'll go back
here as one there we go so it doesn't matter the order and grid but you can set
the grid for easy viewing here nice setup on there but you can see how we can
limit the data so let's start looking at some other 2d graphs and make this cell
a markdown so we run it as a nice pretty title to it and let's go ahead and
create some data with an np array we'll do 0 to
5 on here there we go and let's look at four common graphs we'll put them side
by side so we'll do a figure our axes equals plot subplots one four columns and
then figure size hopefully it'll fit nicely on here it seems to do a pretty good
on here and i'll go and just run that since we're in there run you'll see i have
my four blank plots on here and we'll start with axes of zero let's set title
and we want this to be a scatter plot a scatter plot just means it has a bunch of
dots on it so here's our axes of zero dot scatter easy to remember scatter a
bunch of plots on there we'll do our in we can do x or n there we go and let's go
ahead and do axes set title scatter already did that we're just going to do
scatter that's how you do it on there notice how you create a scatter plot with
simply with the scatter control and we'll do let's do the variable x x plus
let's throw some randomness in here usually scatter plots are i have a lot of
random numbers connected to them that's why they do them on there and so the
bigger the x gets the bigger the randomness so 0.25 times the randomness and what
we should end up doing here is with the scatter plot and you can see as you go up
it just kind of has some random numbers and moves up and down the line but plus
just the points so if you remember from back up here where we did marker this is
plotting basically just the marker so it's a scatter plot probably less used is a
step plot so for exes one we'll go ahead and do a step plot so you can see what
that looks like and this time we'll use our n value instead of x we generated
that n value up here and so for this we have n n times 2 r n squared n times 2 n
squared line width equals 2 and if we run that it creates a nice step up let's
see so we've got a scatter plot we've got a step plot let's do a bar plot and
we'll use the same formula in n squared alignment centered because you can have
them left or right with 0.5 and alpha if you remember correctly that's how opaque
it is let's see what that looks like on there so we have some nice you can see
here a nice bar plot it should look very similar to the step plot but colored in
and we can change the width let's see what happens we do 0.9 run and if we take
width out completely run that you can see it starts coming together on there and
we can change the alpha we can take the alpha out too and run that so now you
have the solid colors and if we take out the center and run that everything you
really can't see the shift on here because that's actually the default on this
but these are common settings for the bar graph let me just put them back in
there there we go alignment center and alpha now i can't say i've used the step
craft very much there's certain other certain i guess domains of expertise
require a step graph but the scatter plot and the bar graph very common
especially the bar graph and we'll look at histograms here in just a minute so i
use histograms a lot especially in data science but this is nice if you have very
concrete objects somebody how many people wearing yellow hats that kind of thing
but if we're going to do that let's go ahead and do the last one which i see a
lot more in the sciences certainly using the data science but more like for
mapping i saw publication on solar flares and they were discussing the energy
and so filling in the graph gives it a very different look so we're going to do
the fill between and it's just like you think it'd be it's fill between but with
a underscore between them and we'll do x and x squared and x and x cubed and
we'll do color green and alpha again in case you had other data you want to plot
on there you can see it forms a nice squared coming up here and also if you look
at the bottom one is your squared value the upper line is your cubed value and
then it fills in everything in between if you remember from calculus this would
be if you had like a car a motor an efficiency they would talk about the
efficiency going up and the loss and you're looking for the space or the area
between the two lines so it gives you a nice visual of that now let's look at a
few more basic two dimensionals so we have our figure figure size on here we're
going to do a radar chart to be honest i've never used a radar chart in business
or in data science i have to find a reason to use one now so the first line for
doing a radar chart we have to add axes and the figure and with this this
actually creates our oh let's let's run it so you can see what it creates it
creates a nice looks like you're on a submarine and you're tracking the hunt for
red october or something like that and it needs all of these the polar is the
fact that we're doing polar coordinates zero zero point six point six has to do
with the size if you take out any of these things and run them you get just a box
if you take out the other half you pretty much get nothing in there and if you
change these numbers and change them a little bit you can see it gets bigger they
had 0.6 on here i'll go ahead and leave it as one because that's just kind of fun
that's all about the size on here the height and the width and then let's create
some data t equals np line space and this is 0 to 2 times np times pi so if you
remember that is the distance across and we're going to generate 100 points so
this is just the thing of data we're putting together then we can simply do an ax
dot plot and in this case let's do t comma t which would be a diagonal line on a
regular chart and we'll give it a nice color equals blue and line width equals
three let's see what that looks like and we can see here a spiral coming out
remember this would be just a diagonal line on a regular chart what happens if we
take this and instead of t times 0.5 there we go and you can see it slightly
alters the way it spirals out we can do t times two spirals that a little quicker
so it's kind of just a fun i've like i said i've never used a radar chart it's a
column but you can always think of radar submarine kind of looks like one of
those or in an airplane and none of this would be complete if we didn't discuss
histograms oh my gosh do i use a histogram so much and we'll use our numpy that
we have set as np to generate oh it looks like we have a hundred thousand
variables we're going to set equal to n and of course we create our figure and
our axes from subplots one two figure size 12 14. so we're going to look at two
different variations of the histogram and we'll set a title default histogram set
our title there and then this is simply hist for histogram and we'll just go
ahead and put in our n in there and let me run this and see what that looks like
and let's talk about what is going on here so we generated an array here of data
1000 random arrays it looks like they're mostly between -4 and 4 and then it adds
up each one it says 0 you have 35 000 that are zero so that's what's most common
on here and we have 20 000 that are somewhere in this range right here between
the minus two and well it looks like one minus two and somewhere between zero and
1 there's 30 000 numbers so all this is saying is this is how common these
variables are and this gives you this point in so many directions when you're
looking at data science to go ahead and run your histogram so you should always
have your histogram and you can always put limits and all the other different
things on your array just like you did on the other graphs on there and then
we're going to do a cumulative detailed histogram and all it is is a histogram
let me just do that and we set cumulative equal to true and bins equal 50. and i
really want to highlight the the cumulative equals true is important but we can
now choose how many bins we have in the first one it kind of selected them for us
in this case let me go ahead and run this and you'll see it has a prints the data
out for us and here's our whoops must have missed oh there we go it doesn't help
that i put it over the old one there we go okay so now you have your default
histogram and then we have a cumulative histogram and we should have 50 steps in
there and let's just find out if that's true not so much by counting them i'm not
going to count them if you want to you can count them let's just change it to 10
and see what happens and we see here we have now 10 counts of that and we could
set that for 5 and run that and then we have our 5 on there and we go ahead and
take the cumulative equals true out just so you can see what that looks like and
let me run that on here too that looks just like it did before i think there's
what one two three four five six seven eight they have eight different bins on
here is what the default came out of put that back in there run and so now it
should look almost identical and it does and then we can put the cumulative back
in see what that looks like with the cumulative and run that and we can see how
that shifts everything over and has a slightly different luck wait it shifts at
all to the right no it doesn't actually shift it to the right it's cumulative so
it's the total of the different occurrences and so what that means is like if you
consider this like for the year of rainfall we have like day one you had a little
bit of rain day two we have more rain and so if you look at the number this is a
hundred thousand thirty five thousand so it's accumulative detail the histogram
of the currents as it grows and rainfall is a good one because that would be a
cumulative histogram of how much rain occurred throughout the year and we're
going to look at two more graphs we've already looked at a bunch of them we
looked
at our radar graph we've looked at scatter step bar fill in basic plots we've
looked at different ways of showing the data and we can increase the size of the
line the look the color the alpha setting so let's look at contour maps let's put
that in there there we go draw a contour map and before we draw a contour map we
need to go ahead and create data for it and if you have contours your data is all
going to have three different values so let's go ahead and create the data here
we have our you'd import your matplot library your numpy so we have our numbers
array we'll import matplot.cm and that's your color maps so you have all these
different color maps you can look at there's like hundreds of color maps so if
you don't want to do your own color you can even do your own color map they're
pretty diverse and of course our plt we're going to our pi plot and to generate
our different data we're going to create a delta 0.025 and we'll start with x and
we're going to create an array between -3 and 3 and delta increments of 0.025
and we'll have our y we'll do something similar and then we'll create our x y
into a mesh grid again these are all numpy commands so if you're not familiar
with these you'll want to go back and review our numpy tutorial and we'll do an
exponential on here minus x squared minus y squared for z1 we'll do a z2 so we
have two different areas and z equals z1 minus z2 times two so we've created a
number of values here and let me go ahead and run this and let's plug that in so
you can see where those values are going so once we've set these we're going to
create our figure and our x from our plt subplots we're going to create the
variable cs and this is going to be our contour so right here cs is our contour
surface and we're feeding it x y and z if you remember x y we created as our x
and y components using our mesh grid and you know what let's do this just because
it's kind of good to see this let's go ahead and print x and let's print y and
i always like to do this when i'm working with something that's either is really
complicated in this case is what we're looking at or you don't understand yet so
we've created a mesh grid we have x y and when we're done with this we end up
with here's our x and this set of values and our y so these are x and y
coordinates and then we've also created z based on our x and y so we have x
capital x capital y and capital z is our three components x and y being the
coordinates while z is going to be our actual height since we're doing a contour
map so we created our contour map from our x y and z coordinates we want to go
ahead and put in a c label maybe we want to go ahead and do a title on here
we'll put that in our set title and this is a contour there we go contour map
and let's go ahead and run this and see what that looks like and you'll see we
generated a nice little contour map there's different settings you can play with
on this but you can picture this being you're on a mountain climb and here we
have a line that's represent zero maybe that's sea level and then moving on up
you have your contours of 0.5 and then minus 1 and different setups little hills
i guess if it's minus that's like a pit so i guess you're going down into a pit
at minus 5 and -1 when on the other side you can see you're going up in levels so
here's a mountaintop and here's like a basin of some kind and in data science
this could represent a lot of things this could also be representing two
different values and maybe profits and loss i don't know if i'd ever really do
that as a contour map but i'm sure you can be creative and find something fun to
do with a contour map and then we're going to look at one last map which is the
3d map and those are can be really important as a final product because they can
show so much additional information that you can't fit on two-dimensional graphs
and we're going to cover the scikit-learn tutorial which has a lot of features
and all kinds of api in it to explore data and do your data science with effect
is probably one of the top data science packages out there so what is the scikit-
learn it's simple and efficient tool for data mining and data analysis it's built
on numpy scipy and matplot library so it interfaces very well with these other
modules and it's an open source commercially usable bsd license bsd originally
stood for berkeley software distribution license but it means it's open source
with very few restrictions as far as what you can do with it another reason to
really like the scikit learn setup so you don't have to pay for it as a
commercial license versus many other copyrighted platforms out there what we can
achieve using the scikit-learn we use class the two main things are
classification and regression models classification identifying which category an
object belongs to for one application very commonly used as spam detection so is
it a spam or is it not a spam yes no in banking it might be is this a good loan
bad loan today we'll be looking at wine is it going to be a good wine or a bad
wine and regression is predicting an attribute associated with an object one
example is stock prices prediction what is going to be the next value if the
stock today sold for 23.5 cents a share what do you think it's going to sell for
tomorrow and the next day and the next day so that would be a regression model
same thing with weather weather forecasting any of these are regression models
where we're looking at one specific prediction on one attribute today we'll be
doing classification like i said we're gonna be looking at whether a wine is good
or bad but certainly the regression model which is in many cases more useful
because you're looking for an actual value is also a little harder to follow
sometimes so classification is a really good place to start we can also do
clustering and model selection clustering is taking an automatic grouping of
similar objects into sets customer segmentation is an example so we have these
customers like this they'll probably also like this or if you like this
particular kind of features on your objects maybe you like these other objects so
it's a referral is a good one especially on amazon.com or any of your shopping
networks model selection comparing validating and choosing parameters and models
now this is actually a little bit deeper as far as a site kit learn we're looking
at different models for predicting the right course or the best course or what's
the best solution today like i said we're looking at wine so it's going to be how
do you get the best wine out of this so we can compare different models and we'll
look a little bit at that and improve the model's accuracy via different
parameters and fine tuning now this is only part one so we're not gonna do too
much tuning on the models we're looking at but i'll point them out as we go two
other features dimensionality reduction and pre-processing dimensionality
reduction is we're reducing the number of random variables to consider this
increases the model efficiency we won't touch that in today's tutorial but be
aware if you have you know thousands of columns of data coming in thousands of
features some of those are going to be duplicated or some of them you can combine
to form a new column and by reducing all those different features into a smaller
amount you can have a you can increase the efficiency of your model it can
process faster and in some cases you'll be less biased because if you're weighing
it on the same feature over and over again it's going to be biased to that
feature and pre-processing these are both pre-processing but pre-processing is
feature extraction and normalization so we're going to be transforming input data
such as text for use with machine learning algorithms we'll be doing a simple
scaling in this one for our pre-processing and i'll point that out when we get to
that and we can discuss pre-processing at that point with that let's go ahead and
roll up our sleeves and dive in and see what we got here now i like to use the
jupiter notebook and i use it out of the anaconda navigator so if you install the
anaconda navigator by default it will come with the jupiter notebook or you can
install the jupiter notebook by itself this code will work in any of your python
setups i believe i'm running an environment of 3.7 set up on there i'd have to go
in here in environments and look it up for the python setup it's one of the three
x's and we go and launch this and this will open it up in a web browser so it's
kind of nice it keeps everything separate and in this anaconda you can actually
have different environments different versions of python different modules
installed in each environment so it's a very powerful tool if you're doing a lot
of development and jupiter notebook is just a wonderful visual display certainly
you can use i know spyder is another one which is installed with the anaconda i
actually use a simple notepad plus plus when i'm doing some of my python script
any of your ides will work fine jupiter notebook is iron python because it's
designed for the interface but it's good to be aware of these different tools
and when i launch the jupyter notebook it'll open up like i said a web page in
here and we'll go over here to new and create a new python setup like i said i
believe this is python37 but any of the three this the scikit-learn works with
any of the three x's there's even two seven versions so it's been around a long
time so it's very big on the development side and then the guys in the back guys
and gals developed they went ahead and put this together for me
and let's go ahead and import our different packages now if you've been reading
some of our other tutorials you'll recognize pandas as pd pandas library is
pretty widely used it's a data frame setup so it's just like columns and rows in
a spreadsheet with a lot of different features for looking stuff up seaborne sits
on top of map plot libraries this is for our graphing and we'll see that how
quick it is to throw a graph out there to view in the jupiter notebook for demos
and showing people what's going on and then we're going to use the random forest
the svc or support vector classifier and also the neural network so we're going
to look at this we're actually going to go through and look at three different
classifiers that are most common some of the most common classifiers and let's
show how those work in the scikit-learn setup and how they're different and then
if you're going to do your setup on here you'll want to go ahead and import some
metrics so the sklearn.metrics on here and we use the confusion metrics and the
classification report out of that and then we're going to use from the sklearn
pre-processing the standard scalar and label encoder standard scalar is probably
the most commonly used pre-processing there's a lot of different pre-processing
packages in the sklearn and then model selection for splitting our data up it's
one of the many ways we can split data into different sections and the last line
here is our percentage map plot library in line some of the seaboard and matplot
library will go ahead and display perfectly inline without this and some won't
it's good to always include this when you're in the jupiter notebook this is
jupiter notebook so if you're in ide when you run this it will actually open up a
new window and display the graphics that way so you only need this if you're
running it in a editor like this one with the specifically jupiter notebook i'm
not even familiar with other editors that are like this but i'm sure they're out
there i'm sure there's a firefox version or something jupiter notebook just
happens to be the most widely used out there and we can go ahead and hit the run
button and this now has saved all this underneath the packages so my packages are
now all loaded i've run them whether you run it on top we run it to the left and
all the packages are up there so we now have them all available to us for our
project we're working on and i'm just gonna make a little side note on that when
you're playing with these and you delete something out and add something in even
if i went back and deleted this cell and just hit the scissors up here these are
still loaded in this kernel so until i go under kernel and restart or restart and
clear or restart and run all i'll still have access to pandas important to know
because i've done that before i've loaded up maybe not a module here but i've
loaded up my own code and then changed my mind and wondering why is it keep
putting out the wrong output and then i realize it's still loaded in the kernel
and you have to restart the kernel just a quick side note for working with a
jupiter notebook and one of the troubleshooting things that comes up and we're
going to go and load up our data set we're using the pandas so if you haven't yet
go look at our pandas tutorial a simple read the csv with the separation on here
let me go ahead and run that and that's now loaded into the variable wine and
let's take a quick look at the actual file i always like to look at the actual
data i'm working with in this case we have wine quality dash red i'll just open
that up i have it in my open office set up separated by semicolons that's
important to notice and we open that up you'll see we have go all the way down
here it looks like 1600 lines of data minus the first one so 15 1599 lines and
we have a number of features going across the last one is quality and right off
the bat we see the quality is uh has different numbers in it five six seven it's
not really i'm not sure how high of a level it goes but i don't see anything over
a seven so it's kind of five through seven is what i see here five six and seven
four five six and seven looking to see if there's any other values in there
looking through the demo to begin with i didn't realize the setup on this so you
can see there's a different quality values in there alcohol sulfates ph density
total sulfur dioxide and so on those are all the features we're going to be
looking at and since this is a pandas we'll just do wine head and that prints
our first five rolls rows of data that's of course a pandas command and we can
see that looks very similar to what we were looking at before we have everything
across here it's automatically assigned an index on the left that's what pandas
does if you don't give it an index and for the column names it has assigned the
first row so we have our first row of data pulled off the our comma separated
variable file in this case uh semicolon separated and it shows the different
features going across and we have what one two three four five six seven eight
nine ten eleven features twelve including quality but that's the one we want to
work on and understand and then because we're in uh panda's data frame we can
also do wine dot info and let's go ahead and run that this tells us a lot about
our variables we're working with you'll see here that there is 1 1599 that's what
i said from the spreadsheet so that looks correct non-null float 64. this is very
important information especially the non-null so there's no null values in here
that can really trip us up in pre-processing and there's a number of ways to
process non-null values one is just to delete that data out of there so if you
have enough data in there you might just delete your non-null values another one
is to fill that information in with like the average or the most common values or
other such means but we're not gonna have to worry about that but we'll look at
another way because we can also do wine is null and sum it up and this will give
us a similar won't tell us that these are float values but it will give us a
summation of there we go let me run that it'll give us a summation on here how
many null values in each one so if you wanted to you know from here you would be
able to say okay this is a null value but it doesn't tell you how many are null
values this one would clearly tell you that you have maybe five null values here
two null values here and you might just if you had only seven null values and all
that different data you'd probably just delete them out where if ninety percent
of the data was null values you might rethink either a different data collection
set up or find a different way to deal with the null values we'll talk about that
just a little bit in the models too because the models themselves have some
built-in features especially the forest model which we're going to look at at
this point we need to make a choice and to keep it simple we're going to do a
little pre-processing of the data and we're going to create some bins and bins
we're going to do is 2 comma 6.5 comma 8. what this means is that we're going to
take those values if you remember up here let me scroll back up here we had our
quality the quality comes out between two and eight basically or one and eight we
have five five five six you can see just in the just in the first five lines of
variation in quality we're going to separate that into just two bins of quality
and so we've decided to create two bins and we have bad and good it's going to be
the labels on those two bins we have a spread of 6.5 and an exact index of eight
the exact index is because we're doing zero to eight on there the six point five
we can change we could actually make this smaller or greater but we're only
looking for the really good wine we're not looking for the zero one two three
four five six we're looking for wines with seven or eight on them so high quality
you know like this is what i want to put on my dinner table at night i might
taste the good wine not the semi good wine or mediocre wine and then this is a
panda so pd remember stands for pandas panda's cut means we're cutting out the
wine quality and we're replacing it and then we have our bins equals bins that's
the command bins is the actual command and then our variable bins 2 comma 6.58 so
two different bins and our labels bad and good and we can also do let me just do
it this way wine quality since that's what we're working on and let's look at
unique another pandas command and we'll run this and i get this lovely error why
did i get an error well because i replaced wine quality and i did this cut here
which changes things on here so i literally altered one of the variables is saved
in the memory so we'll go up here to the kernel restart and run all that starts
it from the very beginning and we can see here that that fixes the error because
i'm not cutting something that's already been cut we have our wine quality unique
and the wine quality unique is a bad or good so we have two qualities objects bad
is less than good meaning bad's going to be zero and good's going to be one and
to make that happen we need to actually encode it so we'll use the label quality
equals label encoder and the label encoder let me just go back there since this
is part of sklearn that was one of the things we imported was a label encoder you
can see that right here from the sklearn dot processing import standard scalar
which we're going to use in a minute and label encoder and that's what tells it
to use that equals zero and good equals one and we'll go ahead and run
that and then we need to apply it to the data and when we do that we take our
wine quality that we had before and we're going to set that equal to label
quality which is our encoder and let's look at this line right here we have dot
fit transform and you'll see this in the pre-processing these are the most common
used is fit transform and fit transform because they're so often that you're also
transforming the data when you fit it they just combine them into one command and
we're just going to take the wine quality feed it back into there and put that
back in our wine quality setup and run that and now when we do the wine and the
head of the first five values and we go ahead and run this you can see right here
underneath quality zero zero zero i have to go down a little further to look at
the better wines let's see if we have some that are ones yeah there we go there's
some ones down here so when we look at 10 of them you can see all the way down to
zero or one that's our quality and again we're looking at high quality we're
looking at the seven and the eights or six point five and up and uh let's go
ahead and grab our or was it here we go wine quality let's take a look at what
else more information about the wine quality itself and we can do a simple pandas
thing value counts hopefully i type that in there correctly and we can see that
we only have 217 of our wines which are going to be the higher quality so 217
and the rest of them fall into the bad bucket and the zero which is uh 1382 so
again we're just looking for the top percentage of these the top what is that
it's probably about a little a little under 20 percent on there so we're looking
for our top wines our seven and eights and let's use our let's plot this on a
graph so we take a look at this and the sns if you remember correctly that is let
me just go back to the top that's our seabourn seaborn sits on top of matplot
library it has a lot of added features plus all the features of the matplot
library and also makes it quick and easy to put out a graph we'll do a simple bar
graph and they actually call it count plot and then we want to just do count plot
the wine quality so let's put our wine quality in there and let's go ahead and
run this and see what that looks like and nice inline remember this is why we did
the inline so make sure it appears in here and you can see the blue space or the
first space represents low quality wine and our second bar is a high quality line
and you can see that we're just looking at the top quality wine here most of the
wine we want to just give it away to the neighbors no maybe if you don't like
your neighbors maybe give them the good quality wine and i don't know what you do
with the bad quality wine i guess use it for cooking there we go but you can see
here it forms a nice little graph for us with the seaboard on there and you can
see our setup on that so now we've looked at we've done some pre-processing we've
described our data a little bit we have a picture of how much of the wine what we
expect it to be high quality low quality checked out the fact that there's none
we don't have any null values to contend with or any odd values some of the other
things you sometimes look at these is if you have like some values that are just
way off the chart so the measurement might be off or miscalibrated equipment if
you're in the scientific field so the next step we want to go ahead and do is we
want to go ahead and separate our data set or reformat our data set and we
usually use capital x and that denotes the features we're working with and we
usually use a lowercase y that denotes what uh in this case quality what we're
looking for and we can take this and go wine it's going to be our full thing of
wine dropping what are we dropping we're dropping the quality so these are all
the features minus quality now make sure we have our axes equals one if you left
it out it would still come out correctly just because of the way it processes on
the defaults and then our y if we're going to remove quality for our x that's
just going to be one and it is just the quality that we're looking at for y so
we put that in there and we'll go ahead and run this so now we've separated the
features that we want to use to predict the quality of the wine and the quality
itself the next step is if you're going to create a data set in a model we got to
know how good our model is so we're going to split the data train and test
splitting data and this is one of the packages we imported from sklearn and the
actual package was train test split and we're going to do x y test size 0.2
random state 42. and this returns four variables and most common you'll see is
capital x train so we're going to train our set with capital x test that's the
data we're going to keep on the side to test it with y train y remember stands
for the quality or the answer we're looking for so when we train it we're going
to use x train and y train and then y test to see how good our x test does and
the train test split let me just go back up to the top that was part of the
sklearn model selection import train test split there is a lot of ways to split
data up this is when you're first starting you do your first model you probably
start with the basics on here you have one test for training one for test our
test size is point two or twenty percent and random state just means we just
start with a it's like a random seed number so it's not too important back there
we're randomly selecting which ones we're going to use since this is the most
common way this is what we're going to use today there is and it's not even an
sklearn package yet so someone's still putting it in there one of the new things
they do is they split the data into thirds and then they'll run the model on each
of they combine each of those thirds into two thirds for training and one for
testing and so you actually go through all the data and you come up with three
different test results from it which is pretty cool that's a pretty cool way of
doing it you could actually do that with this by just splitting this into thirds
and then or you have a test size one test set third and then split the training
set also into thirds and also do that and get three different data sets this
works fine for most projects especially when you're starting out it works great
so we have our x train our x test our y train and our y test and then we need to
go ahead and do the scaler and let's talk about this because this is really
important some models do not need to have scaling going on most models do and so
we create our scalar variable we'll call it sc standard scalar and if you
remember correctly we imported that here wrong with the label encoder the
standard scalar setup so there's our scalar and this is going to convert the
values instead of having some values that go from zero if you remember up here we
had some values are 54 60 40 59 102. so our total sulfur dioxide would have these
huge values coming into our model and some models would look at that and they'd
become very biased to sulfur dioxide it'd have the hugest impact and then a value
that had 0.076.098 or chlorides would have very little impact because it's such
a small number so we take the scalar we kind of level the playing field and
depending on our scalar it sets it up between 0 and 1 a lot of times is what it
does let's go ahead and take a look at that and we'll go ahead and start with our
x train and our x train equals sc fit transform we talked about that earlier
that's an sklearn setup it's going to both fit and transform our x train into our
x train variable and if we have an x train we also need to do that to our test
and this is important because you need to note that you don't want to refit the
data we want to use the same fit we used on the training is on the testing
otherwise you get different results and so we'll do just oops not fit transform
we're only going to transform the test side of the data so here's our x test that
we want to transform and let's go ahead and run that and just so we have an idea
let's go ahead and take and just print out our x train oh let's do uh first 10
variables very similar to the way you do with the head on a data frame you can
see here our variables are now much more uniform and they've scaled them to the
same scale so they're between certain numbers and with the basic scalar you can
fine tune it i just let it do its defaults on this and that's fine for what we're
doing in most cases you don't really need to mess with it too much it does look
like it goes between like minus probably minus 2 to 2 or something like that
that's just looking at the train variable i'll go ahead and cut that one out of
there so before we actually build the models and start discussing the sk-learn
models we're going to use we covered a lot of ground here most of when you're
working with these models you put a lot of work into pre-prepping the data so we
looked at the data notice that it's separated loaded it up we went in there we
found out there's no null values that's hard to say no no no values we have
there's none there's none nobody i can't say it and of course we sum it up if you
had a lot of null values this would be really important coming in here so is
there a null summary we looked at pre-processing the data as far as the quality
and we're looking at the bins so this would be something you might start playing
with maybe you don't want super fine wine you don't want the seven and eights
maybe you want to split this differently so certainly you can play with the bins
and get different values
and make the bins smaller or lean more towards the lower quality so you then
have like medium to high quality and we went ahead and gave it labels again this
is all pandas we're doing in here setting up with unique labels and group names
bad good bad is less than good that could be so important you don't know how many
times people go through these models and they have them reversed or something and
then they go back they're like why is this data not looking correct so it's
important to remember what you're doing up here and double check it and we used
our label encoder so that was to set that up as quality01 good in this case we
have bad good zero one and we just double check that to make sure that's what
came up in the quality there and we threw it into a graph because people like to
see graphs i don't know about you but you start looking all these numbers and all
this text and you get down here and you say oh yes you know here this is how much
of the wine we're going to label as subpar not good this is how much we're going
to label as good and then we go down here to finally separating out our data so
it's ready to go into the models and the models take x and a y in this case x is
all of our features minus the one we're looking for and then y is the features
we're looking for so in this case we dropped quality and in the y case we added
quality and then because we need to have a training set and a test set so we can
see how good our models do we went ahead and split the models up x train x test y
train y test and that's using the train test split which is part of the sklearn
package and we did as far as our testing size point two or twenty percent the
default is twenty five percent so if you leave that out it'll do default setup
and we did a random state equals forty two if you leave that out it'll use a
random state i believe it's default one i'd have to look that back up and then
finally we scaled the data this is so important to scale the data going back up
to here if you have something that's coming out as a hundred is going to really
outweigh something that's 0.071 that's not in all the models different models
handle it differently and as we look at the different models i'll talk a little
bit about that we're going to look at three models today three the top models
used for this and see how they compare and how the numbers come out between them
so we're going to look at three different setups let me change my cell here to
mark down there we go and we're going to start with the random forest classifier
so the three sets we're looking at is the random forest classifier support vector
classifier and then a neural network now we start with the random forest
classifier because it has the least amount of parts moving parts to fine-tune and
let's go ahead and put this in here so we're going to call it rfc for random
force classifier and if you remember we imported that so let me go back up here
to the top real quick and we did an import of the random fourth classifier from
sk learn ensemble and then we'll all we also let me just point this out here's
our svm where we imported our support vector classifier so svm is support vector
model support vector classifier and then we also have our neural network and
we're going to from there the multi-layered perceptron classifier kind of a
mouthful for the p perceptron don't worry too much about that name it's just it's
a neural network there's a lot of different options on there and setups which is
where they came up with the perceptron but so we have our three different models
we're going to go through on here and then we're going to weigh them here's our
metrics we're going to use a confusion metrics also from the sk learn package to
see how good our model does with our split so let's go back down there and take
a look at that and we have our rfs equals random forest classifier and we have n
estimators equals 200. this is the only value you play with with a random forest
classifier how many forests do you need or how many trees in the forest so how
many models are in here that makes it pretty good as a startup model because
we're only playing with one number and it's pretty clear what it is and you can
lower this number or raise it usually start up with a higher number and then
bring it down to see if it keeps the same value so you have less you know the
smaller the model the better the fit and it's easier to send out to somebody else
if you're going to distribute it now the random forest classifier everything i
read says it's used for kind of a medium-sized data set so you can run it in on
big data you can run it on smaller data obviously but tends to work best in the
mid-range and we'll go ahead and take our rfc and i just copied this from the
other side dot fit x train comma y train so we're sending it our features and
then the quality in the y train what we want to predict in there and we just do a
simple fit now remember this is sk learn so everything is fit or transform
another one is predict which we'll do in just a second here let's do that now
predict rfc equals and it's our rfc model predict and what are we predicting on
well we trained it with our train values so now we need our test our x test so
this has done it this is going to do this is the three lines of code we need to
create our random force variable fit our training data to it so we're programming
it to fit in this case it's got 200 different trees it's going to build and then
we're going to predict on here let me go ahead and just run that and we can
actually do something like oh let's do predict rf c just real quick we'll look at
the first 20 variables of it let's go ahead and run that and in our first 20
variables we have three winds that make the cut and the other 17 don't so the
other 17 are bad quality and three of them are good quality in our predicted
values and if you remember correctly we'll go ahead and take this out of here
this is based on our test so these are the first 20 values in our test and this
has as you can see all the different features listed in there and they've been
scaled so when you look at these they're a little bit confusing to look at and
hard to read but we have there's a minus 01 so this is 0.36 minus 01 so 0.164
minus 0.09 or no it's still minus 1. so minus 0.9 all between 0 and 1 on here i
think i was confused earlier and i said 0 between 2 negative 2. but between -1
and 1 which is what it should be in the scale and we'll go ahead and just cut
that out of there run this we have our setup on here so now we've run the
prediction and we have predicted values well one you could publish them but what
do we do with them what we want to do with them is we want to see how where our
model model performed that's the whole reason for splitting it between a training
and testing model and for that remember we imported the classification report
that was again from the sk learn there's our confusion matrix and classification
report and the classification report actually sits on the confusion matrix so it
uses that information and our classification report we want to know how good are
y tests that's the actual values versus our predicted rfc so we'll go ahead and
print this report out and let's take a look and we can see here we have a
precision out of the zero we had about .92 that were labeled as bad that were
actually bad and out of precision for the quality wines we're running about 78
percent so you kind of give us an overall 90 percent and you can see our f1 score
our support setup on there our recall you could also do the confusion matrix on
here which gives you a little bit more information but for this this is going to
be good enough for right now we're just going to look at how good this model was
because we want to compare the random fourth classifier with the other two models
and you know what let's go ahead and put in the confusion matrix just so you can
see that on there with y test and prediction rfc so in the confusion matrix we
can see here that we had 266 correct and seven wrong these are the missed labels
for bad wine and we had a lot of missed labels for good wine so our quality
labels aren't that good we're good at predicting bad wine not so good at
predicting whether it's a good quality wine important to note on there so that is
our basic random forest classifier and let me go ahead upsell and change cell
type to markdown and run that so we have a nice label let's look at our svm
classifier our support vector model and this should look familiar we have our clf
we're going to create what's we'll call it just like we call this an rfc and then
we'll have our clf dot fit and this should be identical to up above x train comma
y train and uh just like we did before let's go ahead and do the prediction and
here is our clf predict and it's going to equal the clf dot predict and we want
to go ahead and use x underscore test and right about now you can realize that
you can create these different models and actually just create a loop to go
through your different models and put the data in and that's how they designed it
they designed it to have that ability let's go ahead and run this and then let's
go ahead and do our classification report and i'm just going to copy this right
off of here they say you shouldn't copy and paste your code and the reason is is
when you go in here and edit it you unbearably will miss something we only have
two lines so i think i'm safe to do it today and let's go ahead and run this and
let's take a look how the svm classifier came out so up here we had a 90 percent
and down here we're running
about an 86 so it's not doing as good now remember we randomly split the data
so if i run this a bunch of times you'll see some changes down here so these
numbers this size of data if i read it 100 times it would probably be within plus
or minus three or four on here in fact if i ran this 100 times you'd probably see
these come out almost the same as far as how well they do in classification and
then on the confusion matrix let's take a look at this one this had 22 by 25 this
one has 35 by 12. so it's it's doing not quite as good that shows up here 71
percent versus 78 percent and then if we're going to do a svm classifier we also
want to show you one more and before i do that kind of tease you a little bit
here before we jump into neural networks the big save all deep learning because
everything else must be shallow learning that's a joke let's just talk a little
bit about the svm versus the random forest classifier the svm tends to work
better on smaller numbers it also works really good on a lot of times you
convert things into numbers and bins and things like that the random forest tends
to do better with those at least that's my brief experience with it where if you
have just a lot of raw data coming in the svm is usually the fastest and easiest
to apply model on there so they've each have their own benefits you'll find
though again that when you run these like 100 times difference between these two
on a data set like this is going to just go away there's randomness involved
depending on which data we took and how they classify them the big one is the
neural networks and this is what makes the neural networks nice is they can do
they can look into huge amounts of data so for a project like this you probably
don't need a neural network on this but it's important to see how they work
differently and how they come up differently so you can work with huge amounts of
data you can also many respects they work really good with text analysis
especially if it's time sensitive more and more you have an order of text and
they've just come out with different ways of feeding that data in where the
series in the order of the words is really important same thing with uh starting
to predict in the stock market if you have tons of data coming in from different
sources the neural network can really process that in a powerful way to pull up
things that aren't seen before when i say lots of data coming in i'm not talking
about just the high lows that you can run an svm on real easily i'm talking about
the data that comes in where you have maybe you pulled off the twitter feeds and
have word counts going on and you've pulled off the the different news feeds that
business are looking at and the different releases when they release the
different reports so you have all this different data coming in and the neural
network does really good with that pictures picture processing now is really
moving heavily into the neural network if you have a pixel 2 or pixel 3 phone put
out by google it has a neural network for doing it's kind of goofy but you can
put little star wars androids dancing around your pictures and things like that
that's all done with the neural network so it has a lot of different uses but
it's also requires a lot of data and is a little heavy-handed for something like
this and this should now look familiar because we've done it twice before we have
our multi-layered perceptron classifier we'll call it an mlpc and it's this is
what we imported mlpc classifier there's a lot of settings in here the first one
is the hidden layers you have to have the hidden layers in there we're going to
do three layers of 11 each so that's how many nodes are in each layer as it comes
in and that was based on the fact we have 11 features coming in then i went ahead
and just did three layers probably get by with a lot less on this but yeah i
didn't want to sit and play with it all afternoon again this is one of those
things you play with a lot because the more hidden layers you have the more
resources you're using you can also run into problems with overfitting with too
many layers and you also have to run higher iterations the max iteration we have
is set to 500 the default's 200 because i use three layers of 11 each which is by
the way kind of a default i use i realized that usually you have about three
layers going down and the number of features going across you'll see that's
pretty common for the first classifier when you're working in neural networks but
it also means you have to do higher iterations so we up the iterations to 500 so
that means it's going through the data 500 times to program those different
layers and carefully adjust them and we do have a full tutorials you can go look
up on neural networks and understand the neural network settings a lot more and
of course we have you're looking over here where we had our previous model where
we fit it same thing here mlpc fit x train y train and then we're going to create
our prediction so let's do our predict and mlpc and it's going to equal the mlpc
and we'll just take the same thing here predict x test let's just put that down
here dot predict test and if i run that we've now programmed it we now have our
prediction here same as before and we'll go ahead and do the copy print again
always be careful with the copy paste now because you always run the the chance
of missing one of these variables so if you're doing a lot of coding you might
want to skip that copy and paste and just type it in and let's go ahead and run
this and see what that looks like and we came up with an 88 we're going to
compare that with the 86 from our tree our svm classifier and our 90 from the
random forest classifier and keep in mind random forest classifiers they do good
on mid-sized data the svm on smaller amounts of data although to be honest i
don't think that's necessarily the split between the two and these things will
actually come together if you random a number of times and we can see down here
the noun of good wines mislabeled with the setup on there it's on par with our
random forest so it had 22.25 shouldn't be a surprise it's identical it just
didn't do as good with the bad wines labeling what's a bad one and what's not see
yeah because they had 266 and 7. we had down here 260 and 13. so mislabeled a
couple of bad wines as good wines so we've explored three of these basic
classifiers these are probably the three most widely used right now i might even
throw in the random tree if we open up their website we go under supervised
learning there's a linear model we didn't do that almost most of data usually
just start with a linear model because it's going to process the quickest and use
the least amount of resources but you can see they have linear quadratic they
have kernel ridge there's our support vector stochastic gradient nearest
neighbors nearest neighbors is another common one that's used a lot very similar
to the svm gaussian process cross decomposition naive bayes this is more of an
intellectual one that i don't see used a lot but it's like the basis of a lot of
other things decision tree there's another one that's used a lot ensemble methods
not as much multi-class and multi-label algorithms feature selection neural
networks that's the other one we use down here and of course the forest so you
can see there's a in sk learn there are so many different options and they've
just developed them over the years we covered three of the most commonly used
ones in here and went over a little bit over why they're different neural network
just because it's fun to work in deep learning and not in shallow learning as i
told you that doesn't mean that the svm is actually shallow it does a lot of it
covers a lot of things and same thing with the decision for the random forest
classifier and we notice that there's a number of other different classifier
options in there these are just the three most common ones and i'd probably throw
the nearest neighbor in there and the decision tree which is usually part of the
decision for us depending on what the back end you're using and since as human
beings if i was in the shareholder's office i wouldn't want to leave them with a
confusion matrix they need that information for making decisions but we want to
give them just one particular score and so i would go ahead and we have our
sklearn metrics we're going to import the accuracy score and i'm just going to do
this on the random forest since that was our best model and we have our cm
accuracy score and i forgot to print it remember in jupyter notebook we can just
do the last variable we leave out there will print and so our cm accurate score
we get is 90 and that matches up here we should already see that up here in
precision so you can either quote that but a lot of times people like to see it
highlighted at the very end this is our precision on this model and then the
final stage is we would like to use this for future so let's go ahead and take
our wine if you remember correctly we'll do one head of 10. we'll run that
remember our original data set we've gone through so many steps now we're going
to go back to the original data and we can see here we have our top 10 our top 10
on the list only two of them make it as having high enough quality wine for us to
be interested in them and then let's go ahead and create some data here we'll
call it x new equals and this is important this data has to be we just kind of
randomly selected some data looks an awful lot like some of the other numbers on
here which is what it should look like and so we have our x new
equals 7.3.58 and so on and then it is so important this is where people forget
this step x new equals sc remember sc that was our standard scalar variable we
created if we go right back up here before we did anything else we created an sc
we fit it and we transformed it and then we need to do what transform the data
we're going to feed in so we're going to go back down here and we're going to
transform our x new and then we're going to go ahead and use the where are we at
here we go our random forest and if you remember all it is is our rfc predict
model right there let's go ahead and just grab that down here and so our y new
equals here's our rfc predict and we do our x new in and then it's kind of nice
to know what it actually puts out so according to this it should print out what
our prediction is for this wine and oh it's a bad wine okay so we didn't pick out
a good wine for our ex new and that should be expected most of wine if you
remember correctly only a small percentage of the wine met our quality
requirements so we can look at this and say oh we'll have to try another wine out
which is fine by me because i like to try out new wines and i certainly have a
collection of old wine bottles and very few of them match but you can see here
we've gone through the whole process just a quick re rehash we had our imports we
touched a lot on the sk learn our random forest our svm and our mlp classifier so
we had our support vector classifier we had our random forests and we have our
neural network three of the top used classifiers in the sk learn system and we
also have our confusion metric matrix and our classification report which we used
our standard scalar for scaling it and our label encoder and of course we needed
to go ahead and split our data up in our implot line train and we explored the
data in here for null values we set up our quality into bins we took a look at
the data and what we actually have and put a nice little plot to show our quality
what we're looking at and then we went through our three different models and
it's always interesting because you spend so much time getting to these models
and then you kind of go through the models and play with them until you get the
best training on there without becoming biased that's always a challenge is to
not over train your data to the point where you're training it to fit the test
value and finally we went ahead and actually used it and applied it to a new wine
which unfortunately didn't make the cut it's going to be the one that we drink a
glass out of and save the rest from cooking certainly there are many reasons to
be able to go online and scrape different websites they arrange everything from
pulling out different links to pulling data off of websites as a data scientist
you might need to get some information off a website that doesn't have a direct
api to pull that information and in python we have a wonderful tool when you talk
python and you talk web scraping we're talking beautiful suit which is a package
you add into your python that you're running and we can come over here to the
website www.crummy.com software slash beautiful soup you can actually read a
little bit about it currently beautiful soup 4 is the current version if you
don't remember the full website for it you can always do what i do which is go
over and do a search for beautiful soup official site it almost always comes up
right at the top and you click on there and it'll take you to the crummy.com
software site for beautiful soup now we're going to use our whatever python
interface you want ide i'm going to use jupiter lab which is built on jupiter
notebook through anaconda so when i open up my anaconda navigator you'll see that
i have my different tools available again you might be using a different editor
and that's okay you might be in pycharm or something like that we don't need to
do this and jupiter lab is jupiter notebook with added tabs and some added
features it's basically in beta testing so it's got a few little glitches when
you're saving things and moving between projects but for the most part it's a
great upgrade to the jupiter notebook and you can use them together so you don't
have to i mean it's built on jupiter notebooks anything you do in jupiter
notebook you can open up in jupiter lab and the first thing we need to do is we
need to go ahead in this case i'm going under my environments since it partitions
the environments out and i'm going to open up a terminal window we have to
install some packages in here to work with now there's a lot of choices on this i
because of the simplicity we'll be using conda install now you can use pip
install for the same thing and we're going to install our beautiful soup four and
you have to type out the whole thing beautiful soup for and you can use a pip
install if you're using a different environment and i am using python version
3.6 although according to beautiful soup they also work on three seven all the
way from two seven through olive 3x now according to the beautiful soup website
the beautiful soup 4 works on anything you can install anything from python 2 7
all the way through any of the python 3 versions this just happens to be python
36 because i do there's a lot of other packages that don't work on 3.7 yet and
we'll go ahead and run this install on here and let it go through its
environmental setup and of course with conda it goes in there and finds all the
dependencies pip doesn't do as much as far as finding dependencies but you know
exactly what's on there with pip so if you're doing a huge distribution you
probably want to use your pip install so you can track what's going on there with
the conda i like to just let it take over since this isn't a major distributed
package going out another quick note between pip and conda is that if you start
on a project in one of these environments and you're using pip in there stick
with pip if you're using conda stick with conda they track the packages and you
can run into some issues where they're not tracking the same packages and
something gets overwritten so it's important to stay very consistent with your
install on your environments and we'll also need to go ahead and install our
numpy environment and our pandas on here so go ahead and do that if you haven't
added those packages in go ahead and install those into your environment that
you're working in and of course pandas is just simply uh install pandas and let's
just install a couple more packages in this case let's get our install our map
plot library because we're going to plot at the end since we're going to be
collecting data and for this project that will be all the packages we'll need so
we can go and close out of our installer or whatever setup you have and we'll go
back to home and we'll just launch our jupyter lab and that will open up in our
browser window now if you are coming from jupiter notebooks and first time in lab
we can go ahead and just create our first notebook python3 you can also do it
under a file launcher and you'll see new notebook it automatically opens up and
we just click right on there it'll pop open on the left and i'll right click this
and we'll rename this we'll rename it just beautiful and it is a i n b file on
there so that should look familiar because that's the jupiter notebook file this
is a new one now i have mults in the past i usually hid this on the other
computer all my notes for the lesson today but this is my notes going down and
we'll go ahead and just start going through this and see what it looks like to do
a data poll from front to end and see how that works as a data scientist pulling
that information in from the website and the first thing i want to do is i want
to go ahead and close this side window that way it looks get the nice full screen
and we can also up the size a little bit one of the wonderful things about
working in a browser window just do that control plus thing the packages we
talked about is pandas so we imported our pandas if you haven't already that's
our data frame if you haven't done our pandas tutorial definitely worthy of the
time to go through there and understand pandas it's such a powerful tool this
basically turns your data into a spreadsheet data frame our numpy is our number
array so it kind of works with pandas very closely as far as manipulating data in
arrays matplot we want to go ahead and bring that in our plt so that we can plot
the data at the end and this line right here that says matplot library inline is
for the jupiter notebook specifically it tells it to print that on this page a
lot of the newer versions don't actually require to have that line they'll still
print it on the page but you should still include that if you're in the jupiter
lab setup and then we have our url library.request we're going to import url open
for opening up the website and then we have our bs4 that's your beautiful soup 4.
we're going to import beautiful soup and then our last one is our re that is for
manipulating our regular expressions so when we get to that part of importing our
data we have to do a lot of reformatting so it's something we can use and the re
is one of those tools we'll go ahead and run this and just bring all that in so
this is all imported all these packages are now into our web scraping program
we're going to run now if we're going to dive in and pull data which you have a
nice website to pull from and let's go ahead and we'll use the upper timing.com
results for the 2018 martin luther king race and if we take this you can actually
just take this where did we get this from
well you can go in here and find the website you're going to scrape from and
you'll see right here it says you just copy that link right in there that http
and this is the website that we're looking at you can see right here all the
information that we're looking at let's say we wanted to run some statistics on
this it sure would be nice to be able to pull it off of here and if they don't
have a direct api that means we need to pull it from their website some of these
will have a download although if you've ever done we have a download click and
maybe you're paging through a hundred websites uh in one case i was uh pulling
all the different united states bills that are passed to track who voted on them
for a project and you can imagine that there's you know hundreds and hundreds of
those thousands of these documents that they voted on who voted on it it goes
through the senate goes back to the congress so i opened up a website pull all
the links off of there that match a certain criteria and we'll look at that in
just a minute how we go through the html and then i had to reformat them or i
could hand download each one one at a time which would just be a nightmare so
it's nice to automate it in this case we're going to be pulling up this chart we
want to figure out how to pull this chart off of this website and so we go back
into our jupiter notebook i've got my url just our name for it and it's just a
string that's all this is nothing fancy there you'll notice that on the slashes
we now have forward forward slash you can do a single forward and hdp is a double
forward this is just how you have to switch it to match setup in there and then
it's going to go ahead and use the html equals url open url and that's from our
url library request so it's opening a link to that website or at least pointing
to it and if we run this this just sets it up so this is all set up and then once
we've done our setup let's go ahead and create an object called soup this will be
and if you remember up here here's our beautiful soup that we imported from the
bs4 and this is the package that we're working with and so we're going to do our
beautiful soup on here and on this we need to go ahead and send it our html so it
knows what it's opening and then the second part is we have to tell it how the
format is coming in and the most common one for your html polls is an lxml setup
and so almost all of them you'll end up using the lxml there's a few other
options and because this is so common in the newer versions a lot of times they
just leave it out just because it's already on the default we'll go ahead and
leave it in here just to remind us that it's there we'll go ahead and run that
and on the newer versions uh they actually default it to the xml setup in the
html we'll just leave it out and call it html so it's just going to pull from
this url and when we run that on here we've now created an object soup that has
pulled the website into it so soup contains the information along with
information on that website and what's going on so let's just go over what we did
real quick before we start digging into the actual soup before we start scooping
out stuff we imported our different modules that we're going to use with our
package specifically the beautiful soup we did install the beautiful soup if you
remember correctly you have to call it beautiful soup 4 specifically so it knows
what you're bringing in and this line right here is very key from bs4 because
that's how it installs the module we're importing our beautiful soup and then we
found our url in this case we're going to go pull information from the martin
luther king dream run and then we set our html to our url open url and you can
see right here we imported that so here's our url dot request import url open so
we're requesting a connection and once we send that connection into the beautiful
soup it creates an object called soup and then this one of course we chose soup
just because it goes with beautiful soup i guess we could have chosen beautiful
and now we can start extracting information from our website because we pulled it
down onto our computer under soup now we can start by looking at the title of the
website soup dot title and if we print title dot text you'll see this a lot in
beautiful suit because title contains all kinds of information and if we want
just the text from that title you add the dot text on the end and you can see
right here we have our 2018 mlk dream run 5k race results if you look at the tab
that's the actual title up here 2018 mlk dream run 5k race that's what the title
is on the website and then you might be curious what's in title what's the whole
title that it's storing up there well let's go ahead and print it out here's
print title and print title.txt and we run that you can see it has the html tags
title on it and then the forward slash title to end it and so we're really just
pulling off this piece of the html code and then we look at the text inside that
particular part of the html and earlier i mentioned links what if you want to get
all the links off this page oh that would be fun uh we could do soup dot and
we'll do find underscore all put this in bracket and then quotation marks we're
going to put a a is the key find and you'll start seeing a div and all the
different options you have for finding these entities in a website and then let's
go ahead and just print our links and you can see here that it now shows all the
different links in here that are marked by eggs we did a find all a and then we
can also because this is a little bit hard to pull off the h reference so we can
also add in our find all fine-tune that in this case the h reference equals true
we'll actually filter that out and then finally we might do a four link in links
and we can simply do something like this for each link we want to actually find
the h reference because we know there's an h reference in it and if i run this
you can see it just comes through and prints them out one at a time some of these
are really useful so you might be looking for something that has https in it and
you know that's a link running to something else or you might be looking for the
mail to tags you know that's all the mail addresses but either way you can easily
find all the links in your html document that you're paging through and of course
any packages that have evolved over time you can also do link dot get h reference
which should do the same thing as our other format and you can see it certainly
does we get the same printout up here in this particular case we really want to
get the data off the page uh so let's go ahead and do that let's see what that
looks like and in data let's call it all rows there we go equals and then we have
our soup dot find underscore all there's our brackets and then if we're looking
for each row in a database you'll remember your html code we're looking for the
tag tr so we want to find all tr and we can take this and let's go ahead and just
take all our rows and do a print all rows and about this time you're going to
guess that we're going to get a huge amount of information just dumped onto our
page and sure enough we do if you look at this it just kind of goes on forever
but this is an array each row is considered an array so because of that we can do
something simply as putting brackets and just print the first let's do the first
five rows so from beginning to five and you can see here's our first five rows on
here i sometimes like to just do let's just do row zero and we see that row zero
is finishers finishers 191 and just out of curiosity what's if that's zero what's
row one male okay so we're starting to see titles going across here so if we come
up here and we do rows we did what up to 10. let's just take a look and see what
10 does again and just take a look at that information that comes across place
bib name gender age city state chip chip pace gender and so on so it comes all
the way down here we kind of have an ending right here and then we have one and
then we actually it looks like we start to have information so we have our one or
one 1191 max randolph that must be the name male 29 and so on and start seeing
how the information starts getting displayed going down so the next thing we want
to do with this i'll go back up here and just edit the space we're in so it
starts to make a little bit more sense keep it all together and so we want to do
for each row in all rows we're looking at what information are we looking at well
we have our th up here that's a header our td down here which looks like the
individual information and we really are looking for the actual data so we're
looking for td tags in the rows and we can do that because when you remember when
it stores the row it also stores the tags underneath that so all rows have all
the different tags in it and you can see right here as we print each one of those
out and so we look at each row we can create another variable we'll call it row
list and we'll set this equal to in this case row because we've already pulled
all the rows out of soup so now we want to find for each row and in there we want
to find our td and if we go ahead and just print i'm going to do it if you notice
i changed the indent so i'm just going to print row list what this does is the
last value to go into row list our last row is going to print now and of course
make sure you have an underscore instead of a period when you're typing so
row.find underscore all td and if we print the last row you can see i have all
the data coming across here we have our 191 our 1216 zuma
ochoa i hope i said that right female i believe that's age 40 and so on and then
we can take our row list and there's a lot of things we can do with the row list
what we'll do for let's do object or let's just do cell in row list and so we're
going to look at each cell because this is if you look at this they have commas
separated between the different objects and then we're going to go ahead and
print cell dot text let's just take a look and see what that looks like and we
can see here for each row we get 191 there's our 191 there's our 12 16 12 16 our
individual who's in the race and so forth all the way down for those different
settings let's go ahead and create a new variable up here uh we'll call this all
let's just call it data we'll keep it simple uh so here's our data and then we
have our row we take our row we break it up into individual cells so we'll call
this data row and we'll set this empty to an empty row and we're going to take
our cells tab this we know that each cell generates a text and so what we want to
do is i want to take my data row let's just replace that let's take our data row
and let's append our cell dot text so i'm going to add the each row is going to
be a row of the different text on here and then once i create each row i want my
data which is going to be everything to append each row and here's our data row
and then if we go ahead and come down here and let's just print data now if we
were lurking with large data we'd be very careful about just throwing all our
data on the page but you can see here we throw the date on the page and we get
finishers 199 male 78 female 113 one and so on and if you look at this this is
the headers on the file we have finishers male females just like some general
statistics on the first one and then we have actually an empty data set and then
we have our data that continues which actually the actual information we're
looking for so we have one 1191 max randolph mill 29 washington dc run time uh
one of 78 and so on on here so we could really quickly get rid of that number of
different ways to do that one of them is just to do we're going to set we do data
two on uh we should get rid of everything but we want to keep randolph so make
sure randolph is in there oh we lost randolph let's try one on there we go
there's max randolph on there uh so we can just simply do redo our date on here
and we can do data probably want to do it in all rows from one on but i'm just
going to do my data equals data one on down here and there's reasons to split it
this way in data science sometimes you don't want to touch the original data in
case you need it in case we do need the first row so we'll put it down here and
uh maybe we'll just call this titles titles equals data of zero and so we could
do something here where we print we'll print up our titles and we'll print our
data in this case instead of one on let's go minus two let's look at the last two
rows of data so here we have our titles and for some reason just put in finishers
of 191 as expecting a little bit more up there and we have our last couple people
and they look like the data on these looks just fine on here turns out this is
just some generic statistics up here so we'll get rid of titles completely
doesn't really do us any good but we know that data comes in here and we can look
at our data and look at the very end of the data too the minus two to the end and
we can see it pulls the data in pretty good we don't have anything too funky in
here we're looking at it looks pretty clean now you got to be a little careful
because at some point we might have to come back here and clean up the data if we
get an error for running data analysis we might find out there's some unusual
characters or something is missed in the data itself and you also notice that
everything is a string so when we're bringing it in we might have to do some
conversions to test it out and convert them to whatever kind of data format we're
working with so at this time we want to go ahead and bring in our pandas um and
let's go ahead and call this idea for data frame we'll set it equal to and if you
remember correctly we imported pandas as pd and that's standard you'll see that
in most code examples where they call the import pandas as pd and it is capital
d capital f for data frame and we're just going to bring in our data that's what
we called it on here and let's uh take this and we'll print now when you're
working with data frames you're usually talking large amounts of data and so you
almost never want to print the whole data frame out we're going to go ahead and
do that anyway just so we can see what that looks like and you can see in here
brings in our data frame coming in here we just have a mess of information this
is our data let's go ahead and print df and see what that looks like in the data
frame and this is nice because it organizes it into a very easy to read table and
we have they set the label 0 1 2 3 4 5 six and so on and then we have each row uh
we have mel 78 none going across when we get all the way down here we'll see max
randall about number three and the first thing this does is this flags me that i
brought in a bunch of information up here that we really didn't want it's from 3
on that we want and we can clean this up in one of two ways we can try to clean
it up under the data or we can clean it up under the data frame depending on what
it is we're trying to do and so to fix this um i want to go ahead and just change
it up here in the actual data pull in we don't need that information so i'll
rerun it reload our data from for on and then when i run this we see we have max
randolph is right at the top of the list like he should be and we have all the
data going down now with the data frame remember i said we don't usually print
the whole data frame we'll go ahead and do df.head and this prints the first five
rows and you can see that we have 13 columns here's max randall all the way to
theo kinman and i usually also print df tail and the reason i like to do these
particular two setups i'm going to change it just to two rows because you can do
that you can put as many rows as you want is this good to look at the first part
and the end because those are usually where you have extra data brought in
something's messed up and you can also see that we have 190 rows in here and it
comes in with our zuma lisha and they're both on here on the list so now we have
a nice data frame columns and rows we can easily look at it we can see the setup
on here and we can look at the names and everything now at some point you might
be looking at these individual columns and find different information that needs
to be re-edited if you can you try to do it with the whole column under pandas
you can up in the upper part of the code where you went from cell to cell or row
to row you can look at individual cells maybe find a marker in that cell that's
something specific like remove all colons or semicolons or something and there
are brackets so there's a lot of options in there but you'll find that this one
actually comes in pretty clean on here all the way down and the next thing we
really want to do is we want to look at the headers i don't know about you but
doesn't make any sense to me when i have column one i don't know what 1191 is or
1080 need name kaiser runner i'm guessing that's column two is names third one
looks like male or female probably age but i don't want to guess i want this to
bring in my column so i know exactly what i'm looking at so how can we make
beautiful soup do that for us well let's take our column headers we're going to
set that equal to our soup find underscore all and then we're going to look for
our headers our th files and since we're in jupiter lab in this case jupiter
notebook i can just type out column headers if it's the last variable i have
listed it will automatically print it so it's kind of a shorthand and we can see
right here we have place i'm guessing that's bib name gender age city state chip
time chip pace and so on so we have all our headers right there i shouldn't have
to type them all in and we'll go ahead and do it before we'll go ahead and do a
header list equals our empty array and then we can do for a column in column
headers and we can take our header list and just a pin and what do we want we
want the text from the column so we'll just do column.text and then if we come
down here and we print our header list let's see what that looks like if we did
it right we should get a nice list of all the different column headings we want
so we have place bib name and so on and then pandas just because pandas is so
cool we can simply do df columns equals our header from our header list we simply
said df column set to df headers and then if i print df.head we'll take a look at
that and we'll see right here it has nicely placed our values on here place bib
name gender age and so on so very quickly we've created this nice data frame we
have the data displayed in nice rows and columns and easy to read and then as a
data scientist the first thing we want to know is the info what is in these
columns and rows and headers and you see right here they all come up with non-
null object there's a big flag so if i want to do anything with this these are
all coming through as strings or an object i usually mean strings in this case
that they're a string variable and we have you can quickly read through this 191
entries date columns total 14 columns there's a total of 14 columns in the data
and it shows you all the different names and what type of column they are
and it's probably good also to look at the shape of the data df.shape we'll go
ahead and just run that you see it's 191 by 14 14 columns 191 entries this is
more like a we look at a numpy array 191 by 14 for the shape and remember this is
a variable so if i put it on if it's a last variable or last value in the set of
cells jupiter automatically prints it out so if you're in a different ide you
want to go ahead and use the print statement on here then one of the things you'd
want to also go through we'll create a second one df2 equals df dot drop in a now
the axis is automatically equal to zero so a lot of times you'll see something
like axes equal zero comma how equals any axis equals zero is default that means
we're looking at going down the rows you could look at the column going across
let's remove the how any that's just going to confuse you the axis is is whether
you're going down the columns or if you're looking at a row by row by row by row
or you could be looking at it by column by column by column this would drop any
column and it would drop off the n a in any column and how equals and we want any
i always confuse all in any because they both start with a all means that all of
them have to be non-value where any means that any of them can be there to drop
it so this would drop any column with a null value in it but we want 0 and it'll
drop any value with a null value and then because 0 is always the default we'll
just leave it out and then it's curious as to what the shape is now did we lose
anything was there any null values in our df2 that we dropped from the df and
we'll go ahead and run that and we see 191 and 14. so we didn't really drop
anything but it's always good to check there's other ways you can also do are
there let's see any n a's you can detect n a's in here no values infinite values
that's another one you got to watch out for we're working with data that we're
going to do something with here in a minute so you got to be a little careful
also in the convergence are you going to convert something where people typed in
weird characters to describe the data a certain way so now we've got to this
point where we have all our different columns we have our different data and at
this point maybe you're asking or maybe the shareholder the company is asking hey
can we look at the based on the chip time here's our chip time can we plot that
versus gender how does gender versus chip time compare and so we can do that we
can take that and the first thing we look at is we say hey well chip time came in
as a string and that's going to be an issue now there's a number of ways you can
change this one of them is we could go all the way back up here where we created
the data and find a way to tag it and say hey whenever this cell text maybe
instead of appending this i notice that anytime there's two colons in it that's
probably a time signature and let's convert all the time signatures to date time
filled or whatever a lot of times you don't get that you don't get that option
and that's always a question in bringing in data whether you convert the data
coming in at the beginning or do you wait till you have it open it up and then
convert it when you go to use it we're going to go ahead and convert it after we
got it into our data frame so we have our df2 here we've dropped all of our n a's
we dropped our we have a shape in fact let's do this since there's no difference
between df and df2 well we'll just go ahead and use df2 so let's go ahead and
take our df2 and we want to take those that specific field and convert it into
some kind of numerical value we can use and let's add another column a lot of
times this is something you want to do is where you want to go ahead and keep all
the original columns and just add a new column in there and this new column is
going to be based on the if remember correctly we had chip time that's what we're
going to look at okay we want chip time versus gender if we go into our pandas we
find out we have pandas 2 delta and this is actually time delta and then we just
want to take our df2 and we're going to use the chip time column so this is going
to say hey let's look at let's convert everything in df2 chip time into a time
delta format that's the data type we're going to put it as let me go ahead and
just run this and if we go in here and we do info df2 and we'll keep our we're
going to look at this particular column but we want to keep it as a data frame so
this is a list of all the columns we want to look at we'll just do dot info on
here and run that and we do an info on that you can see is now a time delta 64
nanoseconds uh well we really don't want nanoseconds we actually probably want to
do it in minutes uh so let's take a look at that and let's take this whole thing
tf2 let's just set the df2 we have our df2 this is a column we're working with
here and we can use the as type property in pandas and so we can set this equal
to df2 we'll take our same column in here and we'll set it as type time delta
seconds so it's still a delta time here so if i run this you'll see that it still
comes up as where is it hopefully it turns it into a float so we're now at a
float 64 so it's the number of seconds in that delta time and then finally we
want to go ahead and turn that from seconds to say minutes and you know there's
60 seconds in minutes and so now we divide by 60 we still have a float we have
our info it shows us it's a float and then we can go ahead and just do a print
df2 and let's just keep it small we don't want to look at all our data we just
want to do the head of it and we run this and you can see right here where we go
to be the last one here's our chip time in minutes and a lot of times just to
make life easy for viewing since we're only looking at this particular element we
can do chip time in minutes and now we just see that oops we take off we'll go
ahead and take off our info done with that and we'll run this and you can see we
have our minutes 16.8 minutes 17.51 minutes and so on it's a float number now
keep in mind this is 0.8 that's not 16 minutes and 80 seconds that they can
always throw you if you're going through so many numbers you forget it's
important to remember that and we're also going to look at the other one we're
looking at is what gender you want to look at gender to chip time in minutes and
so we can see here under the head we get male male male and a number of different
setups and let's switch this to tail real quick and just look at the end of it
and here we have female female male female female so we have two different
genders and we have our chip time in minutes and if you remember we brought in
our plt if you haven't used the plot library the matplot library you have a
drawing place you're putting stuff on so we have our plt we're going to do a bar
graph and we just want to simply use our df2 gender and df2 chip time in minutes
so that's going to plot the two bars and to make it pretty we'll go ahead and
give it the x label gender the y label chip time minutes and that simply is
remember it always plots x in the plots y we have our gender our chip time give
it a nice title uh comparison of average minutes run by male and female uh and if
we go ahead and run this with the correct titles in here and everything matches
you'll get a nice graph we can see here the comparison of average minutes run by
male and female here's our chip time in minutes the men seem to be slackers in
this particular case and it's actually there's a number of studies that show that
women team tend to have as far as doing cross-country there's a lot of women who
have a longer endurance than men so it's not too surprising but we can see here
the average chip time around 70 and for women over 100 minutes and then another
really cool thing we can do is we can describe the data so df2.describe this
again is a pandas function just like info is we're going to include uh np number
the numpy number and if we run that you'll see here it comes up and says chip
time in minutes account it gives you the average or the mean standard deviation
the minimum the maximum um all the different descriptive information you're going
to want from your data set on there and just because there's all kinds of fun
ways let's do a box plot to display your information uh we can do a box plot
where the column equals chip time in minutes and let's go ahead and run that keep
mistaking my chip time in minutes you can see it puts out a nice box plot showing
you the information we have our different values and floaters this is always
interesting because this is a nice way of seeing where we have these uh floaters
one up here and there's two up above and of course here they're a nice spread on
the box plot and we can also modify this a little bit and we can add in by equals
gender and then we'll give it we'll just give it a blank title i don't know why
we're going to get a blank title uh we'll just add a y plot y label on there for
run time and if we run this you can see here box plot grouped by gender chip time
in minutes and now we have our female and male two different areas and you can
see how they vary you have your two different your outlier up here and you can
also see how there's such an overlap between the two different values so if i was
looking at this i'd be like wow you know i really could not draw a conclusive
thing on this saying that women's run time was more in general because they
overlapped too much that would be one of the conclusions i'd have to come up with
then here and then we get to maybe the partners come in from the company and say
hey we'd like to know the age versus chip time in minutes that'd be something
worth knowing on the statistics on this and the first thought is we can simply
plot it and we can do this we can actually plot the scatter plot chip time versus
df2 of h those are xy coordinates but if you remember from df2 when we did the
info let's go way back up here we're looking at a data object as far as our chip
time on this and our h now we converted the chip time but we also need to convert
the age and if we do it right here we just plot it and it'll actually let us plot
it it shouldn't it should give us an error but it does let us plot it you'll see
the ages come up a mess over here because they're converting it to weird float
numbers and all kinds of things so what we want to do is we want to take our age
and we'll just call this h underscore i so we're going to take our age and we're
going to create another column for df2 age underscore i and the i is just going
to stand for integers our own choice of values there is a number of ways to do
this but we're going to do uh pandas 2 merrick is the best way in pandas and the
reason we're doing this is that numeric creates a float value uh so right off the
bat we want all our stuff converted it converts it to the least common
denominator so if they're all integers already you'll get integers as you can see
from here it's doing some kind of conversion that converts it to a float value
the other thing that numeric does is if there is a null value or they put in like
a blank line or a dash to represent no information it'll convert it to a null so
it goes from like a string to a null versus just having some kind of made up
number that python somehow created for the graph we have below and then we want
to add our df2h because that's what we're converting to numeric and then we want
to coerce it and there's a couple different options on this like you can have it
where it just doesn't process it in pandas but coerce means that if it gets a
weird value that is a null value now and since we're dealing with errors this is
what happens when you get an error converting it we want to coerce it there we go
and put the end bracket on there and then finally we want to go and round this
off so i'll put brackets around all the way around it and this rounds off
everything in this series so we've done here is i've taken df2h which is a d type
object which in this case is mostly strings with a couple blank ones in there and
we're going to convert it to a numeric which will automatically go to float and
then we're going to take wherever there's an error wherever it says hey this
doesn't convert and usually that's a blank screen like i said i've worked with so
many databases where they someone puts down none someone types in space sometimes
in a dashed i mean none and you get this really weird conversion coming up this
covers all of that in pandas so it's really a nice way of just coercing it and
saying hey if we don't have a number in there let's make it a null value and then
we're going to round it off and then finally let's go ahead and take our df2
here's our df2 and let's drop those null values drop in a and we want how how
equals any so that means if there's any null values in the data set now you can
this might get you in trouble because you might have no values in a different
column and so you might lose data that way at this point we could also do like
certain parts of like drop just certain columns with null values there's all
kinds of other options in here what we're just going to do how we're going to
just drop any and we want to do it in line equals true so that means it's going
to reassign it to df2 so df2 now has a rounded out so it's rounded to the integer
we didn't do any places the age and it's going to be age i and then we've dropped
all our null values that way we're not going to get any errors when we try to
plot a null value and it also makes sure that data by deleting out the rows
because that's what this does it automatically does axis0 which is your rows axes
one is your column by doing this it automatically removes all the rows with null
values so it just cleans out the rolls and then when we go ahead and plot this we
see we have a nice clean data and we have age all the way up to 70. uh so we have
our chip time set and then our age going across and it makes a nice plot that you
can easily show for display and for the and you can easily show that to your
shareholders or whatever group you're working on it makes a really nice and quick
easy display and now anjali is going to explain how you can become a python
developer following that richard will cover 50 hand-picked interview questions
that you might face in your job interviews a very warm welcome to this video on
how to get a job as a python developer the job of python developer is one of the
most sought after in the market right now and for that very reason it might not
be that simple to land this shop so in today's video i'll give you 12 tips on how
you can land this job so let's begin tip number one build your own github
repository so go on github create a repository and add all your files all your
python codes on there so it doesn't matter if it's a big project or some small
piece of code where you just took some input and made some manipulations
displayed it every work counts learn a bit of github version control so not only
do you upload your file once but you can make modifications to them rework on
them make it better upload it again showing your progress now this github
repository basically becomes your resume as a python developer the recruiters can
look directly on here instead of you sending them zip files or even jupiter
notebooks everything is now available online and they can access it from anywhere
through any machine so this really shows that you're not just writing code for
yourself but you want to share this with other people and that's very important
tip number two make sure that your code is readable so when you're putting your
code out in github as i mentioned previously you're writing this code not just
for yourself so if people want to learn from your code they want to view your
code it's necessary that they understand this code and of course there are a few
guidelines to follow which makes your code more readable the most important one
being you follow the pep 8 style guideline in case of python so the pep 8 style
guidelines basically some conventions that you use and that mainly talks about
indentation so in case of pep 8 you have a 4 space indentation tabs and spaces
that's maximum line length which in case of pep 8 is 79 characters per line the
line breaks that you need to put blank lines for example every major class or
every major function needs to be separated by two lines two blank lines the
source file encoding string codes white spaces in expression trailing commas
naming conventions and so on so again very basic thing if you're having a
variable make sure the variable name shows what the variable stores what it is
used for and it's not just some big name such as bad one where two and so on tip
number three create a good documentation again this helps with the readability
and the understandability of the code so one of the main things with creating a
good documentation is having a readme file in your github repository the readme
file should contain details regarding your project what your project does the
various libraries used in your project and so on so this is a great help to
anyone who is trying to learn from your code or implement them in a different way
now here we have a screenshot of the readme file created by raymond hettinger
it's present in his github repository now who is raymond hettinger well that
brings us to our fourth point raymond hettinger like another guy kenneth reed
these are some of the very popular personalities on github they have a very
unique style and a very neat and organized style of coding and one of the great
ways to develop your own coding skill is to look at other people's code now when
you're looking at other people's code it's important to remember that you look at
code which is of your own skill level so if you are an intermediate coder make
sure you look at someone's github repository who again codes on the intermediate
level so that you're able to connect with that code you could probably write the
same code but he or she writes it in a better manner now these are some of the
people who have great github repositories you can definitely learn a lot from
them tip number five read books on python coding so you might know already quite
a bit of python in fact if you're looking for a python developer job there's a
good chance that you are an advanced coder but nothing beats books here are some
of the very popular and well-renowned books for python fluent python automate the
boring stuffs with python and so on now fluent python is a great book to start
with what it does it just not what it does is it gets your python concepts really
strong so now you'll have not only great skills but also the perfect way to
portray these skills tip number six grow your python skill set of course you can
never stop learning keep learning and some of the very important things when you
go for a python developer job is to make sure that you know how to work with some
of these python libraries in fact make sure that while you can cover most of
these libraries there are certain ones that you have completely mastered some of
the very popular libraries with python are numpy scipy matplotlib tensorflow and
so on so learn these master them
create projects around them and finally put these all up on github for everyone
to view tip number seven it's never enough to just know a language you must know
how to apply it and with python some of its most important and popular
applications are in the field of ai machine learning and data science so master
ai and machine learning with python learn the various algorithms that these
fields use and implement projects on them as you can see here we have two of the
algorithms with machine learning that's linear regression k means clustering
neural networks is an algorithm used with deep learning so make sure you have
some of these applications up there in your repository this displays your skill
not only in python but also in other fields and both of these going hand in hand
just increases your value tip number eight take freelancing projects to start off
with so so far i mentioned how you write your own code you create your own
projects now that's not enough take up projects by companies now these may be
non-paid they may be really low paid does not matter as long as you have
something to show off so you have a project under your belt that really pays off
now some of the websites you can go to for freelancing works our freelancer
upwork twago truelancer.com and so on so this really shows to the recruiter that
this person did not just learn python but he or she is always looking at how to
implement them how to use them tip number nine make open source contributions so
you have your own github repository that's great but now look into others
repositories see if there's some value that you can add and if you can definitely
go for it this shows not only your skill but also that you're a team player you
want to add value to work that is already existing and that is a skill that's
again really valued in organizations so some of the popular ones include pip env
which is the python development workflow for humans there's also chattistics
where you can convert your messenger and hang out chat logs into data frames then
you can solve your traveling salesman problems using self-organizing maps and
there's also a python to bpf converter so these are great places to make your
contributions in fact we have the links for some of these in the description
below so please check them out tip number 10 start a blog and talk about what you
have learned so having your own personal blog will add a lot of credibility to
your profile in your blog you can definitely mention where you started off from
that is at a beginner level what all did you know how you took on your journey to
where you are now what materials you used to collect information and what
projects you took on how you went about this mention any papers you wrote and so
on all this again becomes another profile for you the recruiters can have a quick
look at your personal blog and have a good idea of what kind of a learner you are
what kind of a coder you are and if you have done everything right this could
create a great impression on the recruiters and if you have done everything right
this will create a great mark on the recruiters so here's a screenshot on ned
bachelder's blog on python it'll give you a good idea as to how to create a blog
and how to go about it so please check that out so here's a list of all his blog
posts he has 227 blogs just for python and he writes on various topics for
example here we have a screenshot of is python interpreted or compiled in fact
you can even include some of your personal views on python on the learning
process or the learning curve of python and so on tip number 11 follow a daily
schedule for practice so just because you think you have mastered the language do
not put it aside and let it collect dust take out some time every day write code
whether small or big make sure that every aspect of python is at your fingertips
and finally tip number 12 keep your resume and profile updated on job portals
such as linkedin indeed glassdoor and careerbuilder look out for python developer
job roles on these sites and google jobs simply hire dice and more a recently
updated resume always captures the eye of the recruiter so these are some of the
tips you can follow to back that python developer's job if you're new to python
and require some help in gaining the skills to attain that job of a python
developer you have come to the right place just go to our website
simplylearn.com and in here we have just the course for you in fact we have a
number of courses for python but you can start off here with python training
course this course covers everything a to z for python and it's pretty much all
you require to get that job so before you take up the course you can definitely
go through everything that it covers the objectives who should take up this
course the prerequisites what projects are covered under the course and also the
subheadings of the various topics covered certainly the questions we're going to
ask in here are very general with a few specifics towards data science since
that's the main direction that python's going in and you'd want to expand your
questions for your interview depending on the domain that you're using the python
in specifically let's dive in and get started with some python interview
questions number one what is the difference between shallow copy and deep copy
and you can see with shallow copy we have object one which has child one child
two child three and so on and object two which has child one child two child
three a deep copy creates a different object and populates it with the child
objects of the original object therefore changes in the original object is not
reflected in the copy copy.deep copy creates a deep copy shallow copy creates a
different object and populates it with the references of the child objects within
the original object therefore changes in the original object is reflected in the
copy copy.copy creates a shallow copy and you can look at this if we make a
change to child one it's only a pointer so if you make it in object one and
change to child one and object one it will also make that change in object two
number two how is multi-threading achieved in python oh this is a good one with
multi-processing and multi-threading this question is actually asking you do you
know the difference between multi-processing and multi-threading and how multi-
threading works multi-threading usually implies that multiple threads are
executed concurrently the python global interpreter lock doesn't allow more than
one thread to hold the python interpreter at that particular point of time so
multi-threading in python is achieved through context switching it's very
different than multi-processing which actually opens up multiple processes across
multiple threads so multi-threading discuss the django architecture so the django
architecture and the first thing to know is that the django is a web service way
to build your web pages basically and so we look at the architecture you can see
here we have a nice model drawn out where the user initiates the jangle which
initiates the url which initiates the view what they're going to view and you
have your model and your template so the model of data whatever data model you're
pulling goes into the template and then goes back up the pipeline to the user and
the important thing to note is there's a template the front end of the web page
this is what they're going to see there's a model the back end where the data is
stored so you can keep the template and the looks and everything looks the same
but you can swap out the underlying information that goes into that template then
you have your view which interacts with the model and template and maps it to the
url and then the django serves the page to the user so your django grabs it and
says okay thank you for the url and here you go user what advantage does numpy
array have over a nested list so numpy's a module you import almost always see
numpy import numpy as np numpy is written in c such that all its complexities are
backed into a simple to use module lists on the other hand are dynamically typed
therefore python must check the data type of each element every time it uses it
this make number arrays much faster than lists i would also add in that numpy has
a lot of additional functionality that you don't have in lists there's a lot of
things you can automate in the numpy quick flip over to our jupyter notebook any
ide will work if you're going to do a set of use of interview questions taking a
quick look at code is always important we have our import numpy as np we're going
to import time here's our list list sub for iron range of 100 and what we're
doing is we're going to time it so we're going to create a list then we're going
to create a numpy zeros array and you can see here look how quick you can create
this numpy zeros array here we are appending one zero at a time for a regular
python list and here we are with numpy they're all zeros and they're all of type
integer i believe it's either float or integer on this i'd have to actually do a
type on it and then so if we take in and we create a tl1 time equals time and
then we do for i in range of 100 for j in range of 100 l of i j equals l of i j
plus 5. so we're just doing a simple calculation on our array and sub arrays this
is an array of rays and we'll do the same thing with tl2 tl2 time dot time tl1
equal so here we have our final time on that and then we'll do this with an array
array op and this is what i really love a equals a plus 10 and then you can just
print it right out so you can see right here with the numpy
array we're doing the same thing if i run this our time is significantly
different here we have 0.09 and 0.003 uh so you can see that the time drops
significantly when you're running this on a numpy array versus a list array also
important to note these times aren't going to be they'll change each time i run
it depending on what i have running in the background so there we go number five
what is pickling and unpickling amazes me how many times i pickle and unpickle
something converting a python object hierarchy to a byte stream is called
pickling pickling is also referred to as serialization unpickling converting a
byte stream to a python object hierarchy is also called unpickling unpickling is
also referred to as deserialization so if you just created a neural network
model you can now save that model to your hard drive pickle it and then you can
unpickle it to bring it back into another software program or to use at a later
time how was memory managed in python number six python has a private heap space
where it stores all the objects the python memory manager manages various aspects
of this heap like sharing caching segmentation and allocation the user has no
control over the heap only the python interpreter has the access you have a nice
little diagram here here's your program there's your interpreter we have our heap
memory management on the garbage collector going off of there number seven are
arguments in python pass by value or by reference arguments are passed in python
by reference this means that any change made within a function is reflected on
the original object so you can see here def function of l l of 0 equals 3 l
equals 1 2 3 4 function l print l and you're going to get 3 2 3 4 because we
passed l in there so it's a pointer here we have def function l l equals 3 2 3 4
l equals 1 2 3 4 function of l print l because in this function i have assigned
instead of operating on a piece of l the list i've consigned a whole new value to
that list or l it then at that point will create a new object so if i make
changes to the object it's going to change it in the outside the definition if i
use a variable and i cited a completely new value like l equals 3234 that will
not show up when you're outside the function number eight how would you generate
random numbers in python to generate random numbers in python you must first
import the random module the random function generates a random float value
between zero and one the random range function generates a random number within a
given range and you can see here one is the lower end tens the upper end and step
two so it'd be one three five and so on as far as the options in the random
generation number nine what does the double forward slash operator do in python
the forward slash operator performs division and returns the quotient in float
for example 5 over 2 returns 2.5 to do a double forward slash operator on the
other hand returns the quotient an integer for example 5 double slash returns 2.
5 divided by 2 and you drop the 0.5 number 10 what does the is operator do the is
operator compares the id of the two objects and you can see in here where list
one equals brackets round one two three list one equals list two equals true and
you have the double equals in python of course and you can do list one is list
two where list two equals one two three is false list two is not the brackets one
two three it equals it but it's not the brackets and if we do list three equals
list one then list one is list three equals true number eleven what is the
purpose of pass statement the pass statement is used when there's a syntactic but
not an operational requirement for example the program below prints a string
ignoring the spaces and so here we have variable equals simply learn we've added
two spaces in it for i and variable so it goes through each eye if i equals space
do nothing else print i and then we'll have the end equals bracket bracket and
they'll print out simply learn now of course you would probably write this if i
does not equal blank space print but this would be another way you could do that
if you need a placeholder for that first logical set or that first area you can
also do a function like this you could do function whatever your def function
name brackets colon pass so it goes into the function and does nothing but it's a
placeholder number 12 how will you check all the characters in a string are
alphanumeric python has an inbuilt method is all number which returns true if all
characters in the string are alphanumeric and so you can see here abcd123 is all
number output equals true and the second line a b c d the at symbol 1 2 3 the
pound symbol is all nume output equals false so really just want to know about is
all new all numerical alphanumerical number 12 how will you check if all
characters in a string are alphanumeric so here we go if you know is all number
which returns that the characters in the string are alphanumeric one can also use
regex instead and so we have boolean rematch what's important about this is to
note your capital a dash to capital z a lower codes to dash to z 0 to 9 means
that that array includes all of those the way they have it written out plus a
dollar sign and then we have what we're comparing it to the string we're
comparing to the abcd123 and so we can do an re.match and if it matches if all
these things if all the different entities in that array matches the first one
we'll get an output true and if not an output false number 13 how will you merge
elements in a sequence sequence there are three types of sequences in python
there's lists tuples and strings python acoustics makes this easy if we have a
list one and list two and list one is uh square brackets one comma two comma
three list two is four five and six we can simply do list one plus list two and
our output is one two three four five six if we have tuples your tuple is the
curved brackets designates it and again just add them together same thing with
strings we have simply learn s1 plus s2 equals simply learn number 14 how will
you remove all leading white space in a string python provides the inbuilt
function l strip to remove all leading access from a string and you can see here
spacebase base python.l strip leading strip python and you can also do strip
which release leading and ending of course there's also the ending set number 15
how will you replace all occurrences of a stub string with a new string the
replace function can be used with strings for replacing a substring with a given
string syntax dot replace old comma new comma count replace returns a new string
without modifying the original string hey john how are you john question mark
replace john with capital j-o-h-n one and then you can see right here hey john
how are you and since we designated with the one just says we're only going to
replace one of these 16 what is the difference between dell and remove brackets
on lists dell for delete dell removes all elements of a list within a given range
syntax dell list start to end remove remove brackets removes the first occurrence
of a particular character syntax list remove element and we see a nice example
over here if we delete the list one to three it will delete the first in this
case b one two it doesn't do three remember that one two so we'll delete b and c
and you end up with a d where if we do remove b from the list and we have an a b
b d it's only gonna remove the first b number 17 how to display the contents of
text file in reverse order open the file using the open function store the
contents of the file into a list reverse the contents of the list run a for loop
to iterate through the list number 18 differential between append and extend
append adds an element to the end of the list you can see right here we have a
list one two three four and we append four we end up with an output one two three
four and extend adds an element from an interval to the end of the list and we
have here list equals one two three list dot extend four five six output is one
two three four five six so if you wanna append an array to the end of another
array you want to use the extend number 19 what's the output of the below code
justify your answer this is a great interview question because these are the kind
of things that come up when you're proofing code def add to list value and list
so we have value in and a listing or list equals an empty in this case an empty
list list data pin value return list list one equals add to list one list two
equals add to list one two three empty bracket list three equals add to list a
and then we want to print them list one equals and you can see the formatting we
have our placeholder list one list two and list three so when it prints list one
we get one comma a and what you want to notice here is that list one and list
three are equal why are they equal well when we passed the information to the add
to list we passed value without passing the list equals brackets without passing
a second value what this means is that list as we have it if you don't have a
list it'll start off with empty list which we append the one to the second one
list two we appended a value to an empty list so it's only going to be one two
three doesn't matter what the list was before we've already assigned an empty
list and then list three here's the tricky one we're adding a to the list but
because we didn't designate the list list is a shared value in other words it
doesn't reset it and we end up with list one equals list three one comma a
default this is created only once during the function and not during
its call number 20 what is the difference between a list and a tuple lists are
mutable while tuples are immutable and you can see an example down here where i
have list equals one two three square brackets denote it's a list list of two
equals four and i printed out i now have one two four if i do the same thing with
the tuple i get an error because you can't change the tuple one two three into
one two four you have to completely reassign tuple to a new value what is docs
string in python doc strings are used in providing documentation to various
python modules classes functions and methods and so you can see here we have def
for a function add a b and this is a doc string we have the triple brackets on
there you can add carriage return in that so that you can go multiple lines and
it says this function adds two numbers and then sum a b return sum and so we have
down here two different ways of accessing this function output accessing
docstring method one this function adds two numbers accessing docstring method
two help on function add in model main this function adds two numbers and so you
can see the code down here has two very different in values the second one is
basically a help menu there's our help menu number 22 how to use print without
the new line the solution to this depends on the python version you are using in
python version 2 you can do print hi and then you add a comma afterwards print
how are you and you have hi how are you in version three print hi comma end
equals and it'll add a space on the end there you can put different characters in
there but you just want to put a space to put a space on the end print how are
you and now we get hi how are you number 23 how do you use the split function in
python the split function splits a string into a number of strings based on a
specific delimiter so we have string split delimiter comma max the maximum number
of splits the character based on which the string is to split by default is space
so here we have an example we have a variable red blue green orange and we want
to split it by commas and we only want to do the first two so if we print the
list now you'll find it has red blue and only spread it split it the first two
times and it gets to the third one and just groups them all together green and
orange if you leave the two off you'll split the whole thing number 24 is python
object oriented or functional programming python follows object-oriented paradigm
and you should really know in depth what they mean by object-oriented paradigm if
you're doing any interview for scripting languages python allows the creation of
objects and is manipulation through specific methods it supports most of the
features of oops which has inheritance on a polymorphism so you have an object
and you can inherit all the traits of that object and then add new traits in or
alter some of those traits that's what object oriented means python follows
functional programming paradigm functions may be used as first class object
python supports lambda functions which are characteristic of functional paradigm
so you can set a variable to a function as opposed to setting it to an object
number 25 write a function prototype that takes variable number of arguments here
we have def function name list so we could have in this case whatever the list is
def function the asterisk denotes so we're going to take multiple arguments of a
variable and we can do for i and var print i so if you send function of one
you'll end up with a one function one two five six they'll actually print those
out one at a time the first one just prints out a one because it only sent one
variable the second one will print one another line twenty five another line six
number twenty six what is asterisks args and asterix quarks args used in
function prototype to accept varying number of arguments it's an iterable object
def function arcs and you can imagine it's just a basic list so if i send add the
numbers a comma b or a comma b comma c it doesn't really matter it will have that
number of objects in it whatever i send to it and there's other uses for it but
that's very basic korg's i can actually tell it what i want to send so using a
function prototype and to accept varying number of keyworded arguments it's in
both our iterable objects so you can go through them one at a time and the def
function chords you can now set like color equals red units equal to so you'll
see that especially in machine learning there's a lot of like they'll have inline
equals true that kind of thing number 27 in python functions are first class
objects what do you understand from this this means i could return a function
could be one from another function i could create a function and treat it just
like an object i can assign it to a variable i can pass them as arguments to
other functions number 28 what is the output of print name underscore underscore
name and justify your answer the double underscore name double underscore is a
special variable that holds the name of the current module program execution
starts from main or a code with zero indentation double underscore name dumbbell
underscore name has a value double underscore main double underscore in the above
case if the file is imported from another module then double underscore name
underscore double underscore holds the name of this module number 29 what is a
numpy array and we briefly touched numpy array compared to a list early in
processing speed now let's go ahead and look at some of the more specifics a
numpy array is a grid of values all of the same type so if they're either all
float all integer all string and is indexed by a tuple of non-negative integers
the number of dimensions is the rank of the array and the shape of an array is a
tuple of integers giving the size of the ray along each dimension number 30 what
is the difference between matrices and arrays a matrix comes from linear algebra
and is a two-dimensional representation of data it comes with a powerful set of
mathematical operations that allow you to manipulate the data in interesting ways
now arrays an array is a sequence of objects of similar data type an array within
another array forms a matrix like we said here two-dimensional so if you have an
array of three by four that would be a matrix number 31 how to get indexes of n
maximum values in a numpy array of course the first thing to do is to import your
numpy as np you don't necessarily have to use mp but that is the most standard
use of numpy we create our array equals an np.array of one two three four five
and then if we want to get our indexes of n at maximum values in a numpy array we
can do one way to do it is to take our array sort it then do minus in colon that
means we're going to do once you've sorted it you can do minus in n would equal
then the number of entities so it's not the actual letter n colon and really this
is about understanding this notation that we can sort it so it goes from lowest
to biggest and then we can get the top values for n indexes and then we have our
final set of brackets with the minus 1 on there number 32 how would you obtain
the resulting set from the train set and the test set from below and let's go
ahead and look at the two different variables we have train set equals an array
of one two three test set equals a numpy of ray of arrays we have 0 1 2 1 2 3.
what's important here is that one it's a numpy so that leaves a out and then
we're stuck with three other options and i'm going to say the d is out none of
these and let's look at uh np.concatenate versus np.v stack concatenate would
put one set after the other so you would end up with probably give you an error
because one set is one two three and then we're going to concatenate array012
and one array one two three the array of arrays onto the end of that what we
really want to do is stack and by the way you can actually switch there's
variables you can put into concatenate obviously they can change this so you
could use the concatenate with a lot of fudging around but really we're looking
for is v stack v stands for vertical versus the h deck which is horizontal and if
we do a v stack we can simply do train set comma test set and stack them together
and so we have c resulting set equals np.v stack train stack test set both option
a and b would do horizontal stacking but we would like to have the vertical
stacking option she does this again you could add the axes in and use the
concatenate to stack it the correct way number 33 how would you import a decision
tree classifier in sk learn we have sklearn dot decision tree import decision
tree classifier from sk learn ensemble import decision tree classifier and we
look at these and they're all import decision tree classifier that actually last
part happens to be correct and it's really just a vocabulary knowing where is the
decision tree classifier stored what module is that a part of and it is of course
part of the sklearn.tree number c number 34. you have uploaded the dataset in
csv format on google spreadsheet and shared it publicly how can you access this
in python what's important here is to know that we can read stuff with pandas so
we don't show it here but you can there's actually a number of ways to do this
what's important here is to know a couple things one we have our link generated
from the google docs and spreadsheets and then we can do a string io dot string
io request get link dot content so there's our source and then finally we know
that pandas can read a csv there's obviously many ways to read a csv but data
equals pd.read underscore csv source
number 35 what is the difference between the two data series given below below
we have df name and df location colon comma brackets around asterisks around name
comma where and then we have df equals pd data frame aabb xx uu comma 21 16 5033
columns equal name and age so let's take a look and see what they're looking at
we have just glancing at the questions they want to know is it the original data
frame or is it the copy of the data frame and you can see here that one is view
of the original data frame and two is a copy of the original data frame two is a
view of the original data frame and one is a copy of the original data frame both
are copies both are views and if you're working with pandas you know that unless
you specifically in certain things tell it to do it in line and a lot of
functions don't allow you that you're always taking a slice and it is always a
copy so c both are copies of the original data frame number 36 you get the
following error while trying to read a file temp dot csv using pandas which of
the following could correct it so here's our error traceback most recent call
last file input line 1 and module unicode encode error ascii codex can't encode
character oh i hate it when that one comes up and we have four different entries
we'll go ahead and just pretend that d doesn't exist unless we really can't fit
it into one of the other answers and the first one is pd read csv has our file
compression equals gzip well g zip is just an unzipping and you actually get a
zip error on there the second one is dialect equals string again not an encoding
or coding setup and then we have encoding equals utf-8 well that would be the
encoding error switching it from the character code there's utf-8 there is
unicode that's the most common two that goes between so really this is about
understanding the difference between a utf-8 coding and a unicode and the error
that comes up quite regularly with that number c encoding should be utf-8 number
37 how to set a line with any plot given below so looking at this we have import
matplot library pi plot as plt and you should know your way around this how to do
a plot in there plt.plot1234 plot equals show and so this is a little bit of a
vocabulary test the vocabulary is it width equals three line width equals three
lw equals three or something else and the vocabulary word that we're looking for
is lw equals three which stands for line width in pi plot library pi plot number
38 how would you reset the index of a data frame to a given list so this is a
vocabulary challenge and understanding what re-indexing is re-indexing as we have
the different values here we have the first one which is reset the index well
we're not really resetting the index re-index number b means we are double
checking our indexes to the column and to the main index and so the values match
correctly where reindex like now brings in a new index outside of our data frame
to a given list so this is coming from external and thus the vocabulary word like
is our key word that is external and we have a data friend to a given list number
39 how can you copy objects in python the functions used to copy objects in
python we have copy copy for shallow copying and copy deep copying for deep copy
number 40 what is the difference between range and x range functions in python
well this is a good one we have matrixes and arrays with a matrix the range
returns a python list object x range returns an x range object and with arrays an
x-ray returns an x-range object x-range creates values as you need them through
yielding the key here is that x-range returns the values as you need them so it
actually processes it post like if you have for x or for a variable in x range it
is processing them as you need them zero to nine it doesn't create an array zero
to nine it just hands you zero then one two three four one at a time number 41
how can you check whether a pandas data frame is empty or not the tribute
df.empty is used to check whether a panda's data frame is empty or not and so you
can simply create a we have down here import our pandas as pd we create our
pandas data frame equal to an empty array and is df dot empty comes out as true
one of the catches you got to remember with these vocabularies is with empty
along with some other pandas setup whether you need the brackets or not at the
end number 42 write the code to sort an array in numpy by the n minus one column
this can be achieved using arg sort function let's take an array x then to sort
the n minus one column the code will be x to x to colon in minus two dot args
sort so let's see what that code looks like we import numpy as np we'll create
our array um our numpy array which is uh one two three zero five two two three
four so we have three different entities with three different columns in there
and we go x of x and so we take x of all the rows first entity or in this case
it's actually the second one because it's zero one two dot arg sort so that would
be the second entity or minus two would also be the same you could also do
instead of one you could also do minus two there instead of the one arg sort and
then we get an output of the array one two three zero five two two three four
number 43 how to create a series from a list numpy array and dictionary so we'll
go ahead and input import our numpy our pandas and have my list and you can see
here we have my list equals list of a b c d e f g over all the way through so my
list now makes a list of that for array we have np dot a range to 26 my
dictionary will create a dictionary with a zip my list my arguments so i'll just
use the numpy array we just created with my array to go into the dictionary and
the solution is simple with the pd.series my list pd.series myarray
pd.seriesmydictionary so it's all about knowing the dot capital s e r i e s don't
forget that capitalization number 44 how to get the items not common to both
series a and series b and you can see here we have instead of series a and b we
have series one and two and we have one two three four five four five six seven
eight the solution is we take a panda series we have a series u equals a panda
series np union one dimension series one series 2 so we can now make a union of
them we now have series 1 panda series with an intersection and then we can
remove one from the other series u is series u dot is in series 1. so if the
union is not in the intersection then you know it's a unique value a little bit
of logic going on there playing with three different terms to get the answer we
want 45 how to keep only the top two most frequent values as it is and replace
everything else as other in a series so again we're working with pandas because
we're talking series and data frames that means we're working with pandas so
we're going to import pandas as pd we'll go ahead and create our panda series
we're going to do that by creating a numpy random random state 100 so 100 and the
numpy one and then we have our panda series you can see here we're random integer
numpy random to random integer of 1 comma 5 by 12. and so the solution for this
is we go ahead and we've created a pd dot remember the capital s series solution
we're going to print the top two frequencies and that is our series dot value
counts and then we take series values count dot index of up to two so we're going
to take everything up to two and then we'll do the series is in so if it's not in
the first two then it's going to equal other and this would be something you'd
want to write down on paper if you're if it looks confusing take a moment pause
the video write this down and see if you can figure out how the logic came
together and try to throw yourself a couple other little logic puzzles like this
number 46 how to find the positions of numbers that are multiples of 3 from a
series and in here we're actually going to use a numpy to solve it the first part
series lets you know it's going to be a panda series and if we come down here we
have np.org where this is a vocabulary question series with remember the
percentile 3 is a remainder so if the remainder equals 0 then we're going to
generate that string where the object divided by 3 equals 0 has no remainder so
then we know it's a multiple of 3. number 47 how to compute the euclidean
distance between two series and this one's really cool because we have our panda
series p and q and what i like about this one is they give us two solutions you
can go with and really you should kind of know both the first one would be yes
you know what the euclidean distance is and that is we can take the first series
minus the second series squared and then sum them up and then we do the square
root which is the same as taking the power to 0.5 doing the power to 0.5 is
easier than doing the square root so a lot of times you'll see that as a switch
but you could have also done the square root and use the math in there so there's
solution one you should know your euclidean distance and then solution two is the
numpy solution so we have np dot lin-alg dot norm that's how we're gonna compute
our euclidean distance p minus q very elegant and very straightforward and easy
to compute number 48 how to reverse the rows of data frame so here we have our
data frame we're going to create a numpy array by 25 reshape it five minus one
and this creates a 25 by 25 data frame and so our solution is to do the dfi
location and this is just understanding how steps work the steps you have your
colon colon minus one so we're taking all the rows all the columns minus one so
our stepping minus one going the reverse direction and then
we're just going to use across all the different columns on there let me say
that again the first colon is going to be your row starting row stopping row step
minus one that's all this is about is that step minus one comma and then all the
columns forty-nine if you split your data into train test splits is it possible
to over fit your model and the answer is yes is definitely possible one common
beginner mistake is retuning a model or training new models with different
parameters after seeing his performance on the test set my favorite example of
this is you have your script put together and you keep hitting the re-run button
until you get the answer you want not taking the answer it first gave you or
running it over an array and recording all the answers to see how they vary
number 50 which python library is built on top of map plot library and pandas to
ease data plotting the answer this is seabourn seaborne is a data visualization
library in python that provides a high level interface for drawing statistical
information informative graphs i hope this helps you in your interview with that
we've reached the end of this complete python course i hope you enjoyed this
video do like and share it thank you for watching and stay tuned for more from
simply learn

You might also like