0% found this document useful (0 votes)
2 views9 pages

Python Libraries -2025 (1) Python Libraries -2025 (1) Python Libraries -2025 (1)

The document provides an introduction to Python libraries, explaining their role as collections of pre-written code that simplify programming tasks. It covers various built-in functions and modules such as math, pandas, NumPy, and Matplotlib, along with their applications in data manipulation and visualization. Additionally, it illustrates how to create and manipulate data structures like Series and DataFrames using pandas.

Uploaded by

Jairo Romero
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views9 pages

Python Libraries -2025 (1) Python Libraries -2025 (1) Python Libraries -2025 (1)

The document provides an introduction to Python libraries, explaining their role as collections of pre-written code that simplify programming tasks. It covers various built-in functions and modules such as math, pandas, NumPy, and Matplotlib, along with their applications in data manipulation and visualization. Additionally, it illustrates how to create and manipulate data structures like Series and DataFrames using pandas.

Uploaded by

Jairo Romero
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

4/14/25, 10:04 PM Python Libraries - Jupyter Notebook

Getting Started with Python Libraries


Python libraries are powerful collections of tools; is a collection of pre-written code, often organized
into modules and packages, that provides reusable functions and classes for performing specific
tasks, thereby simplifying and accelerating the development process.

The Python programming language comes with a variety of built-in functions. Among these are
several common functions, including:

print() which prints expressions out


abs() which returns the absolute value of a number
int() which converts another data type to an integer
len() which returns the length of a sequence or collection

These built-in functions, however, are limited, and we can make use of modules to make more
sophisticated programs.

A module is a set of code or functions with the.py extension. A library is a collection of related
modules or packages. They are used by both programmers and developers. Libraries are used by
community members, developers and researchers.

In [1]:  1 # Let's try to import a module , math


2 import math

Since math is a built-in module, your interpreter should complete the task with no feedback,
returning to the prompt. This means you don’t need to do anything to start using the math module.

Let’s run the import statement with a module that you may not have installed, like the 2D plotting
library matplotlib:

In [2]:  1 import matplotlib

NumPy:
Numerical computing, array manipulation, and scientific computing.

Pandas:
Data manipulation and analysis, especially with DataFrames.

Matplotlib:
Data visualization, creating static, interactive, and animated visualizations.

localhost:8888/notebooks/anaconda3/0_BigData_2025/Python Libraries.ipynb# 1/9


4/14/25, 10:04 PM Python Libraries - Jupyter Notebook

Seaborn:
Statistical data visualization, building on Matplotlib for more visually appealing and informative
plots.

Scikit-learn: sklearn
Machine learning algorithms, providing tools for classification, regression, clustering, and more.

TensorFlow:
Deep learning framework, used for building and training neural networks.

SciPy:
Scientific computing, containing modules for optimization, integration, and signal processing.

PyTorch:
Another popular deep learning framework, known for its flexibility and dynamic computation graphs

Keras:
High-level API for building and training neural networks, often used as a front-end for TensorFlow
or Theano.

In [3]:  1 import pandas as pd

Basic data structures in pandas Pandas provides two types of classes for handling data:

1. Series: a one-dimensional labeled array holding data of any type such as integers, strings,
Python objects etc.
2. DataFrame: a two-dimensional data structure that holds data like a two-dimension array or a
table with rows and columns.

In [4]:  1 # let's create a pandas series


2 # nan is a blank cell, it is defined in numpy library
3 ​
4 s = pd.Series([1, 3, 5, 0, 6, 8])

localhost:8888/notebooks/anaconda3/0_BigData_2025/Python Libraries.ipynb# 2/9


4/14/25, 10:04 PM Python Libraries - Jupyter Notebook

In [5]:  1 s

Out[5]: 0 1
1 3
2 5
3 0
4 6
5 8
dtype: int64

In [6]:  1 # let's create a pandas series


2 # nan is a blank cell, it is defined in numpy library
3 ​
4 s = pd.Series([1, 3, 5, np.nan, 6, 8])

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_84444\2328450563.py in <module>
2 # nan is a blank cell, it is defined in numpy library
3
----> 4 s = pd.Series([1, 3, 5, np.nan, 6, 8])

NameError: name 'np' is not defined

In [7]:  1 # we need to import the numpy library to execute the code correctly
2 import numpy as np

In [8]:  1 s = pd.Series([1, 3, 5, np.nan, 6, 8])

In [9]:  1 s

Out[9]: 0 1.0
1 3.0
2 5.0
3 NaN
4 6.0
5 8.0
dtype: float64

In [10]:  1 # import the pandas library


2 import pandas as pd

localhost:8888/notebooks/anaconda3/0_BigData_2025/Python Libraries.ipynb# 3/9


4/14/25, 10:04 PM Python Libraries - Jupyter Notebook

In [11]:  1 # create a dataframe df


2 df = pd.DataFrame(
3 {
4 "A": 1.0,
5 "B": pd.Timestamp("20130102"),
6 "C": pd.Series(1, index=list(range(4)), dtype="float32"),
7 "D": np.array([3] * 4, dtype="int32"),
8 "E": pd.Categorical(["test", "train", "test", "train"]),
9 "F": "foo",
10 }
11 )

In [12]:  1 df

Out[12]:
A B C D E F

0 1.0 2013-01-02 1.0 3 test foo

1 1.0 2013-01-02 1.0 3 train foo

2 1.0 2013-01-02 1.0 3 test foo

3 1.0 2013-01-02 1.0 3 train foo

In [13]:  1 # to get the number of rows, number of columns of a df


2 df.shape

Out[13]: (4, 6)

In [14]:  1 # in a dataframe, the index 0 refers to rows


2 df.shape [0]

Out[14]: 4

In [15]:  1 # in a dataframe, the index 1 refers to columns


2 df.shape [1]

Out[15]: 6

In [16]:  1 # to get the different datatypes of different columns


2 df.dtypes

Out[16]: A float64
B datetime64[ns]
C float32
D int32
E category
F object
dtype: object

localhost:8888/notebooks/anaconda3/0_BigData_2025/Python Libraries.ipynb# 4/9


4/14/25, 10:04 PM Python Libraries - Jupyter Notebook

An object type is a user-defined composite datatype that encapsulates a data structure along with
the functions and procedures needed to manipulate the data. Data object will have memory

In [17]:  1 # after uploading the dataset onto your anaconda 3 folder (or the folder
2 # if the file you want to read is in .csv (comma seperated) then use pd.r
3 # Import an Excel file using the read_excel () function from the pandas l
4 # Set a column index while reading your data into memory.
5 ​
6 data = pd.read_excel ('B Example 1 - Data Encoding.xlsx')
7 data

Out[17]:
S.N Country Hours Salary House

0 0 France 34.0 12000.0 No

1 1 Spain 37.0 49000.0 Yes

2 2 Germany 20.0 34000.0 No

3 3 Spain 58.0 41000.0 No

4 4 Germany 40.0 43333.3 Yes

5 5 France 45.0 28000.0 Yes

6 6 Spain 39.8 51000.0 No

7 7 France 28.0 89000.0 Yes

8 8 Germany 50.0 53000.0 No

9 9 France 47.0 33000.0 Yes

In [18]:  1 data.dtypes

Out[18]: S.N int64


Country object
Hours float64
Salary float64
House object
dtype: object

In [19]:  1 column_name = df.columns


2 column_name

Out[19]: Index(['A', 'B', 'C', 'D', 'E', 'F'], dtype='object')

localhost:8888/notebooks/anaconda3/0_BigData_2025/Python Libraries.ipynb# 5/9


4/14/25, 10:04 PM Python Libraries - Jupyter Notebook

In [20]:  1 # in bigger datasets, you may want to automatically check if data type 'o
2 ​
3 ​
4 column_type = column_name.dtype
5 if column_type == 'object':
6 print('The column contains string data')
7 else:
8 print('The column does not contain string data')

The column contains string data

axis = 1 means columns and axis = 0 means rows

In [21]:  1 d = {
2 "A": 1.0,
3 "B": pd.Timestamp("20130102"),
4 "C": pd.Series(1, index=list(range(4)), dtype="float32"),
5 "D": np.array([3] * 4, dtype="int32"),
6 "E": pd.Categorical(["test", "train", "test", "train"]),
7 "F": "foo",
8 }
9 d

Out[21]: {'A': 1.0,


'B': Timestamp('2013-01-02 00:00:00'),
'C': 0 1.0
1 1.0
2 1.0
3 1.0
dtype: float32,
'D': array([3, 3, 3, 3]),
'E': ['test', 'train', 'test', 'train']
Categories (2, object): ['test', 'train'],
'F': 'foo'}

localhost:8888/notebooks/anaconda3/0_BigData_2025/Python Libraries.ipynb# 6/9


4/14/25, 10:04 PM Python Libraries - Jupyter Notebook

In [22]:  1 pd.DataFrame (data = d)

Out[22]:
A B C D E F

0 1.0 2013-01-02 1.0 3 test foo

1 1.0 2013-01-02 1.0 3 train foo

2 1.0 2013-01-02 1.0 3 test foo

3 1.0 2013-01-02 1.0 3 train foo

In [23]:  1 df.A

Out[23]: 0 1.0
1 1.0
2 1.0
3 1.0
Name: A, dtype: float64

In [24]:  1 df.B

Out[24]: 0 2013-01-02
1 2013-01-02
2 2013-01-02
3 2013-01-02
Name: B, dtype: datetime64[ns]

In [25]:  1 df [A]

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_84444\4086650841.py in <module>
----> 1 df [A]

NameError: name 'A' is not defined

In [26]:  1 df["A"]

Out[26]: 0 1.0
1 1.0
2 1.0
3 1.0
Name: A, dtype: float64

In [27]:  1 df.columns

Out[27]: Index(['A', 'B', 'C', 'D', 'E', 'F'], dtype='object')

localhost:8888/notebooks/anaconda3/0_BigData_2025/Python Libraries.ipynb# 7/9


4/14/25, 10:04 PM Python Libraries - Jupyter Notebook

In [28]:  1 df.index

Out[28]: Int64Index([0, 1, 2, 3], dtype='int64')

In [29]:  1 import pandas as pd


2 data = pd.read_excel ('B Example 1 - Data Encoding.xlsx')
3 data
4 ​
5 data ['Country']

Out[29]: 0 France
1 Spain
2 Germany
3 Spain
4 Germany
5 France
6 Spain
7 France
8 Germany
9 France
Name: Country, dtype: object

In [30]:  1 type (data ['Country'])

Out[30]: pandas.core.series.Series

In [31]:  1 data [['Country']]

Out[31]:
Country

0 France

1 Spain

2 Germany

3 Spain

4 Germany

5 France

6 Spain

7 France

8 Germany

9 France

In [32]:  1 type (data [['Country']])

Out[32]: pandas.core.frame.DataFrame

localhost:8888/notebooks/anaconda3/0_BigData_2025/Python Libraries.ipynb# 8/9


4/14/25, 10:04 PM Python Libraries - Jupyter Notebook

localhost:8888/notebooks/anaconda3/0_BigData_2025/Python Libraries.ipynb# 9/9

You might also like