0% found this document useful (0 votes)

8 views114 pages

Lecture 2 - data wrangling_update (2)

The lecture focuses on data wrangling in data science, emphasizing the use of tabular data and the pandas library for data manipulation. Key concepts include DataFrames, Series, and methods for extracting, adding, modifying, and arranging data. The lecture also introduces the five essential verbs for data wrangling: filter, select, mutate, arrange, and summarise.

Uploaded by

jevictoria

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views114 pages

Lecture 2 - data wrangling_update (2)

Uploaded by

jevictoria

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 114

Lecture 2 : Data wrangling

2 0 2 5 S P R ING I N T RO TO DATA S CI E NCE

Congratulations!!!
You have collected or have been given
a box of data.

What does this "data" actually look like?

How will you work with it?
Data Scientists Love Tabular Data

• Tabular data = data in a table.

• Typically:
• A row represents one observation (here, a single person running for president in a particular year).
• A column represents some characteristic, or feature, of that observation (here, the political party of
that person).
Standard Python Data Science Tool: pandas

Using pandas, we can:

• Arrange data in a tabular format.

• Extract useful information filtered by specific conditions.
• Operate on data to gain new insights.
• Apply NumPy functions to our data.

Stands for "panel data"

DataFrames

• In the "language" of pandas, we call a table a DataFrame

• We think of DataFrames as collections of named columns, called Series.
Series - Custom Index

• We can provide index labels for items in a Series by passing an index list.

• A Series index can also be changed.

Selection in Series

• We can select a single value or a set of values in a Series using:

• A single label
• A list of labels
• A filtering condition
Selection in Series

• Say we want to select values in the Series that satisfy a particular condition:
1. Apply a boolean condition to the Series. This creates a new Series of boolean values.
2. Index into our Series using this boolean condition. pandas will select only the entries in the
Series that satisfy the condition.
DataFrames of Series!

• Typically, we will work with Series using the perspective that they are columns in a DataFrame.

• We can think of a DataFrame as a collection of Series that all share the same Index.
Creating a DataFrame

• The syntax of creating DataFrame is:

pandas.DataFrame(data, index, columns)

• Many approaches exist for creating a DataFrame. Here, we will go over the
most popular ones.
• From a CSV file.
• Using a list and column name(s).
• From a dictionary.
• From a Series.
Creating a DataFrame

• From a CSV file.

• Using a list and column name(s).
• From a dictionary.
• From a Series.
Creating a DataFrame

• From a CSV file.

• Using a list and column name(s).
• From a dictionary.
• From a Series.
Creating a DataFrame

• From a CSV file.

• Using a list and column name(s).
• From a dictionary.
• From a Series.
Creating a DataFrame

• From a CSV file.

• Using a list and column name(s).
• From a dictionary.
• From a Series.
Indices Are Not Necessarily Row Numbers

An Index (a.k.a. row labels) can also:

• Be non-numeric.

• Have a name, e.g. "Candidate".

Indices Are Not Necessarily Unique

The row labels that constitute an index do not have to be unique.

• Left: The index values are all unique and numeric, acting as a row number.
• Right: The index values are named and non-unique.
Modifying Indices

• We can select a new column and set it as the index of the DataFrame.

• Example: Setting the index to the "Party" column.

Resetting the Index

• We can change our mind and reset the Index back to the default list of integers.
Column Names Are Usually Unique!

• Column names in pandas are almost always unique.

• Example: Really shouldn’t have two columns named "Candidate".
Retrieving the Index, Columns, and shape

• Sometimes you'll want to extract the list of row and column labels.

• For row labels, use DataFrame.index:

• For column labels, use DataFrame.columns:

•For shape of the DataFrame we use DataFrame.shape:

The Relationship Between DataFrames, Series, and Indices

• We can think of a DataFrame as a collection of Series that all share the

same Index.
• Candidate, Party, %, Year, and Result Series all share an Index from 0 to
5.
Data Science Lifecycle
Recall the Data Science Lifecycle

Data wrangling / visualization

Into to Data Wrangling

• Raw data is often messy, incomplete, and inconsistent

• Data wrangling is the process of cleaning, transforming, and organizing data into a
structured and usable format for analysis.

• The purpose is to ensure that data is accurate, consistent, and ready for modeling and
visualization.
Into to data wrangling

• Hadley Wickham says that the five verbs help solve 90% of challenge
• filter: select rows (observations) in a data frame;
• select: select columns (variables) in a data frame;
• mutate: add new columns to a data frame;
• arrange: reorder rows in a data frame;
• summarise: collapses a data frame to a single row;

• We will cover essential tools for implementing this verbs using pandas.

* Grammar of Data Manipulation wirh R (2020)

1. Extracting Data (filter, select)

• One of the most basic tasks for manipulating a DataFrame is to extract rows and columns of
interest.
1. Extracting Data (filter, select)

• Common ways we may want to extract data:

• Grab the first or last k rows in the DataFrame.
• Grab data with a certain label.
• Grab data at a certain position.

• We'll find that all three of these methods are useful to us in data
manipulation tasks.
.head and .tail

• The simplest scenarios: We want to extract the first or last n rows from the
DataFrame.
• df.head(k) will return the first k rows of the DataFrame df.
• df.tail(k) will return the last k rows.
Label-based Extraction: .loc

• A more complex task: We want to extract data with specific column or index labels.

• The .loc accessor allows us to specify the labels of rows and columns we wish to extract.

• We describe "labels" as the bolded text at the top and left of a DataFrame.
Label-based Extraction: .loc

• Arguments to .loc can be:

• A list.
• A slice (syntax is inclusive of the right hand side of the slice).
• A single value.
Label-based Extraction: .loc

• Arguments to .loc can be:

• A list.
• A slice (syntax is inclusive of the right hand side of the slice).
• A single value.
Label-based Extraction: .loc

• Arguments to .loc can be:

• A list.
• A slice (syntax is inclusive of the right hand side of the slice).
• A single value.
Label-based Extraction: .loc

• Arguments to .loc can be:

• A list.
• A slice (syntax is inclusive of the right hand side of the slice).
• A single value.
Label-based Extraction: .loc

• Arguments to .loc can be:

• A list.
• A slice (syntax is inclusive of the right hand side of the slice).
• A single value.

A series

A single string
Integer-based Extraction: .iloc

• Arguments to .iloc can be:

• A list.
• A slice (syntax is exclusive of the right hand side of the slice).
• A single value.
Integer-based Extraction: .iloc

• Arguments to .iloc can be:

• A list.
• A slice (syntax is exclusive of the right hand side of the slice).
• A single value.

elections.iloc[[1, 2, 3], [0, 1, 2]]

Integer-based Extraction: .iloc

• Arguments to .iloc can be:

• A list.
• A slice (syntax is exclusive of the right hand side of the slice).
• A single value.

elections.iloc[[1, 2, 3], 0:3]

Integer-based Extraction: .iloc

• Arguments to .loc can be:

• A list.
• A slice (syntax is exclusive of the right hand side of the slice).
• A single value.

elections.iloc[:, 0:3]
Integer-based Extraction: .iloc

• Arguments to .loc can be:

• A list.
• A slice (syntax is exclusive of the right hand side of the slice).
• A single value.

elections.iloc[[1, 2, 3], 1]

elections.iloc[0, 1]
.loc vs .iloc

• Remember:
• .loc performs label-based extraction
• .iloc performs integer-based extraction

• When choosing between .loc and .iloc, you'll usually choose .loc.
• Safer: If the order of data gets shuffled in a public database, your code still works.
• Readable: Easier to understand what elections.loc[:, ["Year", "Candidate", "Result"]]
means than elections.iloc[:, [0, 1, 4]]

• .iloc can still be useful.

• Example: If you have a DataFrame of movie earnings sorted by earnings, can use .iloc to get the
median earnings for a given year (index into the middle).
Context-dependent Extraction: [ ]

• Selection operators:
• .loc selects items by label. First argument is rows, second argument is columns.
• .iloc selects items by integer. First argument is rows, second argument is columns.
• [] only takes one argument, which may be:
• A slice of row numbers.
• A list of column labels.
• A single column label.

• That is, [] is context sensitive.

Context-dependent Extraction: [ ]

•[] only takes one argument, which may be:

• A slice of row integers.
• A list of column labels.
• A single column label.

elections[3:7]
Context-dependent Extraction: [ ]

•[] only takes one argument, which may be:

• A slice of row integers.
• A list of column labels.
• A single column label.

elections[["Year", "Candidate", "Result"]]

Context-dependent Extraction: [ ]

•[] only takes one argument, which may be:

• A slice of row integers.
• A list of column labels.
• A single column label.

elections["Candidate"]
Why Use []?

• In short: [ ] can be much more concise than .loc or .iloc

• Consider the case where we wish to extract the "Candidate" column. It is far simpler to write
elections["Candidate"] than it is to write elections.loc[:, "Candidate"]

• In practice, [ ] is often used over .iloc and .loc in data science work. Typing time adds up!
Boolean Array Input for .loc and [ ]

• We learned to extract data according to its integer position (.iloc) or its label (.loc)
• What if we want to extract rows that satisfy a given condition?
• .loc and [ ] also accept boolean arrays as input.
• Rows corresponding to True are extracted; rows corresponding to False are not.

babynames_first_10_rows = babynames.loc[:9, :]
Boolean Array Input for .loc and [ ]

babynames_first_10_rows = babynames.loc[:9, :]
Boolean Array Input for .loc and [ ]
Boolean Array Input for .loc and [ ]

We can perform the same operation using .loc.

babynames_first_10_rows.loc[[True, False, True, False, True, False, True, False,

True, False], :]
Boolean Array Input

Useful because boolean arrays can be generated by using logical operators on Series.
Boolean Array Input

Can also use .loc.

Boolean Array Input

• Boolean Series can be combined using various operators, allowing filtering of results by
multiple criteria.
• The & operator allows us to apply logical_operator_1 and logical_operator_2
• The | operator allows us to apply logical_operator_1 or logical_operator_2

babynames[(babynames["Sex"] == "F") | (babynames["Year"] < 2000)]

Bitwise Operators

• & and | are examples of bitwise operators. They allow us to apply multiple logical conditions.

• If p and q are boolean arrays or Series:

• Boolean array selection is a useful tool, but can lead to overly verbose code for complex conditions.
babynames[(babynames["Name"] == "Bella") | (babynames["Name"] == "Alex") |
(babynames["Name"] == "Narges") | (babynames["Name"] == "Lisa")]

• pandas provides many alternatives, for example:

• .isin
names = ["Bella", "Alex", "Narges", "Lisa"]
babynames[babynames["Name"].isin(names)]
• .str.startswith

babynames[babynames["Name"].str.startswith("N")]
2. Adding Data

• Remember the five verbs

• filter: select rows (observations) in a data frame;
• select: select columns (variables) in a data frame;
• mutate: add new columns to a data frame;
• arrange: reorder rows in a data frame;
• summarise: collapses a data frame to a single row;
Syntax for Adding a Column

•Adding a column is easy:

1. Use [ ] to reference the desired new column.
2. Assign this column to a Series or array of the appropriate length.
Syntax for Modifying a Column

• Modifying a column is very similar to adding a column.

1.Use [ ] to reference the existing column.
2.Assign this column to a new Series or array of the appropriate length.

# Modify the "name_lengths" column to be one less than its original value
babynames["name_lengths"] = babynames["name_lengths"]-1
Syntax for Renaming a Column

• Rename a column using the (creatively named) .rename() method.

•.rename() takes in a dictionary that maps old column names to their new ones.
Syntax for Dropping a Column (or Row)

•Remove columns using the (also creatively named) .drop method.

•The .drop() method assumes you're dropping a row by default. Use axis="columns" to drop a column
instead.

babynames = babynames.drop("Length", axis="columns")

An Important Note: DataFrame Copies
• Notice that we re-assigned babynames to an updated value on the previous slide.
babynames = babynames.drop("Length", axis="columns")
• By default, pandas methods create a copy of the DataFrame, without changing the original
DataFrame at all.

•To apply our changes, we must update our DataFrame to this new, modified copy.
3. Arrange rows

• Remember the five verbs

• Series.sort_values( ) will automatically sort all values in the Series.

• DataFrame.sort_values(column_name) must specify the name of the column to be used for
sorting.

babynames["Name"].sort_values() babynames.sort_values(by="Count", ascending=False)

Sorting By Length

• Let’s try to solve the sorting problem with different approaches

Approach 1: Create a Temporary Column and Sort Based on the New Column

• Sorting the DataFrame as usual

Approach 2: Sorting Using the key Argument
Approach 3: Create a Temporary Column and Sort Based on the New Column

• Suppose we want to sort by the number of occurrences of "dr" and "ea"s.

• Use the Series.map method.

4. Summaize data

• Remember the five verbs

Our goal:

• Group together rows that fall under the same category.

• For example, group together all rows from the same year.
• Perform an operation that aggregates across all rows in the category.
• For example, sum up the total number of babies born in that year.
Why Group?
.groupby()

• A .groupby() operation involves some combination of splitting the object, applying a

function, and combining the results.
• Calling .groupby() generates DataFrameGroupBy objects → "mini" sub-DataFrames
• Each subframe contains all rows that correspond to the same group (here, a particular year)
.groupby().agg()

• We cannot work directly with DataFrameGroupBy objects! The diagram below is to help understand what goes on
conceptually – in reality, we can't "see" the result of calling .groupby.
• Instead, we transform a DataFrameGroupBy object back into a DataFrame using .agg
• .agg is how we apply an aggregation operation to the data.
Putting it all together

dataframe.groupby(column_name).agg(aggregation_function)
•babynames[["Year", "Count"]].groupby("Year").agg(sum) computes the total number of
babies born in each year.
Alternatives …

• Now, we create groups for each year.

babynames.groupby("Year")[["Count"]].agg(sum)
or
babynames.groupby("Year")[["Count"]].sum()
or
babynames.groupby("Year").sum(numeric_only=True)
Concluding groupby.agg

• A groupby operation involves some combination of splitting the object, applying a function, and combining the
results.

• So far, we've seen that df.groupby("Year").agg(sum):

• Split df into sub-DataFrames based on Year.
• Apply the sum function to each column of each sub-DataFrame.
• Combine the results of sum into a single DataFrame, indexed by Year.
Groupby Review Question
Answer
Case Study: Name "Popularity"

• Goal: Find the baby name with sex "F" that has fallen in popularity the most in California.

f_babynames = babynames[babynames["Sex"]=="F"]
f_babynames = f_babynames.sort_values(["Year"])
jenn_counts_series =f_babynames[f_babynames["Name"]=="Jennifer"]["Count"]

Number of Jennifers Born in California Per Year.

What Is "Popularity"?

Goal: Find the baby name with sex "F" that has fallen in popularity the most in California.

How do we define "fallen in popularity?"

• Let’s create a metric: "Ratio to Peak" (RTP).

• The RTP is the ratio of babies born with a given name in 2022 to the maximum number of babies born
with that name in any year.

Example for "Jennifer":

• In 1972, we hit peak Jennifer. 6,065 Jennifers were born.

• In 2022, there were only 114 Jennifers.
• RTP is 114 / 6065 = 0.018796372629843364.
Calculating RTP

max_jenn = max(f_babynames[f_babynames["Name"]=="Jennifer"]["Count"])
6065

curr_jenn = f_babynames[f_babynames["Name"]=="Jennifer"]["Count"].iloc[-1]
114 Remember: f_babynames is sorted by year.
.iloc[-1] means “grab the latest year”
rtp = curr_jenn / max_jenn
0.018796372629843364

def ratio_to_peak(series):
return series.iloc[-1] / max(series)
jenn_counts_ser = f_babynames[f_babynames["Name"]=="Jennifer"]["Count"]
ratio_to_peak(jenn_counts_ser)
0.018796372629843364
Calculating RTP Using .groupby()

• .groupby() makes it easy to compute the RTP for all names at once!

rtp_table = f_babynames.groupby("Name")[["Count"]].agg(ratio_to_peak)
Renaming Columns After Grouping

• By default, .groupby will not rename any aggregated columns (the column is still named "Count", even
though it now represents the RTP.

• For better readability, we may wish to rename "Count" to "Count RTP”

rtp_table = f_babynames.groupby("Name")[["Count"]].agg(ratio_to_peak)
rtp_table = rtp_table.rename(columns={"Count":"Count RTP"})
Some Data Science Payoff

• By sorting rtp_table we can see the names whose popularity has decreased the most.

rtp_table.sort_values("Count RTP")
Some Data Science Payoff

• We can get the list of the top 10 names and then plot popularity with:

top10 = rtp_table.sort_values("Count RTP").head(10).index

Raw GroupBy Objects and Other Methods

•The result of a groupby operation applied to a DataFrame is a DataFrameGroupBy object.

•It is not a DataFrame!

grouped_by_year = elections.groupby("Year")
type(grouped_by_year)

•Given a DataFrameGroupBy object, can use various functions to generate DataFrames (or Series). agg is
only one choice:
groupby.size() and groupby.count()
groupby.size() and groupby.count()
Filtering by Group

•Another common use for groups is to filter data.

• groupby.filter takes an argument func.

• func is a function that:
• Takes a DataFrame as input.
• Returns either True or False.
• filter applies func to each group/sub-DataFrame:
• If func returns True for a group, then all rows belonging to the group are preserved.
• If func returns False for a group, then all rows belonging to that group are filtered out.
• Notes:
• Filtering is done per group, not per row. Different from boolean filtering.
• Unlike agg(), the column we grouped on does NOT become the index!
groupby.filter()
Filtering Elections Dataset

• Going back to the elections dataset.

• Let's keep only election year results where the max '%' is less than 45%.
elections.groupby("Year").filter(lambda sf: sf["%"].max() < 45)
groupby Quiz

• We want to know the best election by each party.

groupby Quiz

• We want to know the best election by each party.

• Best election: The election with the highest % of votes.
• For example, Democrat’s best election was in 1964, with candidate Lyndon Johnson
winning 61.3% of votes.
Attempt #1

• Why does the table seem to claim that Woodrow Wilson won the presidency in 2020?

elections.groupby("Party").max().head(10)
Problem with Attempt #1

• Why does the table seem to claim that Woodrow Wilson won the presidency in 2020?
• Every column is calculated independently! Among Democrats:
• Last year they ran: 2020.
• Alphabetically the latest candidate name: Woodrow Wilson.
• Highest % of vote: 61.34%.
Attempt #2: Motivation

• We want to preserve entire rows, so we need an aggregate function that does that.
Attempt #2: Solution
Attempt #2: Solution

• First sort the DataFrame so that rows are in descending order of %.

• Then group by Party and take the first item of each sub-DataFrame.
Grouping by Multiple Columns

• Suppose we want to build a table showing the total number of babies born of each sex in each year.

• One way is to groupby using both columns of interest:

babynames.groupby(["Year", "Sex"])[["Count"]].agg(sum).head(6)
Pivot function

babynames_pivot = babynames.pivot(
index = "Year", # rows (turned into index)
columns = "Sex", # column values
values = ["Count"], # field(s) to process in
each group
aggfunc = np.sum, # group operation
)
babynames_pivot.head(6)
groupby(["Year", "Sex"]) vs. pivot

• The pivot() output more naturally represents our data.

Pivot output

babynames.groupby(["Year",
"Sex"])[["Count"]].agg(sum).head(6)
Pivot Tables with Multiple Values

• Pivot_table is same as Pivot() but can takes multiple columns as values

babynames_pivot = babynames.pivot_table(
index = "Year", # rows (turned into
index)
columns = "Sex", # column values
values = ["Count", "Name"],
aggfunc = np.max, # group operation)
babynames_pivot.head(6)
Pivot Table Mechanics
Where are we?

• 5 verbs for data manipulation

• Joining tables

• Tidy data
Joining Tables

• Suppose want to know the popularity of presidential candidate's names in 2022.

• Example: Dwight Eisenhower's name Dwight is not popular today, with only 5 babies born
with this name in California in 2022.

• To solve this problem, we’ll have to join tables.

Creating Table 1: Babynames in 2022

• Let's set aside names in California from 2022 first:

babynames_2022 = babynames[babynames["Year"] == 2022]

babynames_2022
Creating Table 2: Presidents with First Names

•To join our table, we’ll also need to set aside the first names of each candidate.

elections["First Name"] =
elections["Candidate"].str.split().str[0]
Joining Our Tables: Two Options

merged = pd.merge(left = elections, right = babynames_2022, left_on = "First Name", right_on = "Name")
merged = elections.merge(right = babynames_2022, left_on = "First Name", right_on = "Name")
Tidy data

• You can represent the same underlying data in multiple ways.

• Let’s see the example table which records : country, year, population, and number of
documented cases of TB (tuberculosis)
• Which one is the best representation?
Another example

<Table 2>
Tidy data

• Tidy data is data where:

• 1. Each variable is in a column.
• 2. Each observation is a row.
• 3. Each value is a cell.
Tidy data
Tidy data –cont.

• Why tidy data?

• 1. You can easily add observations
• 2. You can easily add columns
• 3. Many python packages are designed with tidy data in mind
Two verbs for tidy data

• It makes “wide” data longer.

• It makes “long” data wider.
• Useful when multiple observations stored in one
• Useful when observations are spread over
row
multiple rows
melt()
melt()

Pandas Basics
No ratings yet
Pandas Basics
84 pages
Indirect Cda
No ratings yet
Indirect Cda
112 pages
Pandas 1
No ratings yet
Pandas 1
49 pages
Python For Data Science 1662157639
No ratings yet
Python For Data Science 1662157639
6 pages
Lec 02 - DS100 Fa23 - Pandas 1
No ratings yet
Lec 02 - DS100 Fa23 - Pandas 1
61 pages
05Getting Started With Pandas
No ratings yet
05Getting Started With Pandas
44 pages
Python Data Science 101
100% (1)
Python Data Science 101
41 pages
PPT Pandas(Assignment 3)
No ratings yet
PPT Pandas(Assignment 3)
24 pages
Pandas
No ratings yet
Pandas
5 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Seleccione Pandas Dataframes Columnas y Filas Usando Loc y Iloc
No ratings yet
Seleccione Pandas Dataframes Columnas y Filas Usando Loc y Iloc
7 pages
Pandas Practice
No ratings yet
Pandas Practice
7 pages
PPT for Assignment-3 (Final_Pandas_Lab)
No ratings yet
PPT for Assignment-3 (Final_Pandas_Lab)
40 pages
Data Handing Using Pandas-I
100% (2)
Data Handing Using Pandas-I
46 pages
Data Handlinng Using Pandas
No ratings yet
Data Handlinng Using Pandas
46 pages
Cheat Python
No ratings yet
Cheat Python
8 pages
Eda Unit 2
No ratings yet
Eda Unit 2
65 pages
Pandas DataFrame
No ratings yet
Pandas DataFrame
70 pages
Phan1_Pandas_Numpy_Matplotlib
No ratings yet
Phan1_Pandas_Numpy_Matplotlib
158 pages
Data Handling Using Pandas-I-ORG
No ratings yet
Data Handling Using Pandas-I-ORG
44 pages
ICT2103 Full Book-Part-3
No ratings yet
ICT2103 Full Book-Part-3
14 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
39 pages
Dataframe Notes
No ratings yet
Dataframe Notes
47 pages
Pandas Data Structures: Sections
No ratings yet
Pandas Data Structures: Sections
13 pages
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
No ratings yet
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
47 pages
Python Pandas-Data Frames
No ratings yet
Python Pandas-Data Frames
41 pages
Class XII IP Key Points (Python Pandas)
No ratings yet
Class XII IP Key Points (Python Pandas)
5 pages
ip study
No ratings yet
ip study
18 pages
Unit-4Introduction To Pandas
No ratings yet
Unit-4Introduction To Pandas
44 pages
Line By Line 12 IP
No ratings yet
Line By Line 12 IP
21 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Data Frames
No ratings yet
Data Frames
60 pages
Python for ML
No ratings yet
Python for ML
41 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Lab3 - Python - Pandas DataFrame - GeeksforGeeks
No ratings yet
Lab3 - Python - Pandas DataFrame - GeeksforGeeks
20 pages
Python-for-Data-Analysis (Pandas
No ratings yet
Python-for-Data-Analysis (Pandas
31 pages
CO3_1_Pandas Series and Data Frame
No ratings yet
CO3_1_Pandas Series and Data Frame
37 pages
Class XII Data Handlinng Using PandasI
No ratings yet
Class XII Data Handlinng Using PandasI
46 pages
Murali Internship
No ratings yet
Murali Internship
34 pages
04 Introduction To Python-1
No ratings yet
04 Introduction To Python-1
29 pages
1 Data Handlinng Using Pandas-I
No ratings yet
1 Data Handlinng Using Pandas-I
46 pages
Python For Data Analysis: Dr. Kishore Kunal
100% (1)
Python For Data Analysis: Dr. Kishore Kunal
43 pages
Python Data Frame New
No ratings yet
Python Data Frame New
32 pages
Pandaspythonfordatascience
No ratings yet
Pandaspythonfordatascience
1 page
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
Data Handlinng Using Pandas-I
No ratings yet
Data Handlinng Using Pandas-I
46 pages
Python 3rd unit question and answer
No ratings yet
Python 3rd unit question and answer
25 pages
Pandas
No ratings yet
Pandas
29 pages
Unit 4
No ratings yet
Unit 4
36 pages
Pandas in Python
No ratings yet
Pandas in Python
59 pages
UNIT -4 -PART 2
No ratings yet
UNIT -4 -PART 2
36 pages
Pandas Python For Data Science
No ratings yet
Pandas Python For Data Science
1 page
Pandas Python For Data Science
100% (1)
Pandas Python For Data Science
1 page
Iloc and Loc Uses PDF
No ratings yet
Iloc and Loc Uses PDF
16 pages
DevOps Session 3 Pandas.pptx
No ratings yet
DevOps Session 3 Pandas.pptx
33 pages
Python Pandas Demo PDF
100% (2)
Python Pandas Demo PDF
23 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
Coding Interview Questions and Answers
From Everand
Coding Interview Questions and Answers
Chinmoy Mukherjee
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Visualizing Data Structures
From Everand
Visualizing Data Structures
Rhonda Hoenigman
No ratings yet
Class 11 Computer Application em Sample Materials
No ratings yet
Class 11 Computer Application em Sample Materials
79 pages
Research for Learning Development (1)
No ratings yet
Research for Learning Development (1)
1 page
Essentials: Week by Week
No ratings yet
Essentials: Week by Week
9 pages
Thermal Design, Analysis and Test Validation of Turksat-3Usat Satellite
No ratings yet
Thermal Design, Analysis and Test Validation of Turksat-3Usat Satellite
15 pages
Buckling response of offshore pipelines under combined tension bending and external pressure
No ratings yet
Buckling response of offshore pipelines under combined tension bending and external pressure
10 pages
MSP-2-WEEK-10-13
No ratings yet
MSP-2-WEEK-10-13
34 pages
Technical Drawing 8 (Quarter 2 - Week 1)
0% (2)
Technical Drawing 8 (Quarter 2 - Week 1)
4 pages
Intensification of Synthesis of Crystalline Zinc Phosphate (Zn3 (PO4) 2) Nanopowder: Advantage of Sonochemical Method Over Conventional Method
No ratings yet
Intensification of Synthesis of Crystalline Zinc Phosphate (Zn3 (PO4) 2) Nanopowder: Advantage of Sonochemical Method Over Conventional Method
6 pages
Steam Distribution and Primary Equipment Index: Index Pg1 Biography Pg2 Preface Pg2 Conclusion - Final Thought PG 37
No ratings yet
Steam Distribution and Primary Equipment Index: Index Pg1 Biography Pg2 Preface Pg2 Conclusion - Final Thought PG 37
38 pages
Syllabus_VI_Sem_AI-ML_2021-25_20.07.2023
No ratings yet
Syllabus_VI_Sem_AI-ML_2021-25_20.07.2023
22 pages
Ec101, Test A Instructions: Answer Questions 1-12 On The Left-Hand Margin Without Skipping Lines
No ratings yet
Ec101, Test A Instructions: Answer Questions 1-12 On The Left-Hand Margin Without Skipping Lines
2 pages
Reading 45 Measures of Financial Risk - Answers
No ratings yet
Reading 45 Measures of Financial Risk - Answers
7 pages
Ncert Solutions For Class 9 Maths Chapter 2
No ratings yet
Ncert Solutions For Class 9 Maths Chapter 2
30 pages
Malting: Current Practice in Malting
No ratings yet
Malting: Current Practice in Malting
7 pages
Price Action
No ratings yet
Price Action
24 pages
Olson DL Wu D Enterprise Risk Management in Finance
No ratings yet
Olson DL Wu D Enterprise Risk Management in Finance
277 pages
Mini project 1
No ratings yet
Mini project 1
18 pages
Daihatsu Ficha Haynes G102 PDF
No ratings yet
Daihatsu Ficha Haynes G102 PDF
1 page
Journal of Hydrology: Research Papers
No ratings yet
Journal of Hydrology: Research Papers
15 pages
Getting Startedwith Proteus 8
No ratings yet
Getting Startedwith Proteus 8
12 pages
Spandrel Panel
No ratings yet
Spandrel Panel
1 page
Himax CR2816-1100 Himax CR2816-1100 Himax CR2816-1100 Himax CR2816-1100 Himax CR2816-1100
No ratings yet
Himax CR2816-1100 Himax CR2816-1100 Himax CR2816-1100 Himax CR2816-1100 Himax CR2816-1100
1 page
Ultrasonic Periodontal Debridement
No ratings yet
Ultrasonic Periodontal Debridement
9 pages
4.2-Homework-Tasks
No ratings yet
4.2-Homework-Tasks
23 pages
Examples For Chapter 3 - 2.PDF-revCRR-September 11, 2016
No ratings yet
Examples For Chapter 3 - 2.PDF-revCRR-September 11, 2016
8 pages
MiG-29
No ratings yet
MiG-29
157 pages
Physical Properties of Some Common Refrigerants Are Indicated
No ratings yet
Physical Properties of Some Common Refrigerants Are Indicated
8 pages
School-Consolidated-Report-On-The-Quarterly-Assessment-All Subjects
100% (1)
School-Consolidated-Report-On-The-Quarterly-Assessment-All Subjects
8 pages
Jncia Lab Guide
No ratings yet
Jncia Lab Guide
18 pages