0% found this document useful (0 votes)
8 views45 pages

E2v Excel to Python Cheat Sheet 1

Uploaded by

mkarthick460
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views45 pages

E2v Excel to Python Cheat Sheet 1

Uploaded by

mkarthick460
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

EXCEL TO PYTHON

Cheat Sheet
The Evolution from
Excel to Python

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


Excel, with its user-friendly interface, has been the
cornerstone of data analysis for years. Yet, as data
grows in size and complexity, its limitations become
evident. Python, armed with libraries like Pandas, offers
advanced data capabilities, bridging the gap.
Transitioning to Python is not about replacement, but
about adapting to a data-rich era.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


PANDAS

What is Pandas?
Pandas is a powerful open-source data analysis and manipulation library
for Python. It provides data structures and functions needed to work with
structured data seamlessly.
Think of it as Excel, but in the form of a programming library.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


PANDAS
DATAFRAMES VS. EXCEL SHEETS

Excel Sheets
In Excel, you work with sheets, which are essentially tables of data with rows
and columns.
You can apply formulas, create charts, and use tools like PivotTables.

Pandas DataFrames
A DataFrame is the primary data structure in Pandas, similar to an Excel
sheet.
It's a two-dimensional table with labeled axes (rows and columns).
You can perform operations on DataFrames using Python code, offering
more flexibility and automation than Excel.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


PANDAS
SERIES VS. EXCEL COLUMNS

Excel Columns
A column in Excel represents a series of data. You can apply formulas to
these columns.

Pandas Series
A Series in Pandas is a one-dimensional labeled array.
It's similar to a column in Excel but can be manipulated using Python
functions.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


PANDAS
DATA MANIPULATION: EXCEL VS. PANDAS

Excel
Uses functions and formulas (e.g., VLOOKUP, SUM, AVERAGE).
Provides a graphical interface for tasks like filtering, sorting, and formatting.

Pandas
Uses methods and functions (e.g., `merge()`, `sum()`, `mean()`).
Offers more advanced and automated data manipulation capabilities
through code.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


PANDAS
VISUALIZATION AND ANALYSIS

Excel
Provides tools like charts, graphs, and PivotTables for data visualization and
analysis.

Pandas
Can integrate with visualization libraries like Matplotlib and Seaborn
Offers more customization and advanced analysis capabilities.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


PANDAS
WHY TRANSITION TO PANDAS?
Scalability: Pandas can handle larger datasets more efficiently than Excel.

Automation: Repetitive tasks can be automated using Python scripts.

Integration: Pandas can integrate with other Python libraries and tools for

more advanced analytics, machine learning, and data visualization.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


LARGE DATASETS

What are Large Datasets?


Datasets that are too large to be easily managed, processed, or analyzed
with traditional tools like Excel.
Often referred to as "Big Data" in certain contexts.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


LARGE DATASETS
LIMITATIONS OF EXCEL

Size Limit
Excel has a row limit (1,048,576 rows). For datasets exceeding this, Excel is not
an option.

Performance Issues
As datasets grow, Excel can become slow, unresponsive, or even crash.

Lack of Advanced Analysis Tools


While Excel is powerful, it lacks advanced data analysis, manipulation, and
machine learning tools.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


LARGE DATASETS
ADVANTAGES OF PYTHON FOR LARGE DATASETS

Scalability
Python, especially with libraries like Pandas and Dask, can handle much
larger datasets efficiently.

Flexibility
Python offers a wide range of libraries and tools for data processing, analysis,
visualization, and machine learning.

Automation
Repetitive and complex tasks can be automated using Python scripts,
making data processing more efficient.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


LARGE DATASETS
DATA STORAGE AND RETRIEVAL

Excel
Limited to file-based storage, which can be inefficient for very large datasets.

Python
Can integrate with databases (e.g., SQL, NoSQL) and cloud storage solutions,
allowing for efficient data storage and retrieval.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


LARGE DATASETS
ADVANCED DATA ANALYSIS AND MACHINE LEARNING

Excel
Limited to basic statistical tools and data analysis functions.

Python
Offers libraries like Scikit-learn for machine learning, TensorFlow for deep
learning, and Statsmodels for advanced statistical modeling

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


LARGE DATASETS
COLLABORATION AND REPRODUCIBILITY

Excel
Collaborating on large Excel files can be challenging. Reproducing analyses
can also be difficult due to manual steps.

Python
Supports version control (e.g., Git), making collaboration easier. Analyses in
Python scripts are reproducible, ensuring consistency.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


WHY TRANSITION?

Efficiency: Python can process large datasets faster and more efficiently than Excel.

Capabilities: Python offers a broader range of tools and libraries for advanced analysis.

Future-Proofing: As data continues to grow, transitioning to Python ensures you have

the tools to handle data challenges of the future.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


Excel to Python
Terminology Glossary

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


DATAFRAME
EXCEL PYTHON (PANDAS)

A table of data with rows and A two-dimensional, size-


columns. In Excel, this is often mutable, and heterogeneous
just referred to as a "table" or tabular data structure with
"worksheet." labeled axes (rows and
columns). It's similar to an Excel
worksheet and can store
various types of data, including
numbers, strings, and dates.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


SERIES
EXCEL PYTHON (PANDAS)

A single column of data. In A one-dimensional labeled


Excel charts, a series refers to a array capable of holding any
set of data points plotted on a data type (integers, strings,
chart. floating-point numbers, Python
objects, etc.). It's essentially a
single column of a DataFrame.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


PIVOT
EXCEL PYTHON (PANDAS)

A PivotTable is a data The `pivot` method in Pandas


summarization tool used in reshapes data based on
Excel. It allows users to column values and reorients
summarize and analyze large the DataFrame. It's similar to
datasets by displaying data in creating a PivotTable in Excel,
a more compact format with and you can specify which
rows, columns, values, and columns become the new rows,
filters. columns, and values in the
reshaped DataFrame.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


GROUPBY
EXCEL PYTHON (PANDAS)
In Excel, grouping data is often The `groupby` method in
done using the "Group" feature Pandas allows you to group
or through PivotTables to rows of data together based on
aggregate data based on the values in one or more
certain criteria. columns and then perform
aggregate functions on the
grouped data, such as sum,
count, mean, etc. It's a powerful
tool for data analysis and is
similar to the grouping feature
in Excel PivotTables.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


CHEAT SHEET

Excel functions and


Their Python Equivalents

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


BASIC OPERATIONS

Excel Python (Pandas)

SUM(A1:A10) df['column_name'].sum()

AVERAGE(A1:A10) df['column_name'].mean()

MAX(A1:A10) df['column_name'].max()

MIN(A1:A10) df['column_name'].min()

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


TEXT MANIPULATION

Excel Python (Pandas)

CONCATENATE(A1, B1) df['A'] + df['B']

LEFT(A1, 3) df['column_name'].str[:3]

RIGHT(A1, 3) df['column_name'].str[-3:]

LEN(A1) df['column_name'].str.len()

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


DATE AND TIME

Excel Python (Pandas)

TODAY() pd.Timestamp.now().date()

YEAR(A1) df['date_column'].dt.year

MONTH(A1) df['date_column'].dt.month

DAY(A1) df['date_column'].dt.day

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


LOOKUP AND REFERENCE

Excel Python (Pandas)

df.merge(lookup_table, on='key_column',
VLOOKUP(A1, Table, 2, FALSE)
how='left')

HLOOKUP(A1, Table, 2, FALSE) Similar to VLOOKUP in Pandas

INDEX(A1:A10, 5) df['column_name'].iloc[4]

MATCH(A1, A1:A10, 0) df['column_name'][df['column_name'] == value].index[0]

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


LOGICAL OPERATIONS

Excel Python (Pandas)

IF(A1 > 10, "Yes", "No") df['column_name'].apply(lambda x: 'Yes' if x > 10 else 'No')

AND(A1 > 10, B1 < 5) (df['A'] > 10) & (df['B'] < 5)

OR(A1 > 10, B1 < 5) `(df['A'] > 10)

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


DATA ANALYSIS

Excel Python (Pandas)

PivotTable df.pivot_table(index='...', columns='...', values='...', aggfunc='...')

Filter (A1:A10, A1 > 10) df[df['column_name'] > 10]

Sort (A1:A10, Ascending) df.sort_values(by='column_name')

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


CHEAT SHEET

Common Data
Manipulation Tasks

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


DATA IMPORT AND EXPORT

Excel Task Excel Method Python (Pandas) Method

Import CSV File > Open pd.read_csv('filename.csv')

Export to CSV File > Save As > CSV df.to_csv('filename.csv', index=False)

Import Excel File > Open pd.read_excel('filename.xlsx')

Export to Excel File > Save As > Excel df.to_excel('filename.xlsx', index=False)

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


DATA EXPLORATION

Excel Task Excel Method Python (Pandas) Method

View first 5 rows Scroll df.head()

View last 5 rows Scroll to end df.tail()

Get summary stats Right-click > Quick Analysis df.describe()

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


DATA CLEANING

Excel Task Excel Method Python (Pandas) Method

Find and replace Ctrl + H df.replace('old_value', 'new_value')

Remove duplicates Data > Remove Duplicates df.drop_duplicates()

Fill blank cells Ctrl + Enter df.fillna('value')

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


DATA TRANSFORMATION

Excel Task Excel Method Python (Pandas) Method

Add new column Insert Column df['new_column'] = df['column1'] + df['column2']

Group data PivotTable df.groupby('column_name').aggfunc()

Filter data Filter button df[df['column_name'] == 'value']

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


DATA ANALYSIS

Excel Task Excel Method Python (Pandas) Method

df.pivot_table(index='...', columns='...', values='...',


Summarize data PivotTable
aggfunc='...')

Create a chart Insert > Chart df.plot(kind='chart_type')

Home > Conditional Not directly applicable in Pandas, but can be visualized using
Conditional formatting
Formatting libraries like Seaborn or Matplotlib

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


ADVANCED OPERATIONS

Excel Task Excel Method Python (Pandas) Method

Date difference DATEDIF function df['date_end'] - df['date_start']

Text split Text to Columns df['column_name'].str.split('delimiter')

Merge data VLOOKUP or INDEX/MATCH pd.merge(df1, df2, on='key_column')

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


CHEAT SHEET

Icons and Color Cues for


Easy Reference

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


📂 DATA IMPORT AND EXPORT
Excel Task Excel Icon Python (Pandas) Method Python Icon

Import CSV 📄 pd.read_csv('filename.csv') 🐍


Export to CSV 💾 df.to_csv('filename.csv', index=False) 📤
Import Excel 📘 pd.read_excel('filename.xlsx') 🐍
Export to Excel 💾 df.to_excel('filename.xlsx', index=False) 📤

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


🔍 DATA EXPLORATION
Excel Task Excel Icon Python (Pandas) Method Python Icon

View first 5 rows 👓 df.head() 🖼️


View last 5 rows 👓 df.tail() 🖼️
Get summary stats 📊 df.describe() 📈

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


🧹 DATA CLEANING
Excel Python
Excel Task Python (Pandas) Method
Icon Icon

Find and replace 🔍➡️ df.replace('old_value', 'new_value') 🔄


Remove duplicates ❌ df.drop_duplicates() 🚫
Fill blank cells ⬜ df.fillna('value') ✅

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


🔄 DATA TRANSFORMATION
Excel Python
Excel Task Python (Pandas) Method
Icon Icon

Add new
column
➕ df['new_column'] = df['column1'] + df['column2'] 🆕
Group data 📑 df.groupby('column_name').aggfunc() 📂
Filter data 🔍 df[df['column_name'] == 'value'] 🕵️

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


📊 DATA ANALYSIS
Excel Python
Task Python (Pandas) Method
Icon Icon

Summarize
data
📑 df.pivot_table(index='...', columns='...', values='...',
aggfunc='...')
📊
Create a
chart
📉 df.plot(kind='chart_type') 📈

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


🔧 ADVANCED OPERATIONS
Excel Python
Excel Task Python (Pandas) Method
Icon Icon

Date difference 📅 df['date_end'] - df['date_start'] ⏳


Text split ✂️ df['column_name'].str.split('delimiter') 🧩
Merge data 🔗 pd.merge(df1, df2, on='key_column') ⛓️

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


CHEAT SHEET

Easing Into Python:


Transitioning Tips

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


PRACTICAL TIPS:

Start small: Begin with basic data manipulation tasks in Python.

Practice regularly: The more you code, the more comfortable you'll become.

Seek community support: Engage in forums, attend workshops, and join

Python groups.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI


WISHING YOU CONTINUED GROWTH AND
SUCCESS IN YOUR PYTHON JOURNEY!

Embrace the journey of transitioning from Excel to Python. While there's a

learning curve, the rewards in terms of efficiency, capabilities, and future-

proofing your skills are immense.

FUTURE-PROOF YOUR CARREER MASTER DATA SKILLS UPSKILL IN AI

You might also like