1. Given a CSV file sales_data.
csv with 10 million rows, use Pandas to:
a) Load it efficiently (consider dtype optimization).
b) Display the first 5 rows.
c) Print column data types and memory usage.
2. You have a DataFrame with missing values in columns price and quantity. Write Pandas
code to:
a) Count missing values per column.
b) Fill missing price values with the column mean.
c) Drop rows where quantity is missing.
3. Given a DataFrame of e-commerce transactions with columns user_id, product_id, and
purchase_amount, use Pandas to find:
a) Total purchase amount per user.
b) Top 5 users by total spending.
4. You have two DataFrames:
a)orders with order_id, customer_id, and order_date.
b) customers with customer_id and customer_name.
Write Pandas code to merge them into a single DataFrame on customer_id.
5. Generate a NumPy array of shape (1000000,) with random integers from 1 to 100.
a) Compute the mean using NumPy.
b) Compare execution time with Python’s built-in sum and len.
6. Given a NumPy array of temperatures in Celsius, write code to convert them to Fahrenheit
using vectorized operations only.
7. Given a DataFrame df with a column sales, write Pandas code to:
a) Select all rows where sales > 5000.
b) Sort these rows in descending order of sales.
8. Given a Pandas Series of daily stock prices indexed by date, write code to:
a) Calculate the 7-day moving average.
b) Plot the moving average using Matplotlib.
9. Using a sales DataFrame with columns region, product, and sales, create a Pandas pivot
table showing the sum of sales for each product in each region.
10. You have a DataFrame with object columns containing repeated string values.
a) Convert these columns to category dtype.
b) Compare memory usage before and after.