0% found this document useful (0 votes)

3 views6 pages

ml file

Data science is an interdisciplinary field focused on extracting insights from large data sets using statistical methods and machine learning. The document outlines the impact of data science across various industries, highlights essential Python libraries for data manipulation, and provides an overview of a housing prices dataset for analysis. It includes practical exercises using Python's Pandas library for data manipulation tasks such as filtering, sorting, and grouping data.

Uploaded by

meet008828

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views6 pages

ml file

Uploaded by

meet008828

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Machine Exercise : 1

What is Data Science?

Data science is an interdisciplinary field that involves extracting insights and knowledge
from large volumes of data. It combines statistical methods, machine learning algorithms,
and domain expertise to solve complex problems. Data scientists play a crucial role in
transforming raw data into actionable information that drives decision-making.

Impact of Data Science

Data science has revolutionized industries across the globe. Its applications range from
healthcare and finance to marketing and e-commerce. Some key impacts include:

- Improved decision-making through data-driven insights

- Enhanced customer experience through personalized recommendations
- Fraud detection and prevention
- Optimization of processes and resource allocation
- Advancements in scientific research and discovery

Essential Python Libraries for Data Science

Python has emerged as the preferred language for data scientists due to its simplicity,
readability, and extensive libraries.

Objective
This exercise aims to demonstrate basic data manipulation techniques using Python's
Pandas library.

Dataset Overview
The Raw Housing Prices dataset provides detailed information about housing sales. This
dataset is useful for understanding pricing trends, property characteristics, and market
behaviors.

Data Description
- Date House was Sold: The date when the house was sold.
- Sale Price: The price at which the house was sold.
- Zipcode: The area code of the property location.
- Bedrooms: The number of bedrooms in the house.
- Bathrooms: The number of bathrooms in the house.
- Living Area (sqft): The living space size in square feet.
- Lot Area (sqft): The size of the lot in square feet.
- Floors: The number of floors in the house.
- Waterfront View: Whether the house has a view of the waterfront.
- Condition: The overall condition of the property.
Purpose
- Price Trend Analysis: Identifying pricing trends over time and across locations.
- Property Segmentation: Analyzing features that affect property prices.
- Location Insights: Understanding how location impacts housing prices.
- Market Behavior: Evaluating market behaviors to assist in real estate decision-making.

Q 1.1) Basic Data Manipulation Tasks

# Import Libraries
import pandas as pd

# Load Data from the provided CSV file

df = pd.read_csv('/content/Raw_Housing_Prices3.csv')

# Display the Data

print(df.head())

Q 1.2) Selecting Multiple Columns

# Selecting relevant columns

selected_columns = df[['Date House was Sold', 'Sale Price', 'Zipcode', 'Waterfront View']]
print(selected_columns)
Q 1.3) Displaying a Concise Summary of the DataFrame

df.info()
Q 1.4) Generating Descriptive Statistics

df.describe()

Q 1.5) Display the Rows and Columns of the Dataset

df.shape

Q 2) Exporting Data

user_data = {'Uniroll': [2234219], 'Name': ['meet'], 'Percentage': [80]}

user_df = pd.DataFrame(user_data)
user_df.to_csv('user_data.csv', index=False)

Q 3) Filtering Data

filtered_data = df[df['Sale Price'] > 500000][['Zipcode', 'Sale Price']]

print(filtered_data)
Q 3) Sorting Data

sorted_df = df.sort_values(by='Sale Price', ascending=False)

print(sorted_df[['Zipcode', 'Sale Price']])

Q 3) Grouping Data

grouped_df = df.groupby('Zipcode')['Sale Price'].sum()

print(grouped_df)

Data Analysis With Python - Jupyter Notebook
No ratings yet
Data Analysis With Python - Jupyter Notebook
10 pages
House Price Prediction
No ratings yet
House Price Prediction
14 pages
Capstone Project Report
No ratings yet
Capstone Project Report
187 pages
Pandas Assignment
0% (4)
Pandas Assignment
8 pages
UNIT-2 (3)
No ratings yet
UNIT-2 (3)
78 pages
Machine Learning(BCSL606) Lab Manual
No ratings yet
Machine Learning(BCSL606) Lab Manual
117 pages
L03 The Regression Pipeline
No ratings yet
L03 The Regression Pipeline
94 pages
Module 2notes
No ratings yet
Module 2notes
44 pages
Prepared by Asif Bhat Exploratory Data Analysis: Explore Dataset
No ratings yet
Prepared by Asif Bhat Exploratory Data Analysis: Explore Dataset
143 pages
Copy - of - Descriptive - EDA - Munjal - Exercise1.ipynb - Colaboratory
No ratings yet
Copy - of - Descriptive - EDA - Munjal - Exercise1.ipynb - Colaboratory
30 pages
1.11 Lab 1 Data Analysis With Python 3
No ratings yet
1.11 Lab 1 Data Analysis With Python 3
25 pages
ds_ml__house_price_book
No ratings yet
ds_ml__house_price_book
46 pages
The Data Science Process
100% (1)
The Data Science Process
53 pages
MiniProject BI
No ratings yet
MiniProject BI
16 pages
boston_housing
No ratings yet
boston_housing
17 pages
Formal Research Paper Slideshow by Slidesgo
No ratings yet
Formal Research Paper Slideshow by Slidesgo
9 pages
Rajasri
No ratings yet
Rajasri
10 pages
Dawit House
No ratings yet
Dawit House
49 pages
Kirubavathi
No ratings yet
Kirubavathi
10 pages
Capstone project 6 April
No ratings yet
Capstone project 6 April
64 pages
Eda Project
No ratings yet
Eda Project
28 pages
Presentation 21
No ratings yet
Presentation 21
9 pages
module_2
No ratings yet
module_2
35 pages
AIMLlatestmodule 2Notes Removed
No ratings yet
AIMLlatestmodule 2Notes Removed
33 pages
Final DA LAB1 Merged (1)
No ratings yet
Final DA LAB1 Merged (1)
48 pages
End To End Machine Learning Project-2
No ratings yet
End To End Machine Learning Project-2
10 pages
Week 1 Get familier with Jupyter Notebook
No ratings yet
Week 1 Get familier with Jupyter Notebook
4 pages
Faseeh Chap 2 Report
No ratings yet
Faseeh Chap 2 Report
30 pages
FML PROJECT diya (1) (1)
No ratings yet
FML PROJECT diya (1) (1)
9 pages
Batch1
No ratings yet
Batch1
3 pages
Real Estate
No ratings yet
Real Estate
10 pages
Python Assignment 1.ipynb - Colaboratory
No ratings yet
Python Assignment 1.ipynb - Colaboratory
3 pages
Exercise Explore Your Data
No ratings yet
Exercise Explore Your Data
2 pages
Ids Project
No ratings yet
Ids Project
25 pages
USA Real Estate Price Prediction Using Decision Tree Regressor, and AdaBoost Regressor
No ratings yet
USA Real Estate Price Prediction Using Decision Tree Regressor, and AdaBoost Regressor
14 pages
Report
No ratings yet
Report
40 pages
DL_LR_1.ipynb - Colab
No ratings yet
DL_LR_1.ipynb - Colab
5 pages
Project PDF
No ratings yet
Project PDF
13 pages
HousePricePrediction_Zillow_solution_methodology
No ratings yet
HousePricePrediction_Zillow_solution_methodology
5 pages
ISMLA_Module5
No ratings yet
ISMLA_Module5
25 pages
Shub Neet Dt
No ratings yet
Shub Neet Dt
12 pages
California Housing Project
No ratings yet
California Housing Project
5 pages
00 Data Wrangling
No ratings yet
00 Data Wrangling
10 pages
ML 1-11
No ratings yet
ML 1-11
27 pages
Exercise3 Solution
No ratings yet
Exercise3 Solution
19 pages
Coding
No ratings yet
Coding
7 pages
Aastha Mahajan Python File
No ratings yet
Aastha Mahajan Python File
17 pages
ADS_LAB8
No ratings yet
ADS_LAB8
5 pages
An Overview of Social Media Analytics
No ratings yet
An Overview of Social Media Analytics
2 pages
P04 The Regression Pipeline - Preprocessing Ans
No ratings yet
P04 The Regression Pipeline - Preprocessing Ans
19 pages
Boston Housing Solutions
No ratings yet
Boston Housing Solutions
3 pages
House Price Prediction Models
No ratings yet
House Price Prediction Models
16 pages
zero-data-loss-recovery-appliance-protected-database-configuration-guide
No ratings yet
zero-data-loss-recovery-appliance-protected-database-configuration-guide
127 pages
Data Analysis Project MAIN
No ratings yet
Data Analysis Project MAIN
6 pages
10 Data Quality and Integration
No ratings yet
10 Data Quality and Integration
43 pages
Predicting Home Prices in Bangalore
No ratings yet
Predicting Home Prices in Bangalore
18 pages
M10 - Visualization
No ratings yet
M10 - Visualization
63 pages
Module 2
No ratings yet
Module 2
20 pages
House Price Prediction 1
No ratings yet
House Price Prediction 1
27 pages
Excel With Ai
No ratings yet
Excel With Ai
7 pages
Top 10 SharePoint Interview Questions and Answers
No ratings yet
Top 10 SharePoint Interview Questions and Answers
9 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
22 pages
Transaction Processing Systems
50% (2)
Transaction Processing Systems
17 pages
Introduction To Machine Learning (ML) With Sklearn
No ratings yet
Introduction To Machine Learning (ML) With Sklearn
10 pages
Pandas Assignment 1
No ratings yet
Pandas Assignment 1
7 pages
Kaggle Machine Learning
No ratings yet
Kaggle Machine Learning
6 pages
Supermarket Sales Analysis Project
No ratings yet
Supermarket Sales Analysis Project
8 pages
Documenting Styles
100% (1)
Documenting Styles
13 pages
The User Interface Design Process: Step 6: Select The Proper Device-Based Controls
No ratings yet
The User Interface Design Process: Step 6: Select The Proper Device-Based Controls
25 pages
08 Hands On Activity 222 ARG Delacruz
No ratings yet
08 Hands On Activity 222 ARG Delacruz
6 pages
Cv Ilyas Data Engineer 5 1 1 1
No ratings yet
Cv Ilyas Data Engineer 5 1 1 1
1 page
Dbms Practical File
No ratings yet
Dbms Practical File
20 pages
Database Interview Questions
No ratings yet
Database Interview Questions
21 pages
Lissam-Security Model For Libraries
No ratings yet
Lissam-Security Model For Libraries
17 pages
FPA For Data Warehousing
No ratings yet
FPA For Data Warehousing
25 pages
Infographic Roadmap To Self-Service Analytics and BI Adoption
No ratings yet
Infographic Roadmap To Self-Service Analytics and BI Adoption
1 page
A Data Mining Architecture For Distributed Environments: Lecture Notes in Computer Science June 2002
No ratings yet
A Data Mining Architecture For Distributed Environments: Lecture Notes in Computer Science June 2002
13 pages
Spring Assignment
0% (1)
Spring Assignment
4 pages
Chapter Two: Interaction Design
No ratings yet
Chapter Two: Interaction Design
32 pages
United States Patent (10) Patent No.: US 8,527,512 B2
No ratings yet
United States Patent (10) Patent No.: US 8,527,512 B2
6 pages
Comdex Computer Applications
No ratings yet
Comdex Computer Applications
1 page
DD Boost Everywhere - File System Plug-In Integration Guide Oct 2017
No ratings yet
DD Boost Everywhere - File System Plug-In Integration Guide Oct 2017
31 pages
MC4211 Set4
No ratings yet
MC4211 Set4
4 pages
Database Management Systems Unit-1
100% (1)
Database Management Systems Unit-1
5 pages
Looking For Real Exam Questions For IT Certification Exams!
No ratings yet
Looking For Real Exam Questions For IT Certification Exams!
13 pages
DBMS Quiz
No ratings yet
DBMS Quiz
4 pages
Galileo Basic Commands
No ratings yet
Galileo Basic Commands
9 pages
Importance of Multimedia Compression
No ratings yet
Importance of Multimedia Compression
1 page
Introduction To Canonical Tags
No ratings yet
Introduction To Canonical Tags
3 pages
PYTHON DATA ANALYTICS: Mastering Python for Effective Data Analysis and Visualization (2024 Beginner Guide)
From Everand
PYTHON DATA ANALYTICS: Mastering Python for Effective Data Analysis and Visualization (2024 Beginner Guide)
FLOYD BAX
No ratings yet

ml file

Uploaded by

ml file

Uploaded by

Machine Exercise : 1

What is Data Science?

Impact of Data Science

- Improved decision-making through data-driven insights

Essential Python Libraries for Data Science

Q 1.1) Basic Data Manipulation Tasks

# Load Data from the provided CSV file

# Display the Data

Q 1.2) Selecting Multiple Columns

# Selecting relevant columns

Q 1.5) Display the Rows and Columns of the Dataset

user_data = {'Uniroll': [2234219], 'Name': ['meet'], 'Percentage': [80]}

filtered_data = df[df['Sale Price'] > 500000][['Zipcode', 'Sale Price']]

sorted_df = df.sort_values(by='Sale Price', ascending=False)

grouped_df = df.groupby('Zipcode')['Sale Price'].sum()

You might also like