0% found this document useful (0 votes)
18 views2 pages

Assignment 1

The analysis of the Titanic dataset focuses on cleaning data and exploring factors influencing passenger survival. Key findings indicate that gender, class, and fare significantly impacted survival rates, with women and first-class passengers having higher chances of survival. The study suggests that further predictive modeling could enhance understanding of these factors.

Uploaded by

Mahim Jain Anwa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views2 pages

Assignment 1

The analysis of the Titanic dataset focuses on cleaning data and exploring factors influencing passenger survival. Key findings indicate that gender, class, and fare significantly impacted survival rates, with women and first-class passengers having higher chances of survival. The study suggests that further predictive modeling could enhance understanding of these factors.

Uploaded by

Mahim Jain Anwa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Assignment 1

Name -Mahim Jain

Enrolment No -23116054

Department -E.C.E

1. Introduction

The Titanic dataset provides information on passengers, including


demographic details and survival status. The objective of this analysis is
to clean the data, explore patterns, and identify key factors influencing
survival.

2. Data Cleaning

2.1 Handling Missing Values

 Age: Missing values filled with the median age.

 Cabin: Missing values replaced with "Unknown" since most entries


were missing.

 Embarked: Missing values filled with the most frequent


embarkation point.

2.2 Duplicate Check

 No duplicate rows were found.

2.3 Outlier Detection

 Age and Fare showed some outliers.

 No outlier removal was performed to retain dataset integrity.

3. Exploratory Data Analysis (EDA)

3.1 Univariate Analysis

 Numerical Variables: Summary statistics showed skewness in


Fare and a normal distribution in Age.

 Categorical Variables: The majority of passengers embarked from


Southampton (S), and most were in third class.

3.2 Bivariate Analysis

 Survival vs. Pclass: Higher survival rates in first-class


passengers.
 Survival vs. Sex: Higher survival rate for females.

 Age vs. Fare: No clear trend, but high fares were generally paid by
first-class passengers.

3.3 Multivariate Analysis

 Survival by Pclass & Sex: First-class women had the highest


survival rate.

 Heatmap of Correlations: Pclass negatively correlated with Fare


and Survived.

4. Findings & Interpretations

 Gender Influence: Women had a significantly higher survival rate.

 Class Matters: First-class passengers had better survival chances.

 Embarkation Impact: Passengers from Cherbourg had a relatively


higher survival rate.

 Fare Factor: Higher fares were associated with higher survival


chances.

5. Conclusion

This analysis highlights the factors that played a role in survival on the
Titanic. Gender, class, and fare were strong indicators of survival
probability. Further predictive modeling could refine these insights for
better accuracy.

You might also like