0% found this document useful (0 votes)

31 views43 pages

DM Project Report

Data mining project

Uploaded by

Akshet Bhadwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views43 pages

DM Project Report

Data mining project

Uploaded by

Akshet Bhadwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

DM Project Report

Page - 1
Part 1: Clustering
Read the data and perform basic analysis such as printing a few rows (head and tail), info, data summary, null
values duplicate values, etc. ------------------------------------------------------ 4

Treat missing values in CPC, CTR and CPM using the formula given. ------------------------- 10

Check if there are any outliers. Do you think treating outliers is necessary for K-Means clustering?

Based on your judgement decide whether to treat outliers and if yes, which method to employ. -----------11

Perform z-score scaling and discuss how it affects the speed of the algorithm. ----------------16

Perform Hierarchical by constructing a Dendrogram using WARD and Euclidean distance. ---------17

Make Elbow plot (up to n=10) and identify optimum number of clusters for k-means algorithm. ---------17

Print silhouette scores for up to 10 clusters and identify optimum number of clusters. ---------18

Profile the ads based on optimum number of clusters using silhouette score and your domain understanding.

-----18

Conclude the project by providing summary of your learnings. ---------22

Part 2: PCA

Read the data and perform basic checks like checking head, info, summary, nulls, and duplicates, etc. ----------------
27

Perform detailed Exploratory analysis by creating certain questions like ----------------31

(i) Which state has highest gender ratio and which has the lowest?
(ii) Which district has the highest & lowest gender ratio?

We choose not to treat outliers for this case. Do you think that treating outliers for this case is necessary? ------------
---32

Page - 2
Scale the Data using z-score method. Does scaling have any impact on outliers? Compare boxplots before and after
scaling and comment. ----------------36

Perform all the required steps for PCA (use sklearn only) Create the covariance Matrix Get eigen values and eigen
vector. ---------------- 37

Identify the optimum number of PCs (for this project, take at least 90% explained variance). Show Scree plot.

---------------------- 39

Compare PCs with Actual Columns and identify which is explaining most variance. Write inferences about all the
principal components in terms of actual variables. ---------------- 41

Write linear equation for first PC. ----------------44

Page - 3
1.1.1 Read the data and perform basic analysis such as printing a few rows (head and tail), info, data summary,
null values duplicate values, etc.

 The data set is read from the file ‘Clustering Clean Ads_Data-2.xlsx’
 There are 23066 rows and 19 columns with 6 of the columns are of float64 data-type, 7 of them are integers and 6
of them are alphabetical type.
 Here is the Data dictionary of the columns of what they actually represent.
 1 Timestamp == The Timestamp of the particular Advertisement.

 2 InventoryType == The Inventory Type of the particular Advertisement. Format 1 to 7. This is a Categorical Variable.

 3 Ad - Length == The Length Dimension of the particular Advertisement.

 4 Ad - Width == The Width Dimension of the particular Advertisement.

 5 Ad Size == The Overall Size of the particular Advertisement. Length*Width.

 6 Ad Type == The type of the particular Advertisement. This is a Categorical Variable.

 7 Platform == The platform in which the particular Advertisement is displayed. Web, Video or App. This is a
Categorical Variable.

 8 Device Type == The type of the device which supports the particular Advertisement. This is a Categorical Variable.

 9 Format == The Format in which the Advertisement is displayed. This is a Categorical Variable.

 10 Available_Impression == How often the particular Advertisement is shown. An impression is counted each time an
Advertisement is shown on a search result page or other site on a Network.

 11 Matched_Queries == Matched search queries data is pulled from Advertising Platform and consists of the exact
searches typed into the search Engine that generated clicks for the particular Advertisement.

 12 Impressions == The impression count of the particular Advertisement out of the total available impressions.

 13 Clicks == It is a marketing metric that counts the number of times users have clicked on the particular
advertisement to reach an online property.

 14 Spend == It is the amount of money spent on specific ad variations within a specific campaign or ad set. This metric
helps regulate ad performance.

 15 Fee == The percentage of the Advertising Fees payable by Franchise Entities.

 16 Revenue == It is the income that has been earned from the particular advertisement.

 17 CTR == CTR stands for "Click through rate". CTR is the number of clicks that your ad receives divided by the number
of times your ad is shown. Formula used here is CTR = Total Measured Clicks / Total Measured Ad Impressions x 100.
Note that the Total Measured Clicks refers to the 'Clicks' Column and the Total Measured Ad Impressions refers to the
'Impressions' Column.

 18 CPM == CPM stands for "cost per 1000 impressions." Formula used here is CPM = (Total Campaign Spend /
Number of Impressions) * 1,000. Note that the Total Campaign Spend refers to the 'Spend' Column and the Number
of Impressions refers to the 'Impressions' Column.

 19 CPC == CPC stands for "Cost-per-click". Cost-per-click (CPC) bidding means that you pay for each click on your ads.
The Formula used here is CPC = Total Cost (spend) / Number of Clicks. Note that the Total Cost (spend) refers to the
'Spend' Column and the Number of Clicks refers to the 'Clicks' Column.

Page - 4
 The first 5 records and last 5 records of the data set: -

 Here is a list of columns

Page - 5
 Here is a list of the numeric field description which gives the count, mean, standard deviation, minimum, 25th
percentile, 50th percentile,75th percentile and maximum values of the Numeric fields.

 There are 14208 null values in the data set.

 Here are the list columns and the sum of their null values. These null values will be treated in the next questions

 Number of duplicate rows = 0

 Here are the object type columns with their respective value counts for each: -
o InventoryType Column
 Format4 7165
 Format5 4249
 Format1 3814
 Format3 3540
Page - 6
 Format6 1850
 Format2 1789
 Format7 659

o Ad Type
 Inter224 1658
 Inter217 1655
 Inter223 1654
 Inter219 1650
 Inter221 1650
 Inter222 1649
 Inter229 1648
 Inter227 1647
 Inter218 1645
 inter230 1644
 Inter220 1644
 Inter225 1643
 Inter226 1640
 Inter228 1639

o Platform
 Video 9873
 Web 8251
 App 4942

o Device Type
 Mobile 14806
 Desktop 8260

o Format
 Video 11552
 Display 11514

 The Timestamp column was ignored as it is not relevant in getting unique values.
 Some of the Numeric fields are converted to Categorical as there few unique values and not good for numerical analysis.
 Ad - Length (Units)
 120 - 7165
 300- 4473
 720 - 4249
 480 - 3540
 336 - 1850
 728 – 1789
 Ad- Width (Units)
 600 - 7824
 250 - 5664
 300 - 4249
 70 - 3540
 90 – 1789
 Ad Size (Ad – Length *Ad- Width)
 72000 - 7165
 216000 - 4249
 75000 - 3814
 33600 - 3540
 84000 - 1850
 65520 - 1789
 180000 - 659
 The remaining numerical type columns are not converted and they can be taken for calculation.
 The Ad - Length, Ad- Width, Ad Size columns are converted into categorical type.
 Here is the information after making changes to data type and renaming the columns for adaptation to the handle column
names in python.

Page - 7
 Here is the Count plot graphs for the categorical columns, this shows the unique values and its total count for each Categorical
Columns

 Inferences from the above graphs

 Format 1 of the InventoryType is the highest used among the remaining items in the InventoryType, Format 2 is the le
ast used
 For Ad_type column, there is no big difference between the variables (Inter217 to Inter230) as the range of the differe
nce just 20 in count.

 Inference from the graph on the left

 From the Platform columns, we can see the Video an
d App are 2 platforms widely used for the Ads
 Desktop device is the used mostly for the Ads than M
obile
 Display mode format count is greater that Video mo
de of Ads

Page - 8
 For the above Graphs which shows Fee column, 35 % is the Advertising Fees payable by Franchise Entities is mostly used amon
g the various other percentages, The least is 21%
 For the above Graphs which shows Ad_ length column, 120 units is the length of the Ad dimension is mostly used.

 For the above Graphs which shows Ad_Width column, 600 units is the Width of the Ad dimension which is mostly used.
 For the above Graphs which shows Ad_Size column, which the product of the Ad_ length and Ad_Width, We could clearly infer
that 72000 sq.units is the mostly used Ad size

Page - 9
 From the above Histogram plots for the numerical variables, we see that the columns are highly right skewed.

1.1.2 Treat missing values in CPC, CTR and CPM using the formula given.

As per the given formula provided in the question, we will be using to fill in the missing values in the CTR, CPM and CPC columns

CTR = (Clicks/Impressions) * 100

CPM = (Spend/Impressions) * 1000
Page - 10
CPC = (Spend/Clicks)

The total null values for CTR before imputing are 4736, The total null values for CTR after imputing are 0

The total null values for CPM before imputing is 4736

The total null values for CPM after imputing is 0

The total null values for CPC before imputing are 4736
The total null values for CPC after imputing are 0

Here is the snapshot of the information of the dataset

 From the below pic on the left, we can see that the values were imputed
 There are some changes in data found for those 3 columns on the pic in the right-side.

1.1.3 Check if there are any outliers. Do you think treating outliers is necessary for K-Means clustering?
Based on your judgement decide whether to treat outliers and if yes, which method to employ
The Outliers are present in all the numeric columns.

 Please find the below date which shows the summary of the boxplot is also evaluated for each of the 9 numerical
columns.
 Boxplot values for Available_Impressions -
 The Lower limit is -3707387.0
 The Minimum value is 1
 The Q1 (25th percentile) is 33672.0
 The Q2 (50th percentile) (mid value) is 483771.0
 The Q3 (75th percentile) is 2527712.0
 The Upper limit is 6268771.0
 The Maximum value is 27592861
 In Available_Impressions column there are 2378 values above the upper range, roughly 10.31 % of the total records

 Boxplot values for Matched_Queries -
 The Lower limit is -1725344.0
 The Minimum value is 1
 The Q1 (25th percentile) is 18282.0
 The Q2 (50th percentile) (mid value) is 258088.0
 The Q3 (75th percentile) is 1180700.0
 The Upper limit is 2924326.0
 The Maximum value is 14702025
 In Matched_Queries column there are 3192 values above the upper range, roughly 13.84 % of the total records

 Boxplot values for Impressions -
 The Lower limit is -1648666.0
 The Minimum value is 1
 The Q1 (25th percentile) is 7990.0
 The Q2 (50th percentile) (mid value) is 225290.0
 The Q3 (75th percentile) is 1112428.0
Page - 11
 The Upper limit is 2769086.0
 The Maximum value is 14194774
 In Impressions column there are 3269 values above the upper range, roughly 14.17 % of the total records

 Boxplot values for Clicks -
 The Lower limit is -17416.0
 The Minimum value is 1
 The Q1 (25th percentile) is 710.0
 The Q2 (50th percentile) (mid value) is 4425.0
 The Q3 (75th percentile) is 12794.0
 The Upper limit is 30919.0
 The Maximum value is 143049
 In Clicks column there are 1691 values above the upper range, roughly 7.33 % of the total records

 Boxplot values for Spend -
 The Lower limit is -4469.0
 The Minimum value is 0.0
 The Q1 (25th percentile) is 85.0
 The Q2 (50th percentile) (mid value) is 1425.0
 The Q3 (75th percentile) is 3121.0
 The Upper limit is 7676.0
 The Maximum value is 26931.87
 In Spend column there are 2081 values above the upper range, roughly 9.02 % of the total records

 Boxplot values for Revenue -
 The Lower limit is -2999.0
 The Minimum value is 0.0
 The Q1 (25th percentile) is 55.0
 The Q2 (50th percentile) (mid value) is 926.0
 The Q3 (75th percentile) is 2091.0
 The Upper limit is 5145.0
 The Maximum value is 21276.18
 In Revenue column there are 2325 values above the upper range, roughly 10.08 % of the total records

 Boxplot values for CTR -
 The Lower limit is -0.0
 The Minimum value is 0.0001
 The Q1 (25th percentile) is 0.0
 The Q2 (50th percentile) (mid value) is 0.0
 The Q3 (75th percentile) is 0.0
 The Upper limit is 0.0
 The Maximum value is 200.0
 In CTR column there are 3487 values above the upper range, roughly 15.12 % of the total records

 Boxplot values for CPM -
 The Lower limit is -15.0
 The Minimum value is 0.0
 The Q1 (25th percentile) is 2.0
 The Q2 (50th percentile) (mid value) is 8.0
 The Q3 (75th percentile) is 13.0
 The Upper limit is 30.0
 The Maximum value is 715.0
 In CPM column there are 208 values above the upper range, roughly 0.9 % of the total records

 Boxplot values for CPC -
 The Lower limit is -1.0
 The Minimum value is 0.0
 The Q1 (25th percentile) is 0.0
 The Q2 (50th percentile) (mid value) is 0.0
 The Q3 (75th percentile) is 1.0
 The Upper limit is 1.0
 The Maximum value is 7.26
 In CPC column there are 568 values above the upper range, roughly 2.46 % of the total records

 Please find the picture below which contains the boxplot of the 9 numerical columns

Page - 12
The Outlier treatment has been performed using the IQR method, the below pic shows the box plot after outliers
treatment.

Page - 13
 These are the boxplot summary after the outlier treatment

Boxplot values for Available_Impressions -

The Lower limit is -3707387.0
The Minimum value is 1.0
The Upper limit is 6268771.0
The Maximum value is 6268771.0
In Available_Impressions column there are 0 values above the upper range, roughly 0.0 % of the total records

Boxplot values for Matched_Queries -

The Lower limit is -1725344.0
The Minimum value is 1.0
The Upper limit is 2924326.0
The Maximum value is 2924326.25
In Matched_Queries column there are 0 values above the upper range, roughly 0.0 % of the total records

Boxplot values for Impressions -

The Lower limit is -1648666.0
The Minimum value is 1.0
Page - 14
The Upper limit is 2769086.0
The Maximum value is 2769085.5
In Impressions column there are 0 values above the upper range, roughly 0.0 % of the total records

Boxplot values for Clicks -

The Lower limit is -17416.0
The Minimum value is 1.0
The Upper limit is 30919.0
The Maximum value is 30919.375
In Clicks column there are 0 values above the upper range, roughly 0.0 % of the total records

Boxplot values for Spend -

The Lower limit is -4469.0
The Minimum value is 0.0
The Upper limit is 7676.0
The Maximum value is 7675.73
In Spend column there are 0 values above the upper range, roughly 0.0 % of the total records

Boxplot values for Revenue -

The Lower limit is -2999.0
The Minimum value is 0.0
The Upper limit is 5145.0
The Maximum value is 5145.297312499999
In Revenue column there are 0 values above the upper range, roughly 0.0 % of the total records

Boxplot values for CTR -

The Lower limit is -0.0
The Minimum value is 0.0001
The Upper limit is 0.0
The Maximum value is 0.45434435469063583
In CTR column there are 0 values above the upper range, roughly 0.0 % of the total records

Boxplot values for CPM -

The Lower limit is -15.0
The Minimum value is 0.0
The Upper limit is 30.0
The Maximum value is 29.974999999999998
In CPM column there are 0 values above the upper range, roughly 0.0 % of the total records

Boxplot values for CPC -

The Lower limit is -1.0
The Minimum value is 0.0
The Upper limit is 1.0
The Maximum value is 1.2400000000000002
In CPC column there are 0 values above the upper range, roughly 0.0 % of the total records

Here is the Histogram plot of columns after the outlier treatment

Page - 15
 The Histogram plots above, shows the treatment has been done all the values above the upper range has been equated to upper rang
e value.

1.1.4 Perform z-score scaling and discuss how it affects the speed of the algorithm.
The Z-score scaling was performed on the dataset. The volume of the data is high for sure the calculation time will
increase. The speed of the algorithm will depend upon the number of rows in a data set and any type of scaling alone
will not affect the speed of the algorithm

The mean value for the scaled data is 9.85752516445123e-18(~0)

The Standard deviation value for the scaled data is 1.0
The sum of the values in the scaled data is 2.0463630789890885e-12(~0)

Sorry, I am not sure how to answer this question as I don’t know how to gauge the performance of the algorithm

Page - 16
1.1.5 Perform Hierarchical by constructing a Dendrogram using WARD and Euclidean distance.

 The nearby pic is the dendrogram with

20 clusters, using WARD and Euclidean distance,
is shown.
 The Red line is drawn at 153 (WSS
value) to separate
 The optimum number of clusters of 4
clusters, can be selected from the dendrogram.

1.1.6 Make Elbow plot (up to n=10) and identify optimum number of clusters for k-means algorithm

The optimum number of clusters for k-means

algorithm is 4
The optimum number of clusters for k-means
algorithm is 4, as after red line, the slope of the line is
less and nothing much changed. So the Optimum
number of clusters can be chosen as 4

Page - 17
1.1.7 Print silhouette scores for up to 10 clusters and identify optimum number of clusters

Here is the list of the silhouette score for the clusters and minimum value silhouette width which determine how well it is clustered
 The silhouette score for 2 clusters is 0.4764675196030162. The Minimum value for the silhouette width is -0.07846579351816758
 The silhouette score for 3 clusters is 0.41202775538030506. The Minimum value for the silhouette width is -0.1213613474025022
 The silhouette score for 4 clusters is 0.4435830153133098. The Minimum value for the silhouette width is -0.05444643658905788
 The silhouette score for 5 clusters is 0.43504173598022067. The Minimum value for the silhouette width is -0.09688768275287114
 The silhouette score for 6 clusters is 0.4580612382343229. The Minimum value for the silhouette width is -0.2167164693199777
 The silhouette score for 7 clusters is 0.4436337977966384. The Minimum value for the silhouette width is -0.2183479792508185
 The silhouette score for 8 clusters is 0.4597928303095475. The Minimum value for the silhouette width is -0.15369094092604285
 The silhouette score for 9 clusters is 0.4624159152234553. The Minimum value for the silhouette width is -0.15474939888054123
 The silhouette score for 10 clusters is 0.44198839415394225. The Minimum value for the silhouette width is -0.136669270144576

The Optimum number of Clusters is identified as K=4 as (indicated as red line) the silhouette score for 4 clusters is 0.4435830153133098. The
minimum value for the silhouette width is -0.05444643658905788 ( Closer to 1 where compared to other values

1.1.8 Profile the ads based on optimum number of clusters using silhouette score and your domain
understanding
 The cluster column and sil_width is joined to the data set and means of the columns are grouped clusterwise. Clusters are from 0 to 3.

 14.27 % (3292.0) of the data rows has been mapped to Cluster 0

 17.52 % (4042.0) of the data rows has been mapped to Cluster 1
 26.25 % (6055.0) of the data rows has been mapped to Cluster 2
 41.95 % (9677.0) of the data rows has been mapped to Cluster 3

The below is the pic of the Bar plot fs the numerical variables hued by Device_type

Page - 18
 The Desktop and Mobile device usage looks almost the equal for the all the above displayed columns
 Even though the Clicks count for the Cluster 0 is high, the Cluster 1 ads is the revenue generating group
 The amount spent for the ads per 1000 impressions are high for the Cluster 0 and 3 Ad groups

The Pic below shows the Value counts of the items for each column hued by cluster

Page - 19
 From the above, Pic in the top, on the Inventorytype graph, Ads from the cluster 3 uses more Format4 of InventoryType
 Ad_ size of 72000 dimension is used more by the cluster 3 ads
 Advertising Fees of 35 % is mostly chosen percentage which is paid to the Franchise Entities by the Cluster 3 and Cluster2 groups of
ads

Here is the Graph which shows the Revenue for each column with unique value wise and hued by cluster groups .

 From the above graph, the revenue generated by the Cluster 1 group of ads is the high on all the columns.
 Format 2 of the InventoryType and 65520 of Ad_Size is mostly revenue generating for the Cluster1 group of ads.
 The revenue generated from the Ads which pays 21% Advertising Fees in the Cluster 3 group is the highest among the remaining percentages
followed by 23% Advertising Fees
 Cluster 3 gets revenue from all the Advertising Fees percentages levels when compared to other cluster groups.

Page - 20
1.1.9 Conclude the project by providing summary of your learning.
Recommendations:

More inferences from the analysed data will be provided for any recommendations
Groupby:1

The below table shows the mean of the values in the columns and grouped by Cluster 0 to 3

 The Cluster 3 group of Ads is the highest count when compared to other Clusters
 Cluster 1 group of Ads has the highest mean of the Available_Impression , Matched_Queries, Impressions, Spend, Revenue and CPC columns
 Even though Cluster 3 group has the highest in count, it the least in the means of the Available_Impression , Matched_Queries, Impressions, Clicks, Spend and Revenue
columns.
 The mean CPM (amount spent per 1000 impressions) is high for the Cluster 0 and 3 groups when compared to other Clusters, Cluster 1 has the least mean of CPM
 Even though the mean CTR (number of clicks that your ad receives divided by the number of times your ad is shown) is high for the Cluster 3 Ads the revenue generated is
the least when compared to the other clusters groups

Groupby:2

The below table shows the sum of the values in the columns and grouped by Cluster 0 to 3

 Cluster 1 group of Ads has got the highest sum of the Available_Impression , Matched_Queries, Impressions, Spend and Revenue columns.
 Followed by, the Cluster 2 has the next highest sum of the Available_Impression , Matched_Queries and Impressions columns

Crosstab 1:

 The Crosstab 1 pic on the left shows the cross tabulation of the Fee and Clusters with count as the value
 The Crosstab 1 pic on the Right shows the cross tabulation of the Fee and Clusters with Revenue as the value
 All of the Ads in the Cluster 3 group has paid the highest Advertising Fees of 35% paid by Franchise Entities and none towards the other percentages
 Cluster 1 group paid all the available categories as the Advertising Fees and Franchise Entities paid 33% Advertising Fees for the highest number of Ads
(1734)
 The Advertising Fees of 23% is the highly paid amount for the Cluster 1 group of Ads

Page - 21
Crosstab 2

Based on the Crosstab 2, We can see that only a certain combination of the Ad_Length and
Ad_width is available. Those are: -
480*70 = 33600
728*90 = 65520
300*250 = 75000
336*250 = 84000
720*300 = 216000
120*600 = 72000
300*600= 180000
As the Ad_Size = Ad_Length * Ad_width , Eventually these are the available sizes too.

Crosstab 3

Based on the Crosstab 3 with Revenas the value, Ad_Size 65520 (728*90)
is the highest revenue generating size which is used by cluster 1 group of
Ads

Crosstab 4a Crosstab 4b

The Crosstab 4a shows the Cross tabulation which involves Device_Type, Platform and Format in the row and Clusters in the column with Clicks as Value

The Crosstab 4b shows the Cross tabulation which involves Device_Type, Platform and Format in the row and Clusters in the column with Revenue as Value

Even though the Cluster 0 groups Ads has the highest number of clicks than Cluster 1, the Cluster 1 gets the more Revenue than Cluster 0

Other Inferences:

The Impression Share is the ratio of the Impression to the Available Impression

 The Impression Share for Cluster 0 is 0.547

 The Impression Share for Cluster 1 is 0.524
 The Impression Share for Cluster 2 is 0.457
 The Impression Share for Cluster 3 is 0.446

We can see that Cluster 0 Group has highest Impression Share

Page - 22
Here is the data for the clusters which shows the Matched Queries per click
 Matched_Queries per Click for Cluster 0 is 8.865
 Matched_Queries per Click for Cluster 1 is 500.579
 Matched_Queries per Click for Cluster 2 is 265.021
 Matched_Queries per Click for Cluster 3 is 10.697

We can infer that Cluster 1 group has the highest Matched Queries per click

Here is the data for the clusters which shows the Revenue per click
 Revenue per Click for Cluster 0 is 0.07
 Revenue per Click for Cluster 1 is 0.566
 Revenue per Click for Cluster 2 is 0.301
 Revenue per Click for Cluster 3 is 0.063

We can infer that Cluster 1 group has the highest Revenue per click.

Recommendations

Cluster 1 Group of Ads Spend more for the Advertising Ads strategically selecting the Ad_size of 65520 (728*90) and 75000 (300*250) paying the Fee of 23 % with
Format 2 of the inventory type is the highest in revenue Generating.

Page - 23
2.1 PCA FH (FT):

Primary census abstract for female headed households excluding institutional households (India & States/UTs - District Level), Scheduled tribes - 2011 PCA for
Female Headed Household Excluding Institutional Household.

The Indian Census has the reputation of being one of the best in the world. The first Census in India was conducted in the year 1872. This was conducted at
different points of time in different parts of the country. In 1881 a Census was taken for the entire country simultaneously. Since then, Census has been conducted
every ten years, without a break. Thus, the Census of India 2011 was the fifteenth in this unbroken series since 1872, the seventh after independence and the
second census of the third millennium and twenty first century. The census has been uninterruptedly continued despite of several adversities like wars, epidemics,
natural calamities, political unrest, etc. The Census of India is conducted under the provisions of the Census Act 1948 and the Census Rules, 1990.

The Primary Census Abstract which is important publication of 2011 Census gives basic information on Area, Total Number of Households, Total Population,
Scheduled Castes, Scheduled Tribes Population, Population in the age group 0-6, Literates, Main Workers and Marginal Workers classified by the four broad
industrial categories, namely, (i) Cultivators, ii) Agricultural Laborers, (iii) Household Industry Workers, and (iv) Other Workers and also Non-Workers.

The characteristics of the Total Population include Scheduled Castes, Scheduled Tribes, Institutional and Houseless Population and are presented by sex and rural-
urban residence. Census 2011 covered 35 States/Union Territories, 640 districts, 5,924 sub-districts, 7,935 Towns and 6,40,867 Villages.

The data collected has so many variables thus making it difficult to find useful details without using Data Science Techniques.

You are tasked to perform detailed EDA and identify Optimum Principal Components that explains the most variance in data. Use Sklearn only.

Note: The 24 variables given in the Rubric is just for performing EDA.

You will have to consider the entire dataset, including all the variables for performing PCA.

Data file - PCA India Data Census.xlsx

 Read the data and perform basic checks like checking head, info, summary, nulls, and duplicates, etc.

 Perform detailed Exploratory analysis by creating certain questions like (i) Which state has highest gender ratio and which has the lowest? (ii) Which
district has the highest & lowest gender ratio? (Example Questions). Pick 5 variables out of the given 24 variables below for EDA: No_HH, TOT_M,
TOT_F, M_06, F_06, M_SC, F_SC, M_ST, F_ST, M_LIT, F_LIT, M_ILL, F_ILL, TOT_WORK_M, TOT_WORK_F, MAINWORK_M, MAINWORK_F, MAIN_CL_M,
MAIN_CL_F, MAIN_AL_M, MAIN_AL_F, MAIN_HH_M, MAIN_HH_F, MAIN_OT_M, MAIN_OT_F

 We choose not to treat outliers for this case. Do you think that treating outliers for this case is necessary?

 Scale the Data using z-score method. Does scaling have any impact on outliers? Compare boxplots before and after scaling and comment.

 Perform all the required steps for PCA (use sklearn only) Create the covariance Matrix Get eigen values and eigen vector.

 Identify the optimum number of PCs (for this project, take at least 90% explained variance). Show Scree plot.

 Compare PCs with Actual Columns and identify which is explaining most variance. Write inferences about all the Principal components in terms of
actual variables.

 Write linear equation for first PC.

Page - 24
Page - 25
2.1.1 Read the data and perform basic checks like checking head, info, summary, nulls, and duplicates, etc.

 There are 640 rows and 61 columns

 There are 0 null values in the data set
 Number of duplicate rows = 0
 First 5 rows of the data set
 There are 0 null values in the data set
 Number of duplicate rows = 0
 The State Code column converted in Categorical variable
 The Dist.Code column converted in Categorical variable
 The State column converted in Categorical variable
 The Area Name column converted in Categorical variable

Page - 26
 Here are the last 5 rows of data.

Page - 27
 Here are the column names for the data set

 Here is the Description of Categorical data in the given data set

Page - 28
These are the list 61 column names

Page - 29
2.1.2 Perform detailed Exploratory analysis by creating certain questions like

(i) Which state has highest gender ratio and which has the lowest?

(ii) Which district has the highest & lowest gender ratio? (Example Questions).

Pick 5 variables out of the given 24 variables below for EDA: No_HH, TOT_M, TOT_F, M_06, F_06, M_SC, F_SC, M_ST, F_ST, M_LIT, F_LIT, M_ILL, F_ILL,
TOT_WORK_M, TOT_WORK_F, MAINWORK_M, MAINWORK_F, MAIN_CL_M, MAIN_CL_F, MAIN_AL_M, MAIN_AL_F, MAIN_HH_M, MAIN_HH_F, MAIN_OT_M,
MAIN_OT_F

Q1 Which state has highest gender ratio and which has the lowest?

Q1a Highest gender ratio

The state which has the highest Gender ratio is

State
Lakshadweep 0.87
Name: GenderRatio, dtype: float64

Q1b Lowest gender ratio

The state which has the Lowest Gender ratio is

State
Andhra Pradesh 0.53
Name: GenderRatio, dtype: float64

Q2 Which State/district has the highest & lowest gender ratio

The State/District which has the highest gender ratio is

State Area_Name
Lakshadweep Lakshadweep 0.87
Name: GenderRatio, dtype: float64

The State/District which has the lowest gender ratio is

State Area_Name
Andhra Pradesh Krishna 0.44
Name: GenderRatio, dtype: float64

Q3 Which are top 5 State/District that has the highest Literate Males and Literate Females?

The top 5 State/District that has the highest Literate Males are
State Area_Name
Maharashtra Mumbai Suburban 403261.00
West Bengal North Twenty Four Parganas 384839.00
Kerala Malappuram 371829.00
Maharashtra Thane 332986.00
Karnataka Bangalore 325690.00
Name: M_LIT, dtype: float64

The top 5 State/District that has the highest Literate Females are
State Area_Name
Kerala Malappuram 571140.00
Maharashtra Mumbai Suburban 568736.00
West Bengal North Twenty Four Parganas 517061.00
Maharashtra Thane 486756.00
Karnataka Bangalore 471354.00
Name: F_LIT, dtype: float64

Page - 30
2.1.3 We choose not to treat outliers for this case. Do you think that treating outliers for this case is necessary?

Page - 31
Page - 32
Boxplot values for No_HH - Boxplot values for F_ILL - Boxplot values for MAIN_OT_M -
The Upper limit is 143004.0 The Upper limit is 162627.0 The Upper limit is 47128.0
The Maximum value is 310450 The Maximum value is 254160 The Maximum value is 240855
In No_HH column there are 31 records above the In F_ILL column there are 26 records above the In MAIN_OT_M column there are 53 records
upper range, roughly 4.84 % of the total records upper range, roughly 4.06 % of the total records above the upper range, roughly 8.28 % of the
total records
Boxplot values for TOT_M - Boxplot values for TOT_WORK_M -
The Upper limit is 224454.0 The Upper limit is 104937.0 Boxplot values for MAIN_OT_F -
The Maximum value is 485417 The Maximum value is 269422 The Upper limit is 31207.0
In TOT_M column there are 25 records above the In TOT_WORK_M column there are 32 records The Maximum value is 209355
upper range, roughly 3.91 % of the total records above the upper range, roughly 5.0 % of the total In MAIN_OT_F column there are 59 records
records above the upper range, roughly 9.22 % of the
Boxplot values for TOT_F - total records
The Upper limit is 340853.0 Boxplot values for TOT_WORK_F -
The Maximum value is 750392 The Upper limit is 108939.0 Boxplot values for MARGWORK_M -
In TOT_F column there are 26 records above the The Maximum value is 257848 The Upper limit is 20094.0
upper range, roughly 4.06 % of the total records In TOT_WORK_F column there are 42 records The Maximum value is 47553
above the upper range, roughly 6.56 % of the In MARGWORK_M column there are 43 records
Boxplot values for M_06 - total records above the upper range, roughly 6.72 % of the
The Upper limit is 34200.0 total records
The Maximum value is 96223 Boxplot values for MAINWORK_M -
In M_06 column there are 32 records above the The Upper limit is 85617.0 Boxplot values for MARGWORK_F -
upper range, roughly 5.0 % of the total records The Maximum value is 247911 The Upper limit is 39061.0
In MAINWORK_M column there are 36 records The Maximum value is 66915
Boxplot values for F_06 - above the upper range, roughly 5.62 % of the In MARGWORK_F column there are 19 records
The Upper limit is 32747.0 total records above the upper range, roughly 2.97 % of the
The Maximum value is 95129 total records
In F_06 column there are 33 records above the Boxplot values for MAINWORK_F -
upper range, roughly 5.16 % of the total records The Upper limit is 73405.0 Boxplot values for MARG_CL_M -
The Maximum value is 226166 The Upper limit is 2735.0
Boxplot values for M_SC - In MAINWORK_F column there are 55 records The Maximum value is 13201
The Upper limit is 43375.0 above the upper range, roughly 8.59 % of the In MARG_CL_M column there are 55 records
The Maximum value is 103307 total records above the upper range, roughly 8.59 % of the
In M_SC column there are 29 records above the total records
upper range, roughly 4.53 % of the total records Boxplot values for MAIN_CL_M -
The Upper limit is 16202.0 Boxplot values for MARG_CL_F -
Boxplot values for F_SC - The Maximum value is 29113 The Upper limit is 5703.0
The Upper limit is 64545.0 In MAIN_CL_M column there are 25 records The Maximum value is 44324
The Maximum value is 156429 above the upper range, roughly 3.91 % of the In MARG_CL_F column there are 53 records
In F_SC column there are 29 records above the total records above the upper range, roughly 8.28 % of the
upper range, roughly 4.53 % of the total records total records
Boxplot values for MAIN_CL_F -
Boxplot values for M_ST - The Upper limit is 15335.0 Boxplot values for MARG_AL_M -
The Upper limit is 18704.0 The Maximum value is 36193 The Upper limit is 9442.0
The Maximum value is 96785 In MAIN_CL_F column there are 29 records above The Maximum value is 23719
In M_ST column there are 51 records above the the upper range, roughly 4.53 % of the total In MARG_AL_M column there are 48 records
upper range, roughly 7.97 % of the total records records above the upper range, roughly 7.5 % of the
total records
Boxplot values for F_ST - Boxplot values for MAIN_AL_M -
The Upper limit is 30556.0 The Upper limit is 18563.0 Boxplot values for MARG_AL_F -
The Maximum value is 130119 The Maximum value is 40843 The Upper limit is 20619.0
In F_ST column there are 58 records above the In MAIN_AL_M column there are 36 records The Maximum value is 45301
upper range, roughly 9.06 % of the total records above the upper range, roughly 5.62 % of the In MARG_AL_F column there are 30 records
total records above the upper range, roughly 4.69 % of the
Boxplot values for M_LIT - total records
The Upper limit is 163027.0 Boxplot values for MAIN_AL_F -
The Maximum value is 403261 The Upper limit is 24431.0 Boxplot values for MARG_HH_M -
In M_LIT column there are 30 records above the The Maximum value is 87945 The Upper limit is 784.0
upper range, roughly 4.69 % of the total records In MAIN_AL_F column there are 60 records above The Maximum value is 4298
the upper range, roughly 9.38 % of the total In MARG_HH_M column there are 58 records
Boxplot values for F_LIT - records above the upper range, roughly 9.06 % of the
The Upper limit is 180601.0 total records
The Maximum value is 571140 Boxplot values for MAIN_HH_M -
In F_LIT column there are 37 records above the The Upper limit is 2467.0 Boxplot values for MARG_HH_F -
upper range, roughly 5.78 % of the total records The Maximum value is 16429 The Upper limit is 2149.0
In MAIN_HH_M column there are 47 records The Maximum value is 15448
Boxplot values for M_ILL - above the upper range, roughly 7.34 % of the In MARG_HH_F column there are 39 records
The Upper limit is 60896.0 total records above the upper range, roughly 6.09 % of the
The Maximum value is 105961 total records
In M_ILL column there are 39 records above the Boxplot values for MAIN_HH_F -
upper range, roughly 6.09 % of the total records The Upper limit is 3216.0 Boxplot values for MARG_OT_M -
The Maximum value is 45979 The Upper limit is 8560.0
In MAIN_HH_F column there are 56 records The Maximum value is 24728
above the upper range, roughly 8.75 % of the In MARG_OT_M column there are 46 records
total record above the upper range, roughly 7.19 % of the
total records

Page - 33
Boxplot values for MARGWORK_0_3_M - Boxplot values for NON_WORK_M -
The Upper limit is 7168.0 The Upper limit is 1270.0
The Maximum value is 20648 The Maximum value is 6456
In MARGWORK_0_3_M column there are 48 In NON_WORK_M column there are 54 records
records above the upper range, roughly 7.5 % of above the upper range, roughly 8.44 % of the
the total records total records

Boxplot values for MARGWORK_0_3_F - Boxplot values for NON_WORK_F -

The Upper limit is 7776.0 The Upper limit is 1803.0
The Maximum value is 25844 The Maximum value is 10533
In MARGWORK_0_3_F column there are 40 In NON_WORK_F column there are 47 records
records above the upper range, roughly 6.25 % above the upper range, roughly 7.34 % of the
of the total records total records

Boxplot values for MARG_CL_0_3_M -

The Upper limit is 3551.0
The Maximum value is 9875
In MARG_CL_0_3_M column there are 48
records above the upper range, roughly 7.5 % of
the total records

Boxplot values for MARG_CL_0_3_F -

The Upper limit is 7564.0
The Maximum value is 21611
In MARG_CL_0_3_F column there are 35 records
above the upper range, roughly 5.47 % of the
total records

Boxplot values for MARG_AL_0_3_M -

The Upper limit is 606.0
The Maximum value is 5775
In MARG_AL_0_3_M column there are 55
records above the upper range, roughly 8.59 %
of the total records

Boxplot values for MARG_AL_0_3_F -

The Upper limit is 1258.0
The Maximum value is 17153
In MARG_AL_0_3_F column there are 59 records
above the upper range, roughly 9.22 % of the
total records

Boxplot values for MARG_HH_0_3_M -

The Upper limit is 1400.0
The Maximum value is 6116
In MARG_HH_0_3_M column there are 63
records above the upper range, roughly 9.84 %
of the total records

Boxplot values for MARG_HH_0_3_F -

The Upper limit is 3830.0
The Maximum value is 13714
In MARG_HH_0_3_F column there are 46
records above the upper range, roughly 7.19 %
of the total records

Boxplot values for MARG_OT_0_3_M -

The Upper limit is 176.0
The Maximum value is 895
In MARG_OT_0_3_M column there are 60
records above the upper range, roughly 9.38 %
of the total records

Boxplot values for MARG_OT_0_3_F -

The Upper limit is 536.0
The Maximum value is 3354
In MARG_OT_0_3_F column there are 42 records
above the upper range, roughly 6.56 % of the
total records

Page - 34
2.1.4 Scale the Data using z-score method. Does scaling have any impact on outliers? Compare boxplots before and after scaling and comment.

The Z-score scaling is applied and here are first 5 records for the columns. The below pic shows the first 5 records of all column

Page - 35
The below pic shows the Boxplot after Z-scaling, there is no change in the Outliers.

2.1.5 Perform all the required steps for PCA (use sklearn only) Create the covariance Matrix Get eigen values and eigen vector.

For PCA,
 As P-value is 0.0 we can reject the Null Hypothesis and there are significant correlations exist
 The kmo_model value is 0.804, it is above 0.7 and shows that adequate sample size

Here is the sample view of the eigen vector (Left Pic) and eigen values (Right Pic)

Page - 36
Here is the Heat map of the columns

Here is the glimpse of the (extracted_loadings ) table of the columns name and 57 PC values. (Unable to show entire table due to size restrictions)

Page - 37
2.1.6 Identify the optimum number of PCs (for this project, take at least 90% explained variance). Show Scree plot.

Here is the pic which show the Scree plot.

If the explained variance is taken as 90%, the Optimum number of PC can be taken as 6. PC 1 to PC 6.

Page - 38
Here is the PCs vs the columns on the basis of cumulative explained variance

Page - 39
2.1.7 Compare PCs with Actual Columns and identify which is explaining most variance. Write inferences about all the principal components in terms of actual
variables.

Page - 40
Page - 41
Here are the original features influence various
PCs

Page - 42
Here is the Extracted required number of PCs (as per the cumulative explained variance)

2.1.8 Write linear equation for first PC.

The Linear equation as follows

( 0.16 ) * No_HH + ( 0.17 ) * TOT_M + ( 0.17 ) * TOT_F + ( 0.16 ) * M_06 + ( 0.16 ) * F_06 + ( 0.15 ) * M_SC + ( 0.15 ) * F_SC + ( 0.03 ) * M_ST + ( 0.03 ) * F_ST + (
0.16 ) * M_LIT + ( 0.15 ) * F_LIT + ( 0.16 ) * M_ILL + ( 0.17 ) * F_ILL + ( 0.16 ) * TOT_WORK_M + ( 0.15 ) * TOT_WORK_F + ( 0.15 ) * MAINWORK_M + ( 0.12 ) *
MAINWORK_F + ( 0.1 ) * MAIN_CL_M + ( 0.07 ) * MAIN_CL_F + ( 0.11 ) * MAIN_AL_M + ( 0.07 ) * MAIN_AL_F + ( 0.13 ) * MAIN_HH_M + ( 0.08 ) * MAIN_HH_F + (
0.12 ) * MAIN_OT_M + ( 0.11 ) * MAIN_OT_F + ( 0.16 ) * MARGWORK_M + ( 0.16 ) * MARGWORK_F + ( 0.08 ) * MARG_CL_M + ( 0.05 ) * MARG_CL_F + ( 0.13 ) *
MARG_AL_M + ( 0.11 ) * MARG_AL_F + ( 0.14 ) * MARG_HH_M + ( 0.13 ) * MARG_HH_F + ( 0.16 ) * MARG_OT_M + ( 0.15 ) * MARG_OT_F + ( 0.16 ) *
MARGWORK_3_6_M + ( 0.16 ) * MARGWORK_3_6_F + ( 0.17 ) * MARG_CL_3_6_M + ( 0.16 ) * MARG_CL_3_6_F + ( 0.09 ) * MARG_AL_3_6_M + ( 0.05 ) *
MARG_AL_3_6_F + ( 0.13 ) * MARG_HH_3_6_M + ( 0.11 ) * MARG_HH_3_6_F + ( 0.14 ) * MARG_OT_3_6_M + ( 0.12 ) * MARG_OT_3_6_F + ( 0.15 ) *
MARGWORK_0_3_M + ( 0.15 ) * MARGWORK_0_3_F + ( 0.15 ) * MARG_CL_0_3_M + ( 0.14 ) * MARG_CL_0_3_F + ( 0.05 ) * MARG_AL_0_3_M + ( 0.04 ) *
MARG_AL_0_3_F + ( 0.12 ) * MARG_HH_0_3_M + ( 0.12 ) * MARG_HH_0_3_F + ( 0.14 ) * MARG_OT_0_3_M + ( 0.13 ) * MARG_OT_0_3 _F + ( 0.15 ) *
NON_WORK_M + ( 0.13 ) * NON_WORK_F +

Page - 43

Data Mining - Business Report: Clustering Clean - Ads
100% (4)
Data Mining - Business Report: Clustering Clean - Ads
24 pages
Data Mining
75% (4)
Data Mining
22 pages
Arnab Chowdhury DM
75% (4)
Arnab Chowdhury DM
14 pages
Data Mining Project DSBA Clustering Report Final
100% (4)
Data Mining Project DSBA Clustering Report Final
26 pages
Total Ad Spent Clicks Conversion Clicks Total Ad Spent Conversion CPA
No ratings yet
Total Ad Spent Clicks Conversion Clicks Total Ad Spent Conversion CPA
7 pages
Data Mininig Project
67% (3)
Data Mininig Project
28 pages
P L Lohitha 11-11-22 Data Mining Business Report
No ratings yet
P L Lohitha 11-11-22 Data Mining Business Report
47 pages
ML-1+Project
No ratings yet
ML-1+Project
30 pages
Business Report
No ratings yet
Business Report
20 pages
Graded Project
No ratings yet
Graded Project
37 pages
VARUNSAINI - 11 Dec 2022
No ratings yet
VARUNSAINI - 11 Dec 2022
16 pages
ML 1
No ratings yet
ML 1
27 pages
Rahulsharma - 03 12 23
No ratings yet
Rahulsharma - 03 12 23
25 pages
Data Mining Project DSBA Clustering Report Final
No ratings yet
Data Mining Project DSBA Clustering Report Final
26 pages
Data Mining Project DSBA Clustering Report Final
No ratings yet
Data Mining Project DSBA Clustering Report Final
26 pages
Business Report DSBA Data Mining Project - Part 2 Segmentation Using K-Means Clustering
No ratings yet
Business Report DSBA Data Mining Project - Part 2 Segmentation Using K-Means Clustering
28 pages
DATA MINING Project Report
No ratings yet
DATA MINING Project Report
28 pages
Rahulsharma - 03 12 23
No ratings yet
Rahulsharma - 03 12 23
26 pages
Data Mining - Project
100% (2)
Data Mining - Project
25 pages
Machine Learning-1 BUSINESS REPORT
No ratings yet
Machine Learning-1 BUSINESS REPORT
122 pages
RAJIV RANJAN 22 Jan 2023
No ratings yet
RAJIV RANJAN 22 Jan 2023
66 pages
Machine Learning-1 Project
No ratings yet
Machine Learning-1 Project
47 pages
Data Mining Project - Abinaya John
No ratings yet
Data Mining Project - Abinaya John
42 pages
Data Mining Assignment-Clustering Data-Ads 24x7 Summary
No ratings yet
Data Mining Assignment-Clustering Data-Ads 24x7 Summary
12 pages
Sukanya 3rd December 2023 Machine Learning1 Coded
No ratings yet
Sukanya 3rd December 2023 Machine Learning1 Coded
58 pages
Name - Atharva Navghane Roll No - 2301117 Div B Krai Flip Classroom Assignment On Regression Analysis
No ratings yet
Name - Atharva Navghane Roll No - 2301117 Div B Krai Flip Classroom Assignment On Regression Analysis
58 pages
Data Mining Project Ashwani 3 PDF
100% (1)
Data Mining Project Ashwani 3 PDF
20 pages
Great Learning DATA MINING PROJECT
No ratings yet
Great Learning DATA MINING PROJECT
15 pages
Case Study Module 1
No ratings yet
Case Study Module 1
4 pages
Data Mining Clustering PDF
No ratings yet
Data Mining Clustering PDF
15 pages
Data Mining
No ratings yet
Data Mining
24 pages
In Tenshi PPP Tte Jum Am
No ratings yet
In Tenshi PPP Tte Jum Am
23 pages
Internship Report Data Science
100% (1)
Internship Report Data Science
58 pages
Predictive Modeling
No ratings yet
Predictive Modeling
42 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Data Mining Project
No ratings yet
Data Mining Project
6 pages
Chapter1 - Introduction & Overview
No ratings yet
Chapter1 - Introduction & Overview
42 pages
Assignment Report - Data Mining
No ratings yet
Assignment Report - Data Mining
24 pages
DATA SCIENCE SAMPLE
No ratings yet
DATA SCIENCE SAMPLE
5 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
30 pages
A - B Testing
No ratings yet
A - B Testing
31 pages
Reading Py Spark ML Lib
No ratings yet
Reading Py Spark ML Lib
12 pages
Data Mining Project - Brahma Chari
No ratings yet
Data Mining Project - Brahma Chari
23 pages
1A
No ratings yet
1A
62 pages
Test Answer
No ratings yet
Test Answer
11 pages
Nikita Prasad - Exploratory Data Analysis (EDA)
No ratings yet
Nikita Prasad - Exploratory Data Analysis (EDA)
18 pages
Cart-Rf-Ann: Prepared by Muralidharan N
67% (3)
Cart-Rf-Ann: Prepared by Muralidharan N
33 pages
Susmita Prajapati University of Cumberlands Dr. Cynthia Mcmahon Analyzing and Visualizing Data
No ratings yet
Susmita Prajapati University of Cumberlands Dr. Cynthia Mcmahon Analyzing and Visualizing Data
10 pages
PM Guided Project Sample Business Report.docx
No ratings yet
PM Guided Project Sample Business Report.docx
35 pages
Walmart Business Case - Updated
No ratings yet
Walmart Business Case - Updated
47 pages
PSK Unit 1 Merged
No ratings yet
PSK Unit 1 Merged
125 pages
Big Data and Analytics
No ratings yet
Big Data and Analytics
86 pages
Chapter 1: Introduction To Business Analytics
No ratings yet
Chapter 1: Introduction To Business Analytics
14 pages
Monika Sree 08-06-2024
No ratings yet
Monika Sree 08-06-2024
36 pages
COMP1810 - Data and Web Analytics
100% (1)
COMP1810 - Data and Web Analytics
47 pages
Advance Stats Assignment
No ratings yet
Advance Stats Assignment
18 pages
Pca Anova PDF
No ratings yet
Pca Anova PDF
21 pages
Team 56B
No ratings yet
Team 56B
17 pages
1 - Creating A Data Transformation Pipeline With Cloud Dataprep
0% (1)
1 - Creating A Data Transformation Pipeline With Cloud Dataprep
39 pages
Manufacturing: Engineering, Management and Marketing
From Everand
Manufacturing: Engineering, Management and Marketing
S.O.T Ogaji
No ratings yet
ap_calculus_ab_chapter_6_practice_test
No ratings yet
ap_calculus_ab_chapter_6_practice_test
5 pages
Seating Arrangement
No ratings yet
Seating Arrangement
16 pages
Parameter Estimation 1: Linear Least Squares
No ratings yet
Parameter Estimation 1: Linear Least Squares
7 pages
Blob Detection Final Version PDF
No ratings yet
Blob Detection Final Version PDF
14 pages
Outline: Regression Analysis in Python
No ratings yet
Outline: Regression Analysis in Python
11 pages
Heat Conduction in A Slab: Dr. Nurul Hasan
No ratings yet
Heat Conduction in A Slab: Dr. Nurul Hasan
14 pages
Critical Success Factors Affecting E-Procurement Adoption in Public Sector Organizations in Sri Lanka PDF
No ratings yet
Critical Success Factors Affecting E-Procurement Adoption in Public Sector Organizations in Sri Lanka PDF
27 pages
CO Assessment CEL 333
No ratings yet
CO Assessment CEL 333
27 pages
class 9th motion
No ratings yet
class 9th motion
3 pages
Control Systems (1-135) PDF
No ratings yet
Control Systems (1-135) PDF
128 pages
Amity Assignment - Continuty and Differentiability
No ratings yet
Amity Assignment - Continuty and Differentiability
4 pages
Remainder Theorem
No ratings yet
Remainder Theorem
6 pages
Mathematics in Modern World Midterm Exam PDF
No ratings yet
Mathematics in Modern World Midterm Exam PDF
10 pages
The Vehicle Steer by Wire Control System by Implementing PID Controller
No ratings yet
The Vehicle Steer by Wire Control System by Implementing PID Controller
5 pages
Barredo Michael MMW Introduction-Of-Data-Management
No ratings yet
Barredo Michael MMW Introduction-Of-Data-Management
84 pages
12-Uk Windpro2 7-Loss Uncertainty
No ratings yet
12-Uk Windpro2 7-Loss Uncertainty
31 pages
Plane Geometries
No ratings yet
Plane Geometries
5 pages
Analysis with an introduction to proof. Fifth Edition, Pearson New International Edition Steven R. Lay - eBook PDF download pdf
100% (6)
Analysis with an introduction to proof. Fifth Edition, Pearson New International Edition Steven R. Lay - eBook PDF download pdf
69 pages
AIML Lab Manual
No ratings yet
AIML Lab Manual
9 pages
ML Lab Manual Arpan
No ratings yet
ML Lab Manual Arpan
48 pages
JR MPC TERM-II EXAM SYLLABUS& BLUE PRINT-2024-25
No ratings yet
JR MPC TERM-II EXAM SYLLABUS& BLUE PRINT-2024-25
1 page
Aggregate Functions - in OBIEE
No ratings yet
Aggregate Functions - in OBIEE
10 pages
Compiled by Solomon Kebede
No ratings yet
Compiled by Solomon Kebede
136 pages
IOQM Mock Test-2
100% (2)
IOQM Mock Test-2
7 pages
Speed of Sound - Detailed - Lesson Plan
No ratings yet
Speed of Sound - Detailed - Lesson Plan
9 pages
Final Report: Dynamic Behavior of A Structure: Agustín Armando Lozano Salas
No ratings yet
Final Report: Dynamic Behavior of A Structure: Agustín Armando Lozano Salas
14 pages
CPM 05 PERT_58966961_2025_04_22_16_15
No ratings yet
CPM 05 PERT_58966961_2025_04_22_16_15
4 pages
Simple Linear Regression: Yandell - Econ 216 Chap 13-1
No ratings yet
Simple Linear Regression: Yandell - Econ 216 Chap 13-1
70 pages
Verbal Reasoning - Series
0% (1)
Verbal Reasoning - Series
263 pages
Plate 5 Copy 2
No ratings yet
Plate 5 Copy 2
11 pages