0% found this document useful (0 votes)
20 views2 pages

Data Science Imp Questions 2

The document outlines a series of questions related to data science, including topics such as the differences between data structures, machine learning techniques, data collection strategies, and the data science lifecycle. It also includes practical coding tasks involving Python libraries like Numpy and DataFrame operations using Pandas. Additionally, it covers the roles of data science professionals and various applications of data science across different fields.

Uploaded by

prem prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views2 pages

Data Science Imp Questions 2

The document outlines a series of questions related to data science, including topics such as the differences between data structures, machine learning techniques, data collection strategies, and the data science lifecycle. It also includes practical coding tasks involving Python libraries like Numpy and DataFrame operations using Pandas. Additionally, it covers the roles of data science professionals and various applications of data science across different fields.

Uploaded by

prem prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Q1 Attempt any four parts (4x5=20)

a) What is a Series and how is it different from a 1-D array, a list, and a dictionary.
b) Differentiate Supervised and Unsupervised learning techniques
c) Specify any four python Libraries and their applications.
d) Describe any five data collection strategies
e) Give 4 ways of creating Numpy arrays.
f) Explain four major tasks in data pre-processing

Q.2 (a) Explain the roles and responsibilities of any five Data Science professionals. (5)
(b) What are the various types of Data in Statistics? Explain with example (5)

Q3. (a) What is Data Science Lifecycle? Explain all stages with diagram. (5)
(b) What are missing values? What are the strategies to handle them? Explain
four methods of Imputation by giving example of each. (5)

Q4 (a) Explain four methods of creating Dataframe by using (5)


i. Multiple List of different length
ii. Multiple Series Object
iii. Nested Dictionary
iv. Numpy Array
(b) Explain five applications/use in different fields of Data Science. (5)

Q5 Give 4 ways of creating Numpy arrays (10)


Give the code or syntax to Perform the following operation on two 2D numpy array array1 and array2 and
1D array array3.
a. Add array1 and array2
b. Find sum of array1 elements over a given axis.
c. Find product of array2 elements over a given axis.
d. Change the dimension of an array3 to 2D.
e. Transpose the array created in part d.
f. Display 2 rows and third column of 2D array array1.
g. Join two 2D array along row.
h. Convert array2 to 1D array.
i. Split an array 1 into multiple subarrays

Q6 Give 4 ways of creating series by using List, arrays, dictionary, scalar value. (10)
a) Write python code to create the following series
101 Harsh
102 Arun
103 Ankur
104 Harpal
105 Divya
106 Jeet
b) Show details of 1st 3 employees using head function
c) Show details of last 3 employees using tail function
d) Show details of 1st 3 employees without using head function
e) Show details of last 3 employees without using tail function
f) Show value of index no 102.
g) Show 2nd to 4th records.
h) Show values of index no=101,103,105.
i) Show details of “Arun”

Q7. Create a dataframe for the below given data (10)


Write a code to perform following operations on above dataframe:
i. Print the batsman name along with runs scored in Test and T20 using column names and dot
notation.
ii. Display the Batsman name along with runs scored in ODI using loc
iii. Display the batsman details who scored runs more than :

More than 2000 in ODI


Less than 2500 in Test
More than 1500 in T20
iv. Display the columns using column index number like 0, 2, 4.
v. Display the alternated rows.
vi. Reindex the dataframe created above with batsman name and delete data of Hardik Pandya and
Shikhar Dhawan by their index from original dataframe.
vii. Insert 2 rows in the dataframe and delete rows whose index is 1 and 4.
viii. Delete a column Test, add one more column total at last (next to T20 column), make total of ODI
and T20 runs in that column.
ix. Rename column T20 with “T20I Runs”.
x. Print the dataframe without headers.

OR
Q8. Create the following DataFrame Sales containing year-wise sales figures for five salespersons in INR. Use
the years as column labels, and salesperson names as row labels. (10)

2014 2015 2016 2017

Madhu 100.5 12000 2000 50000


Kusum 150.8 18000 5000 60000
Kinshuk 200.9 22000 70000 70000
Ankit 30000 30000 1000 80000
Shruti 40000 45000 1250 90000

a. Display the row labels of Sales.


b. Display the column labels of Sales.
c. Display the dimensions, shape, size and values of Sales.
d. Display the last two rows of Sales.
e. Display the first two columns of Sales.
f. Change the DataFrame Sales such that it becomes its transpose.
g. Add data to Sales for salesman Sumeet where the sales made are [196.2, 37800, 52000, 78438] in the
years [2014, 2015, 2016, 2017] respectively.
h. Delete the data for the year 2014 from the DataFrame Sales.
i. Update the sale made by Shruti in 2017 to 100000.
j. Write the values of DataFrame Sales to a comma-separated file SalesFigures.csv on the disk. Do not
write the row labels and column labels.
k. Change the name of the salesperson Ankit to Vivaan and Kinshuk to Shailesh.
l. Delete the data for salesman Madhu from the DataFrame Sales.

You might also like