0% found this document useful (0 votes)
2 views12 pages

Understanding Data Assignment 2

The document consists of multiple-choice questions, fill-in-the-blank exercises, and short answer questions focused on understanding data concepts, including structured and unstructured data, measures of central tendency, and data processing techniques. Key topics include the significance of data in decision-making, statistical techniques, and the role of metadata. The document serves as an educational resource for assessing knowledge in data understanding and analysis.

Uploaded by

rk1522414
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views12 pages

Understanding Data Assignment 2

The document consists of multiple-choice questions, fill-in-the-blank exercises, and short answer questions focused on understanding data concepts, including structured and unstructured data, measures of central tendency, and data processing techniques. Key topics include the significance of data in decision-making, statistical techniques, and the role of metadata. The document serves as an educational resource for assessing knowledge in data understanding and analysis.

Uploaded by

rk1522414
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

ASSIGNMENT -2

CHAPTER 2 :UNDERSTANDING DATA


MULTIPLE CHOICE QUESTIONS (MCQs)
1. What is the primary purpose of collecting data?
A. To decorate webpages

B. To store data indefinitely

C. To make decisions based on analysis

D. To reduce computer memory usage

Answer: C. To make decisions based on analysis

2. Which of the following is an example of structured data?

A. A newspaper article

B. An image on a website

C. A table showing inventory items in a shop

D. A video file

Answer: C. A table showing inventory items in a shop

3. What does metadata represent?

A. Multimedia content

B. Data about data

C. Only numerical data

D. None of the above

Answer: B. Data about data

4. Which of the following is an outlier?

A. A value that is identical to the mean

B. A value that occurs most frequently

C. An extremely high or low value compared to others

D. A value with zero frequency

Answer: C. An extremely high or low value compared to others

5. What statistical technique is most affected by outliers?

A. Median

B. Mode
C. Mean

D. Standard Deviation

Answer: C. Mean

6. In structured data, what is an attribute?

A. The format of an image

B. A column representing a characteristic

C. Arow of values

D. A description of metadata

Answer: B. A column representing a characteristic

7. Which of the following is not a measure of central tendency?

A. Mean

B. Median

C. Mode

D. Range

Answer: D. Range

8. What is the correct order of the data processing cycle?

A. Input → Output → Processing

B. Collection → Preparation → Entry → Storage → Processing → Output

C. Output → Processing → Storage

D. Entry → Output → Collection

Answer: B. Collection → Preparation → Entry → Storage → Processing → Output

9. Which of the following is used to summarize data for easy understanding?

A. Audio processing

B. Metadata generation

C. Statistical techniques

D. Data hiding

Answer: C. Statistical techniques

10. What is the standard deviation used for?

A. To find the middle value

B. To count data entries

C. To measure the spread of data


D. To identify structured data

Answer: C. To measure the spread of data

11. What is the mode of the dataset [34, 34, 27, 28, 27, 34, 34]?

A. 27

B. 34

C. 28

D. 33

Answer: B. 34

12. Which of the following storage devices is volatile?

A. Pen Drive

B. CD/DVD

C. HDD

D. None of the above

Answer: D. None of the above

13. Unstructured data can include all except:

A. Audio files

B. Web pages with multimedia

C. Tables with rows and columns

D. Social media messages

Answer: C. Tables with rows and columns

14. Which of the following tasks represents data collection?

A. Calculating standard deviation

B. Retrieving data from a file

C. Filling student details in an online form

D. Printing a report card

Answer: C. Filling student details in an online form

15. Which tool is suggested for data processing in future chapters?

A. Java

B. MySQL

C. Python

D. HTML
Answer: C. Python

16. Which statistical technique is suitable for finding income disparity?

A. Mode

B. Median

C. Mean

D. Standard Deviation

Answer: D. Standard Deviation

17. What does the median represent in a dataset?

A. The average of all values

B. The value that occurs most frequently

C. The value that appears at the centre after sorting

D. The maximum value in the list

Answer: C. The value that appears at the center after sorting

18. Which of the following best describes unstructured data?

A. Can be easily stored in tables

B. Has a clear format with fixed fields

C. Lacks predefined structure

D. Cannot be stored digitally

Answer: C. Lacks predefined structure

19. What is the formula for standard deviation (σ) as per the chapter?

A. 𝜎 =𝑀𝑒𝑎𝑛𝑜𝑓𝑣𝑎𝑙𝑢𝑒𝑠

B. 𝜎 = (𝑀𝑎𝑥𝑖𝑚𝑢𝑚−𝑀𝑖𝑛𝑖𝑚𝑢𝑚)

C. 𝜎 = √(Σ(𝑥𝑖−𝑥)2/𝑛)

D. 𝜎 =Σ(𝑥𝑖+𝑥)/𝑛

Answer: C. 𝜎 = √(Σ(𝑥𝑖−𝑥)2/𝑛)

FILL IN THE BLANKS

1. Data which is organised and can be recorded in a well-defined format is called ________.
Answer: Structured Data
2. Data which do not follow any fixed structure or format are called ________.
Answer: Unstructured Data
3. The singular form of the word ‘data’ is ________.
Answer: Datum
4. The process of collecting, storing, and analysing data for decision making is known as
________.
Answer: Data Processing
5. ________ is a measure of central tendency that represents the average of a set of values.
Answer: Mean
6. The ________ is the middle value in a sorted list of data values.
Answer: Median
7. The value that occurs most frequently in a data set is called the ________.
Answer: Mode
8. The difference between the maximum and minimum values in a data set is called the
________.
Answer: Range
9. The standard deviation is represented by the Greek letter ________.
Answer: Sigma (𝜎)
10. The data describing other data is referred to as ________.
Answer: Metadata
11. Examples of digital storage devices include HDD, SSD, CD/DVD, Pen Drive, and ________.
Answer: Memory Card
12. Statistical techniques used to summarise data include mean, median, mode, range, and
________.
Answer: Standard Deviation
13. The process of obtaining data from reliable sources before processing is called ________.
Answer: Data Collection
14. ICT revolution has led to the generation of ________ volume of data at a very fast pace.
Answer: Large
15. The structured data is generally stored in a ________ format in computers.
Answer: Tabular

2 MARKS QUESTIONS

1. What is the difference between data and information?


Answer:
Data refers to unorganised facts that need to be processed, while information is the
processed form of data that is meaningful and useful for decision making.
2. Define the term ‘datum’.
Answer:
‘Datum’ is the singular form of the word ‘data’. It represents a single piece of information or
value.
3. What is metadata? Give an example.
Answer:
Metadata is data about data. For example, in an image file, metadata may include image size,
type (JPEG, PNG), and resolution.
4. Differentiate between structured and unstructured data.
Answer:-
Structured Data: Organised in rows and columns (e.g., database tables).- Unstructured Data:
Lacks a defined format (e.g., emails, web pages, videos).
5. Give two examples of structured data.
Answer:
1. School fee payment records with fields like StudentName, RollNo, and FeesAmount. 2.
Inventory table with fields like ProductName, UnitPrice, and Quantity.
6. Mention two examples of unstructured data.
Answer: 1. Social media posts with text, images, and videos.
2. Email content with body text and attachments.

7. What is the significance of data in decision making?

Answer:

Data helps identify trends, draw conclusions, and support decisions in areas such as
business, education, healthcare, and governance.

8. Define mean and write its formula.

Answer:

Mean is the average of numeric values. Formula: 𝑀𝑒𝑎𝑛 = (𝑥1 +𝑥2 +...+𝑥𝑛)/𝑛

9. What is median? How is it calculated for even number of values?

Answer:

Median is the middle value in an ordered list. For even number of values, it is the

average of the two middle values.

10. Define mode with an example.

Answer:

Mode is the value that appears most frequently in a dataset. Example: In [34, 34, 28, 27, 34],
mode is 34.

11. What is meant by range in a dataset?

Answer:

Range is the difference between the maximum and minimum values in a dataset. Formula:
Range = Maximum– Minimum

12. What does standard deviation measure?

Answer:

Standard deviation measures the spread or dispersion of values around the mean. It
considers all data points in the dataset.

13. Name two commonly used digital storage devices.

Answer:

1. Hard Disk Drive (HDD)

2. Solid State Drive (SSD)

14. Mention any two statistical techniques used for data summarisation.

Answer:
1. Measures of Central Tendency (Mean, Median, Mode)

2. Measures of Variability (Range, Standard Deviation)

15. Differentiate between range and standard deviation.

Answer:-

Range: Difference between the maximum and minimum values.- Standard

Deviation: Measures the average spread of all values from the mean.

16. What type of data is stored in an electronic voting machine?


Answer:
Structured data such as votes cast, which are accumulated and processed for quick
result declaration.
17. Give two scenarios where data is used for making decisions.
Answer:
1. Meteorological data used to predict cyclones.
2. Sales data used by businesses to offer discounts or change product placements.

18. How can Python help in data processing and analysis?

Answer: Python provides libraries that allow efficient data processing, statistical analysis,
and visualisation of large data sets.

19.What are the steps involved in data processing?

Answer:

1. Data Collection

2. Data Preparation

3. Data Entry

4. Storage and Retrieval

5. Classification and Update

6. Generation of Reports/Results

3 MARKS QUESTIONS

1. Explain the three commonly used measures of central tendency with examples.
Answer:-
-Mean is the average of all values.
Example: Mean of [90, 100, 110] = (90+100+110)/3 = 100
-Median is the middle value in a sorted list.
Example: Median of [85, 90, 100, 110, 115] = 100
-Mode is the value that occurs most frequently.
Example: Mode of [90, 110, 110, 110, 100] = 110
2. Differentiate between structured data and unstructured data with examples.
Answer:-
Structured Data: Organised in rows and columns, easy to store and analyse. Example:
Table of student records with Roll No, Name, Marks.- Unstructured Data: Lacks
predefined format; difficult to analyse. Example: Social media posts with images and
text.
3. Define standard deviation. Write its formula and explain its significance.
Answer:
Standard deviation (σ) measures the spread of data around the mean. Formula: [𝜎 = √ 1
𝑛 𝑛 ∑ 𝑖=1 (𝑥𝑖 − ̄ 𝑥)2] It gives insights into data variability. A smaller σ means values are
closer to the mean; a larger σ indicates more spread.
4. What are metadata? Give three examples from different digital files.
Answer:
Metadata are data about data.
They describe content and structure.
Examples:- In an image file: resolution, format (JPEG/PNG)- In an email: subject,
recipient, date sent- In a document: author name, word count, creation date
5. Describe the role of data in business decision-making with any two examples.
Answer:
Businesses use data to understand market trends and improve performance.
Examples: 1. Analysing customer feedback to improve products.

2. Using sales data to implement dynamic pricing (e.g., discount in happy


hours based on past data).

6. Explain the data processing cycle with the help of a diagram or steps.

Answer:

The data processing cycle includes the following steps:

1. Input– Collecting and entering data.

2. Processing– Manipulating data to produce results.

3. Output– Presenting the results.

4. Storage and Retrieval– Saving for future use. These steps convert raw data into
useful information.

7. Distinguish between range and standard deviation with formula and example.

Answer:-

-Range: Difference between maximum and minimum values.

Formula: Range = Max– Min

Example: For [85, 90, 115],

Range = 115– 85 = 30

- Standard Deviation: Measures average spread from the mean.

Formula: 𝜎 = √(Σ(𝑥–𝑥)2/𝑛)

Example: For [90, 100, 110], σ is calculated using all data points.

8. Give three different scenarios of data collection and describe the method to convert them
into digital format.
Answer:

1. Manual Record (e.g., shopkeeper’s diary): Enter data into spreadsheet manually.

2. Digital File (e.g., CSV): Directly use data for analysis using software tools.

3. No prior data: Develop software (e.g., in Python or MySQL) to store and manage
sales digitally.

9. What are the limitations of file processing and how does DBMS help overcome them?

Answer:

File Processing Limitations:

- Difficult to handle large data

- Poor data integrity and redundancy

- No concurrent access or security control

DBMS Benefits:

- Centralised management

- Easy retrieval and update

- Ensures data consistency, security, and reduces redundancy

10. A teacher wants to compare students’ test results from five months. Which statistical
technique is suitable and why?

Answer:

Mean is the suitable technique to compare average performance over five months. It
provides a quick understanding of how the class performed each month and highlights
trends in overall class performance.

5 MARKS QUESTIONS

1. What are the different types of data? Explain structured and unstructured data with
examples.
Answer:
Data can be broadly categorized into:
1. Structured Data– Organised in a defined format like tables (rows and columns). Each
column represents an attribute and each row represents an observation.
Examples:
- School records (RollNo, Name, Marks)
- ATM withdrawal data (AccountNo, Date, Amount)
2. Unstructured Data– Data not arranged in predefined format, lacks structure.
Examples:
- Social media posts
- Email content
- News articles with images, videos, and text

2. Explain the role of data in various real-life sectors. Give at least five examples.
Answer: Data plays a crucial role in decision-making across various domains.

Examples:

1. Education: Placement data helps students choose colleges.

2. Government: Census data is used for planning policies.

3. Healthcare: Hospitals collect patient data for treatment analysis.

4. Meteorology: Weather offices use satellite data to predict storms.

5. Business: Sales data is analysed for discounts, inventory planning, and marketing
decisions.

3. Define and differentiate Mean, Median, and Mode. Include examples.

Answer:

- Mean: Average of numeric values.

Formula: Mean = (Sum of all values) / Number of values

Example: [90, 100, 110] → Mean = (90+100+110)/3 = 100

- Median: Middle value in a sorted list.

Example: [85, 90, 100, 110, 115] → Median = 100

- Mode: Most frequently occurring value.

Example: [90, 90, 100, 110, 110, 110] → Mode = 110

Difference:

- Mean considers all values.

- Median is less sensitive to outliers.

- Mode shows the most common value.

4. What is standard deviation? How is it calculated? Explain with an example.

Answer:

Standard deviation measures the dispersion or spread of data around the mean.

Formula:

Steps:

1. Find mean of the dataset.

2. Subtract each value from the mean.

3. Square the differences.

4. Find average of squared differences.

5. Take square root of the result.


Example: For heights: [90, 102, 110, 115, 85, 90, 100, 110, 110]

- Mean ≈ 101.33

- 𝜎 ≈√(938/9) ≈ 10.

5. Differentiate between Range and Standard Deviation. Explain with examples and
formula.

Answer:

Feature Range Standard Deviation


Definition Difference between highest Average spread of all values
and lowest values from the mean
Formula Max- Min Refer 3rd question in 3
markers for formula
Data Used Only two values (max and All values
min)
Sensitive to Outliers Highly sensitive Less affected

Example:
Data: [85, 90, 90, 100, 102, 110, 110, 110, 115]
- Range = 115– 85 = 30
- σ ≈10.2 (calculated using mean and all values)
6. Explain the data processing cycle with a real-life example.
Answer:
The data processing cycle includes:
1. Data Collection– Gather raw data.
2. Data Preparation– Organise, clean, and validate data.
3. Data Entry– Input data into the system.

4. Processing– Apply algorithms and logic to get results.

5. Storage/Retrieval– Save and retrieve data as needed.

6. Output/Reporting– Present results in a meaningful form.

Example: In online exam registration:

- Collect name, marks, payment details

- Check eligibility

- Generate roll numbers and admit cards

7. How is data collected in digital environments? Explain three different scenarios.

Answer:

1. Manual to Digital:

- Shopkeeper keeps records in a diary.

- Data is entered into a spreadsheet or software.


2. Already in Digital Format:

- Data is in CSV or database format.

- Can be directly imported and processed.

3. Fresh Data Collection:

- A new system is developed (e.g., using Python/MySQL) to record and store sales or
transactions digitally.

8. Explain metadata with three examples. How is it useful in processing unstructured data?
Answer:

Metadata is data about data.

It helps identify, describe, and process unstructured data.

Examples:

1. Image File: Metadata includes image size, type, resolution.

2. Email: Subject, recipient, time sent.

3. Document: Author name, date created, word count.

Usefulness:

Metadata helps organise, search, and process unstructured content like emails, images, and
documents.

9. List and explain any five real-life applications of statistical techniques in data processing.
Answer:

1. Education: Teachers analyse marks using mean and median.

2. Business: Use mode to identify popular products.

3. Health: Use standard deviation to assess variability in patient recovery times.

4. Elections: Calculate vote share using mean/percentages.

5.Weather Forecasting: Analyse range of temperatures to predict extremes.

10. Compare the use of Mean, Median, and Mode with suitable scenarios.

Answer:

Measure Best Used When Example


Mean No extreme outliers, want Average marks of students
overall average
Median Outliers present, want Income data where few are
central tendency very rich
Mode Need most frequent value Popular shoe size sold in a
store
Each measure helps summarise data based on the context of variability, frequency, or central
position.

You might also like