Statistical Learning - Introduction
Statistical Learning - Introduction
Introduction to Statistics
Outline
1. Why Statistics
2. Statistical Methods
3. Types of Statistics - Descriptive and Inferential
Statistics
4. Data Sources and Types of Datasets
5. Attributes of Datasets
Why Statistics is So Important?
Event1
• Technological developments, Revolution of Internet and
social networks, data generated from mobile phones and
other electronic devices, produce large amount of data from
which insights will have to be sifted.
Event 2
• Advances in enormous computing power to effectively
process and analyze massive amounts of data
Event 3
Big data
• A set of data that cannot be managed, processed, or
analyzed with traditional software/algorithms within a
reasonable amount of time.
Classification
• For example, these models can classify and predict buyers and
non-buyers, and defaulters and non-defaulters on credit card loan.
Classical Definition of Statistics
• Record
• Relational records
• Data matrix, e.g., numerical matrix,
crosstabs
• Document data: text documents: term-
frequency vector
• Transaction data
• Graph and network
• World Wide Web
• Social or information networks
• Molecular Structures
• Ordered
• Video data: sequence of images
• Temporal data: time-series
• Sequential Data: transaction sequences
• Genetic sequence data
• Spatial, image and multimedia:
• Spatial data: maps
• Image data
• Video data
Data Objects