Business intelligence and Data
Mining
Topic Two
BI/DM/BDA applications, DA models and frameworks
Tibebe B. (PhD)
Topics/chapters
Fundamental concepts and the need for business intelligence, data
mining and its flavors , big data analysis
DM applications, DA models and frameworks
Data and data warehousing
Data mining techniques
Association rule mining,
Classification and
Cluster analysis
Opinion mining, web mining… Current trends in data analytics/data
mining
BI/DM/BDA Applications
Prerequisite for effective application
Understanding the business or nature of area of
investigation
Domain (education, insurance, banking,…)
Understanding the concept to learn or investigate
Subject (loyalty, student performance, risk level, customer
satisfaction, …..)
Cont…
What is Business Understanding:
Statement of Business Objective
Followsfrom identifying objects and events a business
works on
Statement of Success Criteria
Statement of Business analytics/Data analytics objective
5
How is business understanding relevant
to data analytics ?
6
Proper business understanding
Helps to focus on a specific problem worth considering
Helps to appropriate data selection
Helps in method/algorithm/tool/approach selection
Provides a platform for evaluation
7
How do we do business understanding?
Qualitative and/or Quantitative approaches
Interview
Observation
Document review
Questionnaire
Review of literatures
8
Major applications of BI and A
E-Commerce and Market Intelligence
The excitement surrounding BI&A and Big Data has arguably been
generated primarily from the web and e-commerce communities.
Significantmarket transformation has been accomplished by
leading e-commerce vendors such Amazon and eBay through their
innovative and highly scalable ecommerce platforms and product
recommender systems.
E-Government and Politics
As government and political processes become more transparent,
participatory, online, and multimedia-rich, there is a great
opportunity for adopting BI&A research in e-government and
politics 2.0 applications
Cont..
Science and Technology
Many areas of science and technology (S&T) are reaping the
benefits of high-throughput sensors and instruments, from
astrophysics and oceanography, to genomics and environmental
research.
Smart Health and Wellbeing
Much like the big data opportunities facing the e-commerce and
S&T communities, the health community is facing a tsunami of
health- and healthcare-related content generated from numerous
patient care points of contact, sophisticated medical instruments,
and web-based health communities.
Cont…
Security and Public Safety
Since the tragic events of September 11, 2001, security research
has gained much attention, especially given the increasing
dependency of business and our global society on digital
enablement.
Other ways of viewing application areas
Science
astronomy, bioinformatics, drug discovery, …
Business
CRM (Customer Relationship management), fraud detection, e-
commerce, manufacturing, sports/entertainment, telecom,
targeted marketing, health care, …
Web:
search engines, advertising, web and text mining, …
Government
surveillance (?|), crime detection, profiling tax cheaters, …
13
Some specific examples
Energy network monitoring and optimization
Credit fraud detection
Clustering and customer segmentation
Recommendation engines
Price modeling
Medicine
Pfizerpharmaceuticals used data mining to construct a predictive
model that was then embedded in their online cholesterol health risk
assessment, which tells patients their cholesterol risk score.
Cont…
Transportation
Determining accident occurrence and severity
Locate dangerous scenarios and locations of accident
Congestion and traffic analysis
Banking
Identify `loyal' customers
Cont…
Telecommunication
Detecting telephone fraud
detect users of telephone line which either hijacked or stolen from
a customer
Telephone call model:
Use destination of the call, duration, time of day or week and
Analyze patterns that deviate from an expected norm.
Telecom can identify discrete groups of callers with frequent intra-
group calls, especially mobile phones, and broke a possible
multimillion dollar fraud.
Data Analytics Models and
frameworks
General
Ben Fry’s Model - Doing Data Science
1. Acquire
2. Parse
3. Filter
4. Mine
5. Represent
6. Refine
7. Interact
19
Data Mining Process Model
SEMMA, CRISP-DM and Hybrid DM models
The most common standard in data mining is the CRoss-Industry
Standard Process for Data Mining (CRISP-DM)
CRISP-DM has the following steps
Business/research understanding (Learning the application
domain),
Data understanding (data selection for the problem)
20
Data Preparation which involves
collecting, cleaning, consolidating and amalgamating records,
summarizing fields, checking for data integrity, detecting
irregularities and illegal attributes, filling in for missing values,
trimming outliers.
Data modeling involves
selecting data mining tools, transforming the data if the tools
require it, generating samples for training and testing the model
and finally using the tools to build and select a model
Evaluating the model and
Deploying the model 21
CRoss Industry Standard Process for Data Mining (CRISP-DM)
22
Web Mining process model
Web data collection
Process web data/make it ready
Pattern discovery
Analysis or Evaluation of the pattern
Web Mining process
Web mining
Web usage mining
Process model for opinion mining
Select data
Select methods and tools
Pre processing
Transformation
Analysis
Evaluation
Cont…
Big data analytics process model
Extract and store
Major task
Analyze
Present and decide
Big data analytics process model
Big data analytics framework
Review questions
Have you noticed similarities between process models of
different data analytics approaches?
Discuss
Thank you