0% found this document useful (0 votes)
92 views

BigData Mining and Analytics

BDA

Uploaded by

lekha.cce
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views

BigData Mining and Analytics

BDA

Uploaded by

lekha.cce
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

SEMESTER - I

24PBDPC1 BIG DATA MINING AND L T P C


02 3 0 0 3
ANALYTICS
SDG NO. 4

OBJECTIVES:
 To understand the computational approaches to Modelling, Feature
Extraction
 To understand the need and application of Map Reduce
 To understand the various search algorithms applicable to Big Data
 To analyse and interpret streaming data
 To learn how to handle large data sets in main memory and learn the
various clustering techniques applicable to Big Data.

UNIT I DATA MINING AND LARGE SCALE FILES 9


Introduction to Statistical modeling – Machine Learning – Computational
approaches to modeling – Summarization – Feature Extraction – Statistical
Limits on Data Mining - Distributed File Systems – Map-reduce – Algorithms
using Map Reduce– Efficiency of Cluster Computing Techniques.

UNIT II SIMILAR ITEMS 9


Nearest Neighbor Search – Shingling of Documents – Similarity preserving
summaries – Locality sensitive hashing for documents – Distance Measures
– Theory of Locality Sensitive Functions – LSH Families – Methods for High
Degree of Similarities.

UNIT III MINING DATA STREAMS 9


Stream Data Model – Sampling Data in the Stream – Filtering Streams –
Counting Distance Elements in a Stream – Estimating Moments – Counting
Onesin Window – Decaying Windows

UNIT IV LINK ANALYSIS AND FREQUENT ITEMSETS 9


Page Rank –Efficient Computation - Topic Sensitive Page Rank – Link Spam
– Market Basket Model – A-priori algorithm – Handling Larger Datasets in
Main Memory– Limited Pass Algorithm – Counting Frequent Item sets.
UNIT V CLUSTERING 9
Introduction to Clustering Techniques – Hierarchical Clustering –
Algorithms – K-Means – CURE – Clustering in Non – Euclidean Spaces –
Streams and Parallelism – Case Study: Advertising on the Web –
Recommendation Systems

TOTAL: 45 PERIODS
TEXT BOOKS:
1. Jure Leskovec, AnandRajaraman, Jeffrey David Ullman, “Mining of
Massive Datasets”, Cambridge University Press, Second Edition, 2014.
2. Jiawei Han, MichelineKamber, Jian Pei, “Data Mining Concepts
and Techniques”, Morgan Kaufman Publications, Third Edition,
2011.

REFERENCES:
1. Ian H.Witten, Eibe Frank “Data Mining – Practical Machine Learning
Tools and Techniques”, Morgan Kaufman Publications, Third Edition,
2011.
2. David Hand, HeikkiMannila and Padhraic Smyth, “Principles of Data
Mining”, MIT PRESS, 2001

WEB REFERENCES:
1. https://swayam.gov.in/nd2_arp19_ap60/preview
2. https://nptel.ac.in/content/storage2/nptel_data3/html/mhrd/ict/
text/106104189/lec1.pdf

ONLINERESOURCES:
1. https://examupdates.in/big-data-analytics/
2. https://www.tutorialspoint.com/big_data_analytics/index.htm
3. https://www.tutorialspoint.com/data_mining/index.htm

OUTCOMES :
Upon completion of the course, the student should be able to
1. Design algorithms by employing Map Reduce technique for solving Big
Data problems.
2. Design algorithms for Big Data by deciding on the apt Features set .
3. Design algorithms for handling petabytes of datasets
4. Design algorithms and propose solutions for Big Data by optimizing
main memory consumption
5. Design solutions for problems in Big Data by suggesting
appropriate clustering techniques.

You might also like