Tuesday, January 4, 2022

BIG DATA ANALYTICS


Objectives:

 To learn to analyze the big data using intelligent techniques.

 To understand the various search methods and visualization techniques.

 To learn to various techniques for mining data stream.

 To understand the applications using Map Reduce Concepts.

Outcomes:

On completion of this course the student will able to

 Analyze the big data analytics techniques for useful business application.

 Design efficient algorithms for mining the data from large volumes.

 Analyze the HADOOP and Map Reduce technologies associated with big data analytics.

 Explore on big data applications using Pig and Hive.

UNIT-I

Introduction to Big DataIntroduction to Big Data Platform – Challenges of Conventional System – Intelligent data analysis – Nature of Data – Analytic Processes and Tool – Analysis vs Reporting – Modern Data Analytic Tool – Statistical Concepts: Sampling Distributions – Re-Sampling – Statistical Inference – Prediction Error.

UNIT- II Mining Data Streams Introduction To Stream Concepts – Stream Data Model and Architecture - Stream Computing – Sampling Data in a Stream – Filtering Stream – Counting Distinct Elements in a Stream – Estimating Moments – Counting Oneness in a Window – Decaying Window – Real time Analytics Platform(RTAP) Applications – Case Studies – Real Time Sentiment Analysis, Stock Market Predictions.

UNIT – III Hadoop History of Hadoop- The Hadoop Distributed File System – Components of Hadoop – Analyzing the Data with Hadoop – Scaling Out – Hadoop Streaming – Design of HDFS- Java interfaces to HDFSBasics- Developing a Map Reduce Application – How Map Reduce Works – Anatomy of a Map Reduce Job run – Failures – Job Scheduling – Shuffle and Sort – Task Execution – Map Reduce Types and Formats – Map Reduce Features.

UNIT – IV Hadoop Environment Setting up a Hadoop Cluster – Cluster specification – Cluster Setup and Installation –Hadoop Configuration – Security in Hadoop – Administering Hadoop – HDFS – Monitoring – Maintence – Hadoop Benchmarks – Hadoop in the Cloud

UNIT –V Frameworks Applications on Big Data Using Pig and Hive – Data Processing operators in Pig – Hive Services – HiveQL – Querying Data in Hive – fundamentals of HBase and Zookeeper – IBM Info Sphere Big Insights and Streams. Visualization - Visual data analysis techniques, interaction techniques; Systems and applications.

Text Books: 1. Michael Berthold, David J.Hand, Intelligent Data Analysis, Spingers, 2007.

2. Tom White, Hadoop: The Definitive Guide Third Edition, O’reilly Media, 2012.

3. Chris Eaton, Dirk DeRoos, Tom Deutsch, George Lapis, Paul Zikopoulos, Uderstanding Big Data : Analytics for Enterprise Class Hadoop and Streaming Data, McGrawHill Publishing, 2012.

4. AnandRajaraman and Jeffrey David UIIman, Mining of Massive Datasets Cambridge University Press, 2012.

Reference Books:

1. Bill Franks, Taming the big Data tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics, John Wiley & sons, 2012.

2. Glenn J. Myatt, Making Sense of Data , John Wiley & Sons, 2007 Pete Warden, Big Data Glossary, O’Reilly, 2011.

3. Jiawei Han, MichelineKamber, Data Mining Concepts and Techniques, Second Edition.

4. Elsevier, Reprinted 2008. Da Ruan, Guoquing Chen, Etienne E.Kerre, Geert Wets, Intelligent Data Mining, Springer, 2007.

No comments:

Post a Comment