JNTUK 3-2 1st Mid Big Data Analytics - (R2032121) Online Bits
JNTUK 3-2 1st Mid Big Data Analytics - (R2032121) Online Bits
Identify among the options below which is general-purpose computing model and C
runtime system for Distributed Data Analytics.
A HDFS B Kafka
C MapReduce D Oozie
002. Total Vs of big data is ____ C
A 3 B 4
C 5 D 6
003. ___________ is a collection of data that is used in volume, yet growing exponentially D
with time
A Big Database B Big DBMS
C Big Datafil D Big Data
004. In how many forms BigData could be found? B
A 2 B 3
C 4 D 5
005. _________ is refers to the dynamic, large and disparate volumes of data being created A
by people, tools and machines.
A Big Data B Big Database
C Big DBMS D Big Datafile
006. Data in ____ bytes size is called big data D
A Meta B Giga
C Tera D Peta
007. Transaction of data of the bank is a type of. B
A Un-structured data B Structured data
C Semi structured data D Hybrid
008. ______ is the term that is used to describe data that is high volume, high velocity and B
/or high variety.
A Analytics B Bigdata
C Hadoop Data D Bigdata analytics
009. ___________ is general-purpose computing model and runtime system for distributed A
data analytics.
A Mapreduce B Drill
C Oozie D Kafka
010. Numbers ,text, image, audio and video data is ____ D
A Volume B Value
C Varity D Variety
011. Which of the following are incorrect Big Data Technologies? D
A Apache Hadoop B Apache Spark
C Apache Kafka D Apache Pytarch
012. Which of the following is true about big data? B
A Big data can be processed using B Big data refers to data sets that are at
traditional techniques least a petabyte in size
C Big data analysis does not involve D Big data has low velocity meaning
reporting and data mining techniques that it is generated slowly
013. Which of the following can be generally used to clean and prepare big data D
A Pandas B Data lake
C U-SQL D Data Warehouse
014. Who popularized bigdata term? B
A John deere B John Mashey
C Johny Mashe D Jhon Mash
015. Advantages of Big Data are _______. A
A Big data analysis derives innovative B Lots of big data is unstructured.
solutions.
C It can be used for manipulation of D Big data analysis violates principles of
customer records. privacy.
016. Disadvantages of Big Data are ________. D
A Big data analysis derives innovative B Big data analysis helps in
solutions. understanding and targeting
customers.
C It helps in optimizing business D Big data analysis violates principles of
processes. privacy.
017. _________ splits the gap between structured and unstructured data, which, using the C
right datasets, can make it a huge asset.
A Structured Data B Unstructured Data
C Semi-structured Data D Hybrid
018. The examination of large amounts of data to see what patterns or other useful C
information can be found is known as___
A Data examination B Information analysis
C Big data analytics D Data analysis
019. Any data with unknown form or the structure is classified as___________. A
A Un-structured data B Structured data
C Semi structured data D Hybrid
020. Any data that can be stored, accessed and processed in the form of fixed format is B
termed as a _____________.
A Un-structured data B Structured data
C Semi structured data D Hybrid
021. In computers, a ____ is a symbolic representation of facts or concepts from which A
information may be obtained with a reasonable degree of confidence.
A Data B Knowledge
C Program D Algorithm
022. Listed below are the three steps that are followed to deploy a Big Data Solution except C
A Data Ingestion B Data Processing
C Data dissemination D Data Storage
023. The word Big data was coined by A
A Roger Mougalas B John Philips
C Simon Woods D Martin Green
024. Big data analysis does the following except B
A Collects data B Spreads data
C Organizes data D Analyzes data
025. Data in a Relational Database is A
A Structured B Un-Structured
C Semi Structured D Meta Data
026. What kind of data is in Log files? C
A Structured B Un-Structured
C Semi Structured D Meta Data
027. What does the characteristics Velocity in Big Data represents? D
A Speed of input data generation B Speed of individual machine
processors
C Speed of ONLY storing data D Speed of storing and processing data
028. ______ refers to the ability to turn your data useful for business. C
A Velocity B Variety
C Value D Volume
029. Select one of Big Data Platforms D
A HTML B Compiler
C IDE D MapR
030. ___________is open-source, Java based programming framework and server software A
which is used to save and analyze data with the help of 100s or even 1000s of
commodity servers in a clustered environment.
A Hadoop B Cloudera
C Amazon Web Services D Hortonworks
031. _____________refers to the connectedness of big data. D
A Value B Veracity
C Velocity D Valence
032. The word Big Data was coined in the year C
A 2000 B 1970
C 1998 D 2005
033. Concerning the Forms of Big Data, which one of these is odd? C
A Structured B Unstructured
C Processed D Semi-Structured
034. The feature of big data that refers to the quality of the stored data is ______ D
A Variety B Volume
C Variability D Veracity
035. ___________ refers to the biases, noise and abnormality in data, trustworthiness of B
data.
A Value B Veracity
C Velocity D Volume
036. which of the following are not a challenges of conventional system C
A Data B Process
C Services D Management
037. IDA stands for A
A Intelligent Data Analysis B Internal Data Analysis
C Intelligent Data Association D Intelligent Distributed Analysis
038. For the following which is one of the first commercial Hadoop based Big Data Analytics B
Platform offering Big Data solution.
A Hadoop B Cloudera
C Hortonworks D Amazon Web Services
039. Which big data platform is designed to storage and process large datasets extremely C
fast and in fault tolerant way.
A Amazon Web Services B Hortonworks
C Hadoop D Cloudera
040. Hadoop uses _______for storing data on cluster of commodity computers. D
A U-SQL B Pandas
C Data lake D HDFS
041. What type ecosystem provides necessary tools and software for handling and B
analyzing Big Data
A MapR B Hadoop
C Hortonworks D IBM Open Platform
042. What is full form of HDFS? A
A Hadoop File System B Hadoop Field System
C Hadoop File Search D Hadoop Field search
043. Which Big Data analytics is provides specific recommendations on what should be D
done better.
A Descriptive analytics B Diagnostic analytics
C Predictive analytics D Prescriptive analytics
044. ____________ is a statistical method that utilizes algorithms and machine learning to C
identify trends in data and predict future behaviors
A Descriptive analytics B Diagnostic analytics
C Predictive analytics D Prescriptive analytics
045. Which Big Data analytics is uses historical data to uncover patterns and make C
predictions on whats likely to happen in the future.
A Descriptive analytics B Diagnostic analytics
C Predictive analytics D Prescriptive analytics
046. Goal of ____________is to extract useful knowledge, the process demands a A
combination of extraction, analysis, conversion, classification, organization, reasoning.
A IDA B U-SQL
C Pandas D HDFS
047. _____________is the process of finding patterns, trends, and relationships in massive A
datasets that cant be discovered with traditional data management techniques and
tools.
A Big Data analytics B Data examination
C Information analysis D Data analysis
048. Which Big Data analytics is a common kind of analytics that allows you to find out what A
happened and when.
A Descriptive analytics B Diagnostic analytics
C Predictive analytics D Prescriptive analytics
049. Which Big Data analytics is explains why and how something happened by identifying B
patterns and relationships in available data.
A Descriptive analytics B Diagnostic analytics
C Predictive analytics D Prescriptive analytics
050. Big data analysis does the following except? B
A Collects data B Spreads data
C Organizes data D Analyzes data
051. What makes Big Data analysis difficult to optimize? A
A Both data and cost effective ways to B Big Data is not difficult to optimize
mine data to make business sense
out of it
C The technology to mine data D The technology to mining
052. What is an often occurring phenomenon when comparing simple/complex algorithms B
on small/big data?
A On small data simple algorithms work B On large data simple algorithms work
very well. very well.
C On small data complex algorithms D On large data complex algorithms
fail. perform much better than simple
algorithms.
053. a statistical method used to generate recommendations and make decisions based on D
the computational findings of algorithmic models is called as
A Descriptive analytics B Diagnostic analytics
C Predictive analytics D Prescriptive analytics
054. which of the following are examples of Unstructured data D
A HTML B XML
C JSON D CHAT ROOMS
055. which of the following are examples of Unstructured data A
A LETTERS B HTML
C XML D JSON
056. which of the following are examples of semi structured data A
A XML B LETTERS
C RESEARCHES D WHITE PAPERS
057. Distributed stream processing systems involve the use of geographically ________ A
architectures for processing large data streams in real time to increase efficiency and
reliability of the data ingestion, data processing, and the display of data for analysis.
A Distributed B Collective
C Specific D Data cleaning
058. A _______________ which includes the generation of the stream data, the processing C
of the data, and the delivery of the data to a final location.
A Data PipeLine B Processing code
C Stream processing pipeline D Conventional Processing data
059. A ___________framework simplifies parallel hardware and software by restricting the C
performance of parallel computation.
A Soft computing B Cloud computing
C Stream processing D Design processing
060. The new source of big data that will trigger a Big Data revolution in the years to come C
is?
A Business transactions B Social media
C Transactional data and sensor data D RDBMS
061. Listed below are the three steps that are followed to deploy a Big Data Solution except D
A Data Storage B Data Ingestion
C Data Processing D Data dissemination
062. a way to analyze and process Big Data in real time to gain current insights to take C
appropriate decisions or to predict new trends in the immediate future
A Soft computing B Cloud computing
C Stream computing D Design Computing
063. A big data technology that focuses on the real-time processing of continuous streams C
of data in motion.
A Soft computing B Cloud computing
C Stream processing D Parallel processing
064. which frameworks support query languages and were focused on doing efficient event C
matching against supplied queries
A Processing code B Dataflow Pipeline
C Complex Event Processing D Conventional Processing
065. Stream Processing are sometimes known as D
A Stateful stream processing B Distributed Computing
C Conventional processing D Data Processing
066. A Stream Processing framework is a complete processing system that includes a B
___________ receives streaming inputs and generates actionable, real-time analytics
A Processing code B Dataflow Pipeline
C Stream processing pipeline D Conventional Processing data
067. In which processing merges real-time applications and value store tables (database) D
into a single entity.
A Cloud computing B Stream processing
C Distributed Computing D Stateful stream processing
068. a subset of stream processing called _______ A
A Stateful stream processing B Distributed Computing
C Conventional processing D Cloud computing
069. The technology that allows systems to process data continuously and detect conditions C
within seconds.
A Soft computing B Cloud computing
C Stream processing D Conventional processing
070. Stream processing has streamlined the architecture because it unifies _________ B
A data and analytics B applications and analytics
C system and analytics D applications and streams
071. A ________ is the process of selection or matching instances of a desired pattern in a D
continuous stream of data
A Streaming data architecture B Conventional Processing
C Machine learning D Stream filtering
072. Which attribute is _not_ indicative for data streaming? C
A Limited amount of memory B Limited amount of processing time
C Limited amount of input data D Limited amount of processing power
073. A framework of software components built to ingest and process large volumes of A
streaming data from multiple sources.
A streaming data architecture B Conventional Processing
C Mining data streams D cloud data lakes
074. The process of extracting knowledge from continuous rapid data records which comes C
to the system in a stream is called as
A Text Mining B Data Processing
C Data Stream Mining D Data Mining
075. Data Stream Mining cannot fulfil the following characteristics D
A Continuous Stream of Data B Concept Drifting
C Volatility of data D Volume
076. Data Stream Miningalso known as A
A Stream learning B Machine learning
C Deep learning D System learning
077. Streaming data refers to data that iscontinuously generated usually inhigh______ and D
athigh________.
A Value and variety B Velocity and value
C Variety and veracity D Volumes and Velocity
078. Which of the following Batch Processing instance is NOT an example of BigData Batch A
Processing?
A Trending topic analysis of tweets for B Processing 10 GB sales data every 6
last 15 minutes hours
C Processing flights sensor data D Web crawling app
079. The mechanism used to create replica in HDFS is____________. C
A Gossip protocol B Replicate protocol
C HDFS protocol D Store and Forward protocol
080. __________ is capable ofhandling petabytes of data at a time D
A Apache Script B Apache Speak
C Apache Shell D Apache Spark
081. Which of the following statements about data streaming is true? B
A Stream data is always unstructured B Stream data often has a high velocity.
data.
C Stream elements cannot be stored on D Stream data is always structured
disk. data.
082. Among the following options which component deals with ingesting streaming data into D
Hadoop?
A Oozie B Hive
C Kafka D Flume
083. For the following select the Dis-Advantages of Data Streams B
A This data is helpful in upgrading sales B Lack of security of data in the cloud
C Help in recognizing the fallacy D Helps in minimizing costs
084. For the following select the Advantages of Data Streams A
A It provides details to react swiftly to B Lack of security of data in the cloud
risk
C Hold cloud donor subordination D Off-premises warehouse of details
introduces the probable for
disconnection
085. Which of the following streaming windows show valid bucket representations according D
to the DGIM rules?
A 1011101011110101 B 10111000011000101110
01
C 1111001110101 D 10110001011101100101
1
086. To initialize the bit array in bloom filter, it always begins with all bits as _______. A
A ZERO B ONE
C TWO D NAN
087. Distinct Element can be find using_________. B
A PCYalgorithm B FM Algorithm
C DM Algorithm D AI Algorithm
088. Where is HDFS replication factor controlled D
A mapred-site.xml B yarn-site.xml
C core-site.xml D hdfs-site.xml
089. Which of the following Hadoop config files is used to define the heap size? C
A hdfs-site.xml B core-site.xml
C hadoop-env.sh D Slaves
090. For Filtering Stream _________is used A
A Bloom Filter B Sensor readings from machines.
C e-Commerce purchase data. D Stock exchange data to predict the
stock price
091. Park, Chen, Yu algorithm is useful for __________in Big Data Application. D
A Find Field Itemset B Find largest Itemset
C Find filtered Itemset D Find Frequent Itemset
092. In which Streaming Analytics is a platform that delivers insights from high-velocity C
streams of live data from multiple sources and enables immediate action.
A SQL B Oracle
C Cisco Connected D Apache Spark
093. In which Streaming Analytics is a platform that provides a graphical interface to Fast B
Data.
A SQL B Oracle
C Cisco Connected D Apache Spark
094. _____________ Streaming a Big Data platform for data stream analytics in real time. D
A SQL B Oracle
C Cisco Connected D Apache Spark
095. A Bloom filter is a___________ data structure, conceived by Burton Howard Bloom in C
1970
A advanced B un-structures
C space-efficient probabilistic D scientific
096. D