Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
4. Sun also has the Hadoop Live CD project, which allows running a fully functional Hadoop
cluster using a live CD
A. OpenOffice.org
B. OpenSolaris
C. OpenSolaris
D. Linux
ANSWER: C
8. Hadoop achieves reliability by replicating the data across multiple hosts, and hence does not require
storage on hosts.
A. RAID
B. ZFS
C. Operating System
D. DFS
ANSWER: A
9. Above the file systems comes the engine, which consists of one Job Tracker, to which client
applications submit MapReduce jobs.
A. MapReduce
B. Google
C. Functional Programming
D. Facebook
ANSWER: A
10. The Hadoop list includes the HBase database, the Apache Mahout system, and matrix
operations.
A. Machine learning
B. Pattern recognition
C. Statistical classification
D. Artificial intelligence
ANSWER: A
11. is a platform for constructing data flows for extract, transform, and load (ETL) processing
and analysis of large datasets.
A. Pig Latin
B. Oozie
C. Pig
D. Hive
ANSWER: C
13. hides the limitations of Java behind a powerful and concise Clojure API for Cascading.
A. Scalding
B. HCatalog
C. Cascalog
D. All of the mentioned
ANSWER: C
17. is general-purpose computing model and runtime system for distributed data analytics.
A. Mapreduce
B. Drill
C. Oozie
D. None of the mentioned
ANSWER: A
18. The Pig Latin scripting language is not only a higher-level data flow language but also has
operators similar to :
A. JSON
B. XML
C. XSL
D. SQL
ANSWER: D
21. As companies move past the experimental phase with Hadoop, many cite the need for additional
capabilities, including
A. As companies move past the experimental phase with Hadoop, many cite the need for additional
capabilities, including
B. Improved extract, transform and load features for data integration
C. Improved data warehousing functionality
D. Improved security, workload management and SQL
support ANSWER: D
23. According to analysts, for what can traditional IT systems provide a foundation when they are
integrated with big data technologies like Hadoop ?
A. Big data management and data mining
B. Data warehousing and business intelligence
C. Management of Hadoop clusters
D. Collecting and storing unstructured data
ANSWER: A
24. Hadoop is a framework that works with a variety of related tools. Common cohorts include
A. MapReduce, MySQL and Google Apps
B. MapReduce, Hive and HBase
C. MapReduce, Hummer and Iguana
D. MapReduce, Heron and
Trumpet ANSWER: B
31. A node acts as the Slave and is responsible for executing a Task assigned to it by the
JobTracker.
A. MapReduce
B. Mapper
C. TaskTracker
D. JobTracker
ANSWER: C
33. part of the MapReduce is responsible for processing one or more chunks of data and
producing the output results.
A. Maptask
B. Mapper
C. Task execution
D. All of the mentioned
ANSWER: A
34. function is responsible for consolidating the results produced by each of the Map()
functions/tasks.
A. Map
B. Reduce
C. Reducer
D. Reduced
ANSWER: B
35. Point out the wrong statement
A. A MapReduce job usually splits the input data-set into independent chunks which are processed by
the map tasks in a completely parallel manner
B. The MapReduce framework operates exclusively on pairs
C. Applications typically implement the Mapper and Reducer interfaces to provide the map and reduce
methods
D. None of the mentioned
ANSWER: D
36. Although the Hadoop framework is implemented in Java ,MapReduce applications need not be written
in
A. C
B. C++
C. Java
D. VB
ANSWER: C
37. is a utility which allows users to create and run jobs with any executables as the mapper
and/or the reducer.
A. HadoopStrdata
B. Hadoop Streaming
C. Hadoop Stream
D. None of the mentioned
ANSWER: B
41. Mapper implementations are passed the JobConf for the job via the method
A. JobConfigure.configure
B. JobConfigurable.configure
C. JobConfigurable.configureable
D. None of the mentioned
ANSWER: B
42. Point out the correct statement
A. Applications can use the Reporter to report progress
B. The HadoopMapReduce framework spawns one map task for each InputSplit generated by the
InputFormat for the job
C. The intermediate, sorted outputs are always stored in a simple (key-len, key, value-len, value) format
D. All of the mentioned
ANSWER: D
46. The output of the is not sorted in the Mapreduce framework for Hadoop.
A. Mapper
B. Cascader
C. Scalding
D. None of the mentioned
ANSWER: D
47. Which of the following phases occur simultaneously ?
A. Reduce and Sort
B. Shuffle and Sort
C. Shuffle and Map
D. All of the mentioned
ANSWER: B
48. Mapper and Reducer implementations can use the to report progress or just indicate that they
are alive.
A. Partitioner
B. OutputCollector
C. Reporter
D. All of the mentioned
ANSWER: C
49. is a generalization of the facility provided by the MapReduce framework to collect data
output by the Mapper or the Reducer
A. Partitioner
B. OutputCollector
C. Reporter
D. All of the mentioned
ANSWER: B
50. is the primary interface for a user to describe a MapReduce job to the Hadoop framework
for execution.
A. Map Parameters
B. JobConf
C. MemoryConf
D. All of the mentioned
ANSWER: B
51. A serves as the master and there is only one NameNode per cluster
A. Data Node
B. NameNode
C. Data block
D. Replication
ANSWER: B
56. Which of the following scenario may not be a good fit for HDFS?
A. HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same file
B. HDFS is suitable for storing data related to applications requiring low latency data access
C. HDFS is suitable for storing data related to applications requiring high latency data access
D. None of the mentioned
ANSWER: A
57. The need for data replication can arise in various scenarios like :
A. Replication Factor is changed
B. DataNode goes down
C. Data Blocks get corrupted
D. All of the mentioned
ANSWER: D
58. is the slave/worker node and holds the user data in the form of Data Blocks
A. DataNode
B. NameNode
C. Data block
D. Replication
ANSWER: A
59. HDFS provides a command line interface called used to interact with HDFS.
A. HDFS Shell
B. FS Shell
C. DFSA Shell
D. None
ANSWER: B
63. Cloudera includes CDH and an annual subscription license (per node) to Cloudera
Manager and technical support.
A. Enterprise
B. Express
C. Standard
D. All the above
ANSWER: A
64. Cloudera Express includes CDH and a version of Cloudera lacking enterprise features
such as rolling upgrades and backup/disaster recovery
A. Enterprise
B. Express
C. Standard
D. Manager
ANSWER: D
68. is an open source set of libraries, tools, examples, and documentation engineered.
A. Kite
B. Kize
C. Ookie
D. All of the mentioned
ANSWER: A
69. To configure short-circuit local reads, you will need to enable on local Hadoop.
A. librayhadoop
B. libhadoop
C. libhad
D. hadoop
ANSWER: B
74. You can delete a column family from a table using the method of HBAseAdmin class.
A. delColumn()
B. removeColumn()
C. deleteColumn()
D. All of the mentioned
ANSWER: A
75. Point out the wrong statement
A. To read data from an HBase table, use the get() method of the HTable class
B. You can retrieve data from the HBase table using the get() method of the HTable class
C. While retrieving data, you can get a single row by id, or get a set of rows by a set of row ids, or scan
an entire table or a subset of rows
D. None of the mentioned
ANSWER: D
77. The class provides the getValue() method to read the values from its instance
A. Get
B. Result
C. Put
D. Value
ANSWER: B
83. Which of the following is true about the base plotting system ?
A. Margins and spacings are adjusted automatically depending on the type of plot and the data
B. Plots are typically created with a single function call
C. Plots are created and annotated with separate functions
D. The system is most useful for conditioning plots
ANSWER: C
87. Which of the following functions is typically used to add elements to a plot in the base graphics system
A. lines()
B. hist()
C. plot()
D. boxplot()
ANSWER: D
88. Which function opens the screen graphics device for the Mac ?
A. bitmap()
B. quartz()
C. pdf()
D. png()
ANSWER: B
94. In 1991, R was created by Ross Ihaka and Robert Gentleman in the Department of Statistics at the
University of .
A. John Hopkins
B. California
C. Harvard
D. Auckland
ANSWER: D
97. R is technically much closer to the Scheme language than it is to the original language.
A. B
B. C
C. R
D. S
ANSWER: D
98. The R-help and mailing lists have been highly active for over a decade now
A. R-mail
B. R-devel
C. R-dev
D. R-d
ANSWER: B
100. The copyright for the primary source code for R is held by the Foundation.
A. A
B. C
C. C++
D. R
ANSWER: D
104. The R system contains, among other things, the base package which is required to run R
A. root
B. child
C. base
D. none of the above
ANSWER: C
117. If a command is not complete at the end of a line, R will give a different prompt, by default it is :
A. *
B. -
C. +
D. All the above
ANSWER: C
118. Command lines entered at the console are limited to about bytes
A. 3000
B. 4095
C. 5000
D. None
ANSWER: B
119. . text editor provides more general support mechanisms via ESS for working interactively with
R.
A. EAC
B. Emac
C. Shell
D. None
ANSWER: B
120. What would be the result of following R code ? > x <- 1 >print(x)
A. 1
B. 2
C. 3
D. 4
ANSWER: A
126. will divert all subsequent output from the console to an external file.
A. sink
B. div
C. dip
D. exp
ANSWER: A
128. Which of the following can be used to display the names of (most of) the objects which are currently
stored within R ?
A. object()
B. objects()
C. list()
D. none of the above
ANSWER: B
130. What will be the output of following code snippet ? > paste("a", "b", se = ":")
A. a+b
B. a-b
C. ab
D. none
ANSWER: D
134. You can check to see whether an R object is NULL with the function.
A. is.nullobj()
B. null()
C. is.null()
D. obj.null()
ANSWER: C
138. For YARN, the Manager UI provides host and port information.
A. Data Node
B. NameNode
C. Resource
D. Replication
ANSWER: C
139. Point out the correct statement
A. The Hadoop framework publishes the job flow status to an internally running web server on the
master nodes of the Hadoop cluster
B. Each incoming file is broken into 32 MB by default
C. Data blocks are replicated across different nodes in the cluster to ensure a low degree of fault
tolerance
D. None of the mentioned
ANSWER: A
140. For , the HBase Master UI provides information about the HBase Master uptime.
A. Oozie
B. HBase
C. Kafka
D. Afka
ANSWER: B
142. Manager's Service feature monitors dozens of service health and performance metrics
about the services and role instances running on your cluster.
A. Microsoft
B. Cloudera
C. Amazon
D. None of the abovc
ANSWER: B
143. The IBM Platform provides all the foundational building blocks of trusted
information, including data integration, data warehousing, master data management, big data and
information governance.
A. InfoStream
B. InfoSphere
C. InfoSurface
D. InfoSurface
ANSWER: A
148. DataStage originated at , a company that developed two notable products: UniVerse
database and the DataStage ETL tool.
A. VMark
B. Vzen
C. Hatez
D. SMark
ANSWER: A