Sei sulla pagina 1di 4

CS8031/ MCA7107

DMW

Self Assessment/Revision Quiz

1. Which of the following statements are correct?


A. A data warehouse refers to a database that is maintained separately from an
organizations operational databases.
B. Data in the data warehouse are stored to provide information from a historical
perspective.
i.only A ii.only B
iii.Both A and B
iv. None
2. Which is not a property of Data Warehouse
i) Subject oriented
ii) Volatile iii) Detailed and Update oriented
iv)
Integrated
3. Which of the following is not a kind of data warehouse application?
i) Information processing
ii) Analytical processing iii) Data mining iv)
Transaction processing
4. ____________ is used to refer to systems and technologies that provide the
business with the means for decision-makers to extract personalized meaningful
information about their business and industry
i. Business intelligence
ii. Data warehousing
iii. Database iv. all
5. Which of the following statements are correct?
A. Data mining is concerned with finding hidden relationships present in business
data to allow businesses to make predictions for future use.
B. Modeling is simply the act of building a model based on data from situations
where the answer is known and then applying the model to other situations
where the answers are not known.
i.only A ii.only B
iii.Both A and B
iv. None
6. Data Mining functionalities include
i) Prediction ii) Association Mining iii) Classification
iv)
Visualization
7. The full form of OLAP is
i) Online Analytical Processing ii) Online Advanced Processing
iii) Online Advanced Preparation
iv) Online Analytical Performance
8. The data is stored, retrieved and updated in ....................
i) OLAP ii) OLTP
iii) SMTP
iv) FTP
9. Following OLAP engine is most suitable when refresh rate is high
i) ROLAP
ii) MOLAP
iii) MQE
iv) All
10.Following OLAP engine is most suitable when aggregation queries over huge
data are most likely
i) ROLAP
ii) MOLAP iii) MQE
iv) All
11.An .................. system is market-oriented and is used for data analysis by
knowledge workers, including managers, executives, and analysts.
i) OLAP
ii) OLTP
iii) Both of the above
iv) None of
the above
12.Which of the following is not a component of a data warehouse?
i) Metadata
ii) Current detail data iii) Lightly summarized data iv)
Component Key
13.Metadata is not used for following step
i. data integration
ii. data cleaning
iii. data visualization iv.
data discretization

CS8031/ MCA7107

DMW

Self Assessment/Revision Quiz

14.Which of the following is not a kind of data warehouse application?


i) Information processing
ii) Analytical processing iii) Data mining
iv)
Transaction processing
15.With the base cuboid [year, product_group, zone], OLAP operation suitable for
finding aggregate of SALES of all product groups in all zones in year 2004 is
i. Slice ii. Dice
iii. Roll Up
iv. Drill down
16.With the base cuboid [quarter, product_group, zone], OLAP operation suitable for
finding aggregate of SALES of all product groups in all zones in year 2004 is
i. Slice
ii. Dice
iii. Roll Up and Slice
iv. Drill down
17.With the base cuboid [year, product_group, zone], OLAP operation suitable for
finding aggregate of SALES of all product group P1 in all zones in year 2004 is
i. Slice
ii. Dice
iii. Roll Up
iv. Drill down
18.With the base cuboid [year, product_group, zone], OLAP operation suitable for
finding aggregate of SALES of all product group P1 in zone Z2-Z5 in year 2004 is
i. Slice
ii. Dice
iii. Roll Up
iv. Drill down
19.Attribute relevance analysis does not involve
i. Pearson correlation coefficient
ii. chi-square test
iii. Principal Component Analysis (PCA)
iv. Intuitive 3-4-5 rule
20.Histograms are not useful in
i. data cleaning ii. data discretization
iii. Data transformation
iv.
Data summarization
21.Data mining primitives include
i. Task relevant data
ii. Kind of knowledge- description vs Comparison
iii. Interestingness measures
iv. All
22.Data Characterization is expressed using . in FOPL
i. t-weight
ii.d-weight iii.both
iv. none
23.Class comparisons are expressed using in FOPL
i. t-weight
ii.d-weight iii.both
iv. none
24._________ methods smooth a sorted data value by consulting its neighborhood,
that is, the values around it.
i.Binning
ii.Clustering iii.Combined computer and human inspection
iv.Regression
25.___________ techniques can be used to reduce the number of values for a given
continuous attribute, by dividing the range of the attribute into intervals.
i. Discretization
ii. Transformation iii.Generalization iv.Smoothing
26.The efficiency of the Shell fragment approach for cube computation over the
BUC approach is when the cubes are
i. Dense
ii. Sparse
iii.Generalized
iv.All
27.Concept hierarchies are useful in
i. Data reduction
ii. OLAP execution iii. Data smoothing iv. All
28.The process of partitioning the ranges of quantitative attributes into intervals, is
called _________
i. Binning ii.Splitting
iii.Grouping iv.None
29._______________ is the process of removing attributes in the data that are
irrelevant to the classification or prediction task.
i. Relevance analysis ii.Data cleaning
iii.Data
transformation
iv.Normalization

CS8031/ MCA7107

DMW

Self Assessment/Revision Quiz

30.____________focuses on analysis of the link structure of the web and one of its
purposes is to identify more preferable documents.
i.Web content mining ii.Web usage mining
iii.web structure mining
iv.none
31._____________is the process of determining what evidence that can be taken from
raw audit data is most useful for analysis
i. feature extraction
ii.data cleaning
iii. Data transformation
iv. none

Reference:
May refer to the following link and many such available online
http://online.liebertpub.com/doi/full/10.1089/big.2014.0031
Its content helps gain insight on contemporary Big-Data business.
Aim:
The task aims at gaining enough insight and CONFIDENCE to participate in a standard conference related to
BIg Data/ Data Science application/ business; besides honing research and documentation skills
Group:
group of 4 as per Roll number sequence
(member of last group -if less than 3- would each join subsequent previous groups)
Submission Schedule:
1. Submission of The list of group-wise topic (i.e. the business/application domain) through the CR (e-mail or
hard copy)
(Duplication of topics not permissible)
- Feb 27 (MCAIV)
- March10 (BEVIIICS)
2. Submission of the Report and presentation/viva (group-wise presentation/viva to be scheduled after
regular class hours)
-respective CRs to coordinate to schedule the same.
- April 2nd/3rd week
Tasks/ Report Structure:
As part of the progressive evaluation process, you are required to submit a group-project report on
"(Start-Up ) Business plan for BIG-DATA ANalysis and Mining Software Service"
The report should be structured as following:
1. Abstract

CS8031/ MCA7107

DMW

Self Assessment/Revision Quiz

2. Problem Statement: The big-data analysis perspective and need for the business/ application domain
chosen and claim of return on investment (ROI)
3. Survey and insight
4. Conceptualization and design of business plan
5. Design of the software modules
6. Justified incorporation of Data Analysis and Mining methods studied
7. Demonstration of Programming tools and libraries explored.
focus during evaluation would be on
1. Documentation skill
2. Intensity of presentation to defend individual effort.
Look forward for enthusiastic participation.

Potrebbero piacerti anche