Cs2032 Question Bank

2 Marks 1. What is datawarehouse? 2. What is the significant use of subject oriented datawarehouse? 3. Why do we use integrated version of datawarehouse?
4. What is the role of time variant feature in Datawarehouse? 5. What is meant by non volatile nature in datawarehouse? 6. State the difference between datawarehouse vs operational DBMS. 7. List the distinct features of OLTP with OLAP. 8. Why we need separate datawarehouse? 9. Give the conceptual modeling of datawarehouse. 10. Define the distributive measure of datawarehouse categories. 11. Define the algebraic measure of datawarehouse categories. 12. Define the holistic measure of datawarehouse categories. 13. List the OLAP operations and their functionality? 14. What are the different types of datawarehouse design process? 15. Define enterprise warehouse. 16. What is meant by datamart? 17. Define virtual warehouse. 18. What are the back end tools and utilities of Data warehouse? 19. What are the applications of data warehousing? 20. Why we need online analytical mining? 16 Marks 1. Give the Architecture of Data warehouse and explain its usage. 2. State the difference between OLTP and OLAP in detail. 3. Explain the operations performed on data warehouse with examples 4. Write short notes on data warehouse Meta data. 5. Explain the Conceptual Modeling of Data Warehouses 2 Marks 1. Why we need data preporocessing. 2. List the multidimensional measure of data quality? 3. What is meant by Data cleaning? 4. Define Data integration. 5. Why we need Data transformation 6. Define Data reduction. 7. What is meant by Data discretization
8. What is the discretization processes involved in data preprocessing? 9. Define Concept hierarchy. 10. Why we need Data Mining Primitives and Languages? 11. What are the types of knowledge to be mined? 12. Define Datamining Query Language. 13. What tasks should be considered in the design GUIs based on a data mining query language? 14. What are the types of Coupling data mining system with DB/DW system? 15. List the five primitives for specification of a data mining task. 16. Descriptive vs. predictive data mining 17. What is the strength of Data Characterization? 18. Give the list of limitations of Data Characterization. 19. How Attribute oriented induction will be done? 20. Give the basic principle of Attribute oriented induction. 21. What is the basic algorithm for Attribute oriented induction? 22. State the difference between Characterization and OLAP. 23. What is meant by Boxplot analysis? 24. Define Histogram Analysis. 25. What is meant by Quantile plot? 26. Define Quantile-Quantile plot. 27. Define Scatter Plot. 28. Give the definition for Loess Curve. 29. How we can measure the dispersion of data? 16 Marks 1. Explain major tasks in Data Preprocessing. 2. What is data cleaning? List and explain various techniques used for data cleaning? 3. How is Attribute Oriented Induction implemented? Explain with an example 4. Why do we preprocess the data? Explain how data preprocessing techniques can improve the quality of the data. 5. List out and describe the primitives for specifying a data mining task. 6. Describe how concept hierarchies and data generalization are useful in data Mining. 2 Marks 1. What is Association rule mining? 2. What are the Applications of Association rule mining? 3. State the rule measure for finding association. 4. What are the different way to find association?
5.Why counting supports of candidates a problem? 6. Give the Method to find supports of candidates. 7. List the Methods to Improve Aprioris Efficiency. 8. What is the advantage of FP Tree Structure? 9. List the major step to mine FP Tree. 10. What is meant by Node-link property 11. Define Prefix path property 12. What is the principle of frequent pattern growth. 13. Why Is Frequent Pattern Growth Fast? 14. What is meant by Icerberg query? 15. Define Multiple Level Association Rule. 16. What is meant by Uniform support. 17. What do you meant by Reduced Support 18. Why progressive refinement is suitable for reduced support? 19. What is the functionality of Superset mining? 20. Define two or Multi Step mining. 21. What is categorical and Quantitative Attribute? 22. What is the limitations of ARCS? 24. Define Quantitative association rules 25. State Distance-based association rules 26. Give the two-step mining of spatial association. 16 Marks 1. Discuss the following in detail: 1. 2. 3. 4. Association Mining Support Confidence Rule measures 2. Explain how mining will be done in frequent item sets with an example. 3. Describe join and prune steps in Apriori Algorithm. 4. Discuss the approaches for mining databases multi dimensional association rule from transactional databases. Give suitable examples. 5. A database has four transactions. Let min_sup = 60% and min_conf = 80%. TID Date Items_bought
T10010/15/08{K, A, D, B}
T200 10/15/08 {D, A, C, E, B} T300 10/16/08 {C, A, B, E} T400 10/16/08 {B, A, D } (i)Find all frequent itemsets using Apriori and FP-growth respectively. (ii) List all strong association rules matching the following Meta rule, V X transactions buys(X, item1) ^ buys(X, item2) buys(X, item3) Where X Customer, item I- A, B, etc. 1. (i) Explain the methods to improve the Aprioris Efficiency. (ii) Construct the FP tree for given transaction DB TID Frequent Itemsets 100 f,c,a,m,p 200 f,c,a,b,m 300 f,b 400 c,b,p 500 f,c,a,m,p 2 Marks 1. What is the functionality of Classification process? 2. Give the role of prediction in datamining. 3. List the typical Applications of classification and prediction 4. Define Supervised learning (classification) 5. What is meant by Unsupervised learning (clustering) 6. What is the process involved in Data Preparation? 7. How we can evaluate classification methods? 8. What is meant by Decision Tree? 9. What are the 2 phases involved in Decision Tree? 10. What is the condition to stop the partitioning? 11. State the functionality of Greedy Algorithm. 12. What is meant by Information gain? 13. Define Gini Index. 14. State the two approaches to avoid overfitting.
15. Why decision tree induction in data mining? 16. Why we need Bayesian Classification? 17. CCC Given training data D, posteriori probability of a hypothesis h, P(h|D) follows the Bayes theorem 18. What is meant by K-Nearest Neighbor Algorithm? 19. Define case based reasoning approach. 20. What is the role of Genetic Algorithm? 21. State the functionality of Rough set Approach. 22. Define Prediction with classification. 23. How we can estimate error rates? 24. What are the types of Prediction? 25. What is meant by Boosting? 26. State the role of cluster Analysis. 27. Give the applications of clustering. 28. What is the requirement for clustering in data mining? 29. What are the algorithms used for clustering? 30. What are outliers? 16 Marks 1. Discuss Bayesian classification with its theorem 2. What is prediction? Explain about various prediction techniques. 3. Briefly outline the major steps of decision tree classification. 4. Discuss the different types of clustering methods. 5. Describe the working of PAM (Partioning Around Medoids) algorithm. 6. Explain the measure of attributes in decision tree induction and outline the major steps involved in it . 7. Explain in detail about Data mining algorithms.

Cs2032 Question Bank

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Cs2032 Question Bank

Caricato da

Copyright:

Formati disponibili

2 Marks 1. What is datawarehouse? 2. What is the significant use of subject oriented datawarehouse? 3. Why do we use integrated version of datawarehouse?

Potrebbero piacerti anche