Data Warehousing Questions

Q:

Explain the difference between data mining and data warehousing.

Answer

Data mining is a method for comparing large amounts of data for the purpose of finding patterns. Data mining is normally used for models and forecasting. Data mining is the process of correlations, patterns by shifting through large data repositories using pattern recognition techniques.


Data warehousing is the central repository for the data of several business systems in an enterprise. Data from various resources extracted and organized in the data warehouse selectively for analysis and accessibility.

Report Error

View answer Workspace Report Error Discuss

0 1794
Q:

State the difference between View and Materialized View ?

Answer

The main differenc between View and Materialized View states that -


1. View means it stores the SQL statement in the database and let you use it as a table. Everytime you access the view, the SQL statement executes.


2. Materialized view means it stores the results of the SQL in table form in the database. SQL statement only executes once and after that everytime you run the query, the stored result set is used. Advantage include quick query results.

Report Error

View answer Workspace Report Error Discuss

6 1745
Q:

What is an OLTP system and OLAP system?

Answer

OLTP stands for OnLine Transaction Processing. Applications that supports and manges transactions which involve high volumes of data are supported by OLTP system. OLTP is based on client-server architecture and supports transactions across networks.


OLAP stands for OnLine Analytical Processing. Business data analysis and complex calculations on low volumes of data are performed by OLAP. An insight of data coming from various resources can be gained by a user with the support of OLAP.

Report Error

View answer Workspace Report Error Discuss

0 1690
Q:

Describe about Physical Data Integration?

Answer

- Physical Data Integration is all about creating new system that replicates data from the source systems.


- This process is done to manage the data independent of the original system.


- Data Warehouse is the example of Physical Data Integration.


- The benefits of PDI include data version management, combination of data from various sources, like mainframes, flat files, databases.


- A separate system is needed for handling vast data volumes.


 

Report Error

View answer Workspace Report Error Discuss

0 1595
Q:

What is Data Mart?

Answer

Data Mart is a data repository which is served to a community of people who works on knowledge (also known as knowledge workers). The data resource can be from enterprise resources or from a data warehouse.

Report Error

View answer Workspace Report Error Discuss

0 1535
Q:

Explain the use of lookup tables and Aggregate tables.

Answer

At the time of updating the data warehouse, a lookup table is used. When placed on the fact table or warehouse based upon the primary key of the target, the update is takes place only by allowing new records or updated records depending upon the condition of lookup.


The materialized views are aggregate tables. It contains summarized data. For example, to generate sales reports on weekly or monthly or yearly basis instead of daily basis of an application, the date values are aggregated into week values, week values are aggregated into month values and month values into year values. To perform this process aggregate function is used.

Report Error

View answer Workspace Report Error Discuss

0 1490
Q:

Describe the foreign key columns in fact table and dimension table.

Answer

The primary keys of entity tables are the foreign keys of dimension tables.


The Primary keys of fact dimensional table are the foreign keys of fact tablels.

Report Error

View answer Workspace Report Error Discuss

1 1486
Q:

What is data cleaning? How can we do that?

Answer

Data cleaning is also known as data scrubbing. Data cleaning is a process which ensures the set of data is correct and accurate. Data accuracy and consistency, data integration is checked during data cleaning. Data cleaning can be applied for a set of records or multiple sets of data which need to be merged.


Data cleaning is performed by reading all records in a set and verifying their accuracy. Typos and spelling errors are rectified. Mislabeled data if available is labeled and filed. Incomplete or missing entries are completed. Unrecoverable records are purged, for not to take space and inefficient operations.

Report Error

View answer Workspace Report Error Discuss

0 1481