Data Warehousing Questions

Q:

How to perform incremental load in DataStage?

Answer

-Daily loading is known as incremental load.


-When data is selected from source, selected records are loaded between timestamp of last load and the current time 


-The parameter that are passed to perform are last loaded date and current date


-The first parameter is the stored last run date is read through job parameters


-The second parameter is the current date

Report Error

View answer Workspace Report Error Discuss

1 3664
Q:

What is the difference between agglomerative and divisive Hierarchical Clustering?

Answer

- Agglomerative Hierarchical clustering method allows the clusters to be read from bottom to top and it follows this approach so that the program always reads from the sub-component first then moves to the parent. Whereas, divisive uses top-bottom approach in which the parent is visited first then the child. 


- Agglomerative hierarchical method consists of objects in which each object creates its own clusters and these clusters are grouped together to create a large cluster. It defines a process of merging that carries on till all the single clusters are merged together into a complete big cluster that will consists of all the objects of child clusters. Whereas, in divisive the parent cluster is divided into smaller cluster and it keeps on dividing till each cluster has a single object to represent.

Report Error

View answer Workspace Report Error Discuss

1 3454
Q:

Difference between ER Modeling and Dimensional Modeling.

Answer

Dimensional modelling is very flexible for the user perspective. Dimensional data model is mapped for creating schemas. Where as ER Model is not mapped for creating shemas and does not use in conversion of normalization of data into denormalized form.


ER Model is utilized for OLTP databases that uses any of the 1st or 2nd or 3rd normal forms, where as dimensional data model is used for data warehousing and uses 3rd normal form.


ER model contains normalized data where as Dimensional model contains denormalized data.

Report Error

View answer Workspace Report Error Discuss

0 2469
Q:

How do you generate Sequence number in Datastage?

Answer

Sequence numbers can be generated in Datastage using certain routines. They are


-KeyMgtGetNextVal


-KeyMgtGetNextValConn

Report Error

View answer Workspace Report Error Discuss

1 2096
Q:

What are the components of DataStage?

Answer

Datastage has two components


-Client Components – Designer, Director, Manager and Administrator


-Server Components – Server, Repository and Plug-ins.

Report Error

View answer Workspace Report Error Discuss

1 1784
Q:

What is DataStage?

Answer

- A tool for designing Extraction, Transformation and Loading


- An ideal tool for data integration projects system migrations


- Importing, extracting and creating metadata are within these jobs


- Data stage allows scheduling, monitoring and running the jobs 


- Allows to administer the development and execution in a single environment


 

Report Error

View answer Workspace Report Error Discuss

1 1736
Q:

What is the difference between operational data stage (ODS) and data warehouse?

Answer

Data warehouse


- It is a decision support database system for the purpose of organizational needs. 


-It is non volatile, integrated and time variant collection of data


 


Operational Data Stage


- It is an integrated collection of information. 


-It can contain 90 days of information at maximum.


-ODS supports dynamic data.

Report Error

View answer Workspace Report Error Discuss

1 1477
Q:

Explain the difference between data mining and data warehousing.

Answer

Data mining is a method for comparing large amounts of data for the purpose of finding patterns. Data mining is normally used for models and forecasting. Data mining is the process of correlations, patterns by shifting through large data repositories using pattern recognition techniques.


Data warehousing is the central repository for the data of several business systems in an enterprise. Data from various resources extracted and organized in the data warehouse selectively for analysis and accessibility.

Report Error

View answer Workspace Report Error Discuss

0 1355