Data Warehousing Questions

Q:

What is the difference between agglomerative and divisive Hierarchical Clustering?

Answer

- Agglomerative Hierarchical clustering method allows the clusters to be read from bottom to top and it follows this approach so that the program always reads from the sub-component first then moves to the parent. Whereas, divisive uses top-bottom approach in which the parent is visited first then the child. 


- Agglomerative hierarchical method consists of objects in which each object creates its own clusters and these clusters are grouped together to create a large cluster. It defines a process of merging that carries on till all the single clusters are merged together into a complete big cluster that will consists of all the objects of child clusters. Whereas, in divisive the parent cluster is divided into smaller cluster and it keeps on dividing till each cluster has a single object to represent.

Report Error

View answer Workspace Report Error Discuss

6 20583
Q:

How to perform incremental load in DataStage?

Answer

-Daily loading is known as incremental load.


-When data is selected from source, selected records are loaded between timestamp of last load and the current time 


-The parameter that are passed to perform are last loaded date and current date


-The first parameter is the stored last run date is read through job parameters


-The second parameter is the current date

Report Error

View answer Workspace Report Error Discuss

1 6653
Q:

Difference between ER Modeling and Dimensional Modeling.

Answer

Dimensional modelling is very flexible for the user perspective. Dimensional data model is mapped for creating schemas. Where as ER Model is not mapped for creating shemas and does not use in conversion of normalization of data into denormalized form.


ER Model is utilized for OLTP databases that uses any of the 1st or 2nd or 3rd normal forms, where as dimensional data model is used for data warehousing and uses 3rd normal form.


ER model contains normalized data where as Dimensional model contains denormalized data.

Report Error

View answer Workspace Report Error Discuss

0 5127
Q:

How do you generate Sequence number in Datastage?

Answer

Sequence numbers can be generated in Datastage using certain routines. They are


-KeyMgtGetNextVal


-KeyMgtGetNextValConn

Report Error

View answer Workspace Report Error Discuss

1 4795
Q:

What is the purpose of cluster analysis in Data Warehousing?

Answer

Cluster analysis is used to define the object without giving the class label. It analyzes all the data that is present in the data warehouse and compare the cluster with the cluster that is already running. It performs the task of assigning some set of objects into the groups are also known as clusters. It is used to perform the data mining job using the technique like statistical data analysis. It includes all the information and knowledge around many fields like machine learning, pattern recognition, image analysis and bio-informatics. Cluster analysis performs the iterative process of knowledge discovery and includes trials and failures. It is used with the pre-processing and other parameters as a result to achieve the properties that are desired to be used.

Report Error

View answer Workspace Report Error Discuss

2 4722
Q:

What are the components of DataStage?

Answer

Datastage has two components


-Client Components – Designer, Director, Manager and Administrator


-Server Components – Server, Repository and Plug-ins.

Report Error

View answer Workspace Report Error Discuss

2 3859
Q:

What is the difference between operational data stage (ODS) and data warehouse?

Answer

Data warehouse


- It is a decision support database system for the purpose of organizational needs. 


-It is non volatile, integrated and time variant collection of data


 


Operational Data Stage


- It is an integrated collection of information. 


-It can contain 90 days of information at maximum.


-ODS supports dynamic data.

Report Error

View answer Workspace Report Error Discuss

1 3792
Q:

What is DataStage?

Answer

- A tool for designing Extraction, Transformation and Loading


- An ideal tool for data integration projects system migrations


- Importing, extracting and creating metadata are within these jobs


- Data stage allows scheduling, monitoring and running the jobs 


- Allows to administer the development and execution in a single environment


 

Report Error

View answer Workspace Report Error Discuss

1 3787