Ray Wurlod

P.O. Box 1214

ABN 57 092 448 518

North Sydney  N.S.W.  2060

Education and Consulting Services

Australia

 

Email: rayw@mindless.com

 

 

Elements of DataStage™ Health Check

 

This page outlines the elements of a DataStage health check, one of the services offered by this business.

Depending on the site not all of the items below will be covered; for example if the site does not use parallel jobs then no investigation will be made of runtime column propagation and its impact on metadata management.

Delivered at the end of the health check is a report outlining what was investigated, how well the site achieves its goals, whether there exists scope for improvement in any of these areas and, if so, how that might be accomplished.

 

Structured Interview

Overview of DataStage at site

Purpose to which DataStage is being put

Other tools in use at site that interact with ETL

Restrictions on processing (for example time windows)

System description

Data sources and targets

Personnel and skill sets

Existence of in-house standards

 

Monitoring Processes

Are time windows able to be met?

How much scope exists for increased capacity?

Strategies for ongoing monitoring

 

Documentation

Design documents

DataStage documentation

Metadata reporting (processing and technical metadata)

 

Metadata Management

Preserving Repository linkages

Usage analysis

Preparing for MetaStage

Runtime column propagation (parallel jobs)

 

DataStage Best Practices

Job parameters

Naming of objects

Re-use of components

Structure of Repository (categories)

Staging areas

Phases of ETL

 

Performance

Definitions (developer and execution performance)

Expectation management

Efficient practices

 

Testing and Quality Assurance

Unit testing (job and routine level)

System testing in development environment

System testing in test/QA environment

User acceptance testing

 

Versioning

Promotion and regression strategies

Version numbers for jobs and routines

Version Control practices

MetaStage directory versions

 

Disaster Planning/Recovery

Development System backups

Project exports

Documented recovery strategy

 

Administration

Security

Project-wide defaults

Repository maintenance

Hashed file tuning and maintenance

 

Programming

Routines

Job sequences/job control/batch

 

 

 

DataStage and MetaStage are trademarks of International Business Machines Corporation.

Formally the product names are IBM® Websphere® DataStage and IBM Websphere MetaStage.

IBM and Websphere are registered trademarks of International Business Machines Corporation.

 

This page is copyright © 2005-2006, Ray Wurlod. All rights reserved.

Page last updated 31 October, 2006.