Data Warehouse Toolkit

A facility and framework for acquisition of data from legacy and other systems. Data acquisition modules allow timed or event-triggered download of any data from a number of different data sources and systems.

Wherever your corporate data, the data warehouse toolkit will help you get it into your data warehouse.

Generic Data Acquisition

A facility and framework for acquisition of data from legacy and other systems. Data acquisition modules allow timed or event-triggered download of any data from a number of different data sources and systems.

For example, a download from an AS/400 general ledger system can be triggered to happen every night, and the summarized results loaded into a data warehouse. Validations and business rules can be incorporated as required, and are automatically applied across all related downloads, allowing referential integrity between separate systems - for example, a file from a UNIX system can be required to have valid account number, where the account numbers are defined in an AS/400 DB2 database.

Topics

Generic Data Acquisition
Administration
Extensible Framework
Handling Exceptions
Data Loads
PIPFilter

Administration

All of the tools in the data warehouse toolkit can be fully administered and controlled remotely, via standard web browser, making extracts at multiple locations easy to manage.

Extensible Framework

The data warehouse toolkit is an extensible framework designed to make the building, maintenance and access of a data warehouse system easier.  It contains tools for extracting data from legacy systems and storing that data in tables (encapsulated by database objects) in your data warehouse.

  • Extracted data is filtered through your own business rules, and exceptions are logged and handled as you define.
  • Extracted data can also be translated or manipulated as required during the extract process - for example translating the date format of a legacy system into a date time field in your database.
  • Data extractions may be scheduled by standard OS scheduler routines (e.g. cron, NT scheduler, ROBOT, etc) manually launched, or event driven (e.g. load the ledger data whenever a period is closed).

Handling Exceptions

Handling exceptions or violations of business rules is important, particularly when dealing with data from legacy systems that do not enforce valid referential integrity for example.  The extract process can be configured to handle exceptions that are warnings, errors, or fatal errors (e.g. errors which prevent the load).  Error notifications are handled through the Event Notifications System, which makes sure the appropriate person(s) gets the exception information.

For example, lets say a payroll load is taking place and there is a business rule that no payroll transaction may exist for a position which has no budget. When a record is loaded that violates this rule, it is not that helpful simply to inform the system operator, or the IT Operations Department.  They cannot resolve the problem themselves.  Instead, an email is sent with details of the problem to the Payroll Manager and his/her assistant, who can then take the proper action to either correct the error or budget for the position.  The load either continues or is aborted depending on the severity of the rule violation.

Data Loads

Data loads are queued through a job queue in a database, making it easy to monitor currently queued loads. Queued entries can be handled either single or multi-threaded, meaning some loads may need to be processed one at a time (e.g. a ledger master file and ledger transactions related to them), while others (payroll items and inventory movements) could be loaded at the same time.  This is also a good way to balance loads or database servers, as it may be much more efficient to load the same database sequentially, rather than in parallel.

Loads can be also chained to trigger past-processing, such as a load to an OLAP server or data mart server after the primary load is complete. Then source of data loads can be widely varied.  Saved report files could be parsed from a print queue; Legacy files can be accessed via ODBC, ports can be queued via TCP (telnet, FTP), data can be extracted from a CORBA Service, or even email can be received and scanned for particular attachments.

PIPFilter for Pilot™ Internet Publisher

The data warehouse toolkit includes a servlet that allows more advanced HTML formatting of result sets from Pilot Internet Publisher. PIP is the intranet-enabled middleware for the Pilot Decision Support Suite from Pilot software, and the PIPFilter servlet enhances PIP's fomatting capabilities to allow PIP to be used as a reporting tool in addition to it's typical OLAP role.


Home | Products | Services | Partners | Customers | About Us | Login | Forums | Contact Us

Copyright © 2001 Jcorporate Ltd. All rights reserved. Copyright Privacy

Last Modified: 26-Apr-01