April 05, 2001, 12:38 PM — The emergence of data integration software is giving corporations the ability to move back-office, enterprise resource planning information to the Internet.
Data integration products provides software "caching," or data staging, between a company's Internet computers and back-office systems from companies such as SAP, Oracle, Sybase and PeopleSoft.
Data integration provides a mirror image of back-office information that is stored on a company's main computers. When an Internet customer needs to check on the status of an order, the inquiry is directed to the data integration software. The company's main computers do not always need to be accessed. Data integration software has enough intelligence to know when to synchronize with the main computers to keep data up to date.
Integrating ERP data for e-commerce applications is done through combined data staging with direct access to ERP data. It involves using a data server and data caches. Data integration software intelligently blends direct real-time and batch data-access methods for extracting data from an ERP system.
Data progresses from one or more sources to one or more target tables, and/or message types (such as XML). The steps of the data movement involve identifying the sources from which data should be extracted, transformations the data should undergo, and where to send the data. Users specify data mappings and transformations through a graphical user interface.
User-defined processes control movement of each block of data and define interdependencies between such movements. For example, if one target table depends on values from other target tables, processes are used to specify the order in which a data server should run individual data movements that fill the tables.
Movements can be designed to run in batch or real-time mode, and are created and managed by administrators to control data movement between ERP, e-commerce, customer relationship management, supply-chain management, and legacy and messaging applications.
Data movement uses distributed query optimization, multithreading, in-memory caching, in-memory data transformations, and parallel pipe lining to deliver high data throughput and scalability. To manage the extraction process and perform batch data extraction from SAP software, for example, optimized ABAP code (SAP's proprietary programming language) is used, obviating the need to develop and maintain customized ABAP code.
Data cache is key
Key to the data integration architecture is a data cache that includes a target schema, source-to-target mappings, and transformations that handle change-data capture, hierarchy extraction, error recovery and security. In addition, a data cache contains predefined data extraction jobs that automatically populate the cache with a company's back-office and/or data warehouse.