The emergence of data integration software is giving corporations the ability to move back-office, enterprise resource planning information to the Internet.
Data integration products provides software "caching," or data staging, between a company's Internet computers and back-office systems from companies such as SAP, Oracle, Sybase and PeopleSoft.
Data integration provides a mirror image of back-office information that is stored on a company's main computers. When an Internet customer needs to check on the status of an order, the inquiry is directed to the data integration software. The company's main computers do not always need to be accessed. Data integration software has enough intelligence to know when to synchronize with the main computers to keep data up to date.
Integrating ERP data for e-commerce applications is done through combined data staging with direct access to ERP data. It involves using a data server and data caches. Data integration software intelligently blends direct real-time and batch data-access methods for extracting data from an ERP system.
Data progresses from one or more sources to one or more target tables, and/or message types (such as XML). The steps of the data movement involve identifying the sources from which data should be extracted, transformations the data should undergo, and where to send the data. Users specify data mappings and transformations through a graphical user interface.
User-defined processes control movement of each block of data and define interdependencies between such movements. For example, if one target table depends on values from other target tables, processes are used to specify the order in which a data server should run individual data movements that fill the tables.
Movements can be designed to run in batch or real-time mode, and are created and managed by administrators to control data movement between ERP, e-commerce, customer relationship management, supply-chain management, and legacy and messaging applications.
Data movement uses distributed query optimization, multithreading, in-memory caching, in-memory data transformations, and parallel pipe lining to deliver high data throughput and scalability. To manage the extraction process and perform batch data extraction from SAP software, for example, optimized ABAP code (SAP's proprietary programming language) is used, obviating the need to develop and maintain customized ABAP code.
Data cache is key
Key to the data integration architecture is a data cache that includes a target schema, source-to-target mappings, and transformations that handle change-data capture, hierarchy extraction, error recovery and security. In addition, a data cache contains predefined data extraction jobs that automatically populate the cache with a company's back-office and/or data warehouse.
A cache serves as a single point of integration for enterprise and e-commerce data, minimizing the need for direct access to back-office systems and for complex real-time integration. The cache off-loads numerous, unnecessary requests for data from back-offfice systems, thus letting e-commerce firms scale to a greater number of users, while letting back-office systems do what they are designed to do.
Data integration software works in conjunction with products from enterprise application integration vendors and process integrators, not in lieu of them. Indeed, as data integration software becomes more pervasive as a tool used for business-to-business integration, it will dramatically reshape the way business-to-business integrators work together, as well as how businesses move to the Internet.
This story, "Data integration software speeds Web" was originally published by Network World.