Comments
Patrick Collands wrote: collands (AT) gmail com I'd be very grateful for an invitation. Thank you.
Cloud Computing
Conference & Expo
November 2-4, 2009 NYC
Register Today and SAVE !..

SYS-CON.TV
Today's Top SOA Links


Understanding Information Transformation
Anticipate the extraordinary

The transformation layer is the "Rosetta stone" of the system. It understands the format of all information being transmitted among the applications and translates that information on the fly, restructuring data from one message so that it makes sense to the receiving application or applications. It provides a common dictionary that contains information on how each application communicates outside itself (application externalization), as well as which bits of information have meaning to which applications.

Transformation layers, such as those that process XML-based messages (e.g., XSLT), generally contain parsing and pattern-matching methods that describe the structure of any message format. Message formats are then constructed from pieces that represent each field encapsulated within a message. Once the message has been broken down into its component parts, the fields may be recombined to create a new message.

Most integration servers can handle most types of information, including fixed, delimited, and variable. Information is reformatted using an interface that the user integration server provides, which may be as primitive as an API, or as easy to use as a GUI.

There are a few aspects to the notion of transformation.

Differences in Application Semantics
Accounting for the differences in application semantics is the process of changing the structure of a message, and thus remapping the structure and data types so that it is acceptable to the target system. Although it is not difficult, application integration architects need to understand that this process must occur dynamically within the integration server.

This process can be defined within the rules-processing layer of the integration server by creating a rule to translate data dynamically, depending on its content and schema. Moving information from one system to another demands that the schema/format of the message be altered as the information is transferred from one system to the next.

Although most integration servers can map any schema to any other schema, it is prudent to try to anticipate extraordinary circumstances. For example, when converting information extracted from an object-oriented database and placing it in a relational database, the integration server must convert the object schema into a relational representation before it can convert the data within the message. The same holds true when moving information from a relational database to an object-oriented database. Most integration servers break the message moving into their environment into a common format and then translate it into the appropriate message format for the target system.

Differences in Content
Related to the concept of accounting for the differences in application semantics, accounting for content changes is another important aspect of transformation. In short, it's the reformatting of information so that it appears native when sent to a target system. The information needs to appear native, requiring that changes be made to source or target systems.

Although many formats exist within most application integration problem domains, we will confine our attention, for the purposes of this manifesto, to the following:

  • Alphanumeric
  • Binary integers
  • Floating point values
  • Bit fields
  • IBM mainframe floating points
  • COBOL and PL/I picture data
  • BLOBs
In addition to these formats, there are a number of formatting issues to address, including the ability to convert logical operators (bits) between systems and the ability to handle data types that are not supported in the target system. These issues often require significant customization in order to facilitate successful communication between systems.

In data conversion, values are managed in two ways: carrying over the value from the source to the target system without change, or modifying the data value dynamically. Either an algorithm or a look-up table can be used to modify the data value. One or more of the source application attributes may use an algorithm to change the data or create new data.

Algorithms of this type are nothing more than the type of data conversions we have done for years when populating data warehouses and data marts. Now, in addition to using these simple algorithms, it is possible to aggregate, combine, and summarize the data to meet the specific requirements of the target application.

When using the look-up table scenario, it might be necessary to convert to an arbitrary value. "ARA" in the source system might refer to a value in the accounts receivable system. However this value may be determined, it must be checked against the look-up table. Integration servers may use a currency conversion table to convert dollars to yen, which may be embedded in a simple procedure or, more likely, in a database connected to the integration server. The integration server may also invoke a remote application server function to convert the amount.

The application integration architect or developer may encounter special circumstances that have to be finessed. The length of a message attribute may be unknown, or the value may be in an unknown order. In such situations, it is necessary to use the rules-processing capability of the integration server to convert the problem values into the proper representation for the target system.

Abstract Data Types
Transformation mechanisms also need to support abstract data types (ADTs), allowing different representation of data and behavior to meet the requirement of the application integration scenario.

ADTs provide a mechanism with a clear separation between the interface and implementation of the data type, including the representation of the data, or choosing the data structure, and the operations of the data

The interface with the abstract data type is created through an associated operation. What's more, the data structures that store the representation of an abstract data type are invisible to the integration view. The ADT also includes any operations, or algorithms, contained with the ADT.

The internal representation and executions of these operations are changeable at any time and won't affect the interface to the ADTs. Thus, a completely different representation is possible for sets storing information in the ADT.

Having said all that, ADTs consist of:

  • An interface, or a set of operations that can be performed
  • The allowable behaviors, or the way we expect instances of the ADT to respond to operations.
The implementation of an ADT consists of:
  • An internal representation of data stored inside the source or target system's variables
  • A set of methods implementing the interface
  • A set of representation invariants, true initially and preserved by all methods
Information Routing
In addition to transformation, information routing is another core feature that provides a mechanism to move information from system to system. We have a few scenarios that apply, including:
  • One to one
  • Many to many
  • Many to one
It's important that your integration technology can route information from many systems to many systems, as well as split information coming from one system to be sent to multiple targets, and combine information coming from many systems for a single target. While this sounds simple, the application of the mechanism is far from simple. We must introduce the notion of behavior to operate on this information.

Intelligent Routing
Intelligent routing, sometimes referred to as flow control or content-based routing, builds on the capabilities of both the rules layer and the semantic transformation layer. An integration server can "intelligently route" a message by first identifying it as coming from the source application and then routing it to the proper target application, translating it if required.

For example, when a message arrives at the integration server, it is analyzed and identified as coming from a particular system and/or subsystem. Once the message is identified and the message schema is understood, the applicable rules and services are applied to the processing of the message, including transformation. Once the information is processed, the integration server, based on how it is programmed, routes the message to the correct target system. This all takes place virtually instantaneously, with as many as a thousand of these operations occurring at the same time.

Filters
In addition to intelligent routing, it's important to provide the notion of filtering, as well. In the world of application integration, filters are software subsystems that are able to analyze content and selectively leave out specific information based on content or, perhaps, source or target information.

Filters are important to application integration due to the complexity of information coming from source systems and the need to simplify that information before it's processed in the integration server or sent to the target system. The notion of filtering also relates to transaction controls.

About David Linthicum
Dave is an internationally known cloud computing and SOA expert. He is a sought-after consultant, speaker, and blogger. In his career, Dave has formed or enhanced many of the ideas behind modern distributed computing including EAI, B2B Application Integration, and SOA, approaches and technologies in wide use today.In addition, Dave is the Editor-in-Chief of SYS-CON's Virtualization Journal. For the last 10 years, he has focused on the technology and strategies around cloud computing, including working with several cloud computing startups. His industry experience includes tenure as CTO and CEO of several successful software and cloud computing companies, and upper-level management positions in Fortune 500 companies. In addition, he was an associate professor of computer science for eight years, and continues to lecture at major technical colleges and universities, including University of Virginia and Arizona State University. He keynotes at many leading technology conferences, and has several well-read columns and blogs. Linthicum has authored 10 books, including the ground-breaking "Enterprise Application Integration" and "B2B Application Integration." You can reach him at david@bluemountainlabs.com. Or follow him on Twitter. Or view his profile on LinkedIn.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON Featured Whitepapers
ADS BY GOOGLE