Moving toward flexible healthcare data sharing

Dec 27, 2006

Healthcare organizations typically use a multitude of separately developed information technology (IT) applications. The exchange of healthcare-related data between organizations and systems is required both for administrative as well as clinical reasons. Most of the health information systems today are proprietary and often only serve one specific department within a healthcare organization, resulting in difficult interoperability problems.

To complicate matters, a patient's health information may be spread out over a number of different systems or organizations, which do not interoperate. This makes it very difficult for clinicians to capture a complete clinical history of a patient.

Healthcare is increasingly looking at extensible markup language (XML) as a solution to the data integration problem. The assumption is that Web enabling applications by converting them to HTTP and XML will allow communication using standard protocols for cross-platform data exchange. But this is only half the solution because different metadata architectures are restricting efforts to easily exchange data. Why? Because of the naming problem.

Is XML the answer?

XML is a widely deployed standard that has revolutionized the capability for one machine to talk directly with another machine. XML is a common communications protocol that allows applications to talk with one another. It guarantees that the data packets transmitted are formed and transported according to the rules of the protocol.

XML uses tags to describe and format the raw data. An XML tag is wrapped around the raw data itself so that the application can interpret the raw data. These metadata tags allow the application to put the right data in the right location on the screen so that we understand it correctly.

In the example above the raw data is the same but the metadata tags used to define the data are different. This results in a naming problem where the metadata tag names are nonaligned. An application in Hospital B would not understand the tags provided by Hospital A, and vice versa.

The naming problem exists because different systems use different metadata naming conventions. The tags themselves are given different names even though the encapsulated data is the same. They essentially are speaking different languages.

If I speak English and you speak Cantonese, we may be saying the same thing but cannot understand one another. This is the problem the applications are having. The challenge of the naming problem is semantics. Semantics refer to the meaning of the data in contrast to syntax, which solely defines the structure and format of the markup tags.

We need to understand the meaning of the raw data when the metadata tags are different. This critical requirement is necessary every time systems exchange data. The systems either need to be speaking the same language or they need a translator.

The current challenge

To speak the same language, an organization needs to adopt a standard language or vocabulary. There are many such standards across all range of industries, including healthcare. Industry standards such as PML, GPC, IFX, ACORN, ARTS, TINA, EMBARC, PIDX, and many others are common.

In healthcare there are numerous industry standards such as HL7, DICOM, CDA, CEN's, EHRcom, and openEHR. For standards to be truly useful, all the applications across every institution need to talk to one another using the same standard language.

If a nationwide system -- let alone a global system -- of interoperable medical records is to be realized, it will mean getting every hospital, every nursing home, every pharmacy, and every one of the hundreds of thousands of physicians who belong to independent or small group practices to adopt a single standard. Yet, the healthcare industry has many standards, and it is not realistic to expect one national, let alone global, standard to dominate the market.

Why is this doomed to failure? Because no one data standard can meet every requirement. Different organizations and institutions will have unique needs that no single data structure can possibly capture. Such standards require a very precise match between source and destination tags.

In this instance, you could not make even the smallest change to a standard vocabulary to meet a particular or unique requirement that your organization, department, systems, or processes may have. In essence, regional dialects cannot exist within a standardized language. This one-size-fits-all ultimatum has not been successfully adopted across other industries and, if experience is any teacher, is not likely to be adopted globally in healthcare.

Most organizations will "extend" these standards or have implemented proprietary metadata schemas. Proprietary schemas allow organizations to tailor XML vocabulary to capture the specific information that is unique to its systems or processes. When each organization or department speaks its own dialect or its own language, then we require a translator.

A translator in XML speak is called a stylesheet -- specialized software code written to translate each XML tag in your schema into the corresponding XML tag of another system. It is custom code that must be written for each translation. Data integration using this type of translation (called transformations in XML speak) requires a difficult level of cooperation between partners to agree on each markup tag's precise meaning to integrate applications. Any mistakes in the translation could result in costly or dangerous consequences.

Semantic Web

Quite a few emerging solutions are meeting these challenges. The current trend in dealing with interoperability is to manage it not on a syntactic level, but on a semantic one. Called the "Semantic Web," the idea is to capture the shared meaning expressed by the data between common processes. The Semantic Web includes protocols such as RDF and OWL, as well as a new protocol called the Metadata Semantic Language (MSL).

The MSL is formatted as a separate layer to XML -- it is an integration layer, or a metadata layer. Metadata tags and naming conventions are enhanced with a separate layer of semantic information that encodes the definition of each element of data, including its relationship with other elements.

One of the big benefits of a semantic layer is its compatibility with legacy systems and schemas. In an era of underequipped emergency rooms and nursing shortages, convincing a cash-strapped hospital to invest millions in new software or consulting fees to rewrite its entire data structure will be nearly impossible and would take decades to achieve.

With MSL you do not need to replace or upgrade existing systems or rewrite any of your existing metadata tags. The MSL does not replace XML. Your applications still use localized or customized XML markup languages. The MLS is used when applications in separate data domains using nonaligned schema need to communicate with one another.

In the example above the metadata tags from Hospital A map to a common semantic layer as do the metadata tags from Hospital B. This simple semantic layer allows the metadata tags with the same semantic meaning to be aligned. Data integration can now be fully automated by aligning the corresponding semantic elements. Transformation of the XML data tags is simply a matter of aligning the semantic models. This process can be fully automated between disparate and heterogeneous applications.

Enabling this shift from purely syntactic to semantic interoperability are ontologies; they are semantic models of the data and they interweave human understanding of symbols with their machine processability. The MSL maps your proprietary metadata syntax into a shared ontology. This semantic language or data model expresses the underlying meaning of the XML tag in a commonly understood and shared vocabulary.

The MSL is like a universal language that we all understand. It is structured as a set of formal ontologies that are clearly understood within an industry, yet provide a high degree of flexibility within the formal structure. This allows all healthcare organizations to map low-level syntax, whether HL7v3, EHR, or a proprietary XML schema into the semantic model.

The MSL provides a simple and flexible framework to make this exchange of semantic meaning possible and to automate the mapping of data tags between autonomous systems. When the data has been mapped into the semantic model, you can make this model available to partners. This model allows you to exchange data with any other system or institution without the painful process of traditional integration.

By Steve Perry
AuntMinnie.com contributing writer
December 28, 2006

Perry is chief technical officer (CTO) and founder of Jumper Networks, a Plymouth, MA-based data-interoperability development and consulting firm. He can be contacted at 508-224-3292 or via the company's Web site.

Medical software design, Part III: How product line architectures fail, January 29, 2004

Medical software design, part II: Effective approaches for product line architectures, January 28, 2004

Managing medical product line architecture strategy, January 27, 2004

Swiss promote open-source radiology informatics tool, September 5, 2003