data mapping

Healthcare IT Interoperability Nurses

When exchanging data, one has to map data elements from one database to the elements of the table of a destination database. Sounds simple doesn't it? Just get the data file, map it to where it is going, and send it over. Translate the "male" to "M" and the "common cold' to "Rhinitis". In reality, the process is not this simple. A problem inherent in data mapping is that databases from different organizations commonly vary in the granularity of the concepts. If two terminologies have different granularities, they have different levels of detail for similar concepts. Therefore, the concepts will not map one to one. For instance, let’s say that you store your basketballs, baseballs, golf balls, and marbles in separate boxes. I have that stuff, too, but I store my softballs and hardballs separately. The rest of my boxes are the same, so I have one more box than you. When I tell you that I have 30 hardballs and 20 softballs, you have to know to combine them in your system into 50 baseballs. Fine. But when the information is sent back to me, you have lost my distinction. "50 of what?" I say, "Is that hardballs or softballs?" You don't know. The data cannot be reconstituted.

Your mission determines what you want to store in your databases. If I don't have any volleyballs, and you tell me you have two volleyballs, I don't know what to do with that information. What's more, I simply don't care. I store it as "Balls - Other". OK fine. Next, I hand you data about badminton birdies - It's not a ball at all, you say. You don't play badminton, so you just call it "other sports equipment". Once again, when we need to send the data back to its source, the volleyball and birdie data has been lost. My birdies are now under "Other" when I get them back from you. You don't even get any volleyball information back. There is a loss of meaning when going to the less specific or nonmatching concept and back again. So, it is not easy to exchange data meaningfully. I have syntactic understanding (it's in the fifth byte and it is 20 characters long) but not exact semantic understanding. I have a bunch of "Other balls", zero softballs, and zero hardballs - not too helpful if your life depends on it. What does "Other" mean now?

If we transmit medical data enough, will we have everyone diagnosed as "Other" taking that "other" prescription? No, it won't be that bad, but there will be problems. If the Federal government forces us to use values that are meaningful only for certain organizations, standards will be detrimental due to differing granularities. An in-depth understanding of the mission of every entity is required to establish valid, detailed concepts. The FBI's concept of the values for the field "sex" is very different from that of the Veteran's Health Administration, (VHA). The FBI has a purpose for the data, which is to identify people. Hence, for the FBI, the data standard needs a sexual appearance concept. The FBI has 23 values for the data field they call sex. All the values are based on what an individual looks like.

The VHA’s purpose for the data field they call sex is to treat people medically and to be able to assign beds to patients. Their data standard also needs an administrative sex concept. When one works at the data concept level, the data field contents make sense in the context of their purpose. Then the FBI can have their "female appearing as male" value and the VHA can have their "administrative male gender" value. That's why both the VHA and the FBI can benefit from participating in ongoing standards development work. When IT people understand how an organization carries out its mission, they can define the concepts to a very fine level. Once we reach this very fine level of definition, we can send messages without loss of meaning. We now know that in the data standard we are developing, we need both a sexual appearance field and a medical gender field. Then the data fields will contain clear meanings when they are received by yet another Federal organization like FEMA.

Data Mapping Problems in Data Exchanges Between Enterprises

When attempting to map and transmit data from one agency to another, if we transmit medical data without resolution of semantic issues, we may lose the meaning of the data. If your enterprise and my enterprise store different types of details, I may be tempted to solve my problem by lumping your non-matching items into my "Other" category, just so I can view the majority of what you sent me. When I send your data back to you, however, it won't look the same anymore to you. This is because we have different concepts of our data. We may have a different opinion about what details are important. When data field mappings cannot be done with one to one accuracy, the data concepts need to be examined closely so the data transmitted in messages can be understood and stored properly. This is why developing data standards is so crucial.

The Federal Health Information Model is an effort to coordinate data models for the FHA partner agencies. The FHIM model may be viewed at The FHIM is a computationally independent information model.

Our mission determines our data concepts

The data concepts of an organization come from its mission. The FBI databases contain different information from the CDC, which in turn contains different data from FEMA. All are after a different functional result for a different purpose. But all may have data fields with the same names. It is WHY they have these data fields that matter. From the WHY, which is the purpose of the organization's mission, we derive a data concept. The data standard then becomes a catalog of data concepts. We can map concept to concept. The technology offered by XML enables us to label data fields with their standard concept names when they are transmitted between agencies. Ideally, a reference table of a data field's purpose should be maintained.

More information