Skip to main content

Clinical data warehouses: Sometimes it's worth being lazy


If you are a health organization you probably have or thinking of having a clinical data warehouse.

A Clinical Data Warehouse (CDW), sometimes called Clinical Data Repository (CDR), is a database that consolidates data from a variety of clinical sources to form a unified view of a the data for various purposes. Typical data types which are often found within a CDW include: clinical laboratory test results, vital signs, patient demographics, administered meds, hospital admissions, ICD-9 codes and more.

Developing and sustaining an effective CDW operations unit is a substantial effort and long-term commitment. The main challenge in designing a CDW is defining its scope and the use cases it should support. In theory a CDW can serve as a basis for reporting, studying and planning. The use case that is often mentioned in relate to CDW is supporting clinical trials. This would allow for researchers to have all the information from a study in one place as well as let other researchers benefit from the data to further innovation.

The challenge experienced by the stakeholders is that their needs vary depending on the clinical procedures and data formats. In order to please the fuzzy goals of the various stakeholders within the health organization IT teams spend most of their time gathering and modeling data instead of interpreting information and finding opportunities for cutting costs and improving patient care.

The two familiar design approaches are top–down and bottom–up. The top–down design approach provides the final shape of the system. This approach starts with defining an Enterprise Data Model (EDM) and constructing the pieces to reach the final goal. The bottom–up approach on the other hand starts with dividing the large problem into small pieces of obstacles and solving each obstacle individually.

A “by-the-book” development approach which most health organization adopt is top–down. It is a systematic approach which helps in decreasing integration obstacles and team collaboration. This approach is however time consuming and difficult to implement because concept consistency is difficult to achieve for all clinical organization data.
The bottom–up design approach is preferable for design, implementation and development of clinical data marts. This approach is characterized by flexibility and low implementation cost of CDW data marts, but it faces difficulty in integrating various data marts in the clinical enterprise of CDW.

In recent years No-SQL database engines have matured and so is their adoption rate. A No-SQL database supports schema-on-demand which is also referred to as Late binding. With late binding it is possible to execute complex queries on a large dataset in an effective manner without having to define first an EDM.
With late binding instead of spending months and even years to implement a data warehouse as in the case of the traditional early-binding approach, health systems can launch their CDW in weeks.
Late-binding data warehouses are also more scalable and adaptable to the industry-specific problems healthcare organizations are trying to solve.

Successful early EDW leaders ignored the Enterprise Data Model (EDM) in favor of late binding. In a fluid environment, an EDM is outdated as soon as it’s complete. Also, due to the nature of the EDM process (continuous modelling and mapping), data architects never finish mapping. Every time there’s a change in the environment, they have to go back and change the model, the ETL, and the downstream analytics. 

Healthcare is undergoing changes to business rules and vocabulary at an unprecedented rate. The Late-binding data warehouse provides not only faster time-to-value, it also enables the agility required for today’s healthcare analytics demands.
If your health organization hasn’t yet implemented a CDW, you may have profited from waiting that long. Using late binding and the growing echo system of No-SQL engines, you may be able to catch up with those organizations who started long time ago.


Comments

Popular posts from this blog

The big battle: Best of breed vs One stop shop

The world of economics has decided on this debate a long time ago: monopolies are bad, diversity is good. No matter what a monopoly promises, you can rest assure that over time the lack of competition will cause prices to go up and quality to go down. When it comes to healthcare IT, however, there is one unique factor that flips the coin – interoperability. Despite various attempts the healthcare industry has yet to solve the interoperability challenge in a satisfactory manner which will enable a full continuum of care across different health information systems within a health delivery organization. Taking a common scenario of prescribing a medication order in theatres using a surgical system to be later administered in a ward requires significant investment to achieve, even using the modern Fast Health Interoperability Resource (FHIR) protocol. The investment required to streamline the data flow across systems raise in an exponential order with every new system that is thrown to t

Are we ready for a Cloud First hospital?

I will start this article by defining what I mean by the term a Cloud First hospital. The term cloud has been a buzz word in the past decade which led many organizations to declare their support for the cloud, sometime without understanding its true meaning. For the purpose of this article I am proposing a simple test to decide whether an organization is a cloud first or not. If you are software vendor you must have an IT department which directly in charge of the system up-time at your clients sites. If you are a health organization then you should never have visited the data center where your data resides. A Cloud First hospital is one which more than 50% of its systems reside in data center that none of its staff members ever visited or not even sure where they are. According to a recent survey by Datica, in the US only 17.7 percent of the respondents say they work with healthcare organizations that have more than 50% of the existing software infrastructure remotely hosted or