I recently participated in a panel on this topic at a Medicaid conference and gave it some more thought. This post summarizes what I took away from that discussion.
I am always skeptical of buzzwords and fads. When the hype subsides, we realize that promises were unrealistic and our expectations way too high. However, there is usually a kernel of value that can be extracted. That is true for Big Data, too.
Before we get into Big Data in healthcare, however, let’s define the term: paraphrasing Wikipedia, Big Data is just a whole lot of data. So much, in, fact, that traditional technologies and approaches cannot keep up with processing and analyzing it fast enough. According to studies from McKinsey Gartner, and others, data is being generated at staggering rates each year, with volume reaching Exabytes (a million Terabytes) and even Zettabytes (a billion Terabytes) each year. The Kaiser Family Foundation predicts that the volume of healthcare data accumulated each year will grow 50-fold between 2013 and 2020. So much for that.
To me, a more useful approach to the topic is to look at the technologies and tools that have evolved to manage and analyze these massive amounts of data. That is more of a Silicon valley approach, driven by having hit hard limitations of what off-the-shelf technology could handle. Google, Facebook, LinkedIn and certain research have had the most need for these technologies to date and, in my opinion, do for the most part remain most relevant to these extremely high-volume data operations.
Another way to look at the technology is by categorizing structured and unstructured data and how technology has evolved to specifically handle the latter. Historically IT has been dealing predominantly with structured data, at worst attempting to force unstructured data into a schema one form or another. Today we see a proliferation of unstructured data, a big driver for the need for new technologies and approaches.
So what about healthcare data, then? Well, I would contend that in the realm of healthcare, we are still dealing with predominantly structured data. Even though it is sensible to argue that taking all the data from all of the many flavors of EMR can only be considered as unstructured data, if we take the data of each system individually, we return to a structured world. So it’s just inconsistently structured. Now, that does not make it less painful to deal with, but I’d like to stipulate that anyway.
Meanwhile, the challenges in healthcare remain the old ones: Quality of data, the feeds and the processes around it, defining and maintaining systems of record and reconciling all other versions of the same data around it. Now, the Big Data movement has brought us some technologies that are also very useful with structured data: new analytics tools, statistical evaluation tools and data visualization (we at HDVI are starting to use Tableau 8.1 both internally and as part of our SaaS platform). These tools are very useful, also when used with more traditional technologies. And that should really be the takeaway for us in Healthcare IT. Whether the technology is SQL , SAS or Hadoop (distributed File system) is really secondary to what should always be our primary goal: that the technology addresses a specific need when pragmatically applied to business requirements.