Healthcare Predictive Analytics

“The Formula” – great summer reading and some implications for healthcare predictive analytics.

I would like to recommend “The Formula” by Luke Dormehl for a good summer read.   I am enjoying this book so far.  I think it should be a must read for all of those interested in predictive analytics and predictive modelling.  A couple of passages from the beginning of the book are provided below.


“Algorithms sort, filter and select the information that is presented to us on a daily basis.”  “… are changing the way that we view … life, the universe, and everything.”

“To make sense of a big picture, we reduce it …  To take an abstract concept such as human intelligence and turn it into something quantifiable, we abstract it further, stripping away complexity and assigning it a seemingly arbitrary number, which becomes a person’s IQ.”

“What is new is the scale that this idea is now being enacted upon , to the point that it is difficult to think of a field of work or leisure that is not subject to algorithmization and The Formula.  This book is about how we reached this point, and how the age of the algorithm impacts and shapes subjects as varied as human creativity, human relationships, notions of identity, and matters of law.”

“Algorithms are very good at providing us with answers in all of these cases.  The real question is whether they give us the answers we want (my emphasis).”

This takes us back to George E.P. Box’s famous quote “all models are wrong, but some are useful”.   We can create algorithms for almost anything, but how useful are they.   Accurate models can be created that work really well on deterministic systems, but are much harder to develop on complex systems.   As you strip away features to be studied from that complex system, you lose the impact of that feature on the system. You try to select features that do not have a huge impact on the performance of the system, but this is often unknowable in advance.

One of the great challenges in clinical medicine is trying to determine or predict what is going to happen to a patient in the future.   We know generally that smoking is bad, too much alcohol is bad, being overweight is bad, not exercising is bad, not sleeping enough is bad.  We know these are bad for the overall population of people.  However we do not know how each of these effect a single patient, nor how they are interrelated.   We would like to develop models that can predict what will happen if you have certain conditions (predictive modeling), and then look at what would happen if you took certain courses of action/treatments/preventative actions(prescriptive modeling).  The results of these models would allow clinicians and patients to be better informed and choose the best pathway forward.

Of particular interest to me, I would like to be able to predict real-time what is going to happen to a patient I am seeing in the emergency room.    This is a complex situation.   Their current state – physiologic vital signs (level of consciousness, blood pressure, pulse, respiratory rate, temperature, blood oxygen level, respiratory variability, heart rate variability, ekg,  etc.), along with their current laboratory and radiological imaging findings will define their current problem or diagnosis.  The patients past medical history, medications, allergies, social support, living environment, etc.,  will have major impacts on how they respond to their current illness or injury.  We would like to aggregate all of this information into predictive and prescriptive models that could predict future states.   Are the patients safe to be discharged home or do they need to be admitted?  If they need to be admitted, can they go to the short stay unit, a bed with cardiac monitoring, a bed with cardiac monitoring, or the intensive care  unit?  Given the current treatment, what will their response to this treatment be – will they get better or worse?  Will they develop sepsis?  Will they develop respiratory failure and require a tube be placed down their throat and a ventilator to breathe for them?

A particularly exciting area ripe for development is the internet of things.   The internet of things is going to revolutionize how we collect data, both at home and in the hospital.   This much-needed capability will allow us to monitor patients at home,  detect illnesses much earlier, monitor responses to therapies, etc.,  and will be useful for a whole host of things we haven’t even imagined yet.

These are some of the complex questions that face us now in medicine.  I am excited to participate in this quest to answer some of these vexing questions using all of the analytical tools that are currently available – whether “small data”  using standard descriptive and inferential statistics, predictive analytics, and big data analytics.

Becoming a Healthcare Data Scientist

My Current Baseline Data Scientist Skill Set

It will be interesting to compare my skill set once I finish the predictive analytics program to my current skill set.  I will outline my current skills so I can come back later and compare the two.

I will organize my skills using the format presented by Mitch Sanders in his blog article posted on 8.27.13 “Data Science – Capturing, Analyzing, and Presenting Data Skills”.  (

1.  Capturing Data

Programming and Database skills:

I am weak in this area.  I have used R a bit to do some statistical analysis in the past.  I am currently learning Python  as I write this.  So far, I have found that Codecademy’s Python course is the best learning platform for me.  My next favorite resource is Zed Shaw’s book, “Learn Python the Hard Way”.  I really like his practical approach.  “Introducing Python.  Modern computing in simple packages” by Bill Lubanovic is also good, but but a bit more advanced.  Finally, the Visual Quickstart Guide “Python” by Toby Donaldson is a quick reference guide.  Going past basic programming, my skills are near or below zero.  I do not know how to use Hadoop, Java, SQL, Hive or Pig.

Business Domain Expertise and Knowledge

This is my strongest area of expertise.  I started off in medicine in 1984 as a basic EMT, became a EMT-Paramedic, and then Paramedic Educator.  I finished medical school (University of Illinois College of Medicine in Peoria Illinois) in 1994, and my Emergency Medicine Residency at Saint Francis Hospital in Peoria Illinois in 1997.   I have practiced academic and community based emergency medicine since then.   I have been a medical director for both ground based EMS and for a flight program.  I am also one of our health system’s Chief Medical Information Officers (CMIO), so have had to learn the field of Healthcare Information Technology as well.   In my current role I have a special interest in Business Intelligence and Analytics, including predictive analytics.  My passion is for developing smarter systems that can provide information about a patients risk of developing certain diseases/conditions, risk of deterioration/death, early detection of sub-clinical illness, and information about a patient’s response to treatment and therapy.  Hence my interest in predictive analytics.

Data Modeling, Warehouse, and Unstructured Data Skills.

I have minimal skills in this category.

2.  Analyzing Data

Math Skills.

I have basic math skills, but it has been a long time since I have had to do more than basic math, including calculus and linear algebra.  After I finish getting a basic foundation in Python, my next step is to refresh my knowledge of math/calculus/linear algebra before starting my “Math for Modelers” course this fall.

Statistical  and Analytical Skills

I do have a little better grasp of descriptive and inferential statistics.   But I will need to increase my knowledge of the advanced statistical techniques not commonly used in medicine today.  These would include predictive analytics, regression, multivariate analysis, linear models, time series analysis, machine learning, etc.

3.  Presenting Data

I am really excited to learn about and improve my data visualization skills.  I am really pushing hard for our organization to move away from excel and PowerPoint based presentations of data, to more relevant methods.

Storytelling Skills

I am a pretty good storyteller, but would like to improve my skills, especially in presenting the data and stories around the data.  I would like to help people  understand the insight created by the data analysis, and then help them move to operationalizing that insight, and driving organization change to improve patient outcomes.

In summary, my strongest skills are my love of data and analytics, my (obsessive) desire to become a data scientist, and my domain knowledge as it pertains to healthcare.  My other skills will have to be works in progress.

I would love to hear comments on what you think, and any recommendations/advice for students just starting this journey.

June 10, 2015

Becoming a Healthcare Data Scientist

Am I a little anxious to get started with classes at Northwestern?

I had a nice conversation with my academic adviser at Northwestern this morning and laid out a preliminary plan.  I then signed up for my first class – 400-DL Math for Modelers.  This will cover a review of matrices, linear programming, probability, differential and integral calculus with an emphasis on applications.

I actually ordered my textbook, Finite Mathematics and Calculus with Applications by Lial, including MyMathLab.  Even though classes don’t start until September, I do want to start reviewing the textbook, as I am a little anxious since it has been so many years since I took these courses.