Wednesday, February 13, 2013

Predictive Modeling for the Indecisive





Decisions, Decisions, Decisions... They're Everywhere

I don't know about you, but sometimes I'd like a little help in making some of complex decisions.  If you are like me when it comes to decisions, it can get exhausting evaluating the pros and cons of every element contributing to the outcome of a decision.  Whether you are a business executive deciding on specific innovative projects to invest in or a doctor determining a diagnosis for a patient, there comes a point when a decision must be made to optimize the outcome of the situation.  As it is nearly impossible for one person to be well informed about all factors contributing to the effect of an outcome, how can one make the right decision?  Let's consider some factors at play.  Often the data available to us at the present, is data about the past, which is then used to make assumptions about the future.  You can see the danger already, right? While historical data can give a great view of how a product has performed in the past, it may or may not be relevant to a new product in a different market.  That said, historical data does provide a reference point and insight about challenges and successes in past circumstances and can be beneficial in assessing the current situation.  Likewise, a doctor pulls valuable information from historical population health data about patients with similar attributes and symptoms when considering possible diagnosis; however, population data may or may not be indicative of the condition of an individual patient.  Value is derived from identifying the risks and uncertainties of the given situation based on the information available to come to a decision [1].  Predictive analytics can be used as a tool to evaluate a large amount of information and incorporate risks into generating a particular outcome.



What is Predictive Analytics?

Predictive analytics uses statistics, machine learning, and data mining to evaluate historical data to create a model to predict future outcomes given the availability of similar types of information.  There are many tools available to evaluate massive amounts of information in an effort to aid the typical business executive and doctor in making decisions.  Sometimes the information seems endless as information on cost of goods sold, selling price, capital investments, volume sold and many other metrics are available.  Critical in the evaluation of a decision is knowing what attributes directly affect the outcome, and further, which attributes can be controlled. 

How can I get started?

Diener and Kidd identify five factors important in obtaining reliable results when utilizing predictive analytic models [2].
            1. Identify relevant inputs to obtain good output
            2. Obtain a comprehensive range of attributes in order to achieve a complete model.
            3. Have patience when aggregating and integrating data sources (there are tools available!)
            4. Use reliable modeling tools
            5. Know how to interpret the output or use tools with a user friendly output.

The input information selected in predictive modeling is crucial in obtaining an accurate model to predict a defined outcome.  Typically the more attributes (relevant to the outcome, of course) used inputs, the more accurate and comprehensive the output model.  Patience is a must when dealing with a large amount of data as space required to analyze big data, data cleaning, and data integration may be needed before any analysis of the information can be performed.  There are many modeling tools available that will provide quality output.  Open source solutions include Weka, R, RapidMiner while Oracle Data Mining, IBM SPSS, and SAS can be purchased for predictive analytic tools.  Working with large amounts of data from start to finish can be exhausting, as such the output of any analysis or model should be easily interpreted by those making decisions. 

With information being generated and tools available to analyze data and build predictive models, business executives, doctors, and people will be better able to manage the complexities of making difficult decisions.

References
[1] http://www.oracle.com/us/solutions/ent-performance-bi/performance-management/064129.pdf
[2] http://www.marketingprofs.com/articles/2011/6558/why-predictive-modeling-is-hot-and-five-things-you-should-know-to-do-it-well