The data scientific method is primarily used as a blueprint to framing a problem and solving it. Below are the steps involved in this method:
Data Scientific Method
- Start with a Question
- Leverage your current data
- Create features and run tests
- Analyze the results and draw insights
- Let the data frame a conversation
This method is not unlike the scientific method, where a scientist first starts with a question or statement, then works to find meaningful patterns to help answer that question. In a typical scientific method, the steps are:
- Hypothesis: Make an educated guess or prediction
- Observations: Start watching
- Data: Write down what is observed
- Graphs: Turn it into a meaningful picture
- Conclusions: Decide what it all means
Hence the “science” in data science. It always starts with a question, statement or problem, then through a series of steps—sometimes simple but often complex–the analysis brings the data scientist and his/her team to eventually be able to draw meaningful and evidence-based conclusions that can guide future action for the benefit of the organization.
If I was to apply the data scientific method to a real world challenge, a good example would be optimizing time for a sales team. Inside Sales Representatives are normally graded on a few different metrics like: the amount of dials they make, the amount of time they are on the phones, how many units they sell, and obviously the revenue they generate. If I was to apply the data scientific method to solve a problem it would be to see if those metrics actually even relate to each other. If I followed the 5 main steps I imagine it would be something like:
- Hypothesis:What is the ideal amount of time for a sales rep to be on the phone? (I would like to know how much time is the most effective for a sales rep to be on the phone so they aren’t wasting time to hit on unnecessary talk time metrics.)
- Observations:I would have the sales reps monitor their activity.
- Data:I would have the sales reps log their call time and if a sales opportunity came from the call in salesforce.
- Graphs:I would then graph out the average talk time of all the reps on the x axes and the revenue generated from those calls on the y axes
- Conclusions:From this I could see the ideal “break-even” between the amount of time on the phone and the revenue uncovered from that time spent.
At the most basic level, the three components involved in data science are organizing, packaging and delivering data (the OPD of data). I would be able to come up with a few conclusions as how each of the sales reps should spend their time in the most effective way possible. I would imagine the graph would look similar to that of a bell curve. Then we could see how much time is too little and how much is too much. This efficiency across the entire inside sales department would not only increase revenue but decrease cost in wasted time. Approximately 65% of an average sales representative’s time is spent selling by phone and generating leads according to the blog, SellingPower. You can see that if I was to follow that structure I would ultimately come to a conclusion that would help drive business value for a sales department. This is just one of many examples of how the data scientific method can be used to frame a business problem for a data analyst.