In an ideal world, a data analysis process is as simple as-read in data, select a suitable model to fit in data, obtain statistical estimates, and finally, interpret the analysis results. Sounds simple and straight forward, isn’t it?
But, in reality, it’s often not that simple!Data is always messy and often times we need to clean our data before we can make any sense of it. Moreover, some researchers found that more than 80% of data analysis is actually spent on data preparation or data manipulation (Dasu & Johnson, 2003), so that the data is transformed into a usable format before you even think about analysis.