MCA/BTech - Data Science - Understanding Data Preparation (Questions and Answers)
Q1: Why is there need to prepare data?
Answer: It is estimated that over 2.5 exabytes are created and collected by people and organisations each day. Here are the main reasons we need to prepare data are:
① 60% to 95% of the time is spent preparing the data. Some data preparation is needed for all mining tools.
② The purpose of preparation is to transform data sets so that their information content is best exposed to the mining tool.
③ Error prediction rate should be lower (or the same) after the preparation as it was before.