Blog_data_mining_pipeline

to be replaced with blog posting to the questions below:

What is data mining? What are some examples of how data mining can be used?
What are the different steps of the pipeline? Briefly discuss each.
Why is defining the problem first so important?
Why is data cleaning/pre-processing important? What are some aspects of data that need to be cleaned (for example, dealing with null values)?
Find an example of some data understanding/visualizations (e.g., blog post, portfolio). What do you like and/or dislike about it?