Common wisdom states that “perfect is the enemy of good enough.” We can apply this wisdom to the machine learning models that we train and deploy for big data analytics. If we strive for perfection, then we may encounter several potential risks. It may be useful therefore to pay attention to a little bit of “machine unlearning.” For example:
Overfitting
By attempting to build a model that correctly follows every little nuance, deviation, and variation in our data set, we are consequently almost certainly fitting the natural variance in the data, which will never go away. After building such a model, we may find that it has nearly 100% accuracy on the training data, but significantly lower accuracy on the test data set. These test results are guaranteed proof that we have overfit our model. Of course, we don’t want a trivial model (an underfit model) either – to paraphrase Albert Einstein: “models should be as simple as possible, but no simpler.”
(continue reading here … https://www.mapr.com/blog/machine-unlearning-value-imperfect-models)
Follow Kirk Borne on Twitter @KirkDBorne