A while back, we made a list from A to Z of a few of our favorite things in big data and data science. We have made a lot of progress toward covering several of these topics. Here’s a handy list of the write-ups that I have completed so far:
A – Association rule mining: described in the article “Association Rule Mining – Not Your Typical Data Science Algorithm.”
C – Characterization: described in the article “The Big C of Big Data: Top 8 Reasons that Characterization is ‘ROIght’ for Your Data.”
H – Hadoop (of course!): described in the article “H is for Hadoop, along with a Huge Heap of Helpful Big Data Capabilities.” To learn more, check out the Executive’s Guide to Big Data and Apache Hadoop, available as a free download from MapR.
K – K-anything in data mining: described in the article “The K’s of Data Mining – Great Things Come in Pairs.”
L – Local linear embedding (LLE): is described in detail in the blog post series “When Big Data Goes Local, Small Data Gets Big – Part 1” and “Part 2“
N – Novelty detection (also known as “Surprise Discovery”): described in the articles “Outlier Detection Gets a Makeover – Surprise Discovery in Scientific Big Data” and “N is for Novelty Detection…” To learn more, check out the book Practical Machine Learning: A New Look at Anomaly Detection, available as a free download from MapR.
P – Profiling (specifically, data profiling): described in the article “Data Profiling – Four Steps to Knowing Your Big Data.”
Q – Quantified and Tracked: described in the article “Big Data is Everything, Quantified and Tracked: What this Means for You.”
R – Recommender engines: described in two articles: “Design Patterns for Recommendation Systems – Everyone Wants a Pony” and “Personalization – It’s Not Just for Hamburgers Anymore.” To learn more, check out the book Practical Machine Learning: Innovations in Recommendation, available as a free download from MapR.
S – SVM (Support Vector Machines): described in the article “The Importance of Location in Real Estate, Weather, and Machine Learning.”
Z – Zero bias, Zero variance: described in the article “Statistical Truisms in the Age of Big Data.”