Data mining involves “processing data and identifying patterns and trends in that information,” according to IBM. “Data mining principles have been around for many years, but, with the advent of big data, it is even more prevalent.”

Ninety percent of data in the world today has been created in the last two years alone, IBM estimates. Every day, people create 2.5 quintillion bytes of data, enough to fill 10 million Blu-ray Discs.

Data mining techniques help professionals provide insights into available data sets. The techniques can offer descriptive and predictive power for businesses and other organizations.

5 Data Mining Techniques

1.    Association

Association makes a correlation between two or more items to identify a pattern. For instance, a supermarket could determine that customers often purchase whipped cream when they buy strawberries and vice versa. Association is often used at point-of-sale systems to determine common tendencies among products.

“It’s a very simple method, but you’d be surprised how much intelligence and insight it can provide—the kind of information many businesses use on a daily basis to improve efficiency and generate revenue,” according to technology company Galvanize. Application areas include physical organization of items, marketing and the cross-selling and up-selling of products.

2.    Classification

Multiple attributes can be used to identify a particular class of items. Classification assigns items into target categories or classes to accurately predict what will occur within the class.

Several industries use classification with customers. For instance, a banking company could use a classification model to identify loan applicants as low, medium or high credit risks. Other organizations classify current and target audiences into age and social groups for marketing campaigns.

3.    Clustering

“Clustering is the method by which like records are grouped together,” according to Alex Berson, Stephen Smith and Kurt Thearling in the book Building Data Mining Applications for CRM. “Usually this is done to give the end user a high level view of what is going on in the database.”

Seeing object groupings can help businesses in areas like marketing segmentation. Clustering can be used in this example to subdivide a market into subsets of customers. Each subset can then be targeted with a specific marketing strategy based on the attributes of the cluster, such as buying patterns for customers in one cluster vs. another cluster.

4.    Decision Trees

Decision trees are used to categorize or predict data. A decision tree starts with a simple question that has two or more answers. Each answer leads to a further question that is used to classify or identify data that can be categorized, or so that a prediction can be made based on each answer.

Decision tree graph

The graphic of a decision tree represents how a cellphone provider might classify customers who churn, or those who don’t renew their phone contracts. The authors of Building Data Mining Applications for CRM offer some interesting takeaways for the graphic.

•  It divides the data on each branch without losing any of the data. For instance, the total number of records in a parent node is equal to the sum of the records contained in its two children.

•  The number of churners and non-churners is conserved as you move up or down the tree.

•  It is fairly easy to understand how the model is being built.

•  The model would be pretty easy to use if you needed to target customers who are likely to churn with a marketing offer.

•  The company could develop intuition about its customer base; for instance, it could conclude that customers who have been with the provider for a couple of years and have up-to-date cellphones tend to be loyal.

5.    Sequential Patterns

Sequential patterns identify trends or regular occurrences of similar events. This data mining technique is often used to understand user buying behaviors. Many retailers use data and sequential patterns to decide on the products they display.

“With customer data you can identify that customers buy a particular collection of products together at different times of the year,” according to IBM. “In a shopping basket application, you can use this information to automatically suggest that certain items be added to a basket based on their frequency and past purchasing history.”


Career Opportunities in Big Data

The growth of big data has created a number of emerging roles in data mining and analytics. Positions such as data analyst and data scientist are in demand and use several data mining techniques and principles.

The online master’s degree in analytics from Notre Dame of Maryland University prepares students for careers in big data. In a convenient and flexible learning environment, students gain multidisciplinary competencies in knowledge management technologies, qualitative processes and economic principles of change risk management. This program is offered fully online.