Data Mining Techniques

In my last article an overview of data mining has been provided. In this article we will see the different techniques or algorithm we use in data mining process.

data_base_knowledge_base_system

We all know that necessity is the mother of invention. Need of people changes with the time and to feed our hunger and need, lots of technology are also evolving around the world. Also we can do same work in no of ways and it all depends on the requirement, available time frame and our budget. Lots of data mining techniques and algorithms are available in market from variety of technology vendors in market. Described below are most common data mining techniques-

1. Association:

  • This is one of the best known data mining technique available in market these days.
  • Using this technique a pattern is discovered based on the relationship between items in same transaction of group.
  • This technique is also known as relation technique because of above reason.
  • We use association in Market Basket Analysis to identify a set of products a customer purchase frequently together.
  • Retailer are using this technique to research customer’s buying habits.
  • Based on historical sales data retailers might find out that customers always buys snakes/chips when they buys beers therefore they can put beer and chips/snakes together to save customer time and increase their sales.
  • Another example can be the recommended/suggested videos of similar nature when you browse youtube videos. Or when you visit any e-commerce website they website recommend you few more products which people buys together mostly like cellphone cover and external memory card while browsing cell phone etc.

2. Classification

  • This is a classic data mining technique which is based on Machine Learning technique
  • .Classification is used to classify each item in a set of data into one of predefined set of classes or groups.
  • It uses few mathematical techniques like decision tree, linear programming, neural network and statistics to classify items in a group.
  • Using this technique we can develop a software that can learn how to classify data items into groups.
  • Ex- Based on the tenure and current salary paid in organization and market, it can identify which employee will stay with organization and who can left in near future.

3. Clustering

  • Clustering is a data mining technique which creates meaningful cluster of objects which have similar nature/characteristics.
  • Clustering technique identify and creates a set of clusters manually and then puts all objects of similar characteristics  into respective cluster
  • Ex-  A library where we have millions of books, we can apply clustering technique which will guide how to keep similar topic/subject/course books together in library shelf.

4. Prediction

  • Well as name suggests this technique is used to discover the relationship between independent variables(data elements) or relationship between dependent and independent variables.
  • After identifying the relationship between data elements/variables we can predict the future values for some period of time based on history.
  • Ex – Prediction technique can be used in sale to predict future profit if we consider sale as an independent variable and profit as dependent variable (to sale). Then based on historical sales and profit data we can draw a regression curve which can predict future profit values.

5. Sequential Patterns

  • It is a data mining technique that seeks/discovers/identify similar patterns, regular events or trends in transaction data over a business period.
  • Business can use this technique to identify a set of items that customers buys together different times in a year. Then business can use this information to recommend customers, buy it with better deals based on purchasing frequency in past.

6. Decision Tree

  • Decision tree is most widely used data mining technique because it is very easy to understand by business people.
  • In this technique a tree like structure is created which consists of set of questions and answers (or conditions) and each answer leads to a set of questions or conditions that helps us determine the data so that we can make final decision based on it.

Looking forward for your recommendation and feedback on this article.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s