Data mining can be defined as the process of extracting valid, authentic, and actionable information from large databases using various data mining techniques like machine learning, artificial intelligence (AI) and statistics to derive patterns and trends that exist in data. These patterns and trends can be collected together and defined as a mining model.
Almost all industries these days are taking advantage of this technique including manufacturing, marketing, chemical, aerospace etc. to increase their business efficiency. Therefore the needs of a standard data mining process increased dramatically which should be easy, comprehensive, reliable and uniform across the industry.
Data Mining Methodology:-
As a result in 1990 a Cross Industry Standard Process for Data Mining (CRISP-DM) first published the uniform and standard process for data mining by defining below 7 steps
1. Defining the problem – Understanding Business Process Continue reading
Data mining is described as the process of discovering or extracting interesting and meaningful knowledge from large volume of data which are stored in multiple data sources like databases, file system , data warehouse etc. The knowledge extracted from data mining contributes a lots of benefit to business strategies, scientific & medical research, governments and individual. Data warehouse systems has been designed to provide analytical reports that helps (DSS- Decision Support System) business users to make decisions.
Data Mining Architecture Continue reading
In my last article an overview of data mining has been provided. In this article we will see the different techniques or algorithm we use in data mining process.
We all know that necessity is the mother of invention. Need of people changes with the time and to feed our hunger and need, lots of technology are also evolving around the world. Also we can do same work in no of ways and it all depends on the requirement, available time frame and our budget. Lots of data mining techniques and algorithms are available in market from variety of technology vendors in market. Described below are most common data mining techniques-
- This is one of the best known data mining technique available in market these days.
- Using this technique a pattern is discovered based on the relationship between items in same transaction of group.
- This technique is also known as relation technique because of above reason.
- We use association in Market Basket Analysis to identify a set of products a customer purchase frequently together.
- Retailer are using this technique to research customer’s buying habits.
- Based on historical sales data retailers might find out that customers always buys snakes/chips when they buys beers therefore they can put beer and chips/snakes together to save customer time and increase their sales.
- Another example can be the recommended/suggested videos of similar nature when you browse youtube videos. Or when you visit any e-commerce website they website recommend you few more products which people buys together mostly like cellphone cover and external memory card while browsing cell phone etc.
We all are very much aware of the fact that in this modern era we all are surrounded by technology and technical equipment’s like cellphone, laptop, computer, e-commerce portal, satellites, ATM machine, banks,bio-metric machines etc and each one of them are collecting huge amount of data at enormous rate. A lot of relational database systems/servers has been designed to store such a massive amount of data. And to put data into database server OLTP system has been designed to run the business process smoothly which keeps track of each individual transaction occurring into system like bank transaction, ATM withdrawal, e-commerce shopping etc.
Now OLTP system is very good to run the business process smoothly and efficiently as it is designed to keep track of each and every transaction as a single unit of work. We also know that the value of historical data an organization has is much costlier than gold and its value will keep growing with the time. We can analyse hist Continue reading