We all are very much aware of the fact that in this modern era we all are surrounded by technology and technical equipment’s like cellphone, laptop, computer, e-commerce portal, satellites, ATM machine, banks,bio-metric machines etc and each one of them are collecting huge amount of data at enormous rate. A lot of relational database systems/servers has been designed to store such a massive amount of data. And to put data into database server OLTP system has been designed to run the business process smoothly which keeps track of each individual transaction occurring into system like bank transaction, ATM withdrawal, e-commerce shopping etc.
Now OLTP system is very good to run the business process smoothly and efficiently as it is designed to keep track of each and every transaction as a single unit of work. We also know that the value of historical data an organization has is much costlier than gold and its value will keep growing with the time. We can analyse historical data to identify past business trend and may take future business decision. To deal with historical data a new system or technology came into picture which is known as OLAP system or data warehouse system.
- Data mining is also know as KDD (Knowledge Discovery of Data)
- Data mining is the process that extracts implicit, potentially useful, comprehensible, actionable, previously unknown information from large databases and using it to make crucial business decision.
- Data mining can also be defined as Computer aided process that digs and analyses enormous set of data and then extracting knowledge or information out of it.
- Data mining automates the detection of relevant patterns in data/database.
Data Mining Process Steps:
Below mentioned are steps to identify hidden pattern in data in data mining process-
- Data Integration: At the very beginning we collects data from variety of sources (CRM, OLTP, Excel,MS Access, Oracle, SQL Server, csv etc) into a single data source called target data/ database using some technology.
- Data Selection: In this step focus on only those data set which are required to fulfill our hypothesis/ research/ assumption it means only meaningful data is selected to proceed.
- Data Cleaning: Data imported from data sources may be in different format than target database then we need to clean the data using some data cleaning algorithm.
- Data Transformation: Data is then prepossessed and transformed into standard format.
- Data Mining: Then we analyse and identify the type of data mining algorithm which will be suitable for current data and then applying algorithm(s) to identify the hidden patterns.
- Pattern Evaluation: The pattern we identified from the data is then interpreted and evaluated to gain knowledge out of it.
- Knowledge Presentation: This is the goal of data mining technique where knowledge collected from data mining process is then taken into consideration to make crucial business decision for the benefit of organization.
Now we came to know that data mining is the central part of knowledge discovery of data and is a very useful technology which identify the hidden pattern and useful information out of raw data which can be utilized to make business decision. In next article we will see the various data mining algorithm in detail.
Looking forward for your feedback and recommendation.