What is knowledge discovery and data mining?

The activity in which machine learning techniques are applied to find patterns in the relationship between data elements is called data mining. The data mining activity is one step in the knowledge discovery process1. This process seeks to gain insight into the relationship between data elements.

To provide a better understanding of a knowledge discovery endeavour a general process model is useful. Such a process model consists of a set of processing steps needed to complete a knowledge discovery and data mining (KDDM) project1. Various process models have been proposed. A popular KDDM process models is depicted in the figure below. The process model consists of six steps and several feedback loops.

KDDM process model
KDDM process model.

THe process unfolds in intentional cycles as follows:

1 In the first step of the KDDM process, a general understanding of the application domain and the relevant prior knowledge is developed. During this step the data mining problem and the objectives of the knowledge discovery and data mining endeavour are defined. 2. The second step involves the identification and acquisition of appropriate data sources, data exploration, data sampling, as well as the selection of appropriate, relevant and interesting attributes. 3. Data preparation, the third step, involves the preprocessing of the data set into the correct structure and form for use with the selected machine learning technique. During this step the appropriate machine learning technique or combination of machine learning techniques are identified in line with the data mining objectives set out during step one. 4. Step four, the data mining step, involves the application of the selected machine learning techniques to the prepared data. 5. In the context of the data mining objectives, the usefulness of the discovered patterns is evaluated and any alternative actions needed are identified during the fifth step. Useful knowledge learned is deployed for practical use in the final step.

Notes

References

Footnotes

  1. Kurgan, L. A., Musilek, P., 2006. A survey of knowledge discovery and data mining process models. Knowledge Engineering Review 21 (1), 1-24. 2

Related tags

emerging technologydecision and data science