The application of genetic algorithms to knowledge discovery and data mining

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Abstract

Knowledge discovery is a new discipline that applies machine learning techniques to large real-world databases to extract knowledge from the data. This knowledge is often expressed as rules modeling the data. Genetic algorithms (GA) are a unique method for evolving high quality solutions from a potentially huge search space of possible solutions. This technique uses a simulated process of natural selection rather than a simulated reasoning process. Genetic algorithms are uniquely suited to data mining problems due to the inductive nature of the problem. This paper describes two GA-based data mining systems, GA-MINIR, a pure GA technique, and DOGMA, a hybrid technique, using GAs to improve on rules generated by another classifier. It also discusses the difficulty of formally analyzing a genetic algorithm in order to compare it with more conventional methods of solving the same problem.

References

Clark, P. Knowledge Representation in Machine Learning. Machine and Human Learning, pp35--49, Eds: Y. Kodratoff and A. Hutchinson, Kogan Page, London, 1989.

Cortes, C., Jackel, L. D. and Chiang, W. Limits on Learning Machine Accuracy Imposed by Data Quality. In Proceedings of 1 st International Conference on Knowledge Discovery and Data Mining. AAAI Press, Menlo Park, CA, 1995, pp57--62.

Davidenko, V. N., Kureichik, V. M., Miagkikh, V. M. Genetic Algorithm for Restrictive Channel Routing Problem. In Proceedings of 7 th International Conference on Genetic Algorithms. Morgan Kaufman Publishers, San Francisco, CA, 1997, pp636--642.

Decker, K. M., Focardi, S. Technology Overview: A Report on Data Mining. Technical Report CSCS TR-95-02. Swiss Scientific Computing Center, 1995.

Fayyad, U., Uthurusamy, R., Eds. Data Mining and Knowledge Discovery in Databases. Communications of the ACM, 39, 11 (Nov. 1996), pp 24--34.

Fisher, D. Iterative Optimization and Simplification of Hierarchical Clusterings. Technical Report CS-95-01. Dept. of Computer Science, Vanderbilt University, Nashville, TN., 1995.

Flockhart, I. W., Radcliffe, N. J. A Genetic Algorithm-Based Approach to Data Mining. In Proceedings of 2 nd International Conference on Knowledge Discovery and Data Mining. AAAI Press, Menlo Park, CA, 1996, pp 299--302.

Flockhart, I. W., Radcliffe, N. J. GA-MINER: Parallel Data Mining with Hierarchical Genetic Algorithms Final Report. University of Edinburgh Parallel Computing Center, Edinburgh, UK, 1995.

Goldberg, D. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Reading Mass., 1989.

Hekanaho, J. GA-Based Rule Enhancement in Concept Learning In Proceedings of 3 rd International Conference on Knowledge Discovery and Data Mining. AAAI Press, Menlo Park, CA, 1997, pp 183--186.

Holland, J. H. Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, MI., 1975.

Holsheimer, M., Seibes, A. Data Mining: The Search for Knowledge in Databases. Report CS-R9406 CWI. Amsterdam, The Netherlands, 1994.

Mathias, K. E., Whitley, L. D. Stock, C., Kusuma, T. Staged Hybrid Genetic Search for Seismic Data Imaging. IEEE Conference on Evolutionary Computation. Vol. 1 pp 356--361, 1994.

Norenkov, I. P., Goodman, E. D. Solving Scheduling Problems via Evolutionary Methods for Rule Sequence Optimization. Second World Conference on Soft Computing. June, 1997.