Clustering is used to combine similar objects into groups and is one of the fundamental problems in data mining. Clustering is used in many application areas: image segmentation, marketing, fraud control, forecasting, text analysis and many others. At the present stage clustering is often the first step in analyzing data. After similar groups are selected, other methods are applied, and for each group a separate model is constructed.
The K-modes clustering partitions a set of objects into a previously known number of clusters k. The method consists of four steps:
The initial selection of centers for k clusters.
The objects are divided into k clusters relative to the cluster centers previously defined by the principle of the least distance from an object to its cluster center. The distance is defined by the number of attributes, which values for the object and the cluster center are not the same.
The method terminates if there is no movement from one cluster to another cluster. Otherwise, the next step is taken.
The cluster centers are recalculated according to the current division. Move to the second step.
See also:
Library of Methods and Models | Detecting Categories | ISmKmeansClusterAnalysis