K-Means

<< Click to Display Table of Contents >>

Current:  Advanced Analytics > Algorithm 

K-Means

Previous pageReturn to chapter overviewNext page

This operator can be used to classify unlabeled data and belongs to unsupervised learning. Mainly solve the problem of unsupervised learning and cluster prediction.

How-To-Use:

In the case of inconsistent dimensions, the input data needs to be standardized. After setting Kmeans, you can view the output cluster category, number of cluster categories, cluster center, and model clustering result evaluation indicators (Calinski-Harabasz score, Davies-Boulding index; connect the image view to view the clusters by connecting the table view Data presentation results.

Precautions:

The input data of the clustering model needs to filter the collinearity of the variables, you can use the node [Analysis of Correlation] operation.

The input data of the clustering model needs to be standardized, and you can use the node [Normalization] operation.

ML203

 

Configuration

After adding the K-Means node to the experiment, you can set the K-Means node through the "Configuration" page on the right.

[Reserved digits of performance index] When the rounding precision is positive, the digits after the decimal point are retained; when the rounding precision is negative, the digits before the decimal point are retained.

[Number of classification clusters] Specify the number of classification clusters; data requirements: please fill in integer type numbers, data range [1,).

[Initialization method] k-means++: The initial mean vector selected by this initialization strategy is far away from each other, and its effect is better; random: randomly select n samples from the data set as the initial mean vector or provide an array, array The shape of is (n_clusters, n_features), and this array is used as the initial mean vector.

[Specify the number of runs of K-means algorithm] Each time, a different set of initialization mean vectors will be selected, and the final algorithm will select the best classification cluster as the final result. Data requirements: Please fill in an integer greater than 1, and the data range is [1,).

[The maximum number of iterations] An integer that specifies the maximum number of iterations in a single round of kmeans algorithm. The total number of iterations of the algorithm is: max_iter*n_init. Data requirements: Please enter an integer greater than 0, and the data range is [1,).

[Random Seed] Data requirements: Please fill in an integer greater than 1, and the data range is [1,].

[Pre-calculated distance] This parameter specifies whether to calculate the distance between samples in advance. ‘True’: Calculated in advance. ‘False’: Do not calculate in advance.

[Algorithm]        auto: automatically select the algorithm.

         For sparse data, use full. full: Use the classic EM style algorithm.

                     For dense data, use elkan. elkan: uses the ‘elkan’ variant algorithm, which optimizes the algorithm by using triangular inequalities, but does not support sparse data.

[Independent variable] There can be multiple feature fields of the model.

ML204

Right-click menu of k-means:

bi-right

Run K-Means Node

Run the node, pass the data to DM-Engine for calculation, and get the output result.

 

Reset K-Means Node

The node that has been running is reset, the returned result is deleted, and the node status is changed to not running.

 

Rename K-Means Node

In the right-click menu of the K-Means node, select "Rename" to rename the node.

 

Refresh K-Means Node

In the right-click menu of the K-Means node, select "Refresh" to update the synchronization data or parameter information.

 

Save as Composite Node

In the right-click menu of the K-Means node, select "Save as Composite Node",The selected node can be saved as a composite node to realize a multiplexing node, and the parameters of the saved node are consistent with the original node.

 

Cut K-Means Node

In the right-click menu of the K-Means node, select "Cut" to realize node cutting operation.

 

Copy K-Means Node

In the right-click menu of the K-Means node, select "Copy" to realize node  replication operation.

 

Delete K-Means Node

In the right-click menu of a K-Means node, select "Delete" or click the delete key on the keyboard to delete the node and its input and output connections.