MLLIB.BICLUSTER(imputer, n_clusters, seed, columns)
Bisecting Kmeans is a kind of hierarchical clustering using divisive (topdown approach), where all observations start in one cluster, and splits are performed recursively as it moves down the hierarchy. The splits are done with regular Kmeans with K = 2 on a cluster with highest SSE (sum of squared errors). The algorithm is executed with 20 iterations to split clusters.
Bisecting Kmeans can often be much faster than regular Kmeans, but it will generally produce a different clustering.
Parameters

imputer – strategy for dealing with null values:

0 – Replace null values with ‘0'

1 – Assign null values to a designated ‘1' cluster


number_of_clusters – Number of clusters which the algorithm should find, integer.

seed – Random seed, integer.

columns – Dataset columns or custom calculations.
Example: MLLIB.BICLUSTER(0, 3, 555, sum([Gross Sales]), sum([No of customers])) used as a calculation for the Color field of the Scatterplot visualization.
Input data
 Size of input data is not limited.
 Without missing values.
 Character variables are transformed to numeric with label encoding.
Result
 Column of integer values starting with 0, where each number corresponds to a cluster assigned to each record (row) by the Bisecting Kmeans algorithm.
Key usage points
 Less sensitivity to initialization than regular Kmeans.
 Tends to produce clusters of similar sizes, where Kmeans often produces null clusters when k is large.
 Lower computational time.
 Use it when you want to avoid convergence in local minimum.
Drawbacks
 If the number of clusters is not selected properly, it will cause a large deviation between the results and ideal clustering results.
For the whole list of algorithms, see Data science builtin algorithms.