DCPY.AUTOCLUST(max_clusters, columns)
Automated clustering uses Kmeans algorithm to cluster the data. It starts with creating a model with two clusters and continues up to a specified maximum number of clusters, evaluating clustering quality after each model by using CalinskiHarabasz index. If the value of this index is lower than a preceding model, the preceding model is used for optimal clustering (the first local maximum of CalinskiHarabasz index).
Parameters

max_clusters – The maximum number of allowed clusters, integer (default 10).

columns – Columns to be used for clustering.
Example: DCPY.AUTOCLUST(10, sum([Gross Sales]), sum([No of customers])) used as a calculation for the Color field of the Scatterplot visualization.
Input data
 Numeric variables are automatically scaled to zero mean and unit variance.
 Character variables are transformed to numeric values using onehot encoding.
 Dates are treated as character variables, so they are also onehot encoded.
 Size of input data is not limited, but many categories in character or date variables increase rapidly the dimensionality.
 Rows that contain missing values in any of their columns are dropped.
Result

Column of integer values starting with 1, where each number corresponds to a cluster assigned to each record (row) by the algorithm.

Rows that were dropped (due to missing values) are not assigned to any cluster.
Key usage points

Use it when you want a quick clustering without a specific number of clusters, or without any knowledge about underlying data.

Same assumptions and advantages as for Kmeans algorithm apply. For details, see DCPY.KMEANSCLUST(n_clusters, random_state, init, n_init, max_iter, columns).

First local maximum of CalinskiHarabasz index may create clusters that are not optimal.

Depending on the data distribution and absence of natural clusters, CalinskiHarabasz index might be often highest for the model with maximum number of clusters specified, causing also suboptimal results.
For the whole list of algorithms, see Data science builtin algorithms.