MLLIB.PEARSONCOR(columns)

Calculates the Pearson correlation coefficient between selected columns to assess the linear relationship of two continuous variables. A relationship is linear when a change in one variable is associated with a proportional change in the other variable. Pearson’s correlation coefficient is a measure based on the actual data values, and thus, it is sensitive to outliers.

Input data

Two numeric variables
The size of input data is not limited
Without missing values

Example: MLLIB.PEARSONCOR(sum([Gross Sales]), sum([No of Customers]))

Result

The correlation coefficient measures of the strength of the relationship between two variables (from -1 to 1). For example, the value of -1 shows a perfect negative correlation, the value of 1 indicates a perfect positive correlation, and the value of 0 — no linear relationship between the two variables.

Example

Using the Scatterplot widget, add a calculation with the MLLIB.PEARSONCOR(sum([Gross Sales]), sum([No of Customers])), but set to dimension. Using the dataset manager, drag it into the Color field. The function returns a single value, so only one color is used. The coefficient value is shown in the legend and the tooltip for each point of the visualization. The coefficient of 0.93 indicates a strong positive correlation.

For the whole list of algorithms, see Data science built-in algorithms.