Add script calculations
The R and Python script languages are supported
When adding a script for a calculation, you can choose how to process the data in a script: vector or scalar.
If using the Vector calculation type, multiple rows are sent as a table to a server for processing, in one request. A result will be an array of values – a set of rows for a column. The following is an example of the outlier detection script sending data to a table in Python.
With the Scalar calculation type, each row is sent separately and calculated in the server one by one.
Example of a scalar function:
result = _arg1.upper()
- You have an AI connection in the AI connections pane. For details, see Add AI connections.
- You are creating a dataset.
Depending on the page, do one of the following:
On Step 2 – Refine, select a data source, and then click Calculations.
On Step 3 – Join and preview, next to the data source name, click More options and then Calculations.
The Calculationsdialog appears.
- In the Calculation name field, type a name for the calculation.
In the drop-down list on the right, select the calculation data role (dimension, measure, date).
In the Type calculation field, depending on the script type that you need, type script or scalar, and then click the Edit script button that appears.
Alternatively, you can drag the SCRIPT or SCALAR functions that are included in the Data Science group.
In the Script / Scalar pane that appears, do the following:
Select the following settings:
- Connector – DataClaritPy (built-in) or any other defined AI connector.
- Language – Python or R.
- Calculation – Specify how to calculate the data in the script:
- Vector – Multiple rows are sent as a table in one request to the server for processing.
- Scalar – Each row is sent and calculated in the server separately, one by one.
- Result – The format of the returned value: double, string, or integer.
In the Type script field, enter your script.
Note: You can use only scripts that return data in one column.
Click within the Click to add columns field, and select the column that you want to use as the first argument for the script. Repeat this step for each argument of the script.
If you need to change the default aggregation, click a column name, and select a new aggregation.
The script is added to the calculation pane.
The calculation column is added to the dataset.