Data science capabilities
In DataClarity Platform, you have two options to use data science capabilities with your data:
- You can use built-in algorithms for clustering, detecting outliers, calculating correlation coefficients, time series forecasting, and so on. These algorithms are based on the MLlib Spark's machine learning library and on Python. The algorithms can be used as such, without writing any code and without connecting to an external platform. To apply a built-in algorithm to the data, you need to select it from a list, and then add the algorithm parameters, as well as the column(s) to which the algorithm should apply. For details, see Data science built-in algorithms.
- You can connect to an external Microsoft R, Rserve, TabPy, or DataClarityPy server and create script calculations for your data. The script editor included in DataClarity Platform highlights the syntax of these programming languages. You can add, edit, share, and delete AI connections. You can add script calculations when you create a dataset. For details, see Using external data science platforms.