About data profiling
To profile data in a data source (for example, a database or a file) means to examine and collect information about that data. The purpose of the examination can be, for example, to determine whether this data is accurate and complete or whether the data can be used for business analysis. The information collected during the data profiling refers to data type, structure, content, relationships, and so on.
When uploading data sources to
The automatic profiling system identifies the following:
Data can be classified as follows:
For details, see About dimensions and measures.
For the columns containing measures, the system suggests an aggregation type based on the column name. If a column name is not recognized, then the default aggregation is set to Sum. The following aggregation types are available when creating a dataset:
- None (is the only option for OLAP data sources; not available for other data sources)
The aggregation type that is defined at the time of a dataset creation is a default type for that measure in a visualization. More aggregation types are available when clicking a measure in the widget settings pane. For details, see About aggregation for measures.
The joins between tables are suggested based on their common columns (with similar content and similar names).
There are four join types:
- Inner join
- Left join
- Right join
- Full outer join
For details, see About data joins.