Data sources

To create a dataset, you can use various data sources, such as data files, IBM data sources, databases, or any combination of these data sources.

If you create a dataset from two or more data sources (for example, a cube view and an Excel file), you need to define joins between them. For details, see Define joins between data sources.

Each table from the data source is displayed as a separate data source. You can make changes to these data sources, for example, remove columns or add calculations. These changes are not saved in the original files.


You can use the data files of the following types: .csv, .tsv, .txt, .xlsx, .xls, and .sav.

To create a dataset, you can use one or more data files. You may need to prepare your files before uploading them. For details, see Preparing data files.


You can create datasets using the following database types:

  • Amazon Athena*
  • Amazon Aurora (MySQL)
  • Amazon Aurora (PostgreSQL)
  • Amazon Redshift*
  • Apache Derby
  • Apache Hive*
  • Dremio
  • Exasol*
  • FirebirdSQL
  • Google Cloud SQL (MySQL)
  • Google Cloud SQL (PostgreSQL)
  • HyperSQL
  • IBM Cognos TM1 / IBM Planning Analytics Cubes
  • IBM Cognos TM1 / IBM Planning Analytics Cube Views
  • IBM Cognos packages:

    • Relational package
    • DMR package
    • PowerCube
    • Dynamic Cube
    • TM1/PA Cube
  • IBM Db2

  • Informix
  • MariaDB
  • Microsoft Azure SQL Database*
  • Microsoft Azure SQL Data Warehouse*
  • Microsoft SQL Server
  • Microsoft SQL Server Analysis Services
  • MySQL
  • memSQL
  • mongoDB
  • Oracle
  • PostgreSQL
  • SAP Business Objects*
  • SAP Hana*
  • Snowflake*

* The data sources are considered beta in the current release.

To create a dataset from a database, you need to define a connection to the respective server. For details, see Add data connections.