Define joins between data sources
To define a join, you select the common column for the data sources and then specify a join type. When creating a dataset with two or more data sources, the system automatically identifies the common columns and suggests joins between them. You can view and edit these joins when creating or editing a dataset.
Prerequisites
-
You are creating or editing a dataset.
-
You uploaded at least two data sources. For example, two or more tables from one or several database sources; an Excel file with at least two sheets; a file with at least one sheet and a table from a database source.
-
The Step 3 – Join and preview page is open.
Procedure
Depending on the view, the steps for managing joins are different.
List view is set by default. To switch between the views, use the buttons in the upper-right corner of the page.
-
Review the relationships diagram displaying the data sources and the automatically recommended joins:
-
For the recommended joins, the line connecting data sources is dashed. For the approved joins, the line is solid.
-
Each data source is marked with a color to help you match the respective columns in the Data source columns pane on the left and in the Data preview pane at the bottom.
Note: If not all joins are visible, use the buttons in the lower-right corner to zoom the view in or out. To have more real estate when working with joins, you can also collapse the Data source columns and Data preview panes.
-
-
To save a dataset, review all the recommended joins. For each join, you can choose one of the following:
Edit join-
Point to the line connecting two data sources, click the join symbol between them, and then click Edit join.
The Edit join dialog appears.
-
Observe the order of the data sources in the Edit join dialog.
Important: The order of the data sources in this dialog may differ from the order displayed in the diagram. For the following steps, refer to the order shown in the Edit join dialog.
-
To change the join type, click the join symbol and select another type.
There are four join types:
- Inner join
- Left join
- Right join
- Full outer join
For details, see About data joins.
-
Check the column that is used for joining for each data source. If needed, select another column from the dropdown list.
-
(Optional) Depending on your data sources, you may need to add a secondary set of columns for joining. To do so, click add next to the first set of columns, and define the join as needed.
- Click OK to save your changes.
Approve joinIf you agree with the recommended join type and the column selected for the join, point to the line connecting two data sources, click the join symbol, and then click Approve join.
After a join is approved, the line connecting the two data sources becomes solid.
Tip: You can also approve all joins at once. In the upper-righter corner, click Approve all joins.
Delete joinTo delete a join between two data sources, point to the line connecting two data sources, click the join symbol, and then click Delete join.
Tip: You can also remove all recommended joins at once: in the upper-righter corner of the page, click Delete all joins.
Add joinTo add a new join, point to the data source box, then point to the link icon and drag it to the data source with which you want to join it until the line becomes green.
The system adds a recommended join based on the first column in both data sources. Proceed with editing a join as needed. For details, see Edit join
-
-
(Optional) Click Preview data to view all the data included in the dataset.
You can preview data if all the tables are linked.
To avoid misinterpretation of data, circular references (loops) are not allowed in the data structure.
-
Review the data sources and the automatically recommended joins in the list view:
-
Recommended joins are grouped under the Recommended joins section.
-
Each data source is marked with a color to help you match the respective columns in the left pane and in the preview pane at the bottom.
Note: To have more real estate when working with joins, you can also collapse the Data source columns and the Data preview panes.
-
-
To save a dataset, all the recommended joins should be reviewed. For each join, you can choose one of the following:
Edit join-
Point to the line connecting two data sources, and then click the join symbol between two data sources.
The Edit join dialog appears.
-
To change the join type, click the join symbol and select another type.
There are four join types:
- Inner join
- Left join
- Right join
- Full outer join
For details, see About data joins.
-
Check the column that is used for joining for each data source. If needed, select another column from the dropdown list.
-
(Optional) Depending on your data sources, you may need to add a secondary set of columns for joining. To do so, click add next to the first set of columns, and define the join as needed.
- Click OK to save your changes.
Approve joinIf you agree with the recommended column used for the join and the join type, you can approve the join. To do so, point to the line connecting two data sources, and then click Approve join.
The approved joins are moved to the Approved joins section, and the line connecting data sources becomes solid.
Tip: You can also approve all joins at once. Next to the section name, click Approve all joins.
Delete joinTo delete a join between two data sources, point to the recommended join, and then click Delete join. For the approved joins, click Delete join.
Tip: You can also delete all joins at once. Next to the section name, click Delete all joins.
Add joinIf a data source does not have any joins, it appears in the Data sources with no joins section.
To join the data source, just click the data source box. The Add join dialog appears.
A recommended join is added based on the first column in both data sources. Specify all the details for the new join, and click Add to save the new join. For details, see Edit join.
-
-
(Optional) Click Preview data to view all the data included in the dataset.
You can preview data if all the tables are linked.
To avoid misinterpretation of data, circular references (loops) are not allowed in the data structure.