Configuring Data Sources

Overview

Data sources are the foundation of your data pipelines. In Entegrata, you connect to various data sources, select the specific tables or datasets you need, and configure how they relate to each other. This guide covers everything you need to know about working with data sources in the mapping editor.

Understanding Source Types

Primary Source

The primary source is the main source of data for your entity:

Determines the base set of records
All other sources are joined to this source
Each entity must have exactly one primary source
The first source you add becomes the primary source

Example: For a Customer entity, your CRM system’s customer table would be the primary source.

Related sources are additional data sources joined to enrich your primary data:

Can have multiple related sources per entity
Joined using one or more key fields (foreign key relationships)
Can come from different systems or databases

Example: For a Customer entity, you might join:

Client table (to get customer client number)
Marketing engagement table (to get campaign responses)

Canvas showing primary source with multiple related sources

Adding a Primary Source

Open the Mapping Editor

Navigate to your entity in the data pipeline section and open the mapping editor.

Open Source Context Panel

Click anywhere in the editor that isn’t a node or connection to open the source catalog context panel.

Select Source

Search and choose a source from your connected sources:

Database connections
File uploads
API connections
External systems

Configure source identifier

The first source added becomes your primary source automatically. This is the main source of data for your entity.

The primary source determines the base records. All other sources are joined to this.

Configure the identifier used as primary means of identifying unique records for this source across all potentially related sources.

If a primary key is setup for the source, it will be the default
You can override the identifier to any field on the source you choose

Related sources are joined to your primary source to enrich data.

Open Source Context Panel

Click anywhere in the editor that isn’t a node or connection to open the source catalog context panel.

Select Source

Search and choose a source from your connected sources:

Database connections
File uploads
API connections
External systems

Configure Identifiers

Any source added after the first becomes a related source.

Related sources need logic defined to join to the primary source by one or more of its source’s fields.

Configure the primary key for the source if it isn’t already setup from its source. This is required to uniquely identify records from the source system.Configure how data from this source is related to the primary source using data from both sources’ fields.

Select a field on this source whose data uniquely relates to data in the primary source.
Select the field on the primary source whose data matches the selected identifier for this source.

Join Best Practices

Use exact match joins when possible (e.g., ID fields)
Our system preserves all primary source records, but not necessarily all related source records.
For complex or ambiguous cases, multiple fields can be used in combination to uniquely identify records.

Configuring Source Properties

Click on any source node to access its properties panel.

Basic Properties

Connection: The data connection being used (read-only)
Table/Dataset: The specific table or dataset (read-only)
Primary Key: The field(s) used to uniquely identify a record in the source system.
Entegrata Identifier: The field(s) used to join records across other data sources in the Entegrata system.

When working with many related sources:

Nested Joins

Related sources can join to other related sources (not just the primary):

Add First Related Source

Join a related source to your primary source.

Add Second Related Source

When configuring the join, select the first related source instead of the primary source.

Configure Join

Set up the join condition between the two related sources using the identifiers configuration.

Example:

Primary: Customers
Related 1: Orders (joined to Customers)
Related 2: OrderItems (joined to Orders, not Customers)

Troubleshooting

Source Connection Failed

Issue: Cannot connect to the data source. Solutions:

Verify the connection credentials are current
Check network connectivity
Ensure you have permission to access the source
Contact your administrator to refresh the connection

No Tables Visible

Issue: Can’t see any tables or datasets in the source. Solutions:

Verify you have read permissions
Check if you’re looking in the correct schema/database
Refresh the connection in the admin portal
Some sources may require specific catalog configuration

Too Many Records After Join

Issue: Join produces more records than expected. Solutions:

Check for duplicate values in join keys
Verify you’re joining on the correct fields
Look for one-to-many relationships
Add additional join conditions to make relationship unique
Consider if this is actually correct (e.g., one customer, many orders)

Best Practices

Source Configuration

Always use filters at the source level when possible
Choose the most specific table/view available
Ensure you are joining to other tables with at most 1 record for each primary source record.

Data Quality

Verify join keys have good cardinality
Check for nulls in join fields
Test edge cases (no matches, duplicates)
Validate against expected record counts

Data Mapping Editor

Learn about the visual mapping editor interface

Mapping Fields

Map source fields to your entity fields

Multi-Field Mapping

Advanced transformations with multiple source fields

Publishing Mappings

Deploy your configured mappings to production

Getting Started

Pipeline Management

Data Mapping

Advanced Topics

Overview

Understanding Source Types

Primary Source

Adding a Primary Source

Configuring Source Properties

Basic Properties

Nested Joins

Troubleshooting

Source Connection Failed

No Tables Visible

Too Many Records After Join

Best Practices

Data Mapping Editor

Mapping Fields

Multi-Field Mapping

Publishing Mappings

Getting Started

Pipeline Management

Data Mapping

Advanced Topics

​Overview

​Understanding Source Types

​Primary Source

​Related Sources

​Adding a Primary Source

​Adding Related Sources

​Configuring Source Properties

​Basic Properties

​Managing Multiple Related Sources

​Nested Joins

​Troubleshooting

​Source Connection Failed

​No Tables Visible

​Too Many Records After Join

​Best Practices

​Related Topics

Data Mapping Editor

Mapping Fields

Multi-Field Mapping

Publishing Mappings

Overview

Understanding Source Types

Primary Source

Related Sources

Adding a Primary Source

Adding Related Sources

Configuring Source Properties

Basic Properties

Managing Multiple Related Sources

Nested Joins

Troubleshooting

Source Connection Failed

No Tables Visible

Too Many Records After Join

Best Practices

Related Topics