Teradata walkthrough

Teradata is currently in beta, with support for DB version 15.10

Teradata is a database management system for building large-scale data warehousing applications. This tool enables you to run various data warehouse activities at the same time.

Connection

To connect to Teradata, refer to the Teradata connection topic.

After establishing a connection, Teradata offers features for integrating data into a cloud target.

Teradata provides a Multi-Tables mode for Standard Extractions that enables you to load multiple tables to your target at once, as well as a Custom Query that enables you to use any SQL query that Teradata supports.

Multi-Tables mode

You can pull Teradata data into a target with Multi-Tables.

Procedure

Make sure to choose Multi-Tables in the Source tab before moving on to the Target tab.
Select your Target.
Select your Target connection, then click the curved arrow next to Database and Schema on the right side of the row. After the refresh is complete, click the row and choose the Database and Schema where the data is stored.
Set the loading mode for your Multi-Tables migration.
- For Storage Targets(Google Cloud Storage, Amazon S3, Azure Blob Storage), choose a bucket and a path for your data to land.
Navigate to the Schema tab once your chosen Target location has been defined.
To get the metadata of your database, select Show Me My Schemas.
Click your desired schema to get a list of its tables.
You can check the boxes next to individual tables to choose them, or you can check the box next to the Source Tables heading to select all tables.

note

A default Loading Mode is defined in the Target tab. You can change any individual table's loading mode by selecting the Edit on the right.

After choosing your tables, you can edit the following on the Columns tab:
- Check or uncheck the boxes next to the columns you want to include or exclude from the Target.
- In the Target column, rename the field by clicking the name and typing a new one.
- Double-click the current data type under 'Type' to change the field to a new data type.
- Change the field's mode under Mode.
- By checking the Cluster Key box, you can make a field a Cluster Key, which is used for partitioning.
- To make a field the key for Upsert-Merge, highlight it with the key to the left of the field name.
A custom expression can be added to any target column by clicking + Add Calculated Column.

note

When adding a comment to an Expressions query, use /* "comment" */ rather than two hyphens (--) to prevent commands from being mistakenly interpreted as comments, as the query is converted to a single line.

On the Table Settings tab, you can edit the following:

You can extract data in two ways:
- All (Default)
- Incremental If you choose 'Incremental,' you can specify which field will be used to define the increment.

note

Start Date is mandatory.
Data can be retrieved for the date range specified between the Start and End dates.
If you leave the end date blank, the data will be pulled at the current time of the River's run.
Date timezone: UTC.

Select Edit to change the Target table name.
Change the Merge method and the loading mode.
Filter logical key duplication between files - This option removes duplications from the current source pull.

To schedule your River, specify execution timeouts, and receive notifications, go to the 'Settings' tab.
You can now click Run to run your River.

Custom query mode

Use the Custom Query to define a query data pull. You can use any Teradata query or a single SELECT query without any additional statements.

note

Data Integration does not support multi-statements or SQL scripts in the custom query field.

Procedure

Enter your own query.
You can extract data in two ways:
- All (Default)
- Incremental If you choose Incremental, you can specify which field will be used to define the increment. Select one of three incremental types and enter the time periods you require.

note

Start Date is mandatory.
Data can be retrieved for the date range specified between the Start and End dates.
If you leave the end date blank, the data will be pulled at the current time of the River's run.
Date time zone: UTC.

Interval Chunks Size (Optional):

A setting that allows you to divide the data into intervals, which is useful when dealing with large amounts of data. The interval size will divide the data calls by the specified size. For Example, A Daily chunk with an interval size of 3 will divide the date range into three requests of three days each.

In the Advanced Extract Options, you can also specify an Extraction array and Exporter chunk sizes.
To receive all of the columns in your data, click Auto Mapping.
Your schema will be displayed.

To get the metadata of your database, click Auto Mapping.
Here you can modify the following:
- In the Target Field column, rename the field by clicking the name and typing a new one.
- Click the current data type under Type to change the field to a new data type.
- Change the field's mode under Mode.
- By checking the Cluster Key box, you can make a field a Cluster Key, which is used for partitioning.
- To make a field the key for Upsert-Merge, highlight it with the key to the left of the field name.
- To insert a custom expression, click and type a new expression.
To schedule your River, specify execution timeouts, and receive notifications, go to the Settings tab.
You can click Run to run your River.