0% found this document useful (0 votes)
10 views525 pages

Trifacta Developer Guide

The Developer Guide version 6.0.2 provides confidential and proprietary information from Trifacta Inc., detailing the use of various features for developers, including user-defined functions, command line interface, and API references. It includes specific instructions for Java UDFs, creating custom data types, and migrating CLI to APIs. The document is provided as-is without warranty and limits liability to $100 for any use of the documentation.

Uploaded by

soma1243
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views525 pages

Trifacta Developer Guide

The Developer Guide version 6.0.2 provides confidential and proprietary information from Trifacta Inc., detailing the use of various features for developers, including user-defined functions, command line interface, and API references. It includes specific instructions for Java UDFs, creating custom data types, and migrating CLI to APIs. The document is provided as-is without warranty and limits liability to $100 for any use of the documentation.

Uploaded by

soma1243
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 525

Developer Guide

Version: 6.0.2
Doc Build Date: 05/24/2019
Copyright © Trifacta Inc. 2019 - All Rights Reserved. CONFIDENTIAL

These materials (the “Documentation”) are the confidential and proprietary


information of Trifacta Inc. and may not be reproduced, modified, or distributed
without the prior written permission of Trifacta Inc.

EXCEPT AS OTHERWISE PROVIDED IN AN EXPRESS WRITTEN


AGREEMENT, TRIFACTA INC. PROVIDES THIS DOCUMENTATION AS-IS
AND WITHOUT WARRANTY AND TRIFACTA INC. DISCLAIMS ALL EXPRESS
AND IMPLIED WARRANTIES TO THE EXTENT PERMITTED, INCLUDING
WITHOUT LIMITATION THE IMPLIED WARRANTIES OF MERCHANTABILITY,
NON-INFRINGEMENT AND FITNESS FOR A PARTICULAR PURPOSE AND
UNDER NO CIRCUMSTANCES WILL TRIFACTA INC. BE LIABLE FOR ANY
AMOUNT GREATER THAN ONE HUNDRED DOLLARS ($100) BASED ON
ANY USE OF THE DOCUMENTATION.

For third-party license information, please select About Trifacta from the User
menu.
1. Developer . . 7
1.1 User-Defined Functions . 10
1.1.1 Java UDFs . 12
1.2 Create Custom Data Types Using RegEx . 21
1.3 Command Line Interface . 25
1.3.1 CLI Migration to APIs . 26
1.3.2 Install CLI Tools . 48
1.3.3 CLI for Connections . 50
1.3.4 CLI for Jobs . 65
1.3.4.1 CLI Example - Parameterize Job Runs . 81
1.3.4.2 CLI Publishing Options File . 85
1.3.5 CLI for User Admin . 87
1.3.6 CLI Config File . 95
1.4 API Reference . 96
1.4.1 API Overview . 99
1.4.2 API Authentication 104
1.4.2.1 Manage API Access Tokens 108
1.4.3 API Endpoints 111
1.4.3.1 v4 Endpoints 113
1.4.3.1.1 API AccessTokens Create v4 118
1.4.3.1.2 API AccessTokens Delete v4 120
1.4.3.1.3 API AccessTokens Get List v4 121
1.4.3.1.4 API AccessTokens Get v4 123
1.4.3.1.5 API Connections Create DryRun v4 124
1.4.3.1.6 API Connections Create v4 127
1.4.3.1.7 API Connections Delete v4 133
1.4.3.1.8 API Connections Get List v4 134
1.4.3.1.9 API Connections Get Status v4 138
1.4.3.1.10 API Connections Get v4 139
1.4.3.1.11 API Connections Patch v4 143
1.4.3.1.12 API Connections Permissions Create User v4 144
1.4.3.1.13 API Connections Permissions Delete User v4 145
1.4.3.1.14 API Connections Permissions Get User v4 146
1.4.3.1.15 API Connections Vendors Get List v4 148
1.4.3.1.16 API Deployments Create v4 158
1.4.3.1.17 API Deployments Delete v4 160
1.4.3.1.18 API Deployments Get List v4 161
1.4.3.1.19 API Deployments Get Release List v4 162
1.4.3.1.20 API Deployments Get v4 165
1.4.3.1.21 API Deployments Object Import Rules Patch v4 166
1.4.3.1.22 API Deployments Patch v4 170
1.4.3.1.23 API Deployments Run v4 172
1.4.3.1.24 API Deployments Value Import Rules Patch v4 174
1.4.3.1.25 API EMRClusters Create v4 177
1.4.3.1.26 API EMRClusters Delete v4 179
1.4.3.1.27 API EMRClusters Get Count v4 180
1.4.3.1.28 API EMRClusters Get List v4 181
1.4.3.1.29 API EMRClusters Get v4 183
1.4.3.1.30 API EMRClusters Patch v4 185
1.4.3.1.31 API Flows Create v4 186
1.4.3.1.32 API Flows Delete v4 188
1.4.3.1.33 API Flows Get List v4 189
1.4.3.1.34 API Flows Get v4 192
1.4.3.1.35 API Flows Package Get DryRun v4 193
1.4.3.1.36 API Flows Package Get v4 194
1.4.3.1.37 API Flows Package Post DryRun v4 195
1.4.3.1.38 API Flows Package Post v4 197

Page #3
1.4.3.1.39 API Flows Patch v4 204
1.4.3.1.40 API ImportedDatasets Create v4 205
1.4.3.1.41 API ImportedDatasets Delete v4 215
1.4.3.1.42 API ImportedDatasets Get List v4 216
1.4.3.1.43 API ImportedDatasets Get v4 219
1.4.3.1.44 API ImportedDatasets Patch v4 228
1.4.3.1.45 API ImportedDatasets Post AddToFlow v4 230
1.4.3.1.46 API JobGroups Cancel v4 232
1.4.3.1.47 API JobGroups Create v4 233
1.4.3.1.48 API JobGroups Delete v4 237
1.4.3.1.49 API JobGroups Get Jobs v4 238
1.4.3.1.50 API JobGroups Get List v4 241
1.4.3.1.51 API JobGroups Get Publications v4 245
1.4.3.1.52 API JobGroups Get Status v4 248
1.4.3.1.53 API JobGroups Get v4 249
1.4.3.1.54 API JobGroups Put Publish v4 254
1.4.3.1.55 API OutputObjects Create v4 257
1.4.3.1.56 API OutputObjects Delete v4 260
1.4.3.1.57 API OutputObjects Get List v4 261
1.4.3.1.58 API OutputObjects Get v4 264
1.4.3.1.59 API OutputObjects Update v4 268
1.4.3.1.60 API People Create v4 270
1.4.3.1.61 API People Delete v4 272
1.4.3.1.62 API People Get List v4 273
1.4.3.1.63 API People Get v4 276
1.4.3.1.64 API People Patch v4 278
1.4.3.1.65 API Publications Create v4 280
1.4.3.1.66 API Publications Delete v4 282
1.4.3.1.67 API Publications Get List v4 283
1.4.3.1.68 API Publications Get v4 286
1.4.3.1.69 API Publications Update v4 288
1.4.3.1.70 API Releases Create DryRun v4 290
1.4.3.1.71 API Releases Create v4 298
1.4.3.1.72 API Releases Delete v4 305
1.4.3.1.73 API Releases Get v4 306
1.4.3.1.74 API Releases Package Get v4 308
1.4.3.1.75 API Releases Patch v4 309
1.4.3.1.76 API WrangledDatasets Create v4 311
1.4.3.1.77 API WrangledDatasets Delete v4 314
1.4.3.1.78 API WrangledDatasets Get List v4 315
1.4.3.1.79 API WrangledDatasets Get PrimaryInputDataset v4 319
1.4.3.1.80 API WrangledDatasets Get v4 323
1.4.3.1.81 API WrangledDatasets Patch v4 326
1.4.3.1.82 API WrangledDatasets Post AddToFlow v4 327
1.4.3.1.83 API WrangledDatasets Put PrimaryInputDataset v4 329
1.4.3.1.84 API WriteSettings Create v4 333
1.4.3.1.85 API WriteSettings Delete v4 335
1.4.3.1.86 API WriteSettings Get List v4 337
1.4.3.1.87 API WriteSettings Get v4 340
1.4.3.1.88 API WriteSettings Update v4 344
1.4.3.2 v3 Endpoints 345
1.4.3.2.1 API Connections Create v3 348
1.4.3.2.2 API Connections Delete v3 352
1.4.3.2.3 API Connections Get List v3 353
1.4.3.2.4 API Connections Get Status v3 356
1.4.3.2.5 API Connections Get v3 357

Page #4
1.4.3.2.6 API Deployments Create v3 361
1.4.3.2.7 API Deployments Delete v3 363
1.4.3.2.8 API Deployments Get List v3 364
1.4.3.2.9 API Deployments Get Release List v3 366
1.4.3.2.10 API Deployments Get v3 367
1.4.3.2.11 API Deployments Object Import Rules Patch v3 369
1.4.3.2.12 API Deployments Patch v3 372
1.4.3.2.13 API Deployments Run v3 373
1.4.3.2.14 API Deployments Value Import Rules Patch v3 376
1.4.3.2.15 API Flows Create v3 379
1.4.3.2.16 API Flows Delete v3 380
1.4.3.2.17 API Flows Get List v3 381
1.4.3.2.18 API Flows Get v3 384
1.4.3.2.19 API Flows Package Get DryRun v3 386
1.4.3.2.20 API Flows Package Get v3 387
1.4.3.2.21 API Flows Package Post DryRun v3 388
1.4.3.2.22 API Flows Package Post v3 390
1.4.3.2.23 API Flows Patch v3 391
1.4.3.2.24 API ImportedDatasets Create v3 393
1.4.3.2.25 API ImportedDatasets Delete v3 399
1.4.3.2.26 API ImportedDatasets Get List v3 400
1.4.3.2.27 API ImportedDatasets Get v3 404
1.4.3.2.28 API ImportedDatasets Post AddToFlow v3 409
1.4.3.2.29 API JobGroups Create v3 411
1.4.3.2.30 API JobGroups Delete v3 415
1.4.3.2.31 API JobGroups Get Jobs v3 416
1.4.3.2.32 API JobGroups Get List v3 419
1.4.3.2.33 API JobGroups Get Status v3 422
1.4.3.2.34 API JobGroups Get v3 423
1.4.3.2.35 API JobGroups Put Publish v3 427
1.4.3.2.36 API People Create v3 430
1.4.3.2.37 API People Delete v3 431
1.4.3.2.38 API People Get List v3 432
1.4.3.2.39 API People Get v3 434
1.4.3.2.40 API People Patch v3 436
1.4.3.2.41 API Releases Create DryRun v3 438
1.4.3.2.42 API Releases Create v3 439
1.4.3.2.43 API Releases Delete v3 441
1.4.3.2.44 API Releases Get v3 442
1.4.3.2.45 API Releases Package Get v3 444
1.4.3.2.46 API Releases Patch v3 445
1.4.3.2.47 API WrangledDatasets Create v3 447
1.4.3.2.48 API WrangledDatasets Delete v3 449
1.4.3.2.49 API WrangledDatasets Get List v3 450
1.4.3.2.50 API WrangledDatasets Get PrimaryInputDataset v3 454
1.4.3.2.51 API WrangledDatasets Get v3 456
1.4.3.2.52 API WrangledDatasets Put PrimaryInputDataset v3 459
1.4.3.3 API Session Get 461
1.4.4 API Version Support Matrix 464
1.4.4.1 API Migration to v4 464
1.4.5 API - UI Integrations 468
1.4.5.1 UI Integration - Create Dataset 469
1.4.6 API Workflows 471
1.4.6.1 API Workflow - Develop a Flow 471
1.4.6.2 API Workflow - Deploy a Flow 479
1.4.6.3 API Workflow - Run Job on Dataset with Parameters 493

Page #5
1.4.6.4 API Workflow - Publish Results 500
1.4.6.5 API Workflow - Manage Outputs 505
1.4.6.6 API Workflow - Swap Datasets 518

Page #6
Developer
This section contains topics of interest to data engineers and other developers.

Use of the features documented in this section requires programming skills.

Topics:
User-Defined Functions
Java UDFs
Create Custom Data Types Using RegEx
Command Line Interface
CLI Migration to APIs
Install CLI Tools
CLI for Connections
CLI for Jobs
CLI Example - Parameterize Job Runs
CLI Publishing Options File
CLI for User Admin
CLI Config File
API Reference
API Overview
API Authentication
Manage API Access Tokens
API Endpoints
v4 Endpoints
API AccessTokens Create v4
API AccessTokens Delete v4
API AccessTokens Get List v4
API AccessTokens Get v4
API Connections Create DryRun v4
API Connections Create v4
API Connections Delete v4
API Connections Get List v4
API Connections Get Status v4
API Connections Get v4
API Connections Patch v4
API Connections Permissions Create User v4
API Connections Permissions Delete User v4
API Connections Permissions Get User v4
API Connections Vendors Get List v4
API Deployments Create v4
API Deployments Delete v4
API Deployments Get List v4
API Deployments Get Release List v4
API Deployments Get v4
API Deployments Object Import Rules Patch v4
API Deployments Patch v4
API Deployments Run v4
API Deployments Value Import Rules Patch v4
API EMRClusters Create v4
API EMRClusters Delete v4
API EMRClusters Get Count v4
API EMRClusters Get List v4
API EMRClusters Get v4
API EMRClusters Patch v4
API Flows Create v4

Copyright © 2019 Trifacta Inc. Page #7


API Flows Delete v4
API Flows Get List v4
API Flows Get v4
API Flows Package Get DryRun v4
API Flows Package Get v4
API Flows Package Post DryRun v4
API Flows Package Post v4
API Flows Patch v4
API ImportedDatasets Create v4
API ImportedDatasets Delete v4
API ImportedDatasets Get List v4
API ImportedDatasets Get v4
API ImportedDatasets Patch v4
API ImportedDatasets Post AddToFlow v4
API JobGroups Cancel v4
API JobGroups Create v4
API JobGroups Delete v4
API JobGroups Get Jobs v4
API JobGroups Get List v4
API JobGroups Get Publications v4
API JobGroups Get Status v4
API JobGroups Get v4
API JobGroups Put Publish v4
API OutputObjects Create v4
API OutputObjects Delete v4
API OutputObjects Get List v4
API OutputObjects Get v4
API OutputObjects Update v4
API People Create v4
API People Delete v4
API People Get List v4
API People Get v4
API People Patch v4
API Publications Create v4
API Publications Delete v4
API Publications Get List v4
API Publications Get v4
API Publications Update v4
API Releases Create DryRun v4
API Releases Create v4
API Releases Delete v4
API Releases Get v4
API Releases Package Get v4
API Releases Patch v4
API WrangledDatasets Create v4
API WrangledDatasets Delete v4
API WrangledDatasets Get List v4
API WrangledDatasets Get PrimaryInputDataset v4
API WrangledDatasets Get v4
API WrangledDatasets Patch v4
API WrangledDatasets Post AddToFlow v4
API WrangledDatasets Put PrimaryInputDataset v4
API WriteSettings Create v4
API WriteSettings Delete v4
API WriteSettings Get List v4
API WriteSettings Get v4
API WriteSettings Update v4

Copyright © 2019 Trifacta Inc. Page #8


v3 Endpoints
API Connections Create v3
API Connections Delete v3
API Connections Get List v3
API Connections Get Status v3
API Connections Get v3
API Deployments Create v3
API Deployments Delete v3
API Deployments Get List v3
API Deployments Get Release List v3
API Deployments Get v3
API Deployments Object Import Rules Patch v3
API Deployments Patch v3
API Deployments Run v3
API Deployments Value Import Rules Patch v3
API Flows Create v3
API Flows Delete v3
API Flows Get List v3
API Flows Get v3
API Flows Package Get DryRun v3
API Flows Package Get v3
API Flows Package Post DryRun v3
API Flows Package Post v3
API Flows Patch v3
API ImportedDatasets Create v3
API ImportedDatasets Delete v3
API ImportedDatasets Get List v3
API ImportedDatasets Get v3
API ImportedDatasets Post AddToFlow v3
API JobGroups Create v3
API JobGroups Delete v3
API JobGroups Get Jobs v3
API JobGroups Get List v3
API JobGroups Get Status v3
API JobGroups Get v3
API JobGroups Put Publish v3
API People Create v3
API People Delete v3
API People Get List v3
API People Get v3
API People Patch v3
API Releases Create DryRun v3
API Releases Create v3
API Releases Delete v3
API Releases Get v3
API Releases Package Get v3
API Releases Patch v3
API WrangledDatasets Create v3
API WrangledDatasets Delete v3
API WrangledDatasets Get List v3
API WrangledDatasets Get PrimaryInputDataset v3
API WrangledDatasets Get v3
API WrangledDatasets Put PrimaryInputDataset v3
API Session Get
API Version Support Matrix
API Migration to v4
API - UI Integrations

Copyright © 2019 Trifacta Inc. Page #9


UI Integration - Create Dataset
API Workflows
API Workflow - Develop a Flow
API Workflow - Deploy a Flow
API Workflow - Run Job on Dataset with Parameters
API Workflow - Publish Results
API Workflow - Manage Outputs
API Workflow - Swap Datasets

User-Defined Functions
Contents:
UDF Service
Supported UDF Language Frameworks
Running a UDF within the Platform

The Trifacta® platform enables the creation of user-defined functions (UDFs) for use in your Trifacta deployment.
A user-defined function is a way to specify a custom process or transformation for use in your specific Trifacta
solution, using familiar development languages and third-party libraries. Through UDFs, you can apply enterprise-
or industry-specific expertise consistently into your data transformations. A user-defined function is a custom
function that is created in one of the supported language frameworks. Each user-defined function has a defined
set of inputs and generates a single output.

UDF Service

The following diagram provides a high-level overview of the UDF service which provides integration of
user-defined functions into recipe execution.
Diagram 1: The figure illustrates execution of a UDF in interactive mode, where a user interacts with the
Transformer grid.
Diagram 2: This feature illustrates how UDFs interact with Hadoop at job execution time.

Copyright © 2019 Trifacta Inc. Page #10


Figure: User-Defined Service

Supported UDF Language Frameworks

Please use the following links to enable the creation of user-defined functions in the listed language.
Java UDFs

Copyright © 2019 Trifacta Inc. Page #11


Running a UDF within the Platform

After you have created and tested your UDF, you can execute it by entering udf in the Search panel and
populating the rest of the step in the Transform Builder. In this example, the AdderUDF function is executed:

udf name:'AdderUDF' col:column1 args:'1' as:'udf_output'

Notes:
The udf command causes the named UDF to run.
After you type name, your UDF should appear in a drop-down list. If not, please verify that it has been
properly created, compiled, and registered and that the udf-service has been restarted.
The col argument is a comma-separated list of the source data to be used as inputs to the exec method.
The args argument is a string of comma-separated values used as inputs to the init method.
Optionally, the as parameter can be used to provide a specific name to the generated column. If it is not
used, a column name is generated.

NOTE: When a recipe containing a user-defined function is applied to text data, any non-printing (control)
characters cause records to be truncated by the running environment during Hadoop job execution. In
these cases, please execute the job on the Trifacta Server.

See Transformer Page.


Java UDFs
Contents:
Pre-requisites
Overview
Known Limitations
Enable Service
Deployment
Creating a UDF
UDF Requirements
Example - Concatenate strings
Example - Add by constant
Error Handling
Testing the UDF
Compiling the UDF
JDK version mismatches
Registering the UDF
Enabling UDF service on HDInsight cluster
Running Your UDF
Troubleshooting
"Websocket Receive()" error in Transformer page UI

This section describes how to create and deploy Java-based user-defined functions (UDFs) into your Trifacta®
deployment.

Creation of UDFs requires development experience and access to an integrated development


environment (IDE).

Copyright © 2019 Trifacta Inc. Page #12


Pre-requisites

1. Access to the Trifacta deployment


2. IDE
3. The Java UDF is stored in the Trifacta deployment in the following location: libs/custom-udfs-sdk/bu
ild/distributions/java-custom-udf-sdk.zip

NOTE: If you are installing custom UDFs and the Trifacta node does not have an Internet connection, you
should download the Java UDF SDK in an Internet-accessible location, build your customer UDF JAR
there, and then upload the JAR to the Trifacta node.

Overview

Each UDF can take one or more inputs and produces a single output value (map only).
Inputs and outputs must be one of the following types:
Bool
String
Long
Double

Known Limitations

In the Trifacta application, previews are not available for user-defined functions.
Retaining state information across the exec method is unstable. More information is provided below.

NOTE: When a recipe containing a user-defined function is applied to text data, any null characters
cause records to be truncated by the running environment during Trifacta Server job execution. In
these cases, please execute the job on Hadoop.

Enable Service

You must enable the Java UDF service in the Trifacta platform.
Steps:
1. You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json.
For more information, see Platform Configuration Methods.
2. Enable the correct flag:

"feature.enableUDFTransform.enabled": true,

3. Save your changes.

Deployment

Steps:
1. Unzip java-custom-udf-sdk.zip.

Copyright © 2019 Trifacta Inc. Page #13


2. Within the unzipped directory, execute the install command. The following is specific to the Eclipse IDE:

gradlew eclipse

3. Import the project into your IDE.

Creating a UDF

UDF Requirements

All UDFs must implement the TrifactaUDF interface. This interface adds the four methods that each UDF must
override: init, exec, inputSchema, and finish.
1. init method: Used for setting private variables in the UDF. This method may be a no-op function if no
variables must be set. See the Example - Concatenate strings below.

Tip: In this method, perform your data validation on the input parameters, including count, data
type, and other constraints.

NOTE: The init method must be specified but can be empty, if there are no input parameters.

2. exec method: Contains functionality of the UDF. The output of the exec method must be one of the
supported types. It is also must match the generic as described. In the following example, TrifactaUDF<
String> implements a String. This method is run on each record.

Tip: In this method, you should check the number of input columns.

Keep state that varies across calls to the exec method can lead to unexpected behavior.
One-time initialization, such as initializing the regex compiler, is safe, but do not allow state
information to mutate across calls to exec. This is a known issue.

3. inputSchema method: The inputSchema method describes the schema of the list on which the exec
method is acting. The classes in the schema must be supported. Essentially, you should support the I/O
types described earlier.
4. finish method: The finish method is run at the end of UDF. Typically, it is a no-op.

NOTE: If you are executing your UDF on the Spark running environment, the finish method cannot
be invoked at this point. Instead, it is invoked as part of the shutdown of the Java VM. This later
execution may result in the finish method failing to be invoked in situations like a JVM crash.

Example - Concatenate strings

The following code example concatenates two input strings in the List<Object>. This UDF can be easily
modified to concatenate more strings by modifying the inputSchema function.

Copyright © 2019 Trifacta Inc. Page #14


Example UDF: ConcatUDF
package com.trifacta.trifactaudfs;
import java.io.IOException;
import java.util.List;

/**
* Example UDF that concatenates two columns
*/
public class ConcatUDF implements TrifactaUDF<String> {
@Override
public String exec(List<Object> inputs) throws IOException {
if (inputs == null) {
return null;
}
StringBuilder sb = new StringBuilder();
for (int i = 0; i < inputSchema().length; i += 1) {
if (inputs.get(i) == null) {
return null;
}
sb.append(inputs.get(i));
}
return sb.toString();
}
@SuppressWarnings("rawtypes")
public Class[] inputSchema() {
return new Class[]{String.class, String.class};
}
@Override
public void finish() throws IOException {
}
@Override
public void init(List<Object> initArgs) {
}
}

Notes:
The first line indicates that the function is part of the com.trifacta.trifactaudfs package.
The defined UDF class implements the TrifactaUDF class, which is the base interface for UDFs.
It is parameterized with the return type of the UDF (a Java String in this case).
The input into the function is a list with input parameters in the order they are passed to the function
within the Trifacta platform. See Running Your UDF below.
The UDF checks the input data for null values, and if any nulls are detected, returns a null.
The inputSchema describes the input list passed into the exec method.
An error is thrown if the type of the data that is passed into the UDF does not match the schema.
The UDF must handle improper data. See Error Handling below.

Example - Add by constant

In this example, the input value is added by a constant, which is defined in the init method.

Copyright © 2019 Trifacta Inc. Page #15


The init method consumes a list of objects, each of which can be used to set a variable in the UDF. The
input into the init function is a list with parameters in the order they are passed to the function within the Trif
acta platform. See Running Your UDF below.

Example UDF: AdderUDF


package com.trifacta.trifactaudfs;
import java.io.IOException;
import java.util.List;

/**
* Example UDF. Adds a constant amount to an Integer column.
*/
public class AdderUDF implements TrifactaUDF<Long> {
private Long _addAmount;
@Override
public void init(List<Object> initArgs) {
if (initArgs.size() != 1) {
System.out.println("AdderUDF takes in exactly one init argument");
}
Long addAmount = (Long) initArgs.get(0);
_addAmount = addAmount;
}
@Override
public Long exec(List<Object> input) {
if (input == null) {
return null;
}
if (input.size() != 1) {
return null;
}
return (Long) input.get(0) + _addAmount;
}
@SuppressWarnings("rawtypes")
public Class[] inputSchema() {
return new Class[]{Long.class};
}
@Override
public void finish() throws IOException {
}
}

Error Handling

The UDF must handle any error that should occur when processing the function. Two ways of dealing with errors:
1. For null data generated in the exec method, a null value can be returned. It appears in the final generated
column.
2. Any errors that cause the UDF to stop in the init or exec methods cause an IOException to be thrown. This
error signals the platform that an issue occurred with the UDF.

Tip: You can add to the Trifacta logs through Logger. Annotate your exceptions at the appropriate logging
level.

Copyright © 2019 Trifacta Inc. Page #16


Testing the UDF

JUnit can be used to test the UDF. Below are examples of testing the two example UDFs.
Example - JUnit test for Concatenate strings:

ConcatUDF Test
@Test
public void concatUDFTest() throws IOException {
ConcatUDF concat = new ConcatUDF();
ArrayList<Object> input = new ArrayList<Object>();
input.add("hello");
input.add("world");
String result = concat.exec(input);
String expected = "helloworld";
assertEquals(expected, result);
}

Example - JUnit test for Add by constant:

AdderUDF Test
@Test
public void adderUDFTest() {
AdderUDF add = new AdderUDF();
ArrayList<Object> initArgs = new ArrayList<Object>(1);
initArgs.add(1L);
add.init(initArgs);
ArrayList<Object> inputs1 = new ArrayList<Object>();
inputs1.add(1L);
long result = add.exec(inputs1);
long expected = 2L;
assertEquals(expected, result);

ArrayList<Object> inputs2 = new ArrayList<Object>();


inputs2.add(9000L);
result = add.exec(inputs2);
expected = 9001L;
assertEquals(expected, result);
}

Compiling the UDF

After writing the UDF, it must be compiled and included in a JAR before registering it with the platform. To compile
and package the function, run the following command from the root directory:

Copyright © 2019 Trifacta Inc. Page #17


gradlew build

The UDF code is assembled, and unit tests are executed. If all is well, the following JAR file is created in build/
libs.

NOTE: Custom UDFs should be compiled to one or more JAR files. Avoid using the example JAR
filename, which can be overwritten on upgrade.

JDK version mismatches

To avoid an Unsupported major.minor version error during execution, the JDK version used to compile
the UDF JAR file should be less than or equal to the JDK version on the Hadoop cluster.
If this is not possible, then set the value of the Compatibility properties in the local build.gradle file to the
JDK version on the Hadoop cluster prior to building the JAR file.
Example:
If the Hadoop cluster is on JDK 1.8, then add the following to the build.gradle file:

targetCompatibility = '1.8'
sourceCompatibility = '1.8'

Registering the UDF

After a function is compiled it must be registered with the platform.:


1. Enable user-defined functions (if not done so already)
2. Path to the JAR file that was generated in the previous steps.
3. The udfPackages value should contain the package name where the UDFs can be found.

Example configuration:
To apply this configuration change, login as an administrator to the Trifacta node. Then, edit trifacta-conf.j
son. Some of these settings may not be available through the Admin Settings Page. For more information, see
Platform Configuration Methods.

Copyright © 2019 Trifacta Inc. Page #18


Example Config
...
"feature": {
"enableUDFTransform": {
"enabled": true
}
},
"udf-service": {
"classpath":
"%(topOfTree)s/services/udf-service/build/libs/udf-service.jar:%(topOfTre
e)s/services/udf-service/build/dependencies/*",
"additionalJars": [
"/vagrant/libs/custom-udfs-sdk/build/libs/custom-udfs-example.jar"
],
"udfPackages": [
"com.trifacta.trifactaudfs"
]
},
...

Notes:
Set enableUDFTransform.enabled to true, which enables UDFs in general.
Under udf-service:
specify the full path to the JAR under additionalJars
append the paths of any extra JAR dependencies that your UDFs require under classpath

NOTE: Do not include any extra JAR dependencies in the udf-service/build/depend


encies directory, as this directory may be purged at build time.

specify the fully qualified package names under udfPackages


This list contains all fully qualified names of your UDFs.
For example. if your UDF is com.company.ourudfs.MyUDF, then the package name is the
following: com.company.ourudfs

Steps:
After modifying the config, the udf-service needs to be restarted.
a. If you created a new UDF, restart the Trifacta application:

service trifacta restart

b. If you have modified an existing UDF, restart the UDF service:

NOTE: For an existing UDF, you must rebuild the JAR first. Otherwise, the changes are not
recognized during service re-initialization.

Copyright © 2019 Trifacta Inc. Page #19


service java-udf-service restart

2. As part of the restart, any newly added Java UDFs are registered with the application.

Enabling UDF service on HDInsight cluster

By default, the UDF service utilizes compression across the websockets when running on the cluster. HDInsight
clusters do not support compression on websockets.
To make sure the UDF service works on your HDInsight cluster, please do the following.
Steps:
1. To apply this configuration change, login as an administrator to the Trifacta node. Then, edit trifacta-c
onf.json. Some of these settings may not be available through the Admin Settings Page. For more
information, see Platform Configuration Methods.
2. Locate the udf-service configuration.
3. Insert the following extra property in the udf-service configuration area:

"udf-service": {
...
"jvmOptions":
["-Dorg.apache.tomcat.websocket.DISABLE_BUILTIN_EXTENSIONS=true"],
...
}

4. Save your changes and restart the platform.

Running Your UDF

For more information on executing your UDF in the Transformer page, see User-Defined Functions.

Troubleshooting

"Websocket Receive()" error in Transformer page UI

If you execute a Java UDF, you may see an error similar to the following in the Transformer page:

Please reload page (query execution failed).pp::WebSocket::Receive()


error: Unspecified failure.

When you check the udf.log file on the server, the following may be present:

UDFWebsocket closed with status: CloseStatus[code=1009, reason=The


decoded text message was too big for the output buffer and the endpoint
does not support partial messages]

Copyright © 2019 Trifacta Inc. Page #20


Solution

The above issue is likely to be caused by the Photon running environment sending too much data through the
buffer of the UDF's Websocket service. By default, this buffer size is set to 1048576 bytes (1 MB).
The Photon running environment processes data through the Websocket service in 1024 (1 K) rows at a time for
the input and output columns of the UDF. If the data in the input columns to the UDF or output columns from the
UDF exceeds 1 KB (1024 characters) in total size for each row, the default size of the buffer is too small, since
Photon processed 1K records at a time (1 K characters * 1 K rows > 1048576). The query then fails.
When setting a new buffer size:
Assume that 1024 rows are processed from the buffer each time.
Identify the input columns and output columns for the UDF that is failing.
Identify the dataset that has the widest columns for both inputs and outputs here.

Tip: You can use the LEN function to do string-based computations of column width. See
LEN Function.

Perform the following estimate on the widest set of input and output columns that you are processing:
Estimate the total expected number of characters for the input columns of the UDF.
Add a 20% buffer to the above estimate.
Repeat the above estimate for the widest output columns for the UDF.
Set your buffer size to the larger of the two estimates (input columns' width or output columns'
width).
Example: A UDF takes two inputs and produces one output:
If each input column is 256 characters, then the size of 1K rows of input would be 256 bytes * 2
(input cols) * 1024 rows = 0.5 MB.
If the output of the UDF per row is estimated to be 1024 characters, then the output estimate would
be 1024 bytes * 1024 rows = 1MB.
So, set the buffer size to be 1 MB + 20% buffer over the larger estimate between input and output. In
this example, the buffer size should be 1.2 MB or 1258291 Bytes.
Steps:
1. You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json.
For more information, see Platform Configuration Methods.
2. Change the following setting:

"udf-service.outputBufferSize": 1048576,

3. Save your changes and restart the platform.

Create Custom Data Types Using RegEx


Contents:
Custom Types Location
Examples
Defining probabilities
Add custom types to manifest
Enable custom types
Register your custom types
Restart platform

Copyright © 2019 Trifacta Inc. Page #21


As needed, you can deploy custom data types into the Trifacta® platform, in which type validation is performed
against regular expressions that you specify. This method is most useful for validating against patterns, as
opposed to specific values.
If your custom data type contains a pre-defined set of values, you can create the custom type using a
dictionary file for validation. See Create Custom Data Types.

Custom Types Location

On the server hosting the Trifacta platform, type definitions such as dictionaries and custom data types are stored
in the following directory:

/opt/trifacta/js-data/type-packs/trifacta

This directory is referenced as $CUSTOM_TYPE_DIR in the steps below.

Before you begin creating custom data types, you should backup the type-packs/trifacta dire
ctory to a location outside of your Trifacta deployment.

NOTE: The trifacta-extras directory in the type-packs directory contains experimental custom
data types. These data types are not officially supported. Please use with caution.

Directory contents:
The dictionaries sub-directory contains user-defined dictionaries.

NOTE: Please use the user interface to interact with your dictionaries. See Custom Type Dialog.

The types sub-directory contains individual custom data type definitions, each in a separate file.
The manifest.json file contains a JSON manifest of all of the custom dictionaries and types in the
system.

Examples

Each custom data type is created and stored in a separate file. The following example file contains a regular
expression method for validating data against the set of days of the week:

Copyright © 2019 Trifacta Inc. Page #22


{
"name": "DayOfWeek",
"prettyName": "Day of Week",
"category" : "Date/Time",
"defaultProbability": 1E-15,
"testCase": {
"stripWhitespace": true,
"regexes": [
"^(monday|tuesday|wednesday|thursday|friday|saturday|sunday)$",
"^(mon|tue|wed|thu|fri|sat|sun)$"
],
"probability": 0.001
}
}

Parameters:

Parameter Name Description

name Internal identifier for the custom type. Must be unique across all standard types and custom types.

NOTE: You should verify that your data type's name value does not conflict with other custom
data type names.

prettyName Display name for the custom type.

category The category to assign to the type. The current categories are displayed within the data type drop-down for
each column.

defaultProbability Assign a default probability for the custom type. See below.

testCase This block contains the regular expression specification to be applied to the column values.

stripWhitespace When set to true, whitespace is removed from any value prior for purposes of validation. The original
value is untouched.

regexes This array contains a set of regular expressions that are used to validate the column values. For a regex
type, the column value must match with at least one value among the set of expressions.

NOTE: All match types must be double-escaped in the regex expression. For example, to
replicate the \d pattern, you must enter: \\d.

Trifacta Wrangler Enterprise implements a version of regular expressions based off of RE2 and PCRE regu
lar expressions.

probability (optional) Assign an incremental change to the probability when a match is found between a value and one
of the regular expressions. See Defining probabilities below.

Tip: In the types sub-directory, you can review the regex-based types that are provided with the Trifacta
platform. While you should not edit these files directly, they may provide some guidance and some regex
tips on how to configure your own custom data types.

Copyright © 2019 Trifacta Inc. Page #23


Defining probabilities

For your custom type, the probability values are used to determine the likelihood that matching values indicate
that the entire column is of the custom data type.
The defaultProbability value specifies the baseline probability that a match between a value and one
of the regular expressions indicates that the column is the specified type. On a logarithmic scale, values
are typically 1E-15 to 1E-20.
When a value is matched to one of the regular expressions, the probability value is used to increment
the baseline probability that the next matching value is of the specified type. This value should also be
expressed on a logarithmic scale (e.g. 0.001).
In this manner, a higher number of matching values increases the probability that the type is also a match
to the custom type.
Probabilities become important primarily if you are creating a custom type that is a subset of an existing type. For
example, the Email Address custom type is a subset of String type. So, matches for the patterns expressed in the
Email Address definition should register a higher probability value than the same incremental for the String
type definition.

Tip: For custom types that are subsets of other, non-String types, you should lower the defaultProbab
ility of the baseline type by a factor of 10 (e.g. 1E-15 to 1E-16) and raise the same probability in the
custom type by a factor of 10 (e.g. 1E-14). In this manner, you can give higher probability of matching to
these subset types.

Add custom types to manifest

To the $CUSTOM_TYPE_DIR/manifest.json file, you must add the filenames of any custom types that you
have created and stored in the types directory:

{
"types": ["bodies-of-water.json", "dayofweek.json"],
"dictionaries": ["oceans", "seas"]
}

Enable custom types

To enable use of your custom data types in the Trifacta platform, locate and edit enabledSemanticTypes prope
rty.
You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For
more information, see Platform Configuration Methods.

NOTE: Add your entries to the items that are already present in enabledSemanticTypes. Do not delete
and replace entries.

Copyright © 2019 Trifacta Inc. Page #24


"webapp.enabledSemanticTypes": [
"<CustomTypeName1>",
"<CustomTypeName2>",
"<CustomTypeNameN>"
]

where:
<CustomTypeName1> corresponds to the internal name value for your custom data type.

Register your custom types

To add your custom types to the Trifacta platform, run the following command from the js-data directory:

node bin/load-types --manifest ${PATH_TO_MANIFEST_FILE}

Restart platform

Restart services. See Start and Stop the Platform.


Check for the availability of your types in the column drop-down. See Create Custom Data Types.

Command Line Interface

In the next release of Trifacta® Wrangler Enterprise after Release 6.0, the Trifacta command line
interface tools will be removed from the product (End of Life). Before upgrading to that release or
a later one, you must migrate your usage of the CLI to use the REST APIs. For more information,
see CLI Migration to APIs.

The Trifacta® command line interface (CLI) enables scripted execution of jobs and management of users and
connections for the Trifacta platform. This section provides documentation on how to install and deploy the
command line tools and includes example commands for each supported action.
Topics:
CLI Migration to APIs
Install CLI Tools
CLI for Connections
CLI for Jobs
CLI for User Admin
CLI Config File

Logging

The CLI submits requests to the platform through the Trifacta application, which writes its logging information to
the following file:

/opt/trifacta/logs/webapp.log

Copyright © 2019 Trifacta Inc. Page #25


In the above log file, some CLI requests, such as job execution, can be located by searching for the following:

"ranfrom": "cli"

Tip: From the output of the CLI, you should get in the habit of capturing the job, dataset, flow, or other
object identifier that the request is creating, modifying, or removing. These IDs are useful for parsing the
log file or locating the object in the application.

Administrators can download log files through the Trifacta node operating system or through the web interface for
the platform. For more information, see System Services and Logs.

Log Levels

By default, the logging level for the web application is set to INFO.

If you are attempting to debug an issue related to the CLI, you can change the logging level.
You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json. For
more information, see Platform Configuration Methods.
The log level is defined in the following parameter:

"webapp.loggerOptions.level": "INFO",

For more information, see Admin Settings Page.


CLI Migration to APIs
Contents:
General Differences
API authentication
Terminology
Use internal identifiers
Object Identifiers
Get primary object identifiers
Example - Get User Id
Important notes on CLI packages
Reference - CLI for Jobs
Run Job
File Publishing Options
Get Job Status
Publish
Get Publications
Load Data into Table
Truncate and Load
Reference - CLI for Connections
Create Connection
Edit Connection
List Connections
Delete Connection
Reference - CLI for User Admin
Create User
Show User

Copyright © 2019 Trifacta Inc. Page #26


Edit User
Generate Password Reset URL
Disable User
Delete User

In next release of Trifacta® Wrangler Enterprise after Release 6.0, the Trifacta Command Line Interface (CLI) will
reach its end of life (EOL). This means:
A version of the CLI that is compatible with the release will no longer be available for use.
Old versions of the CLI will not work with the new version of the platform.

In the next release of Trifacta® Wrangler Enterprise after Release 6.0, the Trifacta command line
interface tools will be removed from the product (End of Life). Before upgrading to that release or
a later one, you must migrate your usage of the CLI to use the REST APIs. For more information,
see CLI Migration to APIs.

Before you upgrade to the next Trifacta Wrangler Enterprise release, you must migrate any scripts or other
automation projects that currently use the CLI to use the v4 versions of the APIs. This section provides
information on how to manage that migration.

General Differences

API authentication

In CLI usage, you pass authentication username/password with each command.


In API usage, you must pass some form of authentication as a header in each request.

Tip: The recommended method is to create an API access token for the user account that is to be
accessing the APIs. This feature may need to be enabled in your instance of the platform. For more
information, see Enable API Access Tokens.

For more information, see API Authentication.

Terminology

Depending on the version you are using, please use the following mapping:

CLI/UI term API term Notes

Connection Connection

Job JobGroup In the application, a job that you launch is composed of one or more sub-jobs, such as ingest, profiling,
transformation, or sampling.

In the APIs, you reference a job that you launch as a jobGroup.


Script/Recipe WrangledDataset Depending on your previous version of the platform, the object may be referenced as a script or a
recipe.

In the APIs, the object is referenced by its internal platform name: wrangledDataset.
User User The API endpoint is people.

Copyright © 2019 Trifacta Inc. Page #27


Use internal identifiers

Parameters passed to the CLI are often user-friendly text values. The CLI tool then queries the appropriate REST
API endpoint and converts those values to internal identifiers.
When using the APIs, you must reference the internal identifiers directly.
Below is some information on how you can acquire the appropriate internal identifiers for each type of operation
supported by the CLI.

Object Identifiers

Get primary object identifiers

For each CLI command, there is an associated object identifier, which is used to uniquely reference the object. To
reference the object through the APIs, you must use the API unique id.

Tip: In the JSON response from the listed APIs, there may be multiple id values. To assist, you may find
it easier to use the secondary id's to locate each item.

NOTE: Each API endpoint returns only the objects to which the authenticating user has access. If other
users have personal objects that they need to migrate, they must provide access to them to the
authenticating user.

CLI object CLI unique CLI secondary id API Endpoint API API Notes
id secondary unique
id Id

Connection conn_id conn_name API Connections name id


Get List v4

Job job_id API JobGroups id This endpoint gets the


Get List v4 list of jobs that have
been launched.

script.cli n/a. See Open recipe in Transformer page API id This endpoint gets the
below. to acquire wrangledDataset Id. WrangledDatasets list of available
See "Important Notes on CLI Get List v4 wrangled datasets
packages" below. (recipes), which are
required for launching a
new job. That endpoint
is
API JobGroups Create
v4
.

User username username API People Get email id


List v4

Example - Get User Id

The following example steps through the process of acquiring user ids so that you can use the APIs.
CLI - Get list of usernames:
The CLI references users via their platform usernames.
If your CLI scripts contain references to individual users, search them for:

Copyright © 2019 Trifacta Inc. Page #28


--username <user_email_address>

If you want to acquire the list of all available usernames, it's easier to do that via the APIs.
API - Get list of users:
Use the following API endpoint to get the list of all users, including deleted and disabled users.

Endpoint http://example.com:3005/v4/people
Authentication Required

Method GET
Request Body None.

Response Status Code 200 - Ok


Response Body Contains JSON representation of each user in the system.

Parsing the JSON:


In the JSON response body, here are the key values for each user entry:

Object Description

email This value maps to the username value in your CLI scripts.

id Unique internal identifier that you can use in other people endpoints.

Tip: You must map each email address to its corresponding id value.

isAdmin This value is true if the user is a Trifacta administrator.

isDisabled If true, the user is disabled and cannot use the platform.

state If active, the user is a currently active user.

Users who are suspended or deleted cannot access the platform.

Important notes on CLI packages

Unlike connection, job, or user objects, a CLI script package does not contain any references to platform objects
by design. These independent, self-contained objects can be used to run a script snapshot as a job at any time.

NOTE: When running jobs via the CLI, you are executing against a static recipe and other configuration
files on your local desktop. When you run via the APIs, you are executing against the current state of the
recipe object. So, if it is important that you execute your jobs against a read-only version of your recipe,
you should create copies of your flows before you run the job.

After you download, however, the script package is no longer aware of any changes that have occurred to the
source objects on the platform, which has the following implications:
1. If source objects, such as the source recipe, have changed, those changes are not present in the CLI
package.
a. The above does not apply to data sources. In the downloaded CLI package, sources are referenced

Copyright © 2019 Trifacta Inc. Page #29


by URL, which means that they should be using the latest data.
b. If the data source URL has changed, however, that is not reflected in any previously downloaded
CLI packages.
2. There is no object identifier in the CLI that directly corresponds to a unique identifier in the platform.
If you download the CLI package each time that you wish to run a job:
You can acquire the recipe identifier from the Flow View page or Transformer page where you download
the CLI package.
In Flow View page, select the recipe icon. The URL is the following:

http://example.com:3005/flows/11?recipe=39&tab=recipe

The recipe id is 39.


In the Transformer page, the URL is the following:

http://example.com:3005/data/11/39

The recipe id is 39.


This value corresponds to the value to look for in the API WrangledDatasets Get List v4 output.
If you wish to use the latest version of the recipe for running a job:
Use the Transformer page. See previous.
If you have downloaded and saved off a set of CLI packages for execution as recipe snapshots:
This use case may be problematic, as there may be differences between the platform version of the recipe
and the version in the CLI package. Currently, the platform does not support importing individual recipes or
CLI packages.
Options:
1. Make a copy of the closest approximation to your downloaded recipe. Modify the recipe steps in the
application to change the copy to match your local CLI package.
2. Create a new flow and rebuild the individual steps in the recipe to match your local CLI package.

Reference - CLI for Jobs


In this section, you can review the commands available in the CLI, followed by their equivalent commands using
the v4 APIs.
CLI Docs: CLI for Jobs

Run Job

You can issue commands to the CLI to execute jobs using the local package downloaded from the Recipe panel.

NOTE: When you run a job using the CLI, you are executing against a snapshot of a recipe at the
moment in time when the package was downloaded. Please be sure that you are aware of the Important
notes on CLI packages in the previous section.

Below are the three files in the package and their API equivalents:

Copyright © 2019 Trifacta Inc. Page #30


File Description API Equivalent

script.cli A CLI-only version of the The APIs reference the latest definition of the recipe through the
recipe to execute. wrangledDataset object. See API WrangledDatasets Get v4.

datasources.tsv A CLI-only set of links to the The APIs reference the latest saved version of any datasource using the
data sources used to execute importedDataset object. When running a job, the data sources referenced in
the recipe. the WrangledDataset object are automatically pulled into job execution.

publishopts.json A CLI-only set of JSON If these outputs are part of the output definitions for the recipe in Flow
definitions of the outputs that View, they are automatically generated as part of running the job. For
are generated when a job is more information, see Flow View Page.
executed. If these outputs are overrides to the Flow View definitions, you can
insert these outputs as writesettings objects in the request
body when you launch the job. An example of this is provided below.
For more information on managing writesettings via APIs,
see API Workflow - Manage Outputs.

CLI example:

./trifacta_cli.py run_job --user_name <trifacta_user> --password


<trifacta_password> --job_type spark
--output_format json --data redshift-test/datasources.tsv --script
redshift-test/script.cli
--cli_output_path ./job_info.out --profiler on --output_path
hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/42
/cleaned_table_1.json

API v4 example - REST client:

NOTE: Inside the platform, this identifier is a reference to the jobGroup, which is the collection of sub-jobs
for a specified job. Sub-job types include: sampling, ingestion, transformation, and profiling. Collectively,
these appear under a single job identifier in the Trifacta application, and the same value is used as the
jobGroup Id in the APIs.

Default settings: After you have captured the wrangledDataset identifier, you can launch a new job using
default settings:

Endpoint http://localhost:3005/v4/jobGroups
Authentication Required

Method POST
Request Body
{
"wrangledDataset" {
"id", <wrangledDatasetId>
}
}

Response 201 - Created


Status Code

Copyright © 2019 Trifacta Inc. Page #31


Response Job group definition
Body

NOTE: A job group is composed of one or more sub-jobs for sampling, ingestion, transformation, and profiling,
where applicable. You can append ?embed=jobs to include sub-job information in the response.

Specify job overrides: The above request contains only the wrangledDataset identifier. All default output settings
are used.
If needed, you can override these default settings by specifying values as part of the request body. In the
following example, the relevant parameters from the CLI have been added as elements of the JSON body of the
request.
Through the APIs, you can also override the default files, formats, and locations where you output results in the w
ritesettings block.

Endpoint http://localhost:3005/v4/jobGroups
Authentication Required

Method POST
Request Body
{
"wrangledDataset": {
"id": <wrangled_dataset_id>
},
"overrides": {
"execution": "spark",
"profiler": true,
"writesettings": [
{
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/MyDataset/42/clea
"action": "create",
"format": "json",
"compression": "none",
"header": false,
"asSingleFile": false
}
]
},
"ranfrom": "cli"
}

Response 201 - Created


Status Code

Response Job group definition


Body

NOTE: A job group is composed of one or more sub-jobs for sampling, ingestion, transformation, and profiling, where applicable. Yo
include sub-job information in the response.

Copyright © 2019 Trifacta Inc. Page #32


Reference Docs:
API WrangledDatasets Get List v4
API JobGroups Create v4
For more information on modifying output objects and write settings, see:
API OutputObjects Create v4
API WriteSettings Create v4
API Workflow - Manage Outputs

File Publishing Options

You can specify publication options as part of your run_job command. In the following, a single CSV file with
headers is written to a new file with each job execution.
Example (All one command):

./trifacta_cli.py run_job --user_name <trifacta_user> --password


<trifacta_password> --job_type spark
--output_format csv --data redshift-test/datasources.tsv --script
redshift-test/script.cli
--publish_action create --header true --single_file true
--cli_output_path ./job_info.out --profiler on --output_path
hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/43
/cleaned_table_1.csv

CLI example:

./trifacta_cli.py run_job --user_name <trifacta_user> --password


<trifacta_password> --job_type spark
--output_format csv --data redshift-test/datasources.tsv --script
redshift-test/script.cli
--publish_action create --header true --single_file true
--cli_output_path ./job_info.out --profiler on --output_path
hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/43
/cleaned_table_1.csv

API v4 example - REST client:


For more information, see API Workflow - Manage Outputs.
Reference Docs:
See API WriteSettings Create v4.

Get Job Status

After you queue a job through the CLI, you can review the status of the job through the application or through the
CLI.

Tip: You can acquire the job ID through the application as needed. For example, at some point in the
future, you might decide to publish to Hive the results from a job you executed two weeks ago. It might be
easiest to retrieve this job identifier from the Dataset Details page. See Dataset Details Page.

Copyright © 2019 Trifacta Inc. Page #33


CLI example:

./trifacta_cli.py get_job_status --user_name <trifacta_user> --password


<trifacta_password> --job_id 42
--cli_output_path ./job_info.out

API v4 example - REST client:


Using the jobGroup identifier value, you can query the status of any job that has been launched.

Endpoint http://localhost:3005/v4/jobGroup/42/status
Authentication Required

Method GET
Request Body None.

Response Status Code 200 - OK


Response Body Response includes a message on the current status of the job. See docs below for values.

Reference Docs:
See API JobGroups Get Status v4.

Publish

After a job has successfully completed, you can publish the results to another datastore with which the platform is
integrated.
CLI example:

The following command publishes the results of jobId 42 through connectionId 1 to the dev database. Let's
assume that this is a Hive database.

./trifacta_cli.py publish --user_name <trifacta_user> --password


<trifacta_password> --job_id 42
--database dev --table table_job_42 --conn_name 1 --publish_format avro
--cli_output_path ./publish_info.out

API v4 example - REST client:


You can publish results for a specified jobId to an output that can be specified through properties in the request
(see below).

Endpoint http://localhost:3005/v4/jobGroups/42/publish
Authentication Required

Method PUT

Copyright © 2019 Trifacta Inc. Page #34


Request Body
{ "connection": {
"id": 1
},
"path": ["dev"],
"table": "table_job_42",
"action": "create",
"inputFormat": "avro",
"flowNodeId": 27
}

Response Status Code 200 - OK


Response Body
{ "jobgroupId": 42,
"reason": "JobStarted",
"sessionId": "f6c5f350-2102-11e9-bb80-9faf7b15f235"
}

Reference Docs:
See API JobGroups Put Publish v4.

Get Publications

You can retrieve a JSON list of all publications that have been executed for a specific job.
A publication is an object that corresponds to the delivery of a job's results to an external datastore.
In the Trifacta application, publications are executed through the Publishing Dialog, which is available
through the Job Details page. See Publishing Dialog.
CLI example:

./trifacta_cli.py get_publications --user_name <trifacta_user> --password


<trifacta_password> --job_id 42
--cli_output_path ./publications.out --publish_format avro

API v4 example - REST client:


You can use the job Id to retrieve the list of publications that have been executed for that job.

Endpoint http://localhost:3005/v4/jobGroups/42/publications
Authentication Required

Method GET
Request Body None.

Response Status Code 200 - OK


Response Body List of publication objects is included. See docs reference below.

Copyright © 2019 Trifacta Inc. Page #35


Reference Docs:
See API JobGroups Get Publications v4.

Load Data into Table

You can load data into pre-existing Redshift tables.


Data is appended after any existing rows.
If the table does not exist, the job fails.

NOTE: When appending data into a Redshift table, the columns displayed in the Transformer page must
match the order and data type of the columns in the target table.

CLI example:
In the following example, the results of jobId 42 are loaded into a Redshift table called table_42 using
connectionId 2.

./trifacta_cli.py load_data --user_name <trifacta_user> --password


<trifacta_password> --job_id 42
--database dev --table table_42 --conn_id 2 --publish_format avro
--cli_output_path ./load_info.out

API v4 example - REST client:


In the request body, note that the action parameter is set to load.

Endpoint http://localhost:3005/v4/jobgroups/42/publish
Authentication Required

Method POST
Request Body
{ "connection": {
"id": 2
},
"path": ["dev"],
"table": "table_42",
"action": "load",
"inputFormat": "avro",
"flowNodeId": 27
}

Response Status Code 200 - Ok

Copyright © 2019 Trifacta Inc. Page #36


Response Body
{ "jobgroupId": 42,
"reason": "JobStarted",
"sessionId": "f6c5f350-2102-11e9-bb80-9faf7b15g574"
}

Reference Docs:
API JobGroups Put Publish v4

Truncate and Load

For existing tables, you can clear them and load them with results from a job. If the table does not exist, a new
one is created and populated.
CLI example:

./trifacta_cli.py truncate_and_load --user_name <trifacta_user>


--password <trifacta_password> --job_id 10
--database dev --table table_43 --conn_name aSQLServerConnection
--publish_format avro
--cli_output_path ./load_and_trunc_info.out

API v4 example - REST client:


In the request body, note that the action parameter is set to truncateAndLoad.

Endpoint http://localhost:3005/v4/jobgroups/10/publish
Authentication Required

Method POST
Request Body
{ "connection": {
"id": 2
},
"path": ["dev"],
"table": "table_43",
"action": "truncateAndLoad",
"inputFormat": "avro",
"flowNodeId": 27
}

Response Status Code 200 - Ok

Copyright © 2019 Trifacta Inc. Page #37


Response Body
{ "jobgroupId": 10,
"reason": "JobStarted",
"sessionId": "f6c5f350-2102-11e9-bb80-9faf7b15v291"
}

Reference Docs:
API JobGroups Put Publish v4

Reference - CLI for Connections

You can use the CLI for basic management of your connections.
CLI Docs: CLI for Connections

Create Connection

To create a connection, you specify the connection parameters as part of your command line command.
CLI example:

./trifacta_cli.py create_connection --user_name <trifacta_user>


--password <trifacta_password>
--conn_type microsoft_sqlserver --conn_name aSQLServerConnection
--conn_description "This is my connection."
--conn_host example.com --conn_port 1234
--conn_credential_type basic
--conn_credential_location ~/.trifacta/config_conn.json
--conn_params_location ~/.trifacta/p.json
--cli_output_path ./conn_create.out

API v4 example - REST client:

Endpoint http://localhost:3005/v4/connections
Authentication Required

Method POST

Copyright © 2019 Trifacta Inc. Page #38


Request Body
{
"connectParams": {
"vendor": "sqlserver",
"vendorName": "sqlserver",
"host": "example.com",
"port": "1234"
},
"host": "example.com",
"port": 1234,
"vendor": "sqlserver",
"params": {
"connectStrOpts": ""
},
"ssl": false,
"vendorName": "sqlserver",
"name": "aSQLServerConnection",
"description": "",
"type": "jdbc",
"isGlobal": false,
"credentialType": "basic",
"credentialsShared": true,
"disableTypeInference": false,
"credentials": [
{
"username": "<username>",
"password": "<password>"
}
]
}

Response Status Code 201 - Created

Copyright © 2019 Trifacta Inc. Page #39


Response Body
{
"connectParams": {
"vendor": "sqlserver",
"vendorName": "sqlserver",
"host": "example.com",
"port": "1234"
},
"id": 26,
"host": "example.com",
"port": 1234,
"vendor": "sqlserver",
"params": {
"connectStrOpts": ""
},
"ssl": false,
"vendorName": "sqlserver",
"name": "aSQLServerConnection",
"description": "",
"type": "jdbc",
"isGlobal": false,
"credentialType": "basic",
"credentialsShared": true,
"uuid": "fa7e06c0-0143-11e8-8faf-27c0392328c5",
"disableTypeInference": false,
"createdAt": "2018-01-24T20:20:11.181Z",
"updatedAt": "2018-01-24T20:20:11.181Z",
"credentials": [
{
"username": "<username>"
}
],
"creator": {
"id": 1
},
"updater": {
"id": 1
}
}

Reference Docs:
API Connections Create v4

Edit Connection

In the CLI, you use the edit_connection action to pass in modifications to a connection that is specified using
the conn_name command line parameter.

CLI example:
In the following example, the description, host, and port number are being changed for the aSQLServerConnect
ion.

Copyright © 2019 Trifacta Inc. Page #40


./trifacta_cli.py edit_connection --user_name <trifacta_user> --password
<trifacta_password>
--conn_name aSQLServerConnection
--conn_description "This is my connection."
--conn_host mynewhost.com --conn_port 1234
--conn_credential_type basic --conn_credential_location
~/.trifacta/config_conn.json
--cli_output_path ./conn_edit.out

API v4 example - REST client:


When using the APIs, you use the internal identifier for the connection to modify.
In the body of the request, you should include only the parameters that you are modifying for the connection. In
this example, the connectionId is 8.

Endpoint http://localhost:3005/v4/connections/8
Authentication Required

Method PATCH
Request Body
{
"description": "This is my connection.",
"host": "mynewhost.com",
"port": 1234
}

Response Status Code 200 - OK


Response Body
{
"id": 8,
"updater": {
"id": 1
},
"updatedAt": "2019-01-25T23:19:27.648Z"
}

Reference Docs:
See API Connections Patch v4.

List Connections

The CLI command list_connections dumps the JSON objects for all connections to a local file.

CLI example:

Copyright © 2019 Trifacta Inc. Page #41


./trifacta_cli.py list_connections --host dev.redshift.example.com
--user_name <trifacta_user> --password <trifacta_password>
--cli_output_path ./conn_list.out

API v4 example - REST client:


The following API endpoint can be used to retrieve the JSON objects for all connections to which the
authenticating user has access in the body of the response.

Tip: For any endpoint using a GET method, if you omit an object identifier, you retrieve all accessible
objects of that type from the platform.

Endpoint http://localhost:3005/v4/connections
Authentication Required

Method GET
Request Body None.

Response Status Code 200 - OK


Response Body JSON objects for all accessible connections.

Reference Docs:
See API Connections Get List v4.

Delete Connection

For the CLI, you use the delete_connection command to remove connections that are specified by conn_na
me.

CLI example:

./trifacta_cli.py delete_connection --user_name <trifacta_user>


--password <trifacta_password>
--conn_name aSQLServerConnection --cli_output_path ./conn_delete.out

API v4 example - REST client:


Use the internal identifier for the connection to delete it. In the following example, the connectionId is 4.

Endpoint http://localhost:3005/v4/connections/4
Authentication Required

Method DELETE
Request Body None.

Response Status Code 204 - No content


Response Body None.

Reference Docs:

Copyright © 2019 Trifacta Inc. Page #42


See API Connections Delete v4.

Reference - CLI for User Admin

You can use the CLI for handling of some elements of user management.

NOTE: Some user account properties cannot be managed through the CLI. You must use the APIs or the
application for some tasks.

CLI Docs: CLI for User Admin

Create User

CLI example:

./trifacta_admin_cli.py --admin_username <trifacta_admin_user>


--admin_password <trifacta_admin_password>
create_user --user_name [email protected] --password Hello2U --name Joe

API v4 example - REST client:


The request body below contains the minimum set of required parameters to create a new user.
The accept parameter must be set to accept for every new user.

Endpoint http://www.example.com:3005/v4/people
Authentication Required

Method POST
Request Body
{
"accept": "accept",
"password": "Hello2U",
"password2": "Hello2U",
"email": "[email protected]",
"name": "Joe"
}

Response Status 201 - Created


Code

Copyright © 2019 Trifacta Inc. Page #43


Response Body
{
"isDisabled": false,
"forcePasswordChange": false,
"state": "active",
"id": 9,
"email": "[email protected]",
"name": "Joe",
"ssoPrincipal": null,
"hadoopPrincipal": null,
"isAdmin": false,
"updatedAt": "2019-01-09T20:23:31.560Z",
"createdAt": "2019-01-09T20:23:31.560Z",
"outputHomeDir":
"/trifacta/queryResults/[email protected]",
"lastStateChange": null,
"fileUploadPath": "/trifacta/uploads",
"awsConfig": null
}

Reference Docs:
API People Create v4

Show User

You can gather a specific user object using the username through the CLI.
CLI example:

./trifacta_admin_cli.py --admin_username <trifacta_admin_user>


--admin_password <trifacta_admin_password>
show_user --user_name [email protected]

API v4 example - REST client:


Through the APIs, you can retrieve individual users through the internal userId. In the following example, the user
corresponding to userId 4 is retrieved.

Endpoint http://www.example.com:3005/v4/people/4
Authentication Required

Method GET
Request Body None.

Response Status 200 - OK


Code

Copyright © 2019 Trifacta Inc. Page #44


Response Body
{
"id": 4,
"email": "[email protected]",
"name": "Joe2",
"ssoPrincipal": null,
"hadoopPrincipal": null,
"isAdmin": false,
"isDisabled": false,
"forcePasswordChange": false,
"state": "active",
"lastStateChange": null,
"createdAt": "2019-02-20T20:05:49.882Z",
"updatedAt": "2019-02-20T20:05:49.882Z",
"outputHomeDir":
"/trifacta/queryResults/[email protected]",
"fileUploadPath": "/trifacta/uploads",
"awsConfig": null
}

Reference Docs:
API People Get v4.

Edit User

You can edit some properties through the CLI edit_user command.

CLI example:

In this example, the ssoPrincipal for the user is being changed.

./trifacta_admin_cli.py --admin_username <trifacta_admin_user>


--admin_password <trifacta_admin_password>
edit_user --user_name [email protected] --ssoPrincipal my_principal

API v4 example - REST client:


Using the APIs, you reference the user to modify by userId. In the following example, the userId is 4.

Include only the parameters in the request that are being modified.

Endpoint http://www.example.com:3005/v4/people/4
Authentication Required

Method PUT

Copyright © 2019 Trifacta Inc. Page #45


Request Body
{
"ssoPrincipal": "my_principal"
}

Response Status Code 200 - OK


Response Body
{ "id": 4,
"updatedAt": "2018-01-24T23:49:08.199Z"
}

Reference Docs:
API People Patch v4

Generate Password Reset URL

Through the CLI, admins can generate password reset emails to be sent to specific users.
CLI example:

./trifacta_admin_cli.py --admin_username <trifacta_admin_user>


--admin_password <trifacta_admin_password>
edit_user --user_name [email protected] --disable

API v4 example - REST client:

NOTE: The v4 endpoint equivalent of this CLI command is not available in Release 6.0. It will be
available at or before the End of Life (EOL) of v3 endpoints.

Disable User

Through the CLI, you can disable individual users by adding the disable flag as part of an edit_user directive.

CLI example:

./trifacta_admin_cli.py --admin_username <trifacta_admin_user>


--admin_password <trifacta_admin_password> edit_user --user_name
[email protected] --disable

API v4 example - REST client:


In the APIs, you disable a specified user by patching the user object with the disable flag.

Endpoint http://www.example.com:3005/v4/people/4
Authentication Required

Method PATCH

Copyright © 2019 Trifacta Inc. Page #46


Request Body
{
"isDisabled": false
}

Response Status Code 200 - OK


Response Body
{ "id": 4,
"updatedAt": "2018-01-24T23:56:32.834Z"
}

Reference Docs:
API People Patch v4

Delete User

CLI example:
In the following example, the user is deleted by username, and the user's assets are transferred to another user.

NOTE: Transfer of assets is not required. However, if the assets are not transferred, they are no longer
available.

./trifacta_admin_cli.py --admin_username <trifacta_admin_user>


--admin_password <trifacta_admin_password>
delete_user --user_name [email protected] --transfer_assets_to
[email protected]

API v4 example - REST client:


Via the APIs, this transfer of assets and deletion of the user must be accomplished in two steps.

NOTE: You must verify that the transfer step occurs successfully before you execute the deletion.
Deletion of a user cannot be undone.

NOTE: Transferring of assets does not check for access to the objects. It's possible that the receiving
user may not be able to access connections or datasets that were created by the original user. You may
wish to share those assets through the application before you perform the deletions.

Here is the mapping of example userIds between CLI and API:

CLI userId API userId

[email protected] 4

[email protected] 7

Transfer of assets:

Copyright © 2019 Trifacta Inc. Page #47


The following endpoint call transfers assets from userId 4 to userId 7.

Endpoint http://www.example.com:3005/v4/people/7/assetTransfer/4
Authentication Required

Method PATCH
Request Body None.

Response 201 - Created


Status Code

Response
Body [
[
1,
[
0,
[
{
"connectionId": 7,
"personId": 7,
"role": "owner",
"createdAt": "2019-02-21T19:52:22.993Z",
"updatedAt": "2019-02-21T19:52:22.993Z"
}
]
]
]
]

NOTE: Please verify that you have received a response similar to the above before you delete the user. You
should also verify that the receiving user has the assets accessible in the application.

Delete user:
After assets have been transferred, users can be deleted by userId (4).

Endpoint http://www.example.com:3005/v4/people/4
Authentication Required

Method DELETE
Request Body None.

Response Status Code 204 - No Content


Response Body None.

Reference Docs:
API People Delete v4

Copyright © 2019 Trifacta Inc. Page #48


Install CLI Tools

In the next release of Trifacta® Wrangler Enterprise after Release 6.0, the Trifacta command line
interface tools will be removed from the product (End of Life). Before upgrading to that release or
a later one, you must migrate your usage of the CLI to use the REST APIs. For more information,
see CLI Migration to APIs.

Contents:
Download
Install
Upgrade

By default, the Trifacta® Command Line Interface (CLI) tools are installed on the Trifacta node during installation.
You can use them from there.
Optionally, you can install the CLI tools on a separate server. For example, you might want to create a dedicated
server from which you can run a set of predefined jobs on a periodic basis.

NOTE: The location from where you are running the CLI tools must be able to access the pre-installed
instance of the Trifacta platform.

This section describes how to download and install the CLI tools on a dedicated server.

Download

The Trifacta CLI installer is available through a separate file next to the software distribution provided by Trifacta.
*.RPM for CentOS/RHEL
*.DEB for Ubuntu
The appropriate file should be downloaded to the server where you are installing the tools. For more information,
see Trifacta Support.

Install

Steps:
1. On the node where you are installing, execute the command.
a. For CentOS/RHEL6:

rpm -Uvh trifacta-cli-X.Y.Z-AAAA.el6.x86_64.rpm

where:
X.Y.Z = the three-digit release number.
AAAA = internal build number.
b. For CentOS/RHEL7:

rpm -Uvh trifacta-cli-X.Y.Z-AAAA.el7.x86_64.rpm

Copyright © 2019 Trifacta Inc. Page #49


where:
X.Y.Z = the three-digit release number.
AAAA = internal build number.
c. For Ubuntu 14.04 (Trusty):

sudo dpkg -i trifacta-cli_X.Y.Z-AAAA~trusty_amd64.deb

where:
X.Y.Z = the three-digit release number.
AAAA = internal build number.
d. For Ubuntu 16.04 (Xenial):

sudo dpkg -i trifacta-cli_X.Y.Z-AAAA~xenial_amd64.deb

where:
X.Y.Z = the three-digit release number.
AAAA = internal build number.
2. When the installation is complete, you can begin using the tools. The tools are installed in the following
directory:

/opt/trifacta/bin/

Upgrade

When you upgrade to a new version of Trifacta Wrangler Enterprise, you must complete the following steps to
ensure that your CLI tools and scripts are upgraded:

NOTE: There is no guarantee of compatibility between versions of Trifacta Wrangler Enterprise CLI tools.
You should re-install the tools with each upgrade.

1. Download and install the new version of the CLI tools. See earlier in this section.
2. Unless changes are required, you can try to run your CLI scripts using the CLI packages that you
downloaded from the previous version. If your scripts fail when running jobs, then you should try to
re-download the packages from the Transformer page. For more information, see Recipe Panel.
For more information, see Changes to the Command Line Interface.
CLI for Connections

In the next release of Trifacta® Wrangler Enterprise after Release 6.0, the Trifacta command line
interface tools will be removed from the product (End of Life). Before upgrading to that release or
a later one, you must migrate your usage of the CLI to use the REST APIs. For more information,
see CLI Migration to APIs.

Contents:
Requirements
Command Reference

Copyright © 2019 Trifacta Inc. Page #50


Parameters
Credentials file
Params file
Examples
Create connection
Edit connection
List connections
Delete connection

NOTE: This feature requires developer-level skills to enable and use.

The command line references lets you manage connections between the Trifacta® platform and various types of
datastores. You can also use this CLI for the following:
Create, edit, or delete connections

NOTE: In this release, you cannot create SQL DW connections via the CLI. This known issue will
be fixed in a future release.

Get information on all connections

NOTE: Sharing of connections is not supported through the command line interface.

Requirements

NOTE: Some types of connections available through the UI cannot be created through the CLI. For more
information, see Connection Types.

The CLI must have access to a running Trifacta instance. You can specify the host and port of this
instance.
For each connection that you create, the Trifacta node must be able to access it through the listed host and
port.

Command Reference

The CLI tools are stored in the following directory:

/opt/trifacta/bin/

For creating or modifying connections, execute the following command:

./trifacta_cli.py (parameters)

Copyright © 2019 Trifacta Inc. Page #51


Parameters are specified below.

Parameters

Common

These parameters are common to job or connection actions.

Parameter Description Applicable


CLI
Commands

command_type (Required) The type of CLI command to execute. Accepted values: All

create_connection - Create a new connection object.


edit_connection - Edit an existing connection object.
list_connections - List all connection objects for the specified
user.
delete_connection - Delete a connection object.
See Examples below.

For more information on the following commands, see CLI for Jobs.

run_job - Execute a specified job on the specified running


environment.
get_job_status - Get job status information.
get_publications - Acquire publication information for a specified
job.
publish - Publish a completed job to the specified database table,
which has not been created yet.
load_data - Load data into the database table, to which a schema
has already been applied. Use to append to existing table.
truncate_and_load - Overwrite data in specified table.

user_name (Required) Trifacta username of the user to execute the job. Please specify the All
full username.

NOTE: In the response, this value is listed as user.

password (Required) Trifacta password for the username All


If no password is specified, you are prompted to enter one.

NOTE: If you have enabled Kerberos-based access to the Trifacta


platform, you do not need to provide a password. To enable,
additional configuration is required. See
Configure for Kerberos Integration.

NOTE: Passwords can be stored in an external file. See


CLI Config File.

Copyright © 2019 Trifacta Inc. Page #52


cli_output_path (Optional) Defines the client-side path where the JSON output is stored for all All
commands. Default value is cli_results.out.

NOTE: The user issuing the command must also have execute
permissions on all parent folders in the specified cli_output_p
ath.

disable_ssl_certification (Optional) When communicating over HTTPS, this setting can be used to All
override the default behavior of validating the server certificate before executing
the command.

NOTE: If you have stored a self-signed certificate on the Trifacta


node, please set the REQUESTS_CA_BUNDLE environment
variable to point to the directory that contains the trusted server's
certificate(s). The CLI will verify against these certs. In this case, the
disable_ssl_certificate parameter is not needed.

conn_ssl (Optional) Connect to the datastore over SSL. All

NOTE: You must modify the host parameter value to include the
appropriate port number for the SSL connection.

NOTE: SSL connections are not supported for Hive, Redshift, or SQL
Server.

Params for managing connections

The following parameters apply to managing connection objects only. Some of the preceding parameters may be
required for connection actions.

Parameter Description Applicable CLI Command

Copyright © 2019 Trifacta Inc. Page #53


conn_type The type of connection. create_connection

NOTE: After the connection has been created, you


cannot change its type.

Tip: For a list of supported connection types, enter


the following at the command line:

./trifacta_cli.py
create_connection
-h

These connection types can be created by Trifacta admin


users only:

NOTE: These connections must be created through


the CLI and must be created as public connections.
Include the conn_is_global flag in the
command.

NOTE: Only 1 Hive and connection is permitted per


Trifacta deloyment.

Hadoop Hive
Amazon Redshift

NOTE: A Redshift connection requires S3 as


your base storage layer. See
Set Base Storage Layer.

These connection types can be created by any user with


appropriate permissions:

NOTE: Jobs using sources from these connections


cannot be executed on Spark.

Microsoft SQL Server


PostgreSQL Database
Oracle Database
Teradata Database

For more information on the supported connection types and


the tokens to insert for this parameter, see Connection Types.

conn_name Internal name of the connection. This name is referenced in create_connection, e


your CLI scripts. It should be a single value without spaces. dit_connection , list
_connection, delete_
NOTE: This value must be unique among your connection
connection names.

Copyright © 2019 Trifacta Inc. Page #54


conn_id The internal identifier for the connection. When a connection edit_connection , lis
is created, it is assigned an internal numeric identifier. This ID t_connection , delete
or the connection_name can be used to reference the
connection in future commands. _connection

Tip: This value is available when you hover over a


connection in the application. See Flows Page.

conn_host Host of the datastore to which you are connecting. create_connection, e


dit_connection

conn_port Port number to access the datastore. create_connection, e


dit_connection

conn_description This text value is displayed to users when they create or edit create_connection, e
connections of this type through the Trifacta application. dit_connection

conn_credential_type The type of credentials to create. Supported values: create_connection, e


dit_connection
basic - Simple username/password to be provided in
conn_credential_location. Used for JDBC
database connections.
aws - AWS-specific credentials to be provided in conn
_credential_location. Used for Redshift
connections.
trifacta_service - Uses the Trifacta
credentials specified in trifacta-conf.json.
Used for Hive connections.

conn_credential_location The path to a JSON file containing the credentials for your create_connection,
connection, as consistent with the conn_credential_ edit_connection
type. For more information on the expected format, see
Credentials file below.

NOTE: A credential file is not needed if the


credential type is trifacta_service.

conn_params_location When you create a connection, you can reference a JSON file create_connection, e
containing parameters to apply during the creation of any dit_connection
connection of this type. See Params file below.

conn_skip_test If this parameter is added to the command, the connection is create_connection, e


not tested. The default is to test the connection. This flag dit_connection
requires no value.

Tip: After creation, you can test and modify the


connection through the application. See
Flows Page.

Copyright © 2019 Trifacta Inc. Page #55


conn_is_global If this parameter is added, the connection is public and is create_connection, e
available to all Trifacta users after it has been created. This dit_connection
flag requires no value.

NOTE: To use this option, the executing user must


be a Trifacta admin. Hive and Redshift connections
require this parameter.

NOTE: After a connection has been made public, it


cannot be made private again. It must be deleted
and recreated.

For documentation on the CLI parameters, run:

./trifacta_cli.py --help

Additional documentation might be available for individual commands using the following:

./trifacta_cli.py <commmand> --help

Credentials file

You can store connection login credentials in a file on the Trifacta node. When managing connections, you can
reference this JSON credentials file in the command, which forces the use of encrypted versions of the credentials
stored in the file. Examples are provided below.
Example - Basic credentials:
This example applies for relational connection types: Oracle, PostGreSQL, SQL Server, and Teradata.

{
"username": "<your_username>",
"password": "<your_password>"
}

Example - AWS credentials:


This example applies to connections of AWS type (Redshift).

{
"username": "<your_user>",
"password": "<your_password>"
"iamRoleArn": "<your_IAM_role_ARN>"
}

NOTE: iamRoleArn is optional. For more information, see Configure for EC2 Role-Based Authentication.

Copyright © 2019 Trifacta Inc. Page #56


Params file

In an external file, you can create a set of parameters to pass to any object for which you are creating a
connection. For example, when you create a connection to a database, you may need to reference a default
database to which any instance of the connection connects.
The following parameters are supported for each vendor.

Vendor JSON Parameter Description Required

Hive defaultDatabase Name of the default database No

Redshift defaultDatabase Name of the default database Yes

PostgreSQL database Name of the database. Yes

Oracle service Service to use for the connection Yes

SQL Server None.

Teradata None.

Additional parameters:
Except for Redshift connections, you can submit additional configuration parameters using the ConnectStrOpts
key-value pair in the parameters file. Example:

"connectStrOpts": ";transportMode=http;httpPath=cliservice"

NOTE: Each vendor uses a specific separator between the connection URL and the connection string
options. For example, if you are creating or editing a Teradata connection and are submitting ConnectSt
rOpts parameters, the string value must begin with a comma:

"connectStrOpts": ",Key1=Value1,Key2=Value2"

For more information, see the documentation provided with your database product.

Redshift uses the extraLoadParams method, which is described below.

Arbitrary parameters for JDBC connections:


For any supported JDBC connection type, you can include arbitrary parameters specific to the JDBC database as
part of your connection string options.
With the exception of Oracle, all JDBC vendor support the following example. The database to which you are
connecting supports a parameter (myView) which for the database has a value of custom1. To extend the
preceding Teradata example, your connectStrOpts value would be the following, which begins with a comma (
,):

"connectStrOpts": ",Key1=Value1,Key2=Value2?myView=custom1"

When the connection is created and used, the connection string might look like the following:

Copyright © 2019 Trifacta Inc. Page #57


jdbc:teradata://example.com:1025/DatabaseServerName?myView=custom1

For submitting arbitrary parameters to Oracle, please see the example below.
Example - Hive params:

NOTE: By default, the Hive connection is defined to use TCP. If you are using HTTP to connect to Hive,
additional configuration is required, including insertion of additional parameters in your params file. See
Configure for Hive.

NOTE: If you are connecting to a Kerberos-enabled cluster, you must include the Kerberos principal for
Hive as part of the connectStrOpts value. See Configure for Hive.

{
"connectStrOpts": ";<depends_on_deployment>",
"defaultDatabase": "default",
"jdbc": "hive2"
}

For more information on connection string options for Hive, see Configure for Hive.
Example - Redshift params:

{
"defaultDatabase":"<your_database>",
"extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
}

The first parameter defines the default database.


The second parameter is used when you publish results to Redshift. For more information on these values, see
http://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-conversion.html.
Example - PostgreSQL params:

{
"database":"<your_database>"
}

Example - Oracle params:

{
"service":"orcl"
}

For submitting arbitrary parameters to Oracle, the arbitrary string must follow the ORA format, in which most of the
connection string is replaced by parameters. For example:

Copyright © 2019 Trifacta Inc. Page #58


{
"connectStrOpts":
"(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=localhost)(PORT=1521))(CONNECT
_DATA=(SERVICE_NAME=orcl)))",
"service":"orcl"
}

In this case, the generated connection string might look like the following:

jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=localhost)(PO
RT=1521))(CONNECT_DATA(SERVICE_NAME=orcl)))

The original host, port, and service name values specified in the connection are ignored and replaced by these
values.

Examples

At the command line, all jobs must be executed through connection objects. For each datastore to which the Trifa
cta platform is connected, you must create at least one connection object and then reference it in any job
execution tasks.

Create connection

NOTE: For Hive, connections must be created as public connections (include the --conn_is_global fl
ag). You can only create one connection of each of these types.
For more information on creating a Hive connection through the CLI, see Configure for Hive.

Command

Example (all one command):

./trifacta_cli.py create_connection --user_name <trifacta_user>


--password <trifacta_password>
--conn_type microsoft_sqlserver --conn_name aSQLServerConnection
--conn_description "This is my connection."
--conn_host example.com --conn_port 1234
--conn_credential_type basic
--conn_credential_location ~/.trifacta/config_conn.json
--conn_params_location ~/.trifacta/p.json
--cli_output_path ./conn_create.out

Output

Success: Connection aSQLServerConnection created


JSON results written to conn_create.out.

Copyright © 2019 Trifacta Inc. Page #59


JSON Response

Output is written to ./conn_create.out.

{
"conn_credential_location": "~/.trifacta/config_conn.json",
"conn_credential_type": "basic",
"conn_host": "example.com",
"conn_id": 9,
"conn_name": "aSQLServerConnection",
"conn_params_location": "~/.trifacta/p.json",
"conn_port": "1234",
"conn_type": "microsoft_sqlserver",
"host": "http://example.com:3005",
"results": {
"createdAt": "2016-06-30T21:53:58.977Z",
"createdBy": 3,
"credential_type": "basic",
"credentials": [
{
"username": "<trifacta_user>"
}
],
"deleted_at": null,
"description": null,
"host": "example.com",
"id": 9,
"is_global": false,
"name": "aSQLServerConnection",
"port": 1234,
"type": "microsoft_sqlserver",
"updatedAt": "2016-06-30T21:53:58.977Z",
"updatedBy": 3
},
"status": "success",
"user_name": "<trifacta_user>"
}

Edit connection

Command

In the following command, all parameters specified within angled brackets are optional settings that can be
changed. The other ones are required to perform any edit.
You must specify the conn_name or the conn_id.

NOTE: If you are editing the connection's credentials, you must specify the conn_credential_type in
the command, which is required if you are changing any credential parameter. This step completely
replaces the old credentials, so you must specify all connection parameters in the command.

Copyright © 2019 Trifacta Inc. Page #60


Example (all one command):

./trifacta_cli.py edit_connection --user_name <trifacta_user> --password


<trifacta_password>
--conn_name aSQLServerConnection --conn_type microsoft_sqlserver
--conn_description "This is my connection."
--conn_host mynewhost.com --conn_port 1234
--conn_credential_type basic --conn_credential_location
~/.trifacta/config_conn.json
--cli_output_path ./conn_edit.out

Output

Following assumes that only the above values for host and cli_output_path contain new values:

Success: Updated connection aSQLServerConnection


JSON results written to conn_edit.out.

JSON Response

Output is written to ./conn_edit.out.

Copyright © 2019 Trifacta Inc. Page #61


{
"conn_description": "This is my connection.",
"conn_id": 9,
"conn_name": "aSQLServerConnection",
"conn_params_location": "~/.trifacta/p.json",
"host": "http://nynewhost.com:3005",
"results": {
"createdAt": "2016-06-30T22:08:47.016Z",
"createdBy": 3,
"credential_type": "basic",
"credentials": [
{
"username": "<trifacta_user>"
}
],
"deleted_at": null,
"description": "This is my connection.",
"host": "mynewhost.com",
"id": 9,
"is_global": false,
"name": "aSQLServerConnection",
"port": 1234,
"type": "microsoft_sqlserver",
"updatedAt": "2016-06-30T22:09:03.670Z",
"updatedBy": 3
},
"status": "success",
"user_name": "<trifacta_user>"
}

List connections

Command

Example (all one command):

./trifacta_cli.py list_connections --host dev.redshift.example.com


--user_name <trifacta_user> --password <trifacta_password>
--cli_output_path ./conn_list.out

Tip: You can specify a conn_name or conn_id to return the information about a connection.

Copyright © 2019 Trifacta Inc. Page #62


Output

Listing connections
Found 2 connections for params {'noLimit': 'true'}.
Redshift:
description: None
host: dev.redshift.example.com
credentials: ["{u'username': u'<trifacta_user>'}"]
port: 5439
is_global: True
name: Redshift
id: 2
credential_type: custom
params:
extraLoadParams: BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS
defaultDatabase: dev
type: amazon_redshift
Hive:
description: None
host: dev.hive.example.com
credentials: ["{u'username': u'<trifacta_user>'}"]
port: 10000
is_global: True
name: Hive
id: 1
credential_type: conf
params:
jdbc: hive2
connectStrOpts:
defaultDatabase: default
type: hadoop_hive
JSON results written to conn_list.out.

JSON Response

Output is written to ./conn_list.out.

{
"connections": [
{
"conn_createdAt": "2016-06-01T21:12:59.383Z",
"conn_createdBy": 2,
"conn_credential_type": "custom",
"conn_credentials": [
{
"username": "<trifacta_user>"
}
],
"conn_deleted_at": null,
"conn_description": null,
"conn_host": "dev.redshift.example.com",

Copyright © 2019 Trifacta Inc. Page #63


"conn_id": 2,
"conn_is_global": true,
"conn_name": "Redshift",
"conn_params": {
"extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS
TRUNCATECOLUMNS",
"defaultDatabase": "dev"
},
"conn_port": 5439,
"conn_type": "amazon_redshift",
"conn_updatedAt": "2016-06-01T21:33:38.672Z",
"conn_updatedBy": 2
},
{
"conn_createdAt": "2016-06-01T21:11:41.222Z",
"conn_createdBy": 2,
"conn_credential_type": "conf",
"conn_credentials": [
{
"username": "<trifacta_user>"
}
],
"conn_deleted_at": null,
"conn_description": null,
"conn_host": "dev.hive.example.com",
"conn_id": 2,
"conn_is_global": true,
"conn_name": "Hive",
"conn_params": {
"jdbc": "hive2",
"connectStrOpts": "",
"defaultDatabase": "default"
},
"conn_port": 10000,
"conn_type": "hadoop_hive",
"conn_updatedAt": "2016-06-01T21:39:58.090Z",
"conn_updatedBy": 2
}
],
"host": "http://localhost:3005",

Copyright © 2019 Trifacta Inc. Page #64


"status": "success",
"user_name": "<trifacta_user>"
}

Delete connection

NOTE: You cannot delete a connection that is in use by a dataset.

Tip: You can delete a connection by using its internal connection identifier ( conn_id), instead of its
connection name.

Command

Example (all one command):

./trifacta_cli.py delete_connection --user_name <trifacta_user>


--password <trifacta_password>
--conn_name aSQLServerConnection --cli_output_path ./conn_delete.out

Output

Success. Deleted connection with id: 9


JSON results written to conn_delete.out.

JSON Response

Output is written to ./conn_delete.out.

{
"conn_name": "aSQLServerConnection",
"host": "http://localhost:3005",
"status": "success",
"user_name": "<trifacta_user>"
}

CLI for Jobs

In the next release of Trifacta® Wrangler Enterprise after Release 6.0, the Trifacta command line
interface tools will be removed from the product (End of Life). Before upgrading to that release or
a later one, you must migrate your usage of the CLI to use the REST APIs. For more information,
see CLI Migration to APIs.

Contents:
Requirements

Copyright © 2019 Trifacta Inc. Page #65


Command Reference
Parameters
Examples
Run job
Get job status
Publish
Get publications
Load data into table
Truncate and load

NOTE: This feature requires developer-level skills to enable and use.

The Command Line Interface for Jobs enables programmatic control over a variety of operations on the platform.
You can use the CLI to execute any of the following types of commands:
Run a job

NOTE: In this release, you cannot run jobs using datasets imported from Redshift or SQL DW
connections via the CLI.

Check job status


Publish a completed job to other datastores asynchronously

NOTE: In this release, you cannot publish results to Redshift or SQL DW connections via the CLI.
This known issue will be fixed in a future release.

Get information on publications of a job


Load data into a new or existing table in Redshift
(Hive only) Clear an existing table and load with new data

Requirements

The CLI must have access to a running instance of the Trifacta® platform. You can specify the host and
port of this instance.
If you are running jobs for a dataset with parameters, the downloaded assets reference only the first
matching file of the dataset. To run the job across all files in the dataset with parameters, you must build
the matching logic within your CLI script. For more information on datasets with parameters, see
Overview of Parameterization.

Command Reference

Execute the following command from the top-level Trifacta directory. The Python script references script.clia
nd datasources.tsv as parameters.

For repeat executions of the same script.cli file, you can parameterize the values in the datasources.tsv.

The CLI tools are stored in the following directory:

/opt/trifacta/bin/

Copyright © 2019 Trifacta Inc. Page #66


For executing jobs, specify parameters for the following:

./trifacta_cli.py (parameters)

Parameters

Common

These parameters are common to job or connection actions.

Parameter Description Applicable


CLI
Commands

command_type The type of CLI command to execute. Accepted values: All

run_job - Execute a specified job on the specified running


environment.
get_job_status - Get job status information.
get_publications - Acquire publication information for a specified
job.
publish - Publish a completed job to the specified database table. If
the table exists, the data is appended to the existing table.
load_data - Load data into the database table. If the table exists, the
data is appended to the existing table. If the table does not exist, the job
fails.
truncate_and_load - Overwrite data in specified table. If the table
does not exist, a new one is created and populated.

See Examples below.

For more information on the following commands, see CLI for Connections.

create_connection - Create a new connection object.


edit_connection - Edit an existing connection object.
list_connections - List all connection objects for the specified
user.
delete_connection - Delete a connection object.

user_name (Required) Trifacta username of the user to execute the job. Please specify the All
full username.

NOTE: In the response, this value is listed as user .

password Trifacta password for the username All


If no password is specified, you are prompted to enter one.

NOTE: If you have enabled Kerberos-based access to the Trifacta


platform, you do not need to provide a password. To enable,
additional configuration is required. See
Configure for Kerberos Integration.

NOTE: You can store username and password information in an


external file so that they don't need to be included in each command.
For more information, see CLI Config File.

Copyright © 2019 Trifacta Inc. Page #67


cli_output_path Defines the client-side path where the JSON output is stored for all commands. All
Default value iscli_results.out

NOTE: The user issuing the command must also have execute
permissions on all parent folders in the specified cli_output_p
ath.

disable_ssl_certification (Optional) When communicating over HTTPS, this setting can be used to All
override the default behavior of validating the server certificate before executing commands
the command.

NOTE: If you have stored a self-signed certificate on the Trifacta


node, please set the REQUESTS_CA_BUNDLE environment
variable to point to the directory that contains the trusted server's
certificate(s). The CLI will verify against these certs. In this case, the
disable_ssl_certificate parameter is not needed.

conn_ssl (Optional) Connect to the datastore over SSL. All


commands

NOTE: You must modify the host parameter value to include the
appropriate port number for the SSL connection.

NOTE: SSL connections are not supported for Hive, Redshift, or SQL
Server.

Params for managing jobs

These parameters apply to managing jobs only.

Parameter Description Applicable CLI


Commands

host (Required) The server and port number of the Trifacta instance. All
Replace this value with the host and port of the running Trifacta instance. If
it is not provided, localhost:3005 is assumed.

NOTE: In some environments, the http:// or https:// p


rotocol identifier may be required as part of the host value.

conn_name Internal name of the connection. This name is referenced in your CLI load_data, publish
scripts. It should be a single value without spaces. , truncate_and_loa
d
NOTE: This value must be unique among your connection
names.

conn_id The internal identifier for the connection. When a connection is created, it is publish , load_dat
assigned an internal numeric identifier. This ID or the connection_na a, truncate_and_lo
me can be used to reference the connection in future commands. ad

Tip: This value is available when you hover over a connection in


the application. See Flows Page.

Copyright © 2019 Trifacta Inc. Page #68


job_type The execution environment in which to run the job: run_job
photon = Run on Photon running environment on Trifacta Server .

NOTE: If the job_type parameter is not specified, CLI jobs


are run on the Photon running environment.

hadoop = Run in the default running environment for your Hadoop


cluster.

NOTE: When this job type is applied, your CLI scripts


automatically transition to running jobs in Spark.

spark = Run on the Spark running environment in Hadoop.


databricksSpark = Run the job on the Azure Databricks running
environment in Azure.

For more information on these running environments, see


Running Environment Options.

job_id The internal identifier for the job. This value can be retrieved from the get_job_status,
output of a completed run_job command. blish,get_publica
tions,load_data

profiler When on, profiling of your job is enabled. Default is off. run_job

data Full UNIX path to the source TSV file. This file contains a URL pointing to run_job
the actual Hive or HDFS source: one TSV file for each job run. Executing
user must have access to this file.

script Full UNIX path from the Trifacta root directory to the CLI script file. run_job
Executing user must have access.

publish_action (Optional) Defines the action taken on second and subsequent publish run_job
operations:

create - (default) A new file is created with each publication.


Filename is numeric identifier of the job ID.
append - Each publication appends to the existing output file.
Filename is consistent across publications.

NOTE: Compression of published files is not supported throu


gh the command line interface.

NOTE: When publishing single files to S3, the append ope


ration is not supported.

replace - Subsequent publications replace the same file with each


execution.

header (Optional), The output for a CSV job with append or create publishing run_job
action includes the column headers as the first row. Default is false.

NOTE: If you use the header option, you must also include the
single_file option, or this setting is ignored.

single_file (Optional) When true, CSV or JSON outputs are written to a single file. run_job
Default is false.

Copyright © 2019 Trifacta Inc. Page #69


output_path (Required) Defines the fully qualified URI to where the job results are run_job
written, as in the following examples:

hdfs://host:port/path/filename.csv
s3://bucketName/path/filename.csv

NOTE: The output_path must include the protocol identifier


or host and port number (if applicable).

This parameter specifies the base filename. If you are publishing files, the p
ublish_action parameter value may change the exact filename that
is written.

Protocol is set in webapp.storageProtocolin trifacta-co


nf.json.

output_format Accepted values:csv,json, pqt (Parquet), and avro (Avro). run_job


For job_type=photon, you may generate csv, json, and avro
results.

database Name of Redshift or Hive database to which you are publishing or loading. publish,load_data

table The table of the database to which you are publishing or loading. publish,load_data

publish_format The format of the output file from which to publish to Hive or Redshift publish, get_publi
tables. Accepted values: csv, json, pqt (Parquet), or avro (Avro). cations

publish_opt_file Path to file containing definitions for multiple file or table targets to which to run_job
write the job's results. For more information, see
CLI Publishing Options File.

skip_publish_validation By default, the CLI automatically checks for schema validation when run_job
generating results to a pre-existing source.

If this flag is set, schema validation is skipped on results output.

For documentation on the CLI parameters, run:

./trifacta_cli.py --help

Additional documentation may be available for individual commands using the following:

./trifacta_cli.py <commmand> --help

Examples

A key function of the CLI is to execute jobs. You can also check job status through the command line interface
and then take subsequent publication actions using other commands.

Copyright © 2019 Trifacta Inc. Page #70


Run job

This command requires a dataset and a CLI script. The CLI script is used to programmatically run a
recipe produced in the Transformer page.
For example, if you receive raw data each day, you can parameterize the execution of the same recipe against
daily downloads written to HDFS.
Each run of the CLI script creates a new job. A finished CLI job appears on the Jobs page.
Steps:
1. A recipe is specific to a dataset. In the Transformer page, open the Recipe Panel.
2. Click Download.
3. Select CLI Script.
4. Download to your desktop. The ZIP contains the following:
script.cli Contains the necessary code and configuration to access HDFS and the script in the
Trifacta database.
datasources.tsv Contains pointers to the source storage location of your datasource(s).
If you are running jobs for a dataset with parameters, the downloaded assets reference only
the first matching file of the dataset. To run the job across all files in the dataset with
parameters, you must build the matching logic within your CLI script. For more information on
datasets with parameters, see Overview of Parameterization.
For an example of how to add parameters in a local script, see
CLI Example - Parameterize Job Runs.
publishopts.json Template file for defining one or more publishing targets for running jobs. See
CLI Publishing Options File.
5. These files must be transferred to the Trifacta Server where you can reference them from the Trifacta root
directory.
Notes on connections and jobs
In the downloaded ZIP, the datasources.tsv file may contain a reference to the connection used to import the
dataset. However, if you are running the CLI in an Trifacta platform instance that is different from its source, this
connectionId may be different in the new environment. From the new environment, please do the following:
1. Use the list_connections operation to acquire the list of connections available in the new environment.
See CLI for Connections.
2. Acquire the Id value for the connection corresponding to the one used in datasources.tsv.

NOTE: The user who is executing the CLI script must be able to access the connection in the new
environment.

3. Edit datasources.tsv. Replace the connection Id value in the file with the value retrieved through the
CLI.
4. When the job is executed, it should properly connect to the source through the connection in the new
environment.

Command - Basic Job Run

NOTE: This method of specifying a single-file publishing action has been superseded by a newer method,
which relies on an external file for specifying publishing targets. In a future release, this method may be
deprecated. For more information, see CLI Publishing Options File.

Example (All one command):

Copyright © 2019 Trifacta Inc. Page #71


./trifacta_cli.py run_job --user_name <trifacta_user> --password
<trifacta_password> --job_type spark
--output_format json --data redshift-test/datasources.tsv --script
redshift-test/script.cli
--cli_output_path ./job_info.out --profiler on --output_path
hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/42
/cleaned_table_1.json

Output

Job #42 has been successfully launched:

You may monitor the progress of your job here: http://localhost:3005/jobs

JSON Response

JSON response written to job_info.out:

Copyright © 2019 Trifacta Inc. Page #72


{
"status": "success",
"job_outputs": {
"other_outputs": [

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
2/.profiler/profilerValidValueHistograms.json",

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
2/.profiler/profilerSamples.json",

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
2/.profiler/profilerTypeCheckHistograms.json",

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
2/.profiler/profilerInput.json"
]
"job_result_files": [

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
2/cleaned_table_1.json",
]
},
"job_id": 42,
"cli_script":
"/trifacta/queryResults/[email protected]/redshift-test/script.cli",
"job_type": "spark",
"profiler": "on",
"source_data":
"/trifacta/queryResults/[email protected]/redshift-test/datasources.tsv",
"host": "localhost:3005",
"output_path":
"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
2/cleaned_table_1.json",
"user": "[email protected]",
"output_file_formats": [
"json"
],
}

Command - File Publishing Options

You can specify publication options as part of your run_job command. In the following, a single CSV file with
headers is written to a new file with each job execution.
Example (All one command):

Copyright © 2019 Trifacta Inc. Page #73


./trifacta_cli.py run_job --user_name <trifacta_user> --password
<trifacta_password> --job_type spark
--output_format csv --data redshift-test/datasources.tsv --script
redshift-test/script.cli
--publish_action create --header true --single_file true
--cli_output_path ./job_info.out --profiler on --output_path
hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/43
/cleaned_table_1.csv

Output

Job #43 has been successfully launched:

You may monitor the progress of your job here: http://localhost:3005/jobs

JSON Response

JSON response written to job_info.out:

Copyright © 2019 Trifacta Inc. Page #74


{
"status": "success",
"job_outputs": {
"other_outputs": [

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
3/.profiler/profilerValidValueHistograms.json",

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
3/.profiler/profilerSamples.json",

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
3/.profiler/profilerTypeCheckHistograms.json",

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
3/.profiler/profilerInput.json"
]
"job_result_files": [

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
3/cleaned_table_1.csv",
]
},
"job_id": 43,
"cli_script":
"/trifacta/queryResults/[email protected]/redshift-test/script.cli",
"output_file_formats": [
"csv",
],
"job_type": "spark",
"host": "localhost:3005",
"job_output_path":
"/trifacta/queryResults/[email protected]/MyDataset/43/",
"user": "[email protected]",
"source_data":
"/trifacta/queryResults/[email protected]/redshift-test/datasources.tsv",
"profiler": "on"
}

Command - publishing to multiple targets

As part of the CLI job, you can define multiple file or table targets to which to write the job results. For more
information, see CLI Publishing Options File.

Get job status

After you queue a job through the CLI, you can review the status of the job through the application or through the
CLI.

Copyright © 2019 Trifacta Inc. Page #75


Tip: You can acquire the job ID through the application as needed. For example, at some point in the
future, you might decide to publish to Hive the results from a job you executed two weeks ago. It might be
easiest to retrieve this job identifier from the Dataset Details page. See Dataset Details Page.

Command

Example (All one command):

./trifacta_cli.py get_job_status --user_name <trifacta_user> --password


<trifacta_password> --job_id 42
--cli_output_path ./job_info.out

Output

Job status: Complete

JSON Response

JSON response written to job_info.out:

Copyright © 2019 Trifacta Inc. Page #76


{
"status": "success",
"job_outputs": {
"other_outputs": [

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
2/.profiler/profilerValidValueHistograms.json",

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
2/.profiler/profilerSamples.json",

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
2/.profiler/profilerTypeCheckHistograms.json",

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
2/.profiler/profilerInput.json"
]
"job_result_files": [

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
2/cleaned_table_1.json",

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
2/cleaned_table_1.csv",

"hdfs://localhost:8020/trifacta/queryResults/[email protected]/MyDataset/4
2/cleaned_table_1.avro",
]
},
"job_id": 42,
"cli_script":
"/trifacta/queryResults/[email protected]/redshift-test/script.cli",
"output_file_formats": [
"csv",
"json",
"avro",
],
"job_type": "spark",
"host": "localhost:3005",
"job_output_path":
"/trifacta/queryResults/[email protected]/MyDataset/42/",
"user": "[email protected]",
"source_data":
"/trifacta/queryResults/[email protected]/redshift-test/datasources.tsv",
"profiler": "on"
}

Publish

You can publish job results for completed jobs to specified database tables:

Copyright © 2019 Trifacta Inc. Page #77


If the table does not exist, it is created.
If the table does exist, the data is appended to the existing table.
Results can be published to Hive or Redshift:
For Hive, you can publish Avro or Parquet results from HDFS or S3 to Hive.
For Redshift, you can publish CSV, JSON, or Avro results from S3 to Redshift.

NOTE: To publish to Redshift, results must be written first to S3.

NOTE: Even if you are publishing to the default schema, you must preface the table value with
the name of the schema to use: MySchema.MyTable.

Publish commands can be executed as soon as the job identifier has been created. After the publish command is
submitted, the publish job is queued for execution after any related transform job has been completed.

NOTE: You cannot publish ad-hoc results for a job when another publishing job is in progress for the
same job through the applicationor the command line interface. Please wait until the previous job has
been published before retrying to publish the failing job. This is a known issue.

You execute one publish command for each output that you wish to write to a supported database table.

Command

Example (All one command):

./trifacta_cli.py publish --user_name <trifacta_user> --password


<trifacta_password> --job_id 42
--database dev --table table_job_42 --conn_name 1 --publish_format avro
--cli_output_path ./publish_info.out

Output

Create new table every run has been successfully launched:


You may monitor the progress of your publish job here:
http://localhost:3005/jobs
Upon success, you may view the results of your publish job here:
http://localhost:3005/jobs/42

JSON Response

JSON response written to publish_info.out:

{"status": "Job Started", "job_id": 42}

Copyright © 2019 Trifacta Inc. Page #78


Get publications

You can retrieve a JSON list of all publications that have been executed for a specific job.

Command

Example (All one command):

./trifacta_cli.py get_publications --user_name <trifacta_user> --password


<trifacta_password> --job_id 42
--cli_output_path ./publications.out --publish_format avro

Output

Job with id 42 has 2 avro publication(s) associated with it. The list of
publications is available in "./publications.out".

JSON Response

JSON response written to publications.out:

{
"publications": [
{
"publication_target": "redshift",
"job_id": "42",
"database": "dev",
"publication_id": 69,
"app_host": "trifacta.example.com:3005",
"user": "[email protected]",
"table": "table_job_42",
"publish_format": "avro",
"connect_str": "jdbc:redshift://dev.example.com:5439/dev"
},
{
"publication_target": "hive",
"job_id": "42",
"database": "default",
"publication_id": 70,
"app_host": "trifacta.example.com:3005",
"user": "[email protected]",
"table": "table_job_42",
"publish_format": "avro",
"connect_str": "jdbc:hive2://hadoop:10000/default"
}
],
}

Copyright © 2019 Trifacta Inc. Page #79


Load data into table

You can load data into pre-existing Redshift tables.


Data is appended after any existing rows.
If the table does not exist, the job fails.

NOTE: When appending data into a Redshift table, the columns displayed in the Transformer page must
match the order and data type of the columns in the target table.

Command

Example (All one command):

./trifacta_cli.py load_data --user_name <trifacta_user> --password


<trifacta_password> --job_id 42
--database dev --table table_42 --conn_name aSQLServerConnection
--publish_format avro
--cli_output_path ./load_info.out

Output

Append to this table every run has been successfully launched:


You may monitor the progress of your publish job here:
http://localhost:3005/jobs
Upon success, you may view the results of your Load data/Append job here:
http://localhost:3005/jobs/42

JSON Response

JSON response written to load_info.out:

{"status": "Job Started", "job_id": 42}

Truncate and load

For existing tables, you can clear them and load them with results from a job. If the table does not exist, a new
one is created and populated.

Command

Example (All one command):

Copyright © 2019 Trifacta Inc. Page #80


./trifacta_cli.py truncate_and_load --user_name <trifacta_user>
--password <trifacta_password> --job_id 10
--database dev --table table_43 --conn_name aSQLServerConnection
--publish_format avro
--cli_output_path ./load_and_trunc_info.out

Output

Truncate and Load has been successfully launched:


You may monitor the progress of your publish job here:
http://localhost:3005/jobs
Upon success, you may view the results of your Truncate and Load job
here: http://localhost:3005/jobs/10

JSON Response

JSON response written to load_and_trunc_info.out:

{"status": "Job Started", "job_id": 10}

CLI Example - Parameterize Job Runs

In the next release of Trifacta® Wrangler Enterprise after Release 6.0, the Trifacta command line
interface tools will be removed from the product (End of Life). Before upgrading to that release or
a later one, you must migrate your usage of the CLI to use the REST APIs. For more information,
see CLI Migration to APIs.

You can use the following Bash script to execute parameterized job runs on the Trifacta® node. This script
accepts parameters to identify the CLI package downloaded to the node and then runs the job, whose output
includes an identifier for the current date. In this manner, the script can be run on a daily basis on any number of
CLI packages.
The CLI package includes:
script.cli - script file
datasources.tsv - file containing a pointer to the storage location of the source data

These values are provided to the script as command-line parameters (--script and --source).

Based on the specified parameters, this script does the following:


Launches the job
Monitors job execution
Generates error messages if the execution fails
Generates a success message if the execution succeeds

#!/bin/bash

## Example script to run a scheduled job after updating the date.

Copyright © 2019 Trifacta Inc. Page #81


## Scan the command line arguments for the script file path and for the
datasources.tsv file path
for i in "$@"
do
case $i in
--script=*)
ScriptParam="${i#*=}"
;;

--source=*)
SourceParam="${i#*=}"
;;

*)
# unknown option
echo ${i} parameter is not recognized. Please provide --script and
--source values.
echo
exit 0
;;
esac
done
if [${ScriptParam} -eq ""]
then
echo --script param is required.
echo
exit 0
fi
if [${SourceParam} -eq ""]
then
echo --source param is required.
echo
exit 0
fi

## Get the current date.


DATE=`date +%m%d`

## Define Job parameters


AppHost=http://localhost:3005
[email protected]
Password=password
## Hard-coded version: Script='/var/log/myDir/script.cli'
Script='${ScriptParam}'
## Hard-coded version: Data='/var/log/myDir/datasources.tsv'
Data='${SourceParam}'
JobType='spark'
OutputFormats='avro'
OutputPath="/data/common/output/$DATE"
extraArgs="--disable_server_certificate_verification"

## uncomment once daily job commences


## Change the job source date

Copyright © 2019 Trifacta Inc. Page #82


## NEWPATH="hdfs://namenode:port/path/to/$DATE/file"
## echo $NEWPATH > $Data

## Launch job
echo "/opt/trifacta/bin/trifacta_cli.py run_job --script=$Script
--data=$Data --host=$AppHost --job_type=$JobType --user_name=$User
--password=$Password --output_formats=$OutputFormats
--output_path=$OutputPath $extraArgs" >> stdout.txt
/opt/trifacta/bin/trifacta_cli.py run_job --script=$Script --data=$Data
--host=$AppHost --job_type=$JobType --user_name=$User
--password=$Password --output_formats=$OutputFormats
--output_path=$OutputPath $extraArgs >> stdout.txt 2>> stderr.txt

JobLaunched=$?

if [ "$JobLaunched" -eq 1 ]
then
echo "Failed to launch job. See stderr.txt for details"
exit 1
fi

## Parse job id from stdout.txt


JobId=$(cat stdout.txt | sed ':a;N;$!ba;s/\n/ /g;s/.*\Job
#\([0-9]*\).*/\1/')

JobInfo=''
JobStatus='Pending'

## If jobId exists..
if [ "$JobId" -ge 0 ]
then
echo "Job with Id " $JobId " launched"

## Start loop to monitor job status


while [ "$JobStatus" != 'Complete' ] && [ "$JobStatus" != 'Failed' ]
do
## Get job status from server
command="/opt/trifacta/bin/trifacta_cli.py get_job_status
--job_id=$JobId --host=$AppHost --user_name=$User --password=$Password
$extraArgs"

JobStatus=`$command | sed -e 's/^.*: //'`


if [ "$JobStatus" != 'Complete' ] && [ "$JobStatus" != 'Failed' ]
then
echo "Waiting for job to complete..."
sleep 20
fi
done
else
echo "Failed to launch job. See stderr.txt for details"

Copyright © 2019 Trifacta Inc. Page #83


exit 1
fi

if [ "$JobStatus" = 'Complete' ]
then
echo "Job "$JobId" is complete."
echo "Output path is $OutputPath"
else

Copyright © 2019 Trifacta Inc. Page #84


echo "Job failed to complete. Job status = "$JobStatus
exit 1
fi

You can use the above as a basic template for execution of any type of CLI command.
CLI Publishing Options File

In the next release of Trifacta® Wrangler Enterprise after Release 6.0, the Trifacta command line
interface tools will be removed from the product (End of Life). Before upgrading to that release or
a later one, you must migrate your usage of the CLI to use the REST APIs. For more information,
see CLI Migration to APIs.

If needed, you can specify multiple file or table targets as part of a single CLI job. In your CLI command, the path
on the Trifacta® node to this JSON file is specified as the publish_opt_file parameter, as in the following:

./trifacta_cli.py run_job --user_name <trifacta_user> --password


<trifacta_password> --job_type spark --data redshift-test/datasources.tsv
--script redshift-test/script.cli --cli_output_path ./job_info.out
--profiler on --publish_opt_file /json/publish/file/publishopts.json

The file publishopts.json contains the specification of the targets.

Tip: To specify this file, you can run this job through the application. After the job has completed,
download the CLI script from the Recipe panel in the Transformer page. The downloaded publishopts
.json file contains the specification for the targets you just executed. See Recipe Panel.

Example publishopts.json file:

Copyright © 2019 Trifacta Inc. Page #85


{
"file": [
{
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS-r01.c
sv",
"action": "create",
"format": "csv",
"header": true,
"asSingleFile": true,
"compression": "none"
},
{
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS-r01.j
son",
"action": "create",
"format": "json",
"header": false,
"asSingleFile": false,
"compression": "none"
}
],
"hive": [
{
"databaseName":"default",
"tableName":"POS-r01",
"action":"overwrite"
}
]
}

NOTE: All of the following properties require valid values, unless noted.

File targets:

Property Description

path Full path to the target file. Path must include the protocol identifier, such as hdfs:// and the port number.

action The action to take on the file. Supported actions:

create - Create a new file with each subsequent publication. Filenames for subsequent job runs are
appended with the job number identifier.
append - The results of each subsequent job run are appended to the existing file contents.
replace - The results of each subsequent job run replace the same file. Previous job run results are lost
unless moved out of the location.

Some limitations apply to these options. See Run Job Page.

Copyright © 2019 Trifacta Inc. Page #86


format Output format for the file. Supported formats:

csv
json
avro
pqt

header If set to true, then output files in CSV format include a header row. Headers cannot be applied when compression
is enabled.

asSingleFile If set to true, then output files are written to a single file.

If set to false, then the output files are written to multiple files as needed.

compression (optional) This property can be used to specify any compression to apply to a text-based file. Supported compression
formats:

gzip
bzip2
snappy
If this is not specified, then no compression is applied to the output file.

Hive targets:

Property Description

databaseName Name of the database.

NOTE: The database must contain at least one table.

tableName Name of the table in the database to which to write.

action The write action to apply to the table. Supported actions:

create - Create a new table with each subsequent publication. Table names for subsequent job runs are
appended with a timestamp.
append - The results of each subsequent job run are appended to the existing table contents.
replace - The results of each subsequent job run are written to the same table, which has been emptied.
Previous job run results are lost unless moved out of the location (dropAndLoad).
overwrite - The results of each subsequent job run are written to a newly created table with the same name
as the output table from the previous job run (truncateAndLoad).

Some limitations apply to these options. See Run Job Page.

CLI for User Admin

In the next release of Trifacta® Wrangler Enterprise after Release 6.0, the Trifacta command line
interface tools will be removed from the product (End of Life). Before upgrading to that release or
a later one, you must migrate your usage of the CLI to use the REST APIs. For more information,
see CLI Migration to APIs.

Contents:
Command Reference
Troubleshooting

Copyright © 2019 Trifacta Inc. Page #87


NOTE: This feature requires developer-level skills to enable and use.

The Command Line Interface for User Administration enables administrators to perform bulk user management
tasks on the platform. You can use the CLI to manage the following tasks:
Create, edit, or delete users.
Enable or disable an existing users.
Retrieve individual or all user profiles, including any security details.
Password reset.

Command Reference

The CLI tools are stored in the following directory:

/opt/trifacta/bin/

For creating or managing users, specify parameters for the following:

./trifacta_admin_cli.py --admin_username <trifacta_admin_username>


--admin_password <trifacta_admin_password> create_user --user_name
[email protected]
--password user_pwd --verbose

Copyright © 2019 Trifacta Inc. Page #88


Parameters

Parameter Description Applicable CLI


Commands

admin_username Username of the admin account to be used to execute the user admin All
command. Please specify the full username.

NOTE: If Single Sign On is enabled, you must specify your SSO


credentials here instead. See below.

admin_password Password of the admin account. All

NOTE: If Single Sign On is enabled, you must specify your SSO


credentials here instead. See below.

NOTE: If you have enabled Kerberos-based access to the Trifa


cta platform, you do not need to provide a password. To enable,
additional configuration is required. See
Configure for Kerberos Integration.

NOTE: Passwords can be stored in an external file, which is


automatically checked during script execute. See
CLI Config File.

admin_command_type The type of CLI command to execute. Accepted values: All

create_user - Create a new user account with the specified


credentials.
show_user - Retrieve account information for the specified user
account.
edit_user - Modify the specified user account.
get_password_reset_url - Generate a password reset
URL for a specified user.
delete_user - Delete the specified user account.
See Examples below.

host (Optional) The server and port number of the Trifacta® instance. By All
default, this value is set to http://localhost:3005. Specify a
new value if needed.

NOTE: When SSO is enabled, you must specify this value to


point to the Apache server port where the Trifacta node authenti
cation gateway listens. When running the CLI on the Trifacta
node, this value is typically the following:

https://localhost:2443

See below.

NOTE: In some environments, the http:// or https://


protocol identifier may be required as part of the host value.

Copyright © 2019 Trifacta Inc. Page #89


user_name Username of the account to be modified. This value is the user ID. It must All
resolve to a valid, accessible email address. Some features of the
platform fail to work correctly with invalid email addresses.

NOTE: In the response, this value is listed as email .

password Password of the account to be modified. All

NOTE: If you have enabled Kerberos-based access to the Trifa


cta platform, you do not need to provide a password. To enable,
additional configuration is required. See
Configure for Kerberos Integration.

name Display name for the user. create_user


and edit_use
r if making
changes to this
parameter.

verbose Generate verbose output. All

disable_ssl_certification (Optional) When communicating over HTTPS, this setting can be used to All
override the default behavior of validating the server certificate before
executing the command.

NOTE: If you have stored a self-signed certificate on the Trifact


a node, please set the REQUESTS_CA_BUNDLE environm
ent variable to point to the directory that contains the trusted
server's certificate(s). The CLI will verify against these certs. In
this case, the disable_ssl_certificate parameter
is not needed.

disable (Optional) Put the user in a disabled state. create_user


and edit_use
r

enable (Optional) Put the user in an enabled state. Default is to enable the user. create_user
and edit_use
r

transfer_assets_to (Optional) When deleting a user, you can optionally transfer all of the delete_user
user's assets to another user.

NOTE: Assets cannot be transferred to another user before or


after the user deletion command. If assets are not transferred,
they remain unowned in the system and are not removed from
any form of storage.

For documentation on the CLI parameters, run:

./trifacta_admin_cli.py --help

Config file

You can store Trifacta platform username and password information in an external file. See CLI Config File.

Copyright © 2019 Trifacta Inc. Page #90


User account properties

The following user account properties are exposed through the command line:

Property Description Editable


through
CLI?

--hadoopPrincipal Hadoop principal value that is used to connect to the Hadoop environment. This setting Y
applies only when secure impersonation is enabled.

--outputHomeDir The output home directory for the user. By default, the results of each job executed by the Y
user are generated in a sub-directory within this one.

--name The display name for the user. Y

--isDisabled When set to True, the user account is disabled and cannot be used to login to the Y
application.

--email The email address associated with the user account. The email address is also the userID for Y
the account.

--ssoPrincipal The SSO principal value associated with the user account. This value only applies to Y
environments that are integrated with an enterprise Single Sign On solution.

--enableAdmin When set to True, this user account is a system administrator account. You should limit the Y
number of accounts that have system administrator access.

--disableAdmin When set to True, this user account is not a system administrator account. You should limit Y
the number of accounts that have system administrator access.

--lastLoginTime The timestamp of when the user account was most recently used to login to the application. N

Examples
User Admin under SSO

If you are in an SSO environment, the following properties require special values to properly authenticate with
AD/LDAP. All values are required:

Property Description

admin_username Use the SSO username for the platform admin user issuing the command.

NOTE: In an SSO environment, the default admin user account for the Trifacta platform does not work.
The issuing user must be an SSO user that has been promoted to admin within the Trifacta platform.

admin_password Use the password associated with the SSO username.

host This value must point to the SSO gateway on the Trifacta node and must include the port number. If you are
running the CLI on the Trifacta node, use the following:

https://localhost:2443

For more information, see Configure SSO for AD-LDAP.

ssoPrincipal In SSO environments, this parameter is required. It must be set to the SSO principal value associated with the
user that is being modified.

Copyright © 2019 Trifacta Inc. Page #91


Create user

Command
Example (all one command):

./trifacta_admin_cli.py --admin_username <trifacta_admin_user>


--admin_password <trifacta_admin_password>
create_user --user_name [email protected] --password Hello2U --name Joe

Notes
Add --disable parameter to create the user in a disabled state.

Output

Create user [email protected]

Account information for [email protected]


hadoopPrincipal: None
outputHomeDir: /trifacta/queryResults/[email protected]
name: Joe
isDisabled: False
email: [email protected]
ssoPrincipal: None
enableAdmin: False
lastLoginTime: None

Show user

Command
Example (all one command):

./trifacta_admin_cli.py --admin_username <trifacta_admin_user>


--admin_password <trifacta_admin_password>
show_user --user_name [email protected]

Output

Show user [email protected]

Copyright © 2019 Trifacta Inc. Page #92


Account information for [email protected]
hadoopPrincipal: None
outputHomeDir: /trifacta/queryResults/[email protected]
name: Joe
isDisabled: False
email: [email protected]
ssoPrincipal: None
enableAdmin: False
lastLoginTime: None

Edit user

Command
The following command changes the Single Sign On principal for the user to a new value. The values for other
user account settings found in the response below can be inserted in the command to modify those settings.
Example (all one command):

./trifacta_admin_cli.py --admin_username <trifacta_admin_user>


--admin_password <trifacta_admin_password>
edit_user --user_name [email protected] --ssoPrincipal sso_principal

Output

Edit user [email protected]

Account information for [email protected]


hadoopPrincipal: None
outputHomeDir: /trifacta/queryResults/[email protected]
name: Joe
isDisabled: True
email: [email protected]
ssoPrincipal: sso_principal
enableAdmin: False
lastLoginTime: None

Generate password reset URL

Command
The following command generates a URL for a specified user that enables the user to reset his or her account
password.

NOTE: The script returns with a URL containing the hostname with which it was invoked. You should
invoke the script with a fully qualified domain name. If returned hostname is not accessible to the
designated user, then the hostname must be replaced prior to passing the URL to the user for execution.

Example (all one command):

Copyright © 2019 Trifacta Inc. Page #93


./trifacta_admin_cli.py --admin_username <trifacta_admin_user>
--admin_password <trifacta_admin_password>
get_password_reset_url --user_name [email protected]

Output

Generating password reset url for user [email protected]


Reset
url:http://localhost:3005/[email protected]&code=CD442
32791

Disable user

Command
The following command disables the specified user. Disabled users can no longer login to the application and
cannot execute any jobs or commands at the command line.
Example (all one command):

./trifacta_admin_cli.py --admin_username <trifacta_admin_user>


--admin_password <trifacta_admin_password>
edit_user --user_name [email protected] --disable

Output

Edit user [email protected]

Account information for [email protected]


hadoopPrincipal: None
outputHomeDir: /trifacta/queryResults/[email protected]
name: Joe
isDisabled: True
email: [email protected]
ssoPrincipal: None
enableAdmin: False
lastLoginTime: None

Delete user

Command
Delete the user [email protected] and transfer his assets to [email protected].

NOTE: The transfer of the deleted user's assets is optional. If it is invoked, the user to whom the assets
are assigned must have matching permissions on the datastores where the imported datasets are
located.

Copyright © 2019 Trifacta Inc. Page #94


If it is not invoked, the assets remain on the datastore and cannot be managed through the Trifacta
platform until someone creates imported datasets from the files or directories.

Example (all one command):

./trifacta_admin_cli.py --admin_username <trifacta_admin_user>


--admin_password <trifacta_admin_password>
delete_user --user_name [email protected] --transfer_assets_to
[email protected]

Output

Delete user [email protected]


Transferring assets from [email protected] to [email protected]

Troubleshooting

Exceeded 30 redirects when executing Admin CLI in SSO mode

If you are executing the Admin CLI in SSO mode on the localhost, you may receive the following error message to
standard output:

Exceeded 30 redirects

Solution:
This problem occurs when the CLI is run against the application, instead of the gateway proxy. Please insert the
host of the gateway proxy for the host parameter, instead of the host of the application.

CLI Config File

In the next release of Trifacta® Wrangler Enterprise after Release 6.0, the Trifacta command line
interface tools will be removed from the product (End of Life). Before upgrading to that release or
a later one, you must migrate your usage of the CLI to use the REST APIs. For more information,
see CLI Migration to APIs.

As an alternative to including admin passwords in each command that is executed, you can insert a set of admin
credentials into a configuration file. The file location is the following:

~/.trifacta/config.json

One or more sets of credentials can be specified in the following format:

Copyright © 2019 Trifacta Inc. Page #95


{
"credentials": {
"usernameX":"passwordX",
"usernameY":"passwordY"
}
}

In your scripts, you can specify just the value for admin_username, and the config file is checked for the
appropriate password, which is applied to the command.

NOTE: The permissions on this config file should be set such that only the user executing the command
can read the file.

API Reference
This section contains reference information on the REST APIs that are made available by the Trifacta® platform.
Topics:
API Overview
API Authentication
Manage API Access Tokens
API Endpoints
v4 Endpoints
API AccessTokens Create v4
API AccessTokens Delete v4
API AccessTokens Get List v4
API AccessTokens Get v4
API Connections Create DryRun v4
API Connections Create v4
API Connections Delete v4
API Connections Get List v4
API Connections Get Status v4
API Connections Get v4
API Connections Patch v4
API Connections Permissions Create User v4
API Connections Permissions Delete User v4
API Connections Permissions Get User v4
API Connections Vendors Get List v4
API Deployments Create v4
API Deployments Delete v4
API Deployments Get List v4
API Deployments Get Release List v4
API Deployments Get v4
API Deployments Object Import Rules Patch v4
API Deployments Patch v4
API Deployments Run v4
API Deployments Value Import Rules Patch v4
API EMRClusters Create v4
API EMRClusters Delete v4
API EMRClusters Get Count v4
API EMRClusters Get List v4
API EMRClusters Get v4
API EMRClusters Patch v4

Copyright © 2019 Trifacta Inc. Page #96


API Flows Create v4
API Flows Delete v4
API Flows Get List v4
API Flows Get v4
API Flows Package Get DryRun v4
API Flows Package Get v4
API Flows Package Post DryRun v4
API Flows Package Post v4
API Flows Patch v4
API ImportedDatasets Create v4
API ImportedDatasets Delete v4
API ImportedDatasets Get List v4
API ImportedDatasets Get v4
API ImportedDatasets Patch v4
API ImportedDatasets Post AddToFlow v4
API JobGroups Cancel v4
API JobGroups Create v4
API JobGroups Delete v4
API JobGroups Get Jobs v4
API JobGroups Get List v4
API JobGroups Get Publications v4
API JobGroups Get Status v4
API JobGroups Get v4
API JobGroups Put Publish v4
API OutputObjects Create v4
API OutputObjects Delete v4
API OutputObjects Get List v4
API OutputObjects Get v4
API OutputObjects Update v4
API People Create v4
API People Delete v4
API People Get List v4
API People Get v4
API People Patch v4
API Publications Create v4
API Publications Delete v4
API Publications Get List v4
API Publications Get v4
API Publications Update v4
API Releases Create DryRun v4
API Releases Create v4
API Releases Delete v4
API Releases Get v4
API Releases Package Get v4
API Releases Patch v4
API WrangledDatasets Create v4
API WrangledDatasets Delete v4
API WrangledDatasets Get List v4
API WrangledDatasets Get PrimaryInputDataset v4
API WrangledDatasets Get v4
API WrangledDatasets Patch v4
API WrangledDatasets Post AddToFlow v4
API WrangledDatasets Put PrimaryInputDataset v4
API WriteSettings Create v4
API WriteSettings Delete v4
API WriteSettings Get List v4
API WriteSettings Get v4

Copyright © 2019 Trifacta Inc. Page #97


API WriteSettings Update v4
v3 Endpoints
API Connections Create v3
API Connections Delete v3
API Connections Get List v3
API Connections Get Status v3
API Connections Get v3
API Deployments Create v3
API Deployments Delete v3
API Deployments Get List v3
API Deployments Get Release List v3
API Deployments Get v3
API Deployments Object Import Rules Patch v3
API Deployments Patch v3
API Deployments Run v3
API Deployments Value Import Rules Patch v3
API Flows Create v3
API Flows Delete v3
API Flows Get List v3
API Flows Get v3
API Flows Package Get DryRun v3
API Flows Package Get v3
API Flows Package Post DryRun v3
API Flows Package Post v3
API Flows Patch v3
API ImportedDatasets Create v3
API ImportedDatasets Delete v3
API ImportedDatasets Get List v3
API ImportedDatasets Get v3
API ImportedDatasets Post AddToFlow v3
API JobGroups Create v3
API JobGroups Delete v3
API JobGroups Get Jobs v3
API JobGroups Get List v3
API JobGroups Get Status v3
API JobGroups Get v3
API JobGroups Put Publish v3
API People Create v3
API People Delete v3
API People Get List v3
API People Get v3
API People Patch v3
API Releases Create DryRun v3
API Releases Create v3
API Releases Delete v3
API Releases Get v3
API Releases Package Get v3
API Releases Patch v3
API WrangledDatasets Create v3
API WrangledDatasets Delete v3
API WrangledDatasets Get List v3
API WrangledDatasets Get PrimaryInputDataset v3
API WrangledDatasets Get v3
API WrangledDatasets Put PrimaryInputDataset v3
API Session Get
API Version Support Matrix
API Migration to v4

Copyright © 2019 Trifacta Inc. Page #98


API - UI Integrations
UI Integration - Create Dataset
API Workflows
API Workflow - Develop a Flow
API Workflow - Deploy a Flow
API Workflow - Run Job on Dataset with Parameters
API Workflow - Publish Results
API Workflow - Manage Outputs
API Workflow - Swap Datasets

API Overview
Contents:
Design Overview
URL Format
Naming Conventions
Operations and Methods
Embedding Associations
Media Type Headers
Authentication
SSL
Upload
Versioning and Endpoint Lifecycle
HTTP Status Codes and Errors
Caching
Use Cases
REST API Tasks
UI Integrations
About This Documentation

To enable programmatic control over its objects, the Trifacta® platform supports a range of REST API endpoints
across the objects in the platform. This section provides an overview of the API design, methods, and supported
use cases.
Supported operations:
Connections: Get information about connections
Datasets: Create, list, update, and delete operations on datasets
Swap datasets
Jobs and Results:
Launch job
Get job status
Publish job results
Create dataset from results
Get profile metadata:
Quality bar status
Schema (column names and types)
Users: Create, list, delete
Uses:
Can be used for automation of resource management for end-to-end workflow
Can be used to integrate wrangling experience in third-party application
See Use Cases below.

Copyright © 2019 Trifacta Inc. Page #99


Design Overview

URL Format

<http/https>://<my_server>:<port_number>/<version>/<endpoint>/[resource_i
d]/[association][?args]

Elements in square brackets [brackets] are optional.

Element Description Example

<http/https> HTTP protocol identifier. The protocol should be https in a production https
environment.

<my_server> Name of the Trifacta node wrangler.example.com

<port_number> Port number over which you access the Trifacta platform. By default, this 3005
value is 3005.

<version> API version number. v3

NOTE: Unless stated otherwise, the versions for all API


endpoints is v3.

<endpoint> Name of the API endpoint to use. /connections

[resource_id] Internal identifier for the specific resource requested from the endpoint. /10
This value defines the object against which the requested operation is
performed.

[association] If applicable, the association identifiers the API endpoint that is /jobGroups
requested using the context determined by the <endpoint> and the
[resource_id].
Associations can also be referenced by query parameter. See
Embedding Associations below.

[?args] In some cases, arguments can be passed to the endpoint in the form of ?arg1=value1&arg2=value2
query parameters.

Naming Conventions

Resource names are plural and expressed in camelCase.


Resource names are consistent between main URL and URL parameter.

v4 conventions

The following conventions apply to v4 and later versions of the APIs:


Parameter lists are consistently enveloped in the following manner:

Copyright © 2019 Trifacta Inc. Page #100


{ "data": [
{
...
}
]
}

Field names are in camelCase and are consistent with the resource name in the URL or with the embed U
RL parameter.
From early API versions, foreign keys have been replaced with identifiers like the following:

v3 and earlier v4 and later

"createdBy": "creator": {
1, "id": 1
},

"updatedBy": "updater": {
2, "id": 2
},

Operations and Methods

Support for basic CRUD (Create, Read, Update, and Delete) operations across most platform objects.

NOTE: Some of these specific operations may not be supported in the current release. For a complete
list, see API Endpoints.

Operation HTTP Example URL Notes


Method

Create POST /v3/people


POST /v3/jobResults
Read GET /v3/people/1 1 = internal user Id
GET /v3/jobResults/10 10 = internal job Id
GET /v3/people/1/jobGroups
GET /v3/jobGroups/4/flowNode flowNode is a singular reference. Most resource names are
plural.

List GET /v3/people


GET /v3/jobResults
Update PATCH /v3/people/1 Partial replacement

Copyright © 2019 Trifacta Inc. Page #101


PATCH /v3/jobResults/10 Partial replacement

PUT /v3/people/1 Full replacement

PUT /v3/jobResults/10 Full replacement

Delete DELETE /v3/people/1


DELETE /v3/jobResults/10

Embedding Associations

An association can be referenced using the above URL structuring or by applying the embed query parameter as
part of the reference to the specific resource. Example:

https:/wrangler.example.com/v3/jobGroups/6?embed=flowNode

Example response:

{
"id": 6,
"description": "A nifty job group",
"flowNode": {
"id": 1,
"script": {
"id": 1
},
"terminal": true
...
}
}

The id value of the association is always included in the response.

Media Type Headers

NOTE: Some endpoints may accept and return a custom media type. These endpoints are documented.

Action Header Required?

Client request that expects a response body request header: should include
Accept: application/json
Client request that includes a request body request header: required
Content-Type: application/json
Server response that includes a response body response header: required
Content-Type: application/json

Copyright © 2019 Trifacta Inc. Page #102


Authentication

The REST APIs use the same authentication methods as the UI. Each call to an API endpoint must include
authentication credentials for a user with access to the requested objects. See API Authentication.

SSL

If SSL has been enabled for the Trifacta platform, requests to URL endpoints are automatically redirected to the
HTTPS equivalent.

Upload

Single-file upload is supported.


Multi-file upload is not supported.

Versioning and Endpoint Lifecycle

NOTE: API versioning is not synchronized to specific releases of Trifacta Wrangler Enterprise. For
example, some API endpoints for v4 may be updated, while v3 instances of the API endpoint are still
supported. APIs are designed to be backward compatible.

APIs are designed to be backward compatible so that scripts and other tooling built on a previous version of an
endpoint remain valid until the previous version has reached end-of-life. Each API is supported across a window
of Trifacta Wrangler Enterprise releases, after which you must reference a newer version of the API.
API endpoint routes support a consistent structuring and do not contain business logic.
Version information is available at the following endpoint:

<http/https>://<my_server>:<port_number>/<version>/version

For more information, see API Version Support Matrix.

HTTP Status Codes and Errors

Request Method Request Endpoint HTTP Status Code (success)

POST /v3/<resource> 201 Created

GET /v3/<resource> 200 OK

GET /v3/<resource>/<id> 304 Not Modified when client has cached version.

See Caching below.

PATCH /v3/<resource>/<id> 200 OK

PUT /v3/<resource>/<id> 200 OK

DELETE /v3/<resource>/<id> 204 No Content

The following error codes can apply to any of the above requests:

NOTE: 5xx status codes may be generated by the server.

Copyright © 2019 Trifacta Inc. Page #103


HTTP Status Code (client errors) Notes

400 Bad Request Potential reasons:

Resource doesn't exist.


Request is incorrectly formatted.
Request contains invalid values.

403 Forbidden Incorrect permissions to access the Resource.

404 Not Found Resource cannot be found.

410 Gone Resource has been previously deleted.

415 Unsupported Media Type Incorrect Accept header.

Caching

When a resource has been cached in the client, the client may set an If-Modified-Since header in HTTP
date format on the request. If so:

General Response HTTP Status Code

Returns full modified resource 200 OK

Returns an empty response for unmodified resource 304 Not Modified

Use Cases

REST API Tasks

By chaining together sequences of calls to API endpoints, you can create, read, update, and delete objects using
identifiers accessible through the returned JSON. For more information, see API Endpoints.
For more information on endpoint workflows, see API Workflows.

UI Integrations

The REST APIs can also be used for integrating the core transformation experience of the Trifacta platform into a
third-party application. Using a series of URL-based calls, you can retrieve and display specified datasets in the
Transformer page, where authenticated users can wrangle datasets controlled by the third-party application. See
API - UI Integrations.

About This Documentation

Unless otherwise noted, the documentation and examples apply to version 3 (v3) of the Trifacta platform A
PIs.
Examples may require modification to work in your environment.
API Authentication
Contents:
Required Permissions
API Access Token Authentication
Basic Authentication
SSO Authentication
Kerberos Authentication
Logout

Copyright © 2019 Trifacta Inc. Page #104


The Trifacta® REST APIs support the following methods of authentication.

Required Permissions

Authenticating user must be a valid user of the deployed instance of the Trifacta platform.

API Access Token Authentication

API access tokens can be acquired and applied to your requests to obscure sensitive Personally Identifiable
Information (PII) and are compliant with common privacy and security standards. These tokens last for a
preconfigured time period and can be renewed as needed.

Tip: API access tokens are the preferred method of authentication.

NOTE: This feature may need to be enabled in your instance of the Trifacta platform. For more
information, see Enable API Access Tokens.

Basic Steps:
1. You submit a request to create a new access token.
a. You can create and delete access tokens through the Settings area of the Trifacta application. See
Access Tokens Page.
b. You can create access tokens through the REST API endpoint.
i. If you do not have a valid access token, you must submit your request to the endpoint using
one of the other forms of authentication.
ii. If you do have a valid access token, you can use it with your submission to generate a new
access token.
iii. See API AccessTokens Create v4.
2. With each request, you submit the token as part of the Authorization header.
3. Continue using the token. As needed, you can create and use additional tokens. There is no limit to the
number of tokens you can create.

Tip: API access tokens work seamlessly with platform-native SAML and LDAP SSO authentication. They
do not work with the reverse proxy method of SSO authentication. Details are below.

For more information on this process, see Manage API Access Tokens.

Basic Authentication

As request parameters, you can submit username/password under Basic Auth to any REST API endpoint.

NOTE: You must submit authentication credentials with each request to the platform.

NOTE: The user must have permissions to execute the endpoint action.

Copyright © 2019 Trifacta Inc. Page #105


Example:
This example submits authentication requirements over HTTP, including the username and password (me@examp
le.com:me_pwd):

$ curl -u [email protected]:me_pwd \
-b ~/cookies.txt -c ~/cookies.txt \
http://<platform_host>:<platform_port_number>/v3/<endpoint>

where:

Parameter Description

-u [email protected]:me_pwd Required username and password.

-b and -c Required paths and filenames for storage of send and receive HTTP cookies.

<platform_host> Fully qualified name of the host of the Trifacta platform

<platform_port_number> Port number through which to access the Trifacta platform. Default is 3005.

SSO Authentication

You can use the APIs in SSO environments. Below, you can review the best method of authenticating to the APIs
based on your SSO environment:

SSO Type API Auth

Platform-native SAML API access tokens work seamlessly. Basic auth does not work.

Platform-native LDAP-AD API access tokens work seamlessly. Basic auth does not work.

Reverse proxy SAML Use basic auth described below. Additional configuration may be required.

Reverse proxy LDAP-AD Use basic auth described below. Additional configuration may be required.

In a single sign-on environment, you can use basic authentication to interact with the APIs.

NOTE: Enabling SSO integration with the Trifacta platform requires additional configuration. See
Configure SSO for AD-LDAP.

However, some changes are required:


Basic authentication to the gateway must be enabled as part of the configuration for the reverse proxy. This
feature is enabled by default, but please verify that it has not been explicitly disabled in your environment.
For more information, see Configure SSO for AD-LDAP.
You must authenticate using the SSO principal as the username and the LDAP or AD password associated
with that user.
You must authenticate to the SSO gateway. In the Trifacta platform, this value corresponds to the <platf
orm_host>:<platform_port_number> value.

Example:

Copyright © 2019 Trifacta Inc. Page #106


$ curl -u [email protected]:foobar -x
http://<platform_host>:<platform_port_number> \
-b ~/cookies.txt -c ~/cookies.txt \
http://<platform_host>:<platform_port_number>/v4/<endpoint>

NOTE: For the protocol identifier, you can also use https if SSL is enabled. See Install SSL Certificate.

Parameter Description

[email protected]:foobar LDAP principal and password associated with that username.

For more information, see Configure SSO for AD-LDAP.

Kerberos Authentication

In a Kerberos environment, credentials must be submitted with each request using the SPNEGO Auth method.
Kerberos is a network authentication protocol for client/server applications.
SPNEGO provides a mechanism for extending Kerberos to Web applications through HTTP.
For more information on the differences, see
https://msdn.microsoft.com/en-us/library/ms995330.aspx#http-sso-2_topic2.
Credentials are authenticated by the KDC for each request.

NOTE: SPNEGO must be enabled and configured for your REST client or programming library.

Example 1 - Embedded in Java:


SPNEGO requires a custom client. The following SPNEGO client enables submission of URL-based
authentication parameters from within Java:
http://docs.oracle.com/javase/6/docs/technotes/guides/security/jgss/lab/part5.html
Example 2 - Using cURL:
To use cURL:
1. Verify that your version of cURL supports GSS:

$ curl -V
curl 7.51.0 (x86_64-apple-darwin16.0) libcurl/7.51.0 SecureTransport
zlib/1.2.8
Protocols: dict file ftp ftps gopher http https imap imaps ldap
ldaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IPv6 Largefile GSS-API Kerberos SPNEGO NTLM
NTLM_WB SSL libz UnixSockets

2. Verify that GSS-API and SPNEGO are in the output.


3. Run kinit and authenticate using the hadoop principal:

Copyright © 2019 Trifacta Inc. Page #107


$ kinit
Please enter the password for [hadoop.user.principal]@localhost:
$

4. Access using cURL:

$ curl --negotiate -u anything \


-b ~/cookies.txt -c ~/cookies.txt \
http://<platform_host>:<platform_port_number>/v3/<endpoint>

where:

Parameter Description

--negotiate Enables SPNEGO use in cURL. This option requires a library built with GSS-API or SSPI support. If this option
is used several times, only the first one is used. Use --proxy-negotiate to enable Negotiate
(SPNEGO) for proxy authentication.

-u anything Required username. However, this username is ignored. Instead, the principal used in kinit is applied.

For more information:


https://hadoop.apache.org/docs/r2.7.3/hadoop-auth/Examples.html
https://msdn.microsoft.com/en-us/library/ms995329.aspx

Logout

Since each request requires credentials, logging out is not required.


Manage API Access Tokens

Contents:
Enable
Generate New Token
Via API
Via UI
Use Token
List Tokens
Renew Token
Delete Token

This section provides some workflow information for how to use API access tokens as part of your API projects on
the Trifacta® platform. An access token is a hashed string that enables authentication when submitted to any
endpoint of the platform. Access tokens limit exposure of clear-text authentication values and provide an easy
method of managing authentication outside of the browser.
Notes:
An access token is linked to its creator and can be generated by submitting a username/password
combination or another valid token from the same user.
If a token is created for userA, userB can be provided the token to impersonate userA.

Copyright © 2019 Trifacta Inc. Page #108


You cannot create access tokens for users without their authentication credentials.
Changes to passwords do not affect tokens.
After a token has been created, it cannot be modified or extended.
You can create an unlimited number of tokens.
Access tokens are available in v4 and later of the APIs. They can be used for authentication with any
supported version of the APIs.

Enable

This feature must be enabled in your instance of the platform. For more information, see
Enable API Access Tokens.

Generate New Token

API access tokens must be created.

NOTE: The first time that you request a new API token, you must submit a separate form of
authentication to the endpoint. To generate new access tokens after you have created one, you can use a
valid access token if you have one.

Via API

For more information, see API AccessTokens Create v4.

Via UI

Tokens can be generated from the web application.


Steps:
1. Login to the Trifacta application.
2. From the left nav bar, select Settings > Access Tokens.
3. Click Generate New Token.
4. Specify the number of days for how long the token should live. Enter -1 to create a non-expiring token.
5. Add a user-friendly description if desired.
6. Click Generate.

NOTE: Copy the value of the token to the clipboard and store it in a secure location for use with your
scripts.

Tip: If you wish to manage your token via the APIs, you should copy the Token ID value, too. The Token
ID can always be retrieved from the Trifacta application.

For more information, see Access Tokens Page.

Use Token

After a token has been acquired, it must be included in each request to the server, for as long as it is valid.

NOTE: After a token has been created, it cannot be extended or modified.

NOTE: API access tokens are not used by users in the Trifacta application.

Copyright © 2019 Trifacta Inc. Page #109


NOTE: When using the APIs in SSO environments, API access tokens work seamlessly with
platform-native versions of SAML and LDAP-AD. They do not work with the reverse proxy SSO methods.
For more information, see API Authentication.

After you have acquired the token, you submit it with each API request to the platform.
Example - cURL:

curl http://tri.example.com:3005/v4/jobs -X GET -H "Authorization: Bearer


(tokenValue)"

where:
(tokenValue) is the value returned for the token when it was created.

Example - REST client:


If you are submitting your API calls through a REST client, the Authorization header must be specified as follows:

Authorization: Bearer (tokenValue)

List Tokens

NOTE: For security reasons, you cannot acquire the actual token through any of these means.

Tip: You can see all of your current and expired tokens through the Trifacta application. See
Access Tokens Page.

Endpoint Description

API AccessTokens Get List v4 List all access tokens for your user account.

API AccessTokens Get v4 List your access token for the specified token ID.

Renew Token

New tokens can be acquired at any time using the Create method.

NOTE: It is the responsibility of the user to acquire a new API token before the current one expires. If a
token is permitted to expire, a request for a new token must include userId and password information.

See API AccessTokens Create v4.


See Access Tokens Page.

Copyright © 2019 Trifacta Inc. Page #110


Delete Token

Via API: Acquire the tokenId value for the token and use the delete endpoint. See
API AccessTokens Delete v4.
Via UI: In the Access Tokens page, select Delete Token... from the context menu for the token listing. See
Access Tokens Page.
API Endpoints
The following endpoints are available in this release of Trifacta Wrangler Enterprise. Please verify that you are
referring to the correct version of the endpoint.
Topics:
v4 Endpoints
API AccessTokens Create v4
API AccessTokens Delete v4
API AccessTokens Get List v4
API AccessTokens Get v4
API Connections Create DryRun v4
API Connections Create v4
API Connections Delete v4
API Connections Get List v4
API Connections Get Status v4
API Connections Get v4
API Connections Patch v4
API Connections Permissions Create User v4
API Connections Permissions Delete User v4
API Connections Permissions Get User v4
API Connections Vendors Get List v4
API Deployments Create v4
API Deployments Delete v4
API Deployments Get List v4
API Deployments Get Release List v4
API Deployments Get v4
API Deployments Object Import Rules Patch v4
API Deployments Patch v4
API Deployments Run v4
API Deployments Value Import Rules Patch v4
API EMRClusters Create v4
API EMRClusters Delete v4
API EMRClusters Get Count v4
API EMRClusters Get List v4
API EMRClusters Get v4
API EMRClusters Patch v4
API Flows Create v4
API Flows Delete v4
API Flows Get List v4
API Flows Get v4
API Flows Package Get DryRun v4
API Flows Package Get v4
API Flows Package Post DryRun v4
API Flows Package Post v4
API Flows Patch v4
API ImportedDatasets Create v4
API ImportedDatasets Delete v4
API ImportedDatasets Get List v4
API ImportedDatasets Get v4

Copyright © 2019 Trifacta Inc. Page #111


API ImportedDatasets Patch v4
API ImportedDatasets Post AddToFlow v4
API JobGroups Cancel v4
API JobGroups Create v4
API JobGroups Delete v4
API JobGroups Get Jobs v4
API JobGroups Get List v4
API JobGroups Get Publications v4
API JobGroups Get Status v4
API JobGroups Get v4
API JobGroups Put Publish v4
API OutputObjects Create v4
API OutputObjects Delete v4
API OutputObjects Get List v4
API OutputObjects Get v4
API OutputObjects Update v4
API People Create v4
API People Delete v4
API People Get List v4
API People Get v4
API People Patch v4
API Publications Create v4
API Publications Delete v4
API Publications Get List v4
API Publications Get v4
API Publications Update v4
API Releases Create DryRun v4
API Releases Create v4
API Releases Delete v4
API Releases Get v4
API Releases Package Get v4
API Releases Patch v4
API WrangledDatasets Create v4
API WrangledDatasets Delete v4
API WrangledDatasets Get List v4
API WrangledDatasets Get PrimaryInputDataset v4
API WrangledDatasets Get v4
API WrangledDatasets Patch v4
API WrangledDatasets Post AddToFlow v4
API WrangledDatasets Put PrimaryInputDataset v4
API WriteSettings Create v4
API WriteSettings Delete v4
API WriteSettings Get List v4
API WriteSettings Get v4
API WriteSettings Update v4
v3 Endpoints
API Connections Create v3
API Connections Delete v3
API Connections Get List v3
API Connections Get Status v3
API Connections Get v3
API Deployments Create v3
API Deployments Delete v3
API Deployments Get List v3
API Deployments Get Release List v3
API Deployments Get v3
API Deployments Object Import Rules Patch v3

Copyright © 2019 Trifacta Inc. Page #112


API Deployments Patch v3
API Deployments Run v3
API Deployments Value Import Rules Patch v3
API Flows Create v3
API Flows Delete v3
API Flows Get List v3
API Flows Get v3
API Flows Package Get DryRun v3
API Flows Package Get v3
API Flows Package Post DryRun v3
API Flows Package Post v3
API Flows Patch v3
API ImportedDatasets Create v3
API ImportedDatasets Delete v3
API ImportedDatasets Get List v3
API ImportedDatasets Get v3
API ImportedDatasets Post AddToFlow v3
API JobGroups Create v3
API JobGroups Delete v3
API JobGroups Get Jobs v3
API JobGroups Get List v3
API JobGroups Get Status v3
API JobGroups Get v3
API JobGroups Put Publish v3
API People Create v3
API People Delete v3
API People Get List v3
API People Get v3
API People Patch v3
API Releases Create DryRun v3
API Releases Create v3
API Releases Delete v3
API Releases Get v3
API Releases Package Get v3
API Releases Patch v3
API WrangledDatasets Create v3
API WrangledDatasets Delete v3
API WrangledDatasets Get List v3
API WrangledDatasets Get PrimaryInputDataset v3
API WrangledDatasets Get v3
API WrangledDatasets Put PrimaryInputDataset v3
API Session Get

v4 Endpoints

Contents:
Access Tokens
Connections
Connection permissions
Datasets and Recipes
EMR Clusters
Flows
Flow import and export
Jobgroups and Jobs
Outputs, Publications, and WriteSettings
Deployments and Releases
Users

Copyright © 2019 Trifacta Inc. Page #113


Miscellaneous

These endpoints apply to version 4 of the APIs for the Trifacta® platform.
For more information on support for this version, see API Version Support Matrix.

Access Tokens

Endpoint Method Behavior Documentation

/apiAccessTokens POST Create API AccessTokens Create v4

/apiAccessTokens GET List API AccessTokens Get List v4

/apiAccessTokens/:id GET Read API AccessTokens Get v4

/apiAccessTokens/:id DELETE Delete API AccessTokens Delete v4

Connections

Endpoint Method Behavior Documentation

/connections POST Create API Connections Create v4

/connections GET List API Connections Get List v4

/connections/vendors GET List API Connections Vendors Get List v4

/connections/:id GET Read API Connections Get v4

/connections/:id/status GET Read Status API Connections Get Status v4

/connections/:id DELETE Delete API Connections Delete v4

Connection permissions

Endpoint Method Behavior Documentation

/v3/connections/:id/permissions/ POST Create API Connections Permissions Create User v4

/v3/connections/:id/permissions/ GET Read API Connections Permissions Get User v4

/v3/connections/:id/permissions/ DELETE Delete API Connections Permissions Delete User v4

Copyright © 2019 Trifacta Inc. Page #114


Datasets and Recipes

Endpoint Method Behavior Documentation

/importedDatasets POST Create API ImportedDatasets Create v4

/importedDatasets GET List API ImportedDatasets Get List v4

/importedDatasets/:id GET Read API ImportedDatasets Get v4

/importedDatasets/:id PATCH Update API ImportedDatasets Patch v4

/importedDatasets/:id DELETE Delete API ImportedDatasets Delete v4

/v3/importedDatasets/:id/addToFlow POST Create API ImportedDatasets Post AddToFlow


v4

/wrangledDatasets POST Create API WrangledDatasets Create v4

/wrangledDatasets GET List API WrangledDatasets Get List v4

/wrangledDatasets/:id GET Read API WrangledDatasets Get v4

/wrangledDatasets/:id PATCH Patch API WrangledDatasets Patch v4

/wrangledDatasets/:id DELETE Delete API WrangledDatasets Delete v4

/wrangledDatasets/:id/primaryInputDatasets GET Read API WrangledDatasets Get


PrimaryInputDataset v4

/wrangledDatasets/:id/primaryInputDatasets PUT Update API WrangledDatasets Put


PrimaryInputDataset v4

EMR Clusters

The following endpoints apply only if the Trifacta platform has been integrated with an AWS Elastic MapReduce
(EMR) cluster.

Endpoint Method Behavior Documentation

/emrClusters POST Create API EMRClusters Create v4

/emrClusters/count GET Count API EMRClusters Get Count v4

/emrClusters GET List API EMRClusters Get List v4

/emrClusters/:id GET Read API EMRClusters Get v4

/emrClusters/:id PATCH Update API EMRClusters Patch v4

/emrClusters/:id DELETE Delete API EMRClusters Delete v4

Copyright © 2019 Trifacta Inc. Page #115


Flows

Endpoint Method Behavior Documentation

/flows POST Create API Flows Create v4

/flows GET List API Flows Get List v4

/flows/:id GET Read API Flows Get v4

/flows/:id PATCH Update API Flows Patch v4

/flows/:id DELETE Delete API Flows Delete v4

Flow import and export

Endpoint Method Behavior Documentation

/flows/package/dryRun POST Import dry run API Flows Package Post DryRun v4

/flows/package POST Import API Flows Package Post v4

/flows/:id/package/dryRun GET Export dry run API Flows Package Get DryRun v4

/flows/:id/package GET Export API Flows Package Get v4

Jobgroups and Jobs

Endpoint Method Behavior Documentation

/jobGroups POST Create API JobGroups Create v4

/jobGroups GET List API JobGroups Get List v4

/jobGroups/:id GET Read API JobGroups Get v4

/jobGroups/:id/jobs GET Read API JobGroups Get Jobs v4

/jobGroups/:id/status GET Read Status API JobGroups Get Status v4

/jobGroups/:id/cancel POST Cancel job API JobGroups Cancel v4

/jobGroups/:id/publish PUT Create API JobGroups Put Publish v4

/jobGroups/:id DELETE Delete API JobGroups Delete v4

Copyright © 2019 Trifacta Inc. Page #116


Outputs, Publications, and WriteSettings

Endpoint Method Behavior Documentation

/v4/outputobjects POST Create API OutputObjects Create v4

/v4/outputobjects GET List API OutputObjects Get List v4

/v4/outputobjects/:id GET Read API OutputObjects Get v4

/v4/outputobjects/:id PUT Update API OutputObjects Update v4

/v4/outputobjects/:id DELETE Delete API OutputObjects Delete v4

/v4/publications POST Create API Publications Create v4

/v4/publications GET List API Publications Get List v4

/v4/publications/:id GET Read API Publications Get v4

/v4/publications/:id PUT Update API Publications Update v4

/v4/publications/:id DELETE Delete API Publications Delete v4

/v4/writesettings POST Create API WriteSettings Create v4

/v4/writesettings GET List API WriteSettings Get List v4

/v4/writesettings/:id GET Read API WriteSettings Get v4

/v4/writesettings/:id PUT Update API WriteSettings Update v4

/v4/writesettings/:id DELETE Delete API WriteSettings Delete v4

Copyright © 2019 Trifacta Inc. Page #117


Deployments and Releases

Endpoint Method Behavior Documentation

/deployments POST Create API Deployments Create v4

/deployments GET List API Deployments Get List v4

/deployments/:id GET Read API Deployments Get v4

/deployments/:id PATCH Update API Deployments Patch v4

/deployments:/:id/objectImportRules PATCH Update API Deployments Object Import Rules Patch v4

/deployments:/:id/valueImportRules PATCH Update API Deployments Value Import Rules Patch v4

/deployments/:id?embed=releases GET List API Deployments Get Release List v4

/deployments/:id/run POST Create API Deployments Run v4

/deployments/:id DELETE Delete API Deployments Delete v4

/deployments/:id/releases POST Create API Releases Create v4

/deployments/:id/releases/dryRun POST Create API Releases Create DryRun v4

/releases:id GET Read API Releases Get v4

/releases/:id PATCH Update API Releases Patch v4

/releases/:id DELETE Delete API Releases Delete v4

Users

Endpoint Method Behavior Documentation

/people POST Create API People Create v4

/people GET List API People Get List v4

/people PATCH Update API People Patch v4

/people/:id GET Read API People Get v4

/people/:id DELETE Delete API People Delete v4

Miscellaneous

Endpoint Method Behavior Documentation

/session GET Read API Session Get

API AccessTokens Create v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request

Copyright © 2019 Trifacta Inc. Page #118


Response

Create an access token for the authenticated user.


Access tokens can be used for managing more secure access to the APIs. For more information, see
API Authentication.
Access tokens can be created via the Trifacta® application or via API.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/apiAccessTokens

Request Header:
If you do not have a valid access token to use at this time, you must submit a username/password
combination as part of the Authentication header.
If you have a valid access token, you can submit that token in your Authentication header with this request.
For more information, see API Authentication.
Request Body:

{
"lifetimeSeconds": 100,
"description": "My 100-second token"
}

Response

Response Status Code - Success: 201 - Created

NOTE: If you receive a Route doesn't exist error message, please verify that the API access token
feature has been enabled in your instance of the platform. For more information, see
Enable API Access Tokens.

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #119


{
"tokenValue":
"eyJ0b2tlbklkIjoiYmFiOTA4ZjctZGNjMi00OTYyLTg1YmQtYzFlOTZkMGNhY2JkIiwic2Vj
cmV0IjoiOWIyNjQ5MWJiODM4ZWY0OWE1NzdhYzYxOWEwYTFkNjc4ZmE4NmE5MzBhZWFiZDk3O
GRlOTY0ZWI0MDUyODhiOCJ9",
"tokenInfo": {
"tokenId": "bab908f7-dcc2-4962-85bd-c1e96d0cacbd",
"description": "My 100-second token",
"expiredAt": "2019-01-16T00:06:41.743Z",
"lastUsed": null,
"createdAt": "2019-01-16T00:05:01.743Z"
}
}

Reference

Request reference:

Response reference:
API AccessTokens Delete v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Delete the specified access token.


Access tokens can be used for managing more secure access to the APIs. For more information, see
API Authentication.
Access tokens can be created via the Trifacta® application or via API. For more information on creating
them via API, see API AccessTokens Create v4.

If you delete an active access token, you may prevent the user from accessing the platform
outside of the Trifacta application.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #120


Request

Request Type: DELETE

Endpoint:

/v4/apiAccessTokens/<id>

where:

Parameter Description

<id> The tokenId value for the access token to delete

Request URI - Example:

/v4/apiAccessTokens/0bc1d49f-5475-4c62-a0ba-6ad269389ada

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

NOTE: If you receive a Route doesn't exist error message, please verify that the API access token
feature has been enabled in your instance of the platform. For more information, see
Enable API Access Tokens.

Response Body Example:


Empty.

Reference

None.
API AccessTokens Get List v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get list of all access tokens assigned to the authenticated user.


Access tokens can be used for managing more secure access to the APIs. For more information, see
API Authentication.

Copyright © 2019 Trifacta Inc. Page #121


Access tokens can be created via the Trifacta® application or via API. For more information on creating via
API, see API AccessTokens Create v4.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/apiAccessTokens

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

NOTE: If you receive a Route doesn't exist error message, please verify that the API access token
feature has been enabled in your instance of the platform. For more information, see
Enable API Access Tokens.

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #122


{
"data": [
{
"tokenId": "0bc1d49f-5475-4c62-a0ba-6ad269389ada",
"description": "new token",
"createdAt": "2019-01-15T20:58:28.175Z",
"expiredAt": "2020-01-15T20:58:28.175Z",
"lastUsed": null
},
{
"tokenId": "5e9484ba-a540-4019-acc5-680ad090e1c3",
"description": "",
"createdAt": "2019-01-15T23:35:04.698Z",
"expiredAt": "2020-01-15T23:35:04.698Z",
"lastUsed": null
}
]
}

Reference

For more information on the properties of an access token, see API AccessTokens Get v4.
API AccessTokens Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get information on the specified AccessToken.


Access tokens can be used for managing more secure access to the APIs. For more information, see
API Authentication.
Access tokens can be created via the Trifacta® application or via API. For more information on creating via
API, see API AccessTokens Create v4.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Copyright © 2019 Trifacta Inc. Page #123


Endpoint:

/v4/apiAccessTokens/<id>

where:

Parameter Description

<id> The tokenId value for the access token

Request URI - Example:

/v4/apiAccessTokens/0bc1d49f-5475-4c62-a0ba-6ad269389ada

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

NOTE: If you receive a Route doesn't exist error message, please verify that the API access token
feature has been enabled in your instance of the platform. For more information, see
Enable API Access Tokens.

Response Body Example:

{
"tokenId": "0bc1d49f-5475-4c62-a0ba-6ad269389ada",
"description": "new token",
"expiredAt": "2020-01-15T20:58:28.175Z",
"lastUsed": null,
"createdAt": "2019-01-15T20:58:28.175Z"
}

Reference

API Connections Create DryRun v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Copyright © 2019 Trifacta Inc. Page #124


Performs a dry run of creating the connection, testing it, and then deleting the connection.

NOTE: In this release, you cannot create Redshift or SQL DW connections via the API. Please create
these connections through the application. This known issue will be fixed in a future release.

For more information on connections, see Connection Types.


Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/connections/dryRun

Request Body - Relational Connection:


For relational connections, the request body should look like the following. All properties are required unless
noted.

NOTE: Relational connections require the creation and installation of an encryption key file on the Trifacta
node. This file must be present before the connection is created. See Create Encryption Key File.

This example creates a Postgres connection of basic credentials type. A valid username/password combination
must be specified in the credentials property.

Copyright © 2019 Trifacta Inc. Page #125


{
"connectParams": {
"vendor": "postgres",
"vendorName": "postgres",
"host": "postgres.example.com",
"port": "5432",
"database": "myDB"
},
"host": "postgres.example.com",
"port": 5432,
"vendor": "postgres",
"params": {
"connectStrOpts": "",
"database": "myDB"
},
"ssl": false,
"vendorName": "postgres",
"name": "postgres",
"description": "",
"type": "jdbc",
"isGlobal": false,
"credentialType": "basic",
"credentialsShared": true,
"disableTypeInference": false,
"credentials": [
{
"username": "<username>",
"password": "<password>"
}
]
}

For more information on these properties, see API Connections Get v4.

Response

Response Status Code - Success: 200 - OK

Response Body Example - success:

{
"result": "SUCCESS",
"reason": null
}

Response Body Example - failure:


In the following example, the message indicates that there was a failure to connect to the host.

Copyright © 2019 Trifacta Inc. Page #126


{
"result": "FAILURE",
"reason": {
"exception": {
"name":
"com.trifacta.connect.spring.exception.JdbcConnectionException",
"message": "Could not connect to host",
"details":
"com.trifacta.connect.spring.exception.JdbcConnectionException",
"cause": "ConnectException: Connection refused (Connection
refused)"
}
}
}

Reference

For more information on the response body properties, see API Connections Get v4.

API Connections Create v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Create the specified connection.

NOTE: In this release, you cannot create Redshift or SQL DW connections via the API. Please create
these connections through the application. This known issue will be fixed in a future release.

For more information on connections, see Connection Types.


Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

Copyright © 2019 Trifacta Inc. Page #127


/v4/connections/

Request Body - Relational Connection:


For relational connections, the request body should look like the following. All properties are required unless
noted.

NOTE: Relational connections require the creation and installation of an encryption key file on the Trifacta
node. This file must be present before the connection is created. See Create Encryption Key File.

This example creates a SQL Server connection of basic credentials type. A valid username/password
combination must be specified in the credentials property.

{
"connectParams": {
"vendor": "sqlserver",
"vendorName": "sqlserver",
"host": "sqlserver.example.com",
"port": "1433"
},
"host": "sqlserver.example.com",
"port": 1433,
"vendor": "sqlserver",
"params": {
"connectStrOpts": ""
},
"ssl": false,
"vendorName": "sqlserver",
"name": "sqlserver_test2",
"description": "",
"type": "jdbc",
"isGlobal": false,
"credentialType": "basic",
"credentialsShared": true,
"disableTypeInference": false,
"credentials": [
{
"username": "<username>",
"password": "<password>"
}
]
}

Property Description

name Display name of the connection

description (Optional) Display description for the connection

vendorName Set this value to sqlserver.

type For more information on the value to insert for the connection, see Connection Types.

Copyright © 2019 Trifacta Inc. Page #128


isGlobal (Optional) If true, the connection is available to all users. The default is false.

host Host name of the relational server to which to connect.

port Port number for the relational server. The default value varies between database vendors. For more information,
please see the documentation provided with your database distribution.

vendor Set this value to sqlserver.

params (Optional) Set of JSON parameters that are passed to the database when initializing the connection. Depending on
the database vendor, you may be required to submit via this parameter the name of the default database. You can
also pass in optional parameters through the ConnecStrOpts parameter. For more information, see
CLI for Connections.

ssl (Optional) If set to true, the connection is made over SSL. The default is false.

NOTE: If you connect over SSL, you must modify the hostname value to use HTTPS.

NOTE: SSL connections to SQL Server are not supported.

credentialType Set this value to one of the following:

basic - Simple username/password to be provided in the credentials property.


conf - Use the connection credentials stored in trifacta-conf.json.
kerberos delegate - Connection uses the Kerberos-delegated principal to connect to the database.
No credentials are submitted as part of the connection definition. This method requires additional configuration.
See Enable SSO for Relational Connections.
custom - Connection credentials are specified in the params property, such as using key-value
parameters to specify access credentials.

credentialsShared If the connection is a global connection, the credentials to connect can be shared with other users when this
property is true. Otherwise, other users must provide their own credentials.

disableTypeInference By default, the Trifacta platform attempts to infer types when data is imported. For schematized sources, you may
prefer to disable type inference, instead using the types provided by the source.

When this setting is true, initial type inference by the platform is disabled for all data read through this connection.

credentials (Optional) If credentialType=basic, this property must contain the username and password to use to
connect to the relational source.

Request Body - Hive Connection:


You can create create only one public connection to Hive.

Copyright © 2019 Trifacta Inc. Page #129


{
"connectParams": {
"vendor": "hive",
"vendorName": "hive",
"host": "hadoop",
"port": "10000",
"jdbc": "hive2",
"defaultDatabase": "default"
},
"id": 1,
"host": "hadoop",
"port": 10000,
"vendor": "hive",
"params": {
"jdbc": "hive2",
"connectStringOptions": "",
"defaultDatabase": "default"
},
"ssl": false,
"vendorName": "hive",
"name": "hive",
"description": "Hive connection",
"type": "jdbc",
"isGlobal": true,
"credentialType": "conf",
"credentialsShared": true,
"disableTypeInference": false,
"credentials": []
}

The following property values are specific to Hive connections.

Property Description

host Set this value to hadoop to integrate with the Hive instance for the Hadoop cluster to which the Trifacta platform
is connected.

port By default, this value should be 10000.

params Connection parameters for the Hive instance.

NOTE: The following parameter entry is required for Hive:

"jdbc": "hive2",

Others are optional.

type Set this value to jdbc.

Copyright © 2019 Trifacta Inc. Page #130


isGlobal
NOTE: For Hive connections, this value must be set to true.

credentialType Set this value to conf.

For more information, see Configure for Hive.


Request Body - Redshift Connection:

{
"connectParams": {
"vendor": "redshift",
"vendorName": "redshift",
"host": "redshift.example.com",
"port": "5439",
"defaultDatabase": "dev",
"extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS
TRUNCATECOLUMNS"
},
"host": "redshift.example.com",
"port": 5439,
"vendor": "redshift",
"params": {
"connectStrOpts": "",
"defaultDatabase": "dev",
"extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS
TRUNCATECOLUMNS"
},
"ssl": false,
"vendorName": "redshift",
"name": "redshift2",
"description": "Redshift connection",
"type": "jdbc",
"isGlobal": true,
"credentialType": "custom",
"credentialsShared": true,
"disableTypeInference": false,
"credentials": [
{"key":"user","value":"<username>"},
{"key":"password","value":"<password>"},
{"key":"iamRoleArn","value":"<IAM_role_ARN>"}
]
}

The following property values are specific to Redshift connections.

Property Description

connectParams A default database value is required.

The extraLoadParams value is used when you publish results to Redshift. For more information on these
values, see http://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-conversion.html.

Copyright © 2019 Trifacta Inc. Page #131


type Set this value to jdbc.

credentialType Set this value to custom. Credentials are specified below.

credentials username and password must be specified in this key-value format, although the value for either can be an
empty string.

NOTE: iamRoleArn is optional. For more information, see


Configure for EC2 Role-Based Authentication.

For more information on parameters and credentials, see Create Redshift Connections.

Response

Response Status Code - Success: 201 - Created

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #132


{
"connectParams": {
"vendor": "sqlserver",
"vendorName": "sqlserver",
"host": "sqlserver.example.com",
"port": "1433"
},
"id": 26,
"host": "sqlserver.example.com",
"port": 1433,
"vendor": "sqlserver",
"params": {
"connectStrOpts": ""
},
"ssl": false,
"vendorName": "sqlserver",
"name": "sqlserver_test2",
"description": "",
"type": "jdbc",
"isGlobal": false,
"credentialType": "basic",
"credentialsShared": true,
"uuid": "fa7e06c0-0143-11e8-8faf-27c0392328c5",
"disableTypeInference": false,
"createdAt": "2018-01-24T20:20:11.181Z",
"updatedAt": "2018-01-24T20:20:11.181Z",
"credentials": [
{
"username": "<username>"
}
],
"creator": {
"id": 1
},
"updater": {
"id": 1
}
}

Reference

For more information on the response body properties, see API Connections Get v4.

API Connections Delete v4

This is the latest version of the APIs.

Contents:
Required Permissions

Copyright © 2019 Trifacta Inc. Page #133


Request
Response
Reference

Delete the specified connection.


For more information on connections, see Connection Types.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v4/connections/<id>

where:

Parameter Description

<id> Internal identifier for the connection

Request URI - Example:

/v4/connections/4

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

None.
API Connections Get List v4

This is the latest version of the APIs.

Contents:

Copyright © 2019 Trifacta Inc. Page #134


Required Permissions
Request
Response
Reference

Get the list of accessible connections for the authenticated user.


For more information on connections, see Connection Types.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/connections

Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

/v4/connections?limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.
Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"data": [
{
"connectParams": {
"vendor": "teradata",
"vendorName": "teradata",
"host": "teradata.example.com",

Copyright © 2019 Trifacta Inc. Page #135


"port": "1025"
},
"id": 7,
"host": "teradata.example.com",
"port": 1025,
"vendor": "teradata",
"params": {
"connectStrOpts": ""
},
"ssl": false,
"vendorName": "teradata",
"name": "teradata",
"description": "",
"type": "jdbc",
"isGlobal": false,
"credentialType": "basic",
"credentialsShared": true,
"uuid": "7e807000-144b-11e9-99c7-43fee559d973",
"disableTypeInference": false,
"createdAt": "2019-01-09T20:16:48.897Z",
"updatedAt": "2019-01-09T20:16:48.904Z",
"credentials": [
{
"username": "dbc"
}
],
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"workspace": {
"id": 1
}
},
{
"connectParams": {
"vendor": "hive",
"vendorName": "hive",
"host": "hadoop",
"port": "10000",
"jdbc": "hive2",
"defaultDatabase": "default"
},
"id": 1,
"host": "hadoop",
"port": 10000,
"vendor": "hive",
"params": {
"jdbc": "hive2",
"connectStringOptions": "",
"defaultDatabase": "default"

Copyright © 2019 Trifacta Inc. Page #136


},
"ssl": false,
"vendorName": "hive",
"name": "hive",
"description": null,
"type": "jdbc",
"isGlobal": true,
"credentialType": "conf",
"credentialsShared": true,
"uuid": "3c8672f0-13ea-11e9-82f7-3ff8e3d74b3f",
"disableTypeInference": false,
"createdAt": "2019-01-09T08:40:37.023Z",
"updatedAt": "2019-01-09T08:40:37.037Z",
"credentials": [],
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"workspace": {
"id": 1
}
}

Copyright © 2019 Trifacta Inc. Page #137


],
"count": 2
}

Reference

For more information on the properties of a connection, see API Connections Get v4.
API Connections Get Status v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Run a connection test for the specified connection.


For more information on connections, see Connection Types.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/connections/<id>/status

where:

Parameter Description

<id> Internal identifier for the connection

Request URI - Example:

/v4/connections/10/status

Request Body:
Empty.

Copyright © 2019 Trifacta Inc. Page #138


Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"result": "SUCCESS",
"reason": null
}

Reference

Property Description

result Results of testing the connection.

For more information on debugging failures in relational connections, see Enable Relational Connections.
For more information on debugging Hive connections. see Configure for Hive.
For more information on debugging S3 connections, see Enable S3 Access.

reason If the result value is not SUCCESS, additional information may be included here.

API Connections Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the specified connection.


For more information on connections, see Connection Types.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/connections/<id>

Copyright © 2019 Trifacta Inc. Page #139


where:

Parameter Description

<id> Internal identifier for the connection

Request URI - Example:

/v4/connections/3

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #140


{
"connectParams": {
"vendor": "hive",
"vendorName": "hive",
"host": "hadoop",
"port": "10000",
"jdbc": "hive2",
"defaultDatabase": "default"
},
"id": 1,
"host": "hadoop",
"port": 10000,
"vendor": "hive",
"params": {
"jdbc": "hive2",
"connectStringOptions": "",
"defaultDatabase": "default"
},
"ssl": false,
"vendorName": "hive",
"name": "hive",
"description": null,
"type": "jdbc",
"isGlobal": true,
"credentialType": "conf",
"credentialsShared": true,
"uuid": "3c8672f0-13ea-11e9-82f7-3ff8e3d74b3f",
"disableTypeInference": false,
"createdAt": "2019-01-09T08:40:37.023Z",
"updatedAt": "2019-01-09T08:40:37.037Z",
"credentials": [],
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"workspace": {
"id": 1
}
}

Copyright © 2019 Trifacta Inc. Page #141


Reference

Property Description

connectParams.vendor The type of connection. See Connection Types.

connectParams.host Host of the source

connectParams.port Port number for the source

connectParams.jdbc JDBC type for the connection

connectParams.defaultDatabase (if applicable) Name of the default database

connectParams.extraLoadParams (if applicable) If the connection types supports them, this setting contains additional parameters to be
passed to the host when making the connection.

id Internal identifier for the connection

host Host of the source

port Port number for the source

vendor String identifying the connection's vendor

params This setting is populated with any parameters that are passed to the source during connection and
operations. For relational sources, this setting may include the default database and extra load
parameters.

ssl When true, the Trifacta platform uses SSL to connect to the source.

vendorName Name of the vendor of the connection

name Internal name of the connection

description User-friendly description for the connection

type Type of connection

isGlobal If true, the connection is public and available to all users.

NOTE: After a connection has been made public, it cannot be made private again. It must be
deleted and recreated.

Default is false. A connection can be made public through the command line interface or the
Connections page. See Connections Page.

credentialType The type of credentials used for the connection. This value varies depending on where the credentials
are stored. See CLI for Connections.

credentialsShared If true, the credentials used for the connection are available for use by users who have been shared
the connection.

uuid A universal object identifier, which is unique across instances of the platform.

This internal identifier is particularly useful when create import mapping rules.

See API Deployments Value Import Rules Patch v4.


See API Deployments Object Import Rules Patch v4.

disableTypeInference If set to false, type inferencing has been disabled for this connection. The default is true.

When type inferencing has been disabled, the Trifacta platform does not apply Trifacta types to data
when it is imported. For more information, see Configure Type Inference.

createdAt Timestamp for when the connection was made

updatedAt Timestamp for when the connection was last updated

Copyright © 2019 Trifacta Inc. Page #142


credentials If present, these values are the credentials used to connect to the database.

NOTE: For security reasons, you can store the connection's credentials in an external file on
the Trifacta Server, after which they do not appear in this setting. See CLI for Connections.

creator.id Internal identifier of the user who created the connection.

updater.id Internal identifier of the user who last updated the connection.

workspace.id Internal identifier of the workspace with which this connection is associated.

API Connections Patch v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Modify the specified connection.


For more information on connections, see Connection Types.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

/v4/connections/<id>

where:

Parameter Description

<id> Internal identifier for the connection

Request URI - Example:

/v4/connections/8

Copyright © 2019 Trifacta Inc. Page #143


Request Body:
Only the properties that are being modified need to be included in the request. In the following example, the
default database and the description of the connection are modified:

{
"params": {
"defaultDatabase": "my_default_db"
},
"description": "This connection uses a non-default default DB."
}

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 8,
"updater": {
"id": 1
},
"updatedAt": "2019-01-25T23:19:27.648Z"
}

Reference

For more information on the properties of a connection, see API Connections Get v4.
API Connections Permissions Create User v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response

Share a connection with a specified user.


For more information on connections, see Connection Types.
Version: v4

Required Permissions

The authenticated user must be the owner of the connection.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #144


Request

Request Type: GET

Endpoint:

/v4/connections/<cid>/permissions/

Parameter Description

<cid> Internal identifier of the connection.

Request URI - Example:

/v4/connections/10/permissions/

Request Body:

[
{
"personId": 26,
"role": "readOnly"
}
]

Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
"data": [
{
"role": "readOnly",
"createdAt": "2019-03-21T21:01:58.266Z",
"updatedAt": "2019-03-21T21:01:58.266Z",
"person": {
"id": 26
},
"connection": {
"id": 1
}
}
]
}

Copyright © 2019 Trifacta Inc. Page #145


API Connections Permissions Delete User v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response

Delete user access to a specified connection.


For more information on connections, see Connection Types.
Version: v4

Required Permissions

The authenticated user must be an admin.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v4/connections/<cid>/permissions/<uid>

Parameter Description

<cid> Internal identifier of the connection.

<uid> Intenrnal identifier of the user whose permissions you are removing.

Request URI - Example:

/v4/connections/10/permissions/6

Request Body:
None.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


None.

Copyright © 2019 Trifacta Inc. Page #146


API Connections Permissions Get User v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Retrieve the user permissions to a specified connection.


For more information on connections, see Connection Types.
Version: v4

Required Permissions

The authenticated user must be an admin.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/connections/<cid>/permissions/

Parameter Description

<cid> Internal identifier of the connection to retrieve.

Request URI - Example:

/v4/connections/7/permissions/

Request Body:
None.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #147


[
{
"name": "Administrator",
"email": "[email protected]",
"id": 1,
"connectionpermission": {
"connectionId": 7,
"personId": 1,
"role": "owner"
}
},
{
"name": "Joe Guy",
"email": "[email protected]",
"id": 6,
"connectionpermission": {
"connectionId": 7,
"personId": 6,
"role": "readOnly"
}
}
]

Reference

Property Description

connectionId Internal identifier of the connection

personId Internal identifier of the user who has access to the connection

role Role assigned to the user for the connection. Values:

owner - creator and owner of the connection


readOnly - user with whom the connection has been shared. Users cannot modify the connection.

For more information, see API Connections Get v4.


API Connections Vendors Get List v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the list of all vendors of connections that are supported in the instance of the platform.

Copyright © 2019 Trifacta Inc. Page #148


NOTE: Additional vendors can become available based on the deployment of a set of configuration files
for the vendor. Adding vendors that are not on this list may require Trifacta Customer Success Services.
For more information on the available connections, see Connection Types.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/connections/vendors

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

[
{
"name": "db2",
"displayName": "DB2",
"type": "db2",
"category": "relational",
"credentialType": "basic",
"operation": "import",
"connectionParameters": [
{
"name": "host",
"displayName": "Host",
"type": "string",
"required": true,
"category": "location",
"default": ""
},
{
"name": "port",
"displayName": "Port",
"type": "integer",
"required": true,
"category": "location",
"default": "1521"

Copyright © 2019 Trifacta Inc. Page #149


},
{
"name": "connectStrOpts",
"displayName": "Connect String Options",
"type": "string",
"required": false,
"category": "location",
"default": ""
},
{
"name": "database",
"displayName": "Database Name",
"type": "string",
"required": true,
"category": "location",
"default": ""
}
]
},
{
"name": "hive",
"displayName": "Hive",
"type": "hive",
"category": "bigdata",
"credentialType": "conf",
"operation": "both",
"connectionParameters": [
{
"name": "host",
"displayName": "Host",
"type": "string",
"required": true,
"category": "location",
"default": ""
},
{
"name": "port",
"displayName": "Port",
"type": "integer",
"required": true,
"category": "location",
"default": "10000"
},
{
"name": "connectStrOpts",
"displayName": "Connect String Options",
"type": "string",
"required": false,
"category": "location",
"default": ""
},
{
"name": "defaultDatabase",

Copyright © 2019 Trifacta Inc. Page #150


"displayName": "Default Database",
"type": "string",
"required": true,
"category": "location",
"default": ""
}
]
},
{
"name": "oracle",
"displayName": "Oracle Database",
"type": "oracle",
"category": "relational",
"credentialType": "basic",
"operation": "both",
"connectionParameters": [
{
"name": "host",
"displayName": "Host",
"type": "string",
"required": true,
"category": "location",
"default": ""
},
{
"name": "port",
"displayName": "Port",
"type": "integer",
"required": true,
"category": "location",
"default": "1521"
},
{
"name": "connectStrOpts",
"displayName": "Connect String Options",
"type": "string",
"required": false,
"category": "location",
"default": ""
},
{
"name": "ssl",
"displayName": "Enable SSL",
"type": "boolean",
"required": true,
"category": "location",
"default": "false"
},
{
"name": "service",
"displayName": "Service Name",
"type": "string",
"required": true,

Copyright © 2019 Trifacta Inc. Page #151


"category": "location",
"default": ""
}
]
},
{
"name": "postgres",
"displayName": "PostgreSQL",
"type": "postgres",
"category": "relational",
"credentialType": "basic",
"operation": "both",
"connectionParameters": [
{
"name": "host",
"displayName": "Host",
"type": "string",
"required": true,
"category": "location",
"default": ""
},
{
"name": "port",
"displayName": "Port",
"type": "integer",
"required": true,
"category": "location",
"default": "5432"
},
{
"name": "connectStrOpts",
"displayName": "Connect String Options",
"type": "string",
"required": false,
"category": "location",
"default": ""
},
{
"name": "ssl",
"displayName": "Enable SSL",
"type": "boolean",
"required": true,
"category": "location",
"default": "false"
},
{
"name": "database",
"displayName": "Database",
"type": "string",
"required": true,
"category": "location",
"default": ""
}

Copyright © 2019 Trifacta Inc. Page #152


]
},
{
"name": "redshift",
"displayName": "Amazon Redshift",
"type": "redshift",
"category": "cloudplatform",
"credentialType": "custom",
"operation": "both",
"connectionParameters": [
{
"name": "host",
"displayName": "Host",
"type": "string",
"required": true,
"category": "location",
"default": ""
},
{
"name": "port",
"displayName": "Port",
"type": "integer",
"required": true,
"category": "location",
"default": "5439"
},
{
"name": "connectStrOpts",
"displayName": "Connect String Options",
"type": "string",
"required": false,
"category": "location",
"default": ""
},
{
"name": "defaultDatabase",
"displayName": "Database",
"type": "string",
"required": true,
"category": "location",
"default": ""
}
]
},
{
"name": "salesforce",
"displayName": "SalesForce",
"type": "salesforce",
"category": "application",
"credentialType": "basic",
"operation": "import",
"connectionParameters": [
{

Copyright © 2019 Trifacta Inc. Page #153


"name": "servername",
"displayName": "Server Name",
"type": "string",
"required": true,
"category": "location",
"default": ""
},
{
"name": "securityToken",
"displayName": "Security Token generated in account",
"type": "string",
"required": true,
"category": "location",
"default": ""
},
{
"name": "connectStrOpts",
"displayName": "Connect String Options",
"type": "string",
"required": false,
"category": "location",
"default": ""
}
]
},
{
"name": "sqldatawarehouse",
"displayName": "Microsoft SQL Data Warehouse",
"type": "sqldatawarehouse",
"category": "relational",
"credentialType": "basic",
"operation": "both",
"connectionParameters": [
{
"name": "host",
"displayName": "Host",
"type": "string",
"required": true,
"category": "location",
"default": ""
},
{
"name": "port",
"displayName": "Port",
"type": "integer",
"required": true,
"category": "location",
"default": "1433"
},
{
"name": "connectStrOpts",
"displayName": "Connect String Options",
"type": "string",

Copyright © 2019 Trifacta Inc. Page #154


"required": false,
"category": "location",
"default": ""
},
{
"name": "database",
"displayName": "Database",
"type": "string",
"required": true,
"category": "location",
"default": ""
},
{
"name": "externalDataSource",
"displayName": "External Data Source name",
"type": "string",
"required": false,
"category": "location",
"default": ""
}
]
},
{
"name": "sqlserver",
"displayName": "Microsoft SQL Server",
"type": "sqlserver",
"category": "relational",
"credentialType": "basic",
"operation": "both",
"connectionParameters": [
{
"name": "host",
"displayName": "Host",
"type": "string",
"required": true,
"category": "location",
"default": ""
},
{
"name": "port",
"displayName": "Port",
"type": "integer",
"required": true,
"category": "location",
"default": "1433"
},
{
"name": "connectStrOpts",
"displayName": "Connect String Options",
"type": "string",
"required": false,
"category": "location",
"default": ""

Copyright © 2019 Trifacta Inc. Page #155


}
]
},
{
"name": "tableau",
"displayName": "Tableau Server",
"type": "tableau",
"category": "application",
"credentialType": "basic",
"operation": "publish",
"connectionParameters": [
{
"name": "host",
"displayName": "Server URL",
"type": "string",
"required": true,
"category": "location",
"default": ""
}
]
},
{
"name": "teradata",
"displayName": "Teradata Data Warehouse",
"type": "teradata",
"category": "relational",
"credentialType": "basic",
"operation": "both",
"connectionParameters": [
{
"name": "host",
"displayName": "Host",
"type": "string",
"required": true,
"category": "location",
"default": ""
},
{
"name": "port",
"displayName": "Port",
"type": "integer",
"required": true,
"category": "location",
"default": "1025"
},
{
"name": "connectStrOpts",
"displayName": "Connect String Options",
"type": "string",
"required": false,
"category": "location",
"default": ""
},

Copyright © 2019 Trifacta Inc. Page #156


{
"name": "ssl",
"displayName": "Enable SSL",
"type": "boolean",
"required": true,
"category": "location",
"default": "false"
}

Copyright © 2019 Trifacta Inc. Page #157


]
}
]

Reference

For each connection type, the following parameters are available.

Property Description

name Name of the connection type

displayName Description value of the connection type

type Vendor identifier for the connection type

category Tab in the Connections screen where connections of this type can be created

credentialType Type of credentials that are accepted for this connection type:

basic - used for relational connections, except as noted below


aws - used for Redshift connections only
trifacta_service - used for Hive connections only
Kerberos delegate - Connection uses the Kerberos-delegated principal to connect to the database. No
credentials are submitted as part of the connection definition. This method requires additional configuration. See
Enable SSO for Relational Connections.

operation Supported operations for the connection type:

import - import datasets only


publish - publish datasets only
both - import and publish

connectionParameters properties:

For each of the connection properties, the following attributes are specified:

Attribute Description

name Name of connection property

displayName Display value of the property, which appears above the textbox in the application

type Data type of the connection property

required If true, the property must be populated to create a connection of this type.

category Defines the order for how the properties are listed in the Create Connection dialog.

If set to location, the order is determined by the order in which the properties are listed under the vendor object.

default The default value for the connection property

API Deployments Create v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response

Copyright © 2019 Trifacta Inc. Page #158


Reference

Create a new deployment.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/deployments/

Request Body:

{
"name": "Test Deployment"
}

Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
"id": 1,
"name": "Test Deployment",
"updatedAt": "2019-02-13T20:14:48.537Z",
"createdAt": "2019-02-13T20:14:48.537Z",
"creator": {
"id": 7
},
"updater": {
"id": 7
}
}

Copyright © 2019 Trifacta Inc. Page #159


Reference

For more information on properties of a deployment, see API Deployments Get v4.
API Deployments Delete v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Delete the specified deployment.

Deleting a deployment removes all releases, packages, and flows underneath it. This step cannot
be undone.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v4/deployments/<id>

where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v4/deployments/4

Request Body:

Copyright © 2019 Trifacta Inc. Page #160


Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

None.
API Deployments Get List v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the list of accessible deployments for the authenticated user.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/deployments

Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

Copyright © 2019 Trifacta Inc. Page #161


/v4/deployments?limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.
Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"data": [
{
"id": 2,
"name": "Test Deployment 2",
"createdAt": "2019-02-13T20:15:39.147Z",
"updatedAt": "2019-02-13T20:15:39.147Z",
"numReleases": 0,
"latestRelease": null,
"creator": {
"id": 7
},
"updater": {
"id": 7
}
},
{
"id": 1,
"name": "Test Deployment",
"createdAt": "2019-02-13T20:14:48.537Z",
"updatedAt": "2019-02-13T20:14:48.537Z",
"numReleases": 0,
"latestRelease": null,
"creator": {
"id": 7
},
"updater": {
"id": 7
}
}
]
}

Reference

For more information on the properties of a deployment, see API Deployments Get v4.
API Deployments Get Release List v4

Copyright © 2019 Trifacta Inc. Page #162


This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the list of releases for the specified deployment for the authenticated user.

NOTE: Deployments and releases pertain to Production instances of the Trifacta® platform. For more
information, see Overview of Deployment Manager.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/deployments/:id?embed=releases

Endpoint with paged retrieval:


By default, this endpoint returns results in sets of 25.
You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

/v4/deployments/:id?embed=releases&limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.
Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #163


{
"id": 1,
"name": "New Deployment Name",
"createdAt": "2019-02-13T20:14:48.537Z",
"updatedAt": "2019-02-13T20:17:56.608Z",
"releases": {
"data": [
{
"id": 4,
"notes": "example_flow",
"packageUuid": "9bae78c0-2fcb-11e9-9523-77f56ed58844",
"active": null,
"createdAt": "2019-02-13T20:54:11.179Z",
"updatedAt": "2019-02-13T20:55:21.277Z",
"deployment": {
"id": 1
},
"creator": {
"id": 7
},
"updater": {
"id": 7
}
},
{
"id": 1,
"notes": "example_flow",
"packageUuid": "9bae78c0-2fcb-11e9-9523-77f56ed58844",
"active": true,
"createdAt": "2019-02-13T20:39:41.764Z",
"updatedAt": "2019-02-13T20:55:21.276Z",
"deployment": {
"id": 1
},
"creator": {
"id": 7
},
"updater": {
"id": 7
}
}
]
},
"creator": {
"id": 7
},
"updater": {
"id": 7
}
}

Copyright © 2019 Trifacta Inc. Page #164


Reference

For more information on the properties of a release, see API Releases Get v4.
API Deployments Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the specified deployment.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/deployments/<id>

where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v4/deployments/1

Request Body:
Empty.

Copyright © 2019 Trifacta Inc. Page #165


Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 1,
"name": "Test Deployment",
"createdAt": "2019-02-13T20:14:48.537Z",
"updatedAt": "2019-02-13T20:14:48.537Z",
"creator": {
"id": 7
},
"updater": {
"id": 7
}
}

Reference

Property Description

id Internal identifier for the deployment

name Display name for the deployment. This value appears in the user interface.

createdAt Timestamp for when the deployment was created.

updatedAt Timestamp for when the deployment was last updated.

creator.id Internal identifier for the user who created the deployment.

updater.id Internal identifier for the user who last updated the deployment.

API Deployments Object Import Rules Patch v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

NOTE: This feature is in Beta release.

Create a list of object-based import rules for the specified deployment. Delete all previous rules applied to the
same object.

NOTE: Import rules must be applied to individual deployments.

Copyright © 2019 Trifacta Inc. Page #166


The generated rules apply to all flows that are imported into the deployment after they has been created.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

The response contains any previously created rules that have been deleted as a result of this change.
You can also make replacements in the import package based on value mappings. See
API Deployments Value Import Rules Patch v3.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

/v4/deployments/<id>/objectImportRules

where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v4/deployments/4/objectImportRules

Request Body Example: Replace connection


The following JSON array describes replacing the connection specified by the UUID, which is a field on the
connection object exported from the original platform instance. This connection reference is replaced by a
reference to connection ID 1 in the local platform instance and is applied to any release uploaded into the
deployment after the rule has been created:

Copyright © 2019 Trifacta Inc. Page #167


[
{
"tableName": "connections",
"onCondition": {
"uuid": "d75255f0-a245-11e7-8618-adc1dbb4bed0"
},
"withCondition": {
"id": 1
}
}
]

You can specify matching values using string literals.

Match Type Example Syntax

string literal {"uuid":"d75255f0-a245-11e7-8618-adc1dbb4bed0"}

string literal {"id":"22"}

NOTE: Use of Trifacta patterns or regular expressions is not supported.

For more information on patterns, see Text Matching.


Request Body Example: Replace multiple connections
This example request includes replacements for multiple connection references.

NOTE: Rules are applied in the listed order. If you are applying multiple rules to the same object in the
import package, the second rule must reference the expected changes applied by the first rule.

This type of replacement applies if the imported packages contain sources that are imported through two separate
connections:

Copyright © 2019 Trifacta Inc. Page #168


[
{
"tableName": "connections",
"onCondition": {
"uuid": "d75255f0-a245-11e7-8618-adc1dbb4bed0"
},
"withCondition": {
"id": 1
}
},
{
"tableName": "connections",
"onCondition": {
"uuid": "d552045e0-c314-22b5-9410-acd1bcd8eea2"
},
"withCondition": {
"id": 2
}
}
]

Response

Response Status Code - Success: 200 - OK

The response body contains any previously created rules that have been deleted as a result of this update.
Response Body Example: All new rule, no deletions
If the update does not overwrite any previous rules, then no rules are deleted. So, the response looks like the
following:

{
"deleted": []
}

Response Body Example: Replace connection


If you submit the request again, the response contains the rule definition of the previous update, which has been
deleted. This example applies to the one-rule change listed previously:

Copyright © 2019 Trifacta Inc. Page #169


{ "deleted": {

"data": [
{
"onCondition": {
"uuid": "d75255f0-a245-11e7-8618-adc1dbb4bed0"
},
"withCondition": {
"id": 1
},
"id": 1,
"tableName": "connections",
"createdAt": "2019-02-13T23:07:51.720Z",
"updatedAt": "2019-02-13T23:07:51.720Z",
"creator": {
"id": 7
},
"updater": {
"id": 7
},
"deployment": {
"id": 4
}
}
]
}
}

Reference

Property Description

onCondition The matching object identifier and the specified literal or pattern to match.

withCondition The identifier for the object type, as specified in by the tableName value, which is being modified.

id Internal identifier for the object import rule

tableName Name of the table to which the mapping is applied. Values:

connections - applies to local connections


creator.id Internal identifier of the user who created the deleted rule

updater.id Internal identifier of the user who last updated the deleted rule

deployment.id Internal identifier for the deployment to which to apply the import rule.

API Deployments Patch v4

This is the latest version of the APIs.

Contents:
Required Permissions

Copyright © 2019 Trifacta Inc. Page #170


Request
Response
Reference

Update the specified deployment.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

/v4/deployments/<id>

where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v4/deployments/1

Request Body: Example - Modify the deployment name

NOTE: For the PATCH method, only the properties that are being patched need to be submitted.

{
"name": "New Deployment Name"
}

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #171


{
"id": 1,
"updater": {
"id": 7
},
"updatedAt": "2019-02-13T20:17:56.608Z"
}

Reference

For more information on the properties of a deployment, see API Deployments Get v4.
API Deployments Run v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Run the job for the active release of the specified deployment.
At least one manual output must be specified for the main flow within the package. See Flow View Page.
An active release must be specified for the deployment. See API Releases Patch v4.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/deployments/<id>/run

where:

Parameter Description

Copyright © 2019 Trifacta Inc. Page #172


<id> Internal identifier for the deployment

Request URI - Example:

/v4/deployments/4/run

Request Body:
Empty.
Request Body - Example for dataset with parameters:
In the following example, the request body contains overrides to the default job definition. In this case, the
override is to set a new value for the parameter of the dataset:

{
"overrides": {
"runParameters": {
"overrides": {
"data": [{
"key": "varRegion",
"value": "02"
}
]}
}
}
}

Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
"data": [
{
"reason": "JobStarted",
"sessionId": "14337009-1637-4948-a36f-16479d7138c6",
"id": 3
}
]
}

Copyright © 2019 Trifacta Inc. Page #173


Reference

Property Description

reason Action undertaken on the endpoint.

sessionId Internal identifier for the session of the job run

id JobGroup identifier. For more information, see API JobGroups Get v4.

jobs.data.id Internal identifier for the individual jobs that compose the job group being executed.

API Deployments Value Import Rules Patch v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Create a list of value-based import rules for the specified deployment. Delete any previous rules applied to the
same values.

NOTE: Import rules must be applied to individual deployments.

The generated rules apply to all flows that are imported into the Production instance after they have been created.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

The response contains any previously created rules that have been deleted as a result of this change.
You can also make replacements in the import package based on object references. See
API Deployments Object Import Rules Patch v4.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

Copyright © 2019 Trifacta Inc. Page #174


/v4/deployments/<id>/valueImportRules

where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v4/deployments/4/valueImportRules

Request Body Example: Single value replacement


The following JSON array describes a single replacement rule for the S3 bucket name. In this case, the wrangle
-dev bucket name has been replaced by the wrangle-prod bucket name, which means data is pulled in the
Production deployment from the appropriate S3 bucket.

NOTE: The executing user of any job must have access to any data source that is remapped in the new
instance.

[
{
"type": "s3Bucket",
"on": "wrangle-dev",
"with": "wrangle-prod"
}
]

Request Body Example: Multiple value replacements


The following JSON array describes two replacements for the fileLocation values. In this case, rules are
applied in succession.

NOTE: Rules are applied in the listed order. If you are applying multiple rules to the same object in the
import package, the second rule must reference the expected changes applied by the first rule.

Copyright © 2019 Trifacta Inc. Page #175


[
{
"type": "fileLocation",
"on": "klamath",
"with": "klondike"
},
{
"type": "fileLocation",
"on": "//dev//",
"with": "/prod/"
}
]

In the above:
The first rule replaces the string klamath in the path to the source with the following value: klondike.
The second rule performs a regular expression match on the string /dev/. Since the match is described
using the regular expression syntax, the backslashes must be escaped. The replacement value is the
following literal: /prod/.

You can specify matching values using the following types of matches:

Match Type Example Syntax

string literal {"on":"d75255f0-a245-11e7-8618-adc1dbb4bed0"}

regular expression {"on":"/[0-9a-zA-z]{8}-a245-11e7-8618-adc1dbb4bed0/"}

NOTE: Use of Trifacta patterns is not supported.

For more information on patterns, see Text Matching.

Response

Response Status Code - Success: 200 - OK

The response body contains any previously created rules that have been deleted as a result of this update.
Response Body Example: All new rule, no deletions
If the update does not overwrite any previous rules, then no rules are deleted. So, the response looks like the
following:

{
"deleted": []
}

Response Body Example: Replace file location, delete previous rule


If you submit the request again, the response contains the rule definition of the previous update, which has been
deleted.

Copyright © 2019 Trifacta Inc. Page #176


{
"deleted": {
"data": [
{
"id": 1,
"type": "s3Bucket",
"on": "wrangle-dev",
"with": "wrangle-prod",
"createdAt": "2019-02-13T23:27:13.351Z",
"updatedAt": "2019-02-13T23:27:13.351Z",
"creator": {
"id": 7
},
"updater": {
"id": 7
},
"deployment": {
"id": 2
}
}
]
}
}

Reference

Property Description

id Internal identifier for the value import rule

type The type of value import rule:

fileLocation - the location of a specified file.


s3Bucket - location of the S3 bucket to modify
on The specified literal or pattern to match.

with The replacement value or pattern

createdAt Timestamp for when the rule was created

updatedAt Timestamp for when the rule was last updated

creator.id Internal identifier of the user who created the rule

updater.id Internal identifier of the user who last updated the rule

deployment.id Internal identifier for the deployment from which the import rule was deleted.

API EMRClusters Create v4

This is the latest version of the APIs.

Contents:
Required Permissions

Copyright © 2019 Trifacta Inc. Page #177


Request
Response
Reference

Create an EMR cluster identifier object in the Trifacta® platform.

NOTE: APIs for EMR clusters apply only to instances of the Trifacta platform that are integrated with
Amazon EMR clusters. These APIs can be used to manage switching between EMR clusters when
needed.

NOTE: There can be only one EMR cluster registered with the platform at any time. The registered cluster
is always the active one.

Tip: You can use the PATCH method on the current EMRCluster object to update the EMR cluster ID that
is active for the platform. See API EMRClusters Patch v4.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/emrClusters/

Request URI - Example:

/v4/emrClusters/

Request Body:

{
"emrClusterId": "a-2BFK2$KVR7QQ",
"resourceBucket": "3dog-testing-emr-spark",
"resourcePath": "",
"region": "us-west-2"
}

Copyright © 2019 Trifacta Inc. Page #178


Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
"id": 1,
"emrClusterId": "a-2BFK2$KVR7QQ",
"resourceBucket": "3dog-testing-emr-spark",
"resourcePath": "",
"region": "us-west-2",
"updatedAt": "2019-02-14T01:20:51.303Z",
"createdAt": "2019-02-14T01:20:51.303Z"
}

Reference

For more information, see API EMRClusters Get v4.


API EMRClusters Delete v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Delete the specified EMR cluster identifier.

NOTE: APIs for EMR clusters apply only to deployments of the Trifacta® platform that are integrated with
Amazon EMR clusters. These APIs can be used to manage failovers or prolonged outages of a primary
EMR cluster.

NOTE: There can be only one EMR cluster registered with the platform at any time. The registered cluster
is always the active one.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #179


Request

Request Type: DELETE

Endpoint:

/v4/emrClusters/<id>

where:

Parameter Description

<id> Internal identifier for the EMR cluster

Request URI - Example:

/v4/emrClusters/1

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

None.
API EMRClusters Get Count v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get count of EMR cluster Ids accessible by the authenticated user.

NOTE: APIs for EMR clusters apply only to instances of the Trifacta® platform that are integrated with
Amazon EMR clusters. These APIs can be used to manage switching between EMR clusters when
needed.

Copyright © 2019 Trifacta Inc. Page #180


NOTE: There can be only one EMR cluster registered with the platform at any time. The registered cluster
is always the active one.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/emrClusters/count

Request URI - Example:

/v4/emrClusters/count

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example: count = 0:


Empty.
Response Body Example: count = 1:

{
"count": 1
}

Reference

None.
API EMRClusters Get List v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request

Copyright © 2019 Trifacta Inc. Page #181


Response
Reference

Get list of all EMR cluster Ids accessible by the authenticated user.

NOTE: APIs for EMR clusters apply only to instances of the Trifacta® platform that are integrated with
Amazon EMR clusters. These APIs can be used to manage switching between EMR clusters when
needed.

NOTE: There can be only one EMR cluster registered with the platform at any time. The registered cluster
is always the active one.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/emrClusters/

Request URI - Example:

/v4/emrClusters/

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #182


{
"data": [
{
"id": 22109999,
"emrClusterId": "a-2BEJOQHVRZ7BZ",
"resourceBucket": "2dog-testing-emr",
"resourcePath": "",
"region": "us-west-2",
"createdAt": "2019-01-23T17:16:46.000Z",
"updatedAt": "2019-01-23T17:16:46.000Z"
},
{
"id": 10419999,
"emrClusterId": "a-2CEKR9HVRK9H0",
"resourceBucket": "2dog-testing-emr",
"resourcePath": "",
"region": "us-west-2",
"createdAt": "2019-01-23T14:27:26.000Z",
"updatedAt": "2019-01-23T14:27:26.000Z"
}
]
}

Reference

For more information, see API EMRClusters Get v4.


API EMRClusters Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get information on the specific EMR cluster identifier.

NOTE: APIs for EMR clusters apply only to instances of the Trifacta® platform that are integrated with
Amazon EMR clusters. These APIs can be used to manage switching between EMR clusters when
needed.

NOTE: There can be only one EMR cluster registered with the platform at any time. The registered cluster
is always the active one.

Version: v4

Copyright © 2019 Trifacta Inc. Page #183


Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/emrClusters/<id>

where:

Parameter Description

<id> Internal identifier for the EMR cluster

Request URI - Example:

/v4/emrClusters/22101948

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 22101948,
"emrClusterId": "a-2BFK90SF0F9K",
"resourceBucket": "2dog-testing-emr",
"resourcePath": "",
"region": "us-west-2",
"createdAt": "2019-01-23T17:16:46.000Z",
"updatedAt": "2019-01-23T17:16:46.000Z"
}

Copyright © 2019 Trifacta Inc. Page #184


Reference

Property Description

id Internal identifier for the EMR cluster

emrClusterId EMR identifier for the cluster

resourceBucket S3 bucket that contains the Trifacta libraries for EMR and Trifacta job logs

resourcePath Default path within the resourceBucket

region AWS region where the cluster has been created

createdAt Timestamp for when the EMR cluster object was launched

updatedAt Timestamp for when the EMR cluster object was last updated

API EMRClusters Patch v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Update the properties of the specified EMR cluster identifier.

Tip: You can use this endpoint to switch the currently active EMR cluster to a new EMR cluster by
changing the identifier value for it.

NOTE: Modifying a cluster while jobs are running can result in erroneous reporting of job status. Perform
these modifications during off-peak hours.

NOTE: APIs for EMR clusters apply only to instances of the Trifacta® platform that are integrated with
Amazon EMR clusters. These APIs can be used to manage switching between EMR clusters when
needed.

NOTE: APIs for EMR clusters apply only to deployments of the Trifacta platform that are integrated with
Amazon EMR clusters. These APIs can be used to manage failovers or prolonged outages of a primary
EMR cluster.

NOTE: There can be only one EMR cluster registered with the platform at any time. The registered cluster
is always the active one.

Version: v4

Copyright © 2019 Trifacta Inc. Page #185


Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

/v4/emrClusters/<id>

where:

Parameter Description

<id> Internal identifier for the EMR cluster

Request URI - Example:

/v4/emrClusters/1

Request Body:
Only the properties that you are updating need to be included in the request.

{
"resourceBucket": "3dog-testing-emr2",
"resourcePath": "default"
}

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 1,
"updatedAt": "2019-02-14T01:23:14.344Z"
}

Reference

For more information, see API EMRClusters Get v4.


API Flows Create v4

Copyright © 2019 Trifacta Inc. Page #186


This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Create a new flow with specified name and optional description and target folder.

NOTE: You cannot add datasets to the flow through this endpoint. Moving pre-existing datasets into a
flow is not supported in this release. Create the flow first and then when you create the datasets,
associate them with the flow at the time of creation.
See API ImportedDatasets Create v4.
See API WrangledDatasets Create v4.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/flows/

Request Body:
A name value is required. Other properties are optional.

{
"name": "My Flow",
"description": "This is my flow."
"folder": {
"id": 2
}
}

Response

Response Status Code - Success: 201 - Created

Copyright © 2019 Trifacta Inc. Page #187


Response Body Example:

{
"id": 18,
"updatedAt": "2019-01-08T20:27:53.422Z",
"createdAt": "2019-01-08T20:27:53.422Z",
"name": "My Flow",
"description": "This is my flow.",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"folder": {
"id": 2
},
"workspace": {
"id": 1
}
}

Reference

For more information on the properties of a flow, see API Flows Get v4.

API Flows Delete v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Delete the specified flow.


Version: v4

Required Permissions

The user deleting the flow must be its owner.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #188


Request

Request Type: DELETE

Endpoint:

/v4/flows/<id>

where:

Parameter Description

<id> Internal identifier for the flow

Request URI - Example:

/v4/flows/2

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

For more information on the properties of a flow, see API Flows Get v4.
API Flows Get List v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Returns a list of all flows accessible to the authenticated user.


Version: v4

Required Permissions

When authenticated, you can review all flows to which you have access.

Copyright © 2019 Trifacta Inc. Page #189


NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/flows

Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

/v4/flows?limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Examples:

{
"data": [
{
"id": 9,
"name": "Intern Training",
"description": "(Please don't modify)",
"createdAt": "2019-01-08T18:14:37.851Z",
"updatedAt": "2019-01-08T18:57:26.824Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"folder": {
"id": 1
}
"workspace": {
"id": 1
}

Copyright © 2019 Trifacta Inc. Page #190


},
{
"id": 6,
"name": "2013 POS",
"description": null,
"createdAt": "2019-01-08T17:25:21.392Z",
"updatedAt": "2019-01-08T17:30:30.959Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"folder": {
"id": 4
}
"workspace": {
"id": 1
}
},
{
"id": 3,
"name": "Untitled Flow",
"description": "",
"createdAt": "2019-01-08T15:53:51.109Z",
"updatedAt": "2019-01-08T16:10:10.020Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"folder": null,
"workspace": {
"id": 1
}

Copyright © 2019 Trifacta Inc. Page #191


}
]
}

Reference

For more information on the properties of a flow see API Flows Get v4.
API Flows Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Retrieve the flow information for a specified flow identifier.


Version: v4

Required Permissions

The authenticated user must have access to the specified flow.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/flows/<id>

Parameter Description

<id> Internal identifier of the flow to retrieve.

Request URI - Example:

/v4/flows/6

Request Body:
Empty.

Copyright © 2019 Trifacta Inc. Page #192


Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 6,
"name": "2013 POS",
"description": null,
"createdAt": "2019-01-08T17:25:21.392Z",
"updatedAt": "2019-01-08T17:30:30.959Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"folder": null,
"workspace": {
"id": 1
}
}

Reference

Property Description

id Internal identifier for the flow

name Display text for the flow

description User-friendly description for the flow

createdAt Timestamp for when the flow was created

updatedAt Timestamp for when the flow was last modified

creator.id Internal ID of the user who created the flow.

updater.id Internal identifier of the user who last updated the flow.

folder If the flow has been added to a folder, this value contains the path to the folder.

workspace.id Internal identifier of the workspace to which this flow belongs. In most environments, this value is 1.

API Flows Package Get DryRun v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Copyright © 2019 Trifacta Inc. Page #193


Performs a dry-run of generating a flow package and exporting it, which performs a check of all permissions
required to export the package.
If they occur, permissions errors are reported in the response.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/flows/<id>/package/dryRun

Parameter Description

<id> Internal identifier of the flow to retrieve.

Request URI - Example:

/v4/flows/7/package/dryRun

Request Body:
None.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{ }

Reference

None.
API Flows Package Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request

Copyright © 2019 Trifacta Inc. Page #194


Response
Reference

Retrieve a package containing the definition of the specified flow.


Response body is the contents of the package. Package contents are a ZIPped version of the flow definition.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/flows/<id>/package

Parameter Description

<id> Internal identifier of the flow to retrieve.

Request URI - Example:

/v4/flows/7/package

Request Body:
None.

Response

Response Status Code - Success: 200 - OK

Response Body Example:


Response body is the contents of the ZIP file. This package should be downloaded to your local environment.

Reference

None.
API Flows Package Post DryRun v4

This is the latest version of the APIs.

Contents:
Required Permissions

Copyright © 2019 Trifacta Inc. Page #195


Request
Response
Reference

Performs a dry-run of importing a flow package, which performs a check of all permissions required to import the
package, as well as any specified import rules.
For more information on import rules, see Define Import Mapping Rules.
If they occur, errors are reported in the response.
After you have successfully completed a dry-run, you can execute a formal import. See
API Flows Package Post v4.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/flows/package/dryRun

Request URI - Example:

/v4/flows/package/dryRun

Request Body:
The request body must include the following key and value combination submitted as form data. This path is the
location of the ZIP package that you are importing.

key value

data "@path-to-file"

Example request - curl:

curl -X POST http://example.com:3005/v4/flows/package/dryRun \


-H 'authorization: Basic YWRtaW5AdHJpZmFjdGEubG9jYWw6YWRtaW4=' \
-H 'cache-control: no-cache' \
-H 'content-type: multipart/form-data' \
-F data=@path/to/flow.zip

Copyright © 2019 Trifacta Inc. Page #196


Response

Response Status Code - Success: 200 - OK

Response Body Example:

{"importRuleChanges":{"object":[],"value":[]},"flowName":"[7dd7da30]
2013 POS"}

Reference

None.
API Flows Package Post v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Performs an import of a flow package, which also applies any specified import rules.
Before you import, you can perform a dry-run to check for errors. See API Flows Package Post DryRun v4.
For more information on import rules, see Define Import Mapping Rules.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/flows/package/

Request URI - Example:

/v4/flows/package

Request Body:

Copyright © 2019 Trifacta Inc. Page #197


The request body must include the following key and value combination submitted as form data. This path is the
location of the ZIP package that you are importing.

key value

data "@path-to-file"

Example request - curl:

curl -X POST http://example.com:3005/v4/flows/package \


-H 'authorization: Basic YWRtaW5AdHJpZmFjdGEubG9jYWw6YWRtaW4=' \
-H 'cache-control: no-cache' \
-H 'content-type: multipart/form-data' \
-F data=@path/to/flow.zip

Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
"importRuleChanges":{
"object":[],
"value":[]
},
"primaryFlowIds":[
6
],
"flows":[
{
"id":6,
"name":"[b19d9a70] 2013 POS",
"description":null,
"deleted_at":null,
"cpProject":null,
"workspaceId":1,
"createdAt":"2018-04-24T16:11:59.343Z",
"updatedAt":"2018-04-24T18:26:47.522Z",
"createdBy":1,
"updatedBy":1
}
],
"datasources":[
{
"id":10,
"size":"13757",

"path":"/uploads/1/0b6a7d7a-be8d-46c9-92cd-202f39fa5b1b/REF_PROD.txt",
"dynamicPath":null,
"type":"hdfs",
"cpProject":null,
"workspaceId":1,

Copyright © 2019 Trifacta Inc. Page #198


"bucket":null,
"connectionId":null,
"deleted_at":null,
"blobHost":null,
"container":null,
"isSchematized":true,
"isDynamic":false,
"disableTypeInference":false,
"createdAt":"2018-04-24T16:11:59.568Z",
"updatedAt":"2018-04-24T18:26:47.592Z",
"createdBy":1,
"updatedBy":1,
"parsingScriptId":15
},
{
"id":11,
"size":"128761",

"path":"/uploads/1/0b6a7d7a-be8d-46c9-92cd-202f39fa5b1b/POS-r03.txt",
"dynamicPath":null,
"type":"hdfs",
"cpProject":null,
"workspaceId":1,
"bucket":null,
"connectionId":null,
"deleted_at":null,
"blobHost":null,
"container":null,
"isSchematized":true,
"isDynamic":false,
"disableTypeInference":false,
"createdAt":"2018-04-24T16:12:00.812Z",
"updatedAt":"2018-04-24T18:26:47.597Z",
"createdBy":1,
"updatedBy":1,
"parsingScriptId":16
},
{
"id":12,
"size":"56976",

"path":"/uploads/1/0b6a7d7a-be8d-46c9-92cd-202f39fa5b1b/REF_CAL.txt",
"dynamicPath":null,
"type":"hdfs",
"cpProject":null,
"workspaceId":1,
"bucket":null,
"connectionId":null,
"deleted_at":null,
"blobHost":null,
"container":null,
"isSchematized":true,
"isDynamic":false,

Copyright © 2019 Trifacta Inc. Page #199


"disableTypeInference":false,
"createdAt":"2018-04-24T16:12:01.431Z",
"updatedAt":"2018-04-24T18:26:47.600Z",
"createdBy":1,
"updatedBy":1,
"parsingScriptId":17
},
{
"id":13,
"size":"281032",

"path":"/uploads/1/0b6a7d7a-be8d-46c9-92cd-202f39fa5b1b/POS-r02.txt",
"dynamicPath":null,
"type":"hdfs",
"cpProject":null,
"workspaceId":1,
"bucket":null,
"connectionId":null,
"deleted_at":null,
"blobHost":null,
"container":null,
"isSchematized":true,
"isDynamic":false,
"disableTypeInference":false,
"createdAt":"2018-04-24T16:12:01.848Z",
"updatedAt":"2018-04-24T18:26:47.603Z",
"createdBy":1,
"updatedBy":1,
"parsingScriptId":18
},
{
"id":14,
"size":"1799008",

"path":"/uploads/1/0b6a7d7a-be8d-46c9-92cd-202f39fa5b1b/POS-schema.csv",
"dynamicPath":null,
"type":"hdfs",
"cpProject":null,
"workspaceId":1,
"bucket":null,
"connectionId":null,
"deleted_at":null,
"blobHost":null,
"container":null,
"isSchematized":true,
"isDynamic":false,
"disableTypeInference":false,
"createdAt":"2018-04-24T16:12:03.402Z",
"updatedAt":"2018-04-24T18:26:47.607Z",
"createdBy":1,
"updatedBy":1,
"parsingScriptId":19
}

Copyright © 2019 Trifacta Inc. Page #200


],
"flownodes":[
{
"id":16,
"scriptId":15,
"flowId":6,
"flowNodeId":null,
"deleted_at":null,
"wrangled":false,
"referenceId":null,
"createdAt":"2018-04-24T16:11:59.930Z",
"updatedAt":"2018-04-24T18:26:47.803Z",
"createdBy":1,
"updatedBy":1,
"activesampleId":null
},
{
"id":17,
"scriptId":16,
"flowId":6,
"flowNodeId":null,
"deleted_at":null,
"wrangled":false,
"referenceId":null,
"createdAt":"2018-04-24T16:12:01.048Z",
"updatedAt":"2018-04-24T18:26:47.821Z",
"createdBy":1,
"updatedBy":1,
"activesampleId":null
},
{
"id":18,
"scriptId":17,
"flowId":6,
"flowNodeId":null,
"deleted_at":null,
"wrangled":false,
"referenceId":null,
"createdAt":"2018-04-24T16:12:01.652Z",
"updatedAt":"2018-04-24T18:26:47.834Z",
"createdBy":1,
"updatedBy":1,
"activesampleId":null
},
{
"id":19,
"scriptId":18,
"flowId":6,
"flowNodeId":null,
"deleted_at":null,
"wrangled":false,
"referenceId":null,
"createdAt":"2018-04-24T16:12:02.073Z",

Copyright © 2019 Trifacta Inc. Page #201


"updatedAt":"2018-04-24T18:26:47.848Z",
"createdBy":1,
"updatedBy":1,
"activesampleId":null
},
{
"id":20,
"scriptId":19,
"flowId":6,
"flowNodeId":null,
"deleted_at":null,
"wrangled":false,
"referenceId":null,
"createdAt":"2018-04-24T16:12:03.664Z",
"updatedAt":"2018-04-24T18:26:47.861Z",
"createdBy":1,
"updatedBy":1,
"activesampleId":null
},
{
"id":21,
"scriptId":20,
"flowId":6,
"flowNodeId":null,
"deleted_at":null,
"wrangled":true,
"referenceId":2,
"createdAt":"2018-04-24T16:12:14.018Z",
"updatedAt":"2018-04-24T18:26:47.876Z",
"createdBy":1,
"updatedBy":1,
"activesampleId":null
},
{
"id":22,
"scriptId":21,
"flowId":6,
"flowNodeId":null,
"deleted_at":null,
"wrangled":true,
"referenceId":null,
"createdAt":"2018-04-24T17:15:36.199Z",
"updatedAt":"2018-04-24T18:26:47.886Z",
"createdBy":1,
"updatedBy":1,
"activesampleId":null
}
],
"outputobjects":[
{
"id":5,
"execution":"photon",
"profiler":true,

Copyright © 2019 Trifacta Inc. Page #202


"flowNodeId":21,
"isAdhoc":true,
"createdAt":"2018-04-24T16:14:17.153Z",
"updatedAt":"2018-04-24T18:26:47.896Z",
"createdBy":1,
"updatedBy":1

Copyright © 2019 Trifacta Inc. Page #203


}
]
}

Reference

Node Description

flows Internal identifier of the flow and related data

datasources Sources of data for the flow.

flowNodes Objects (imported datasets, recipes, and reference objects) within the flow definition

outputobjects Output objects related to the Run Job settings for the flow. See Run Job Page.

API Flows Patch v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Update an existing flow based on the specified identifier.

NOTE: You cannot add datasets to the flow through this endpoint. Moving pre-existing datasets into a
flow is not supported in this release. Create the flow first and then when you create the datasets,
associate them with the flow at the time of creation.
See API ImportedDatasets Create v4.
See API WrangledDatasets Create v4.

Version: v4

Required Permissions

The authenticated user must be the owner of the flow that is being updated.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

Copyright © 2019 Trifacta Inc. Page #204


/v4/flows/<id>

Parameter Description

<id> Internal identifier of the flow to update.

Request URI - Example:

/v4/flows/8

Request Body:
You can modify the following properties.

{
"name": "My Flow",
"description": "This is my flow."
}

NOTE: For the PATCH method, only the properties that are being patched need to be submitted.

Response

Response Status Code - Success: 200 - Ok

Response Body Example:


None.

Reference

For more information on the properties of a flow, see API Flows Get v4.

API ImportedDatasets Create v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request and Response
Examples by Type
File (HDFS and S3 sources)
Hive
Relational
Relational with Custom SQL Query
Reference

Copyright © 2019 Trifacta Inc. Page #205


Create an imported dataset from an available resource. Created dataset is owned by the authenticated user.

NOTE: When an imported dataset is created via API, it is always imported as an unstructured dataset.
Any recipe that references this dataset should contain initial parsing steps required to structure the data.

NOTE: Do not create an imported dataset from a file that is being used by another imported dataset. If
you delete the newly created imported dataset, the file is removed, and the other dataset is corrupted.
Use a new file or make a copy of the first file first.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request and Response

Request Type: POST

Endpoint:

/v4/importedDatasets

Response Status Code - Success: 201 - Created

Examples by Type

Below, you can review the basic request body for creating imported datasets for various types of sources:
File (HDFS or S3 source)
Hive
Relational
Relation with Custom SQL Query

File (HDFS and S3 sources)

Request Body - HDFS file:


Below, the bucket value is set to null. This parameter applies only to S3 sources.

NOTE: The path value should not include the HDFS protocol, host, or port information. You only need to
provide the path on HDFS.

Copyright © 2019 Trifacta Inc. Page #206


{
"path":
"/tri-hdfs/uploads/1/4aee9852-cf92-47a8-8c6a-9ff2adeb3b4a/POS-r02.txt",
"type": "hdfs",
"bucket": null,
"name": "POS-r02b.txt",
"description": "POS-r02 - copy"
}

Request Body - S3 file:


For S3 sources, a bucket must be specified. Below, the bucket value is set to myBucket.

NOTE: The path value should not include the S3 protocol, host, or port information. You only need to
provide the path on S3.

{
"path":
"/tri-h26/uploads/1/343647c7-5b23-41c8-9397-b40a1ff415ea/USDA_Farmers_Mar
ket_2014.avro",
"type": "s3",
"bucket": "myBucket",
"name": "USDA Farmers Market 2014b",
"description": "USDA Farmers Market 2014 - copy"
}

Response Body - file:


Following example is for an S3 file. For an HDFS file:
type=hdfs
bucket value is null.

Copyright © 2019 Trifacta Inc. Page #207


{
"id": 63,
"size": "5053205",
"path":
"/tri-h26/uploads/1/343647c7-5b23-41c8-9397-b40a1ff415ea/USDA_Farmers_Mar
ket_2014.avro",
"dynamicPath": null,
"type": "s3",
"bucket": "myBucket",
"isSchematized": true,
"isDynamic": false,
"disableTypeInference": false,
"createdAt": "2018-02-03T01:08:32.867Z",
"updatedAt": "2018-02-03T01:08:34.185Z",
"parsingRecipe": {
"id": 121
},
"runParameters": {
"data": []
},
"name": "USDA Farmers Market 2014b",
"description": "USDA Farmers Market 2014 - copy",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"connection": null
}

Hive

Request Body - Hive:


Notes:
Note that the type=jdbc.
The columns key is optional. If not provided, all columns in the source table are included.

Copyright © 2019 Trifacta Inc. Page #208


{
"visible": true,
"numFlows": 0,
"size": -1,
"type": "jdbc",
"jdbcType": "TABLE",
"jdbcPath": [
"default"
],
"jdbcTable": "farmers_market_recipe_tri",
"columns": [
"fmid",
"market_name"
],
"connectionId": 1,
"name": "Farmer's Market Data"
}

Response Example - Hive:

{
"jdbcTable": "farmers_market_recipe_tri",
"jdbcPath": [
"default"
],
"columns": [
"fmid",
"market_name"
],
"filter": null,
"raw": null,
"id": 19,
"size": "-1",
"path": null,
"dynamicPath": null,
"type": "jdbc",
"bucket": null,
"isSchematized": true,
"isDynamic": false,
"disableTypeInference": false,
"createdAt": "2018-02-26T19:19:33.069Z",
"updatedAt": "2018-02-26T19:19:33.720Z",
"parsingRecipe": {
"id": 35
},
"relationalSource": {
"relationalPath": [
"default"
],
"columns": [
"fmid",

Copyright © 2019 Trifacta Inc. Page #209


"market_name"
],
"filter": null,
"raw": null,
"id": 2,
"tableName": "farmers_market_recipe_tri",
"createdAt": "2018-02-26T19:19:33.074Z",
"updatedAt": "2018-02-26T19:19:33.074Z",
"importedDataset": {
"id": 19
}
},
"runParameters": {
"data": []
},
"connection": {
"name": "hive",
"creator": {
"id": 1,
"email": "[email protected]",
"name": "Administrator"
},
"id": 1,
"inaccessibleToUser": true
},
"name": "Farmer's Market Data",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"workspace": {

Copyright © 2019 Trifacta Inc. Page #210


"id": 1
}
}

Relational

Request Body - Relational:


Notes:
If you know the size value for the table, please provide. It is helpful for performance reasons and
validation but is not required.
The columns key is optional. If not provided, all columns in the source table are included.

{
"visible": true,
"numFlows": 0,
"size": -1,
"type": "jdbc",
"jdbcType": "TABLE",
"jdbcPath": [
"public"
],
"jdbcTable": "datasources",
"columns": [
"id",
"size",
"path"
],
"connectionId": 3,
"name": "My DB Table"
}

Response Example - Relational:

{
"jdbcTable": "datasources",
"jdbcPath": [
"public"
],
"columns": [
"id",
"size",
"path"
],
"filter": null,
"raw": null,
"id": 23,
"size": "-1",
"path": null,
"dynamicPath": null,

Copyright © 2019 Trifacta Inc. Page #211


"type": "jdbc",
"bucket": null,
"isSchematized": true,
"isDynamic": false,
"disableTypeInference": false,
"createdAt": "2018-02-26T19:32:52.898Z",
"updatedAt": "2018-02-26T19:32:53.613Z",
"parsingRecipe": {
"id": 37
},
"relationalSource": {
"relationalPath": [
"public"
],
"columns": [
"id",
"size",
"path"
],
"filter": null,
"raw": null,
"id": 4,
"tableName": "datasources",
"createdAt": "2018-02-26T19:32:52.904Z",
"updatedAt": "2018-02-26T19:32:52.904Z",
"importedDataset": {
"id": 23
}
},
"runParameters": {
"data": []
},
"connection": {
"name": "postgres",
"creator": {
"id": 1,
"email": "[email protected]",
"name": "Administrator"
},
"id": 3,
"inaccessibleToUser": true
},
"name": "My DB Table",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"workspace": {

Copyright © 2019 Trifacta Inc. Page #212


"id": 1
}
}

Relational with Custom SQL Query

You can submit custom SQL queries to relational or hive connections. These custom SQLs can be used to
pre-filter the data inside the database, improving performance of the query and the overall dataset.
For more information, see Enable Custom SQL Query.
Request Body:
Notes:
See previous notes on queries to relational sources.
As part of the request body, you must submit the custom SQL query as the value for the raw property.

The following example is valid for Oracle databases. Note the escaping of the double-quote marks.

NOTE: Syntax for the custom SQL query varies between relational systems. For more information on
syntax examples, see Create Dataset with SQL.

{
"visible": true,
"numFlows": 0,
"size": -1,
"type": "jdbc",
"jdbcType": "TABLE",
"connectionId": 1,
"raw": "SELECT * FROM `default`.`farmers_market_recipe_tri`",
"name": "Farmer's Market Data - Custom SQL Query"
}

Response Body:
In the response, note that the source of the data is defined by the connectionId value and the SQL defined in
the raw value.

{
"jdbcTable": null,
"jdbcPath": null,
"columns": null,
"filter": null,
"raw": [
"SELECT * FROM `default`.`farmers_market_recipe_tri`"
],
"id": 21,
"size": "-1",
"path": null,
"dynamicPath": null,
"type": "jdbc",
"bucket": null,

Copyright © 2019 Trifacta Inc. Page #213


"isSchematized": true,
"isDynamic": true,
"disableTypeInference": false,
"createdAt": "2018-02-26T19:25:33.000Z",
"updatedAt": "2018-02-26T19:25:33.884Z",
"parsingRecipe": {
"id": 36
},
"relationalSource": {
"relationalPath": null,
"columns": null,
"filter": null,
"raw": [
"SELECT * FROM `default`.`farmers_market_recipe_tri`"
],
"id": 3,
"tableName": null,
"createdAt": "2018-02-26T19:25:33.006Z",
"updatedAt": "2018-02-26T19:25:33.006Z",
"importedDataset": {
"id": 21
}
},
"runParameters": {
"data": []
},
"connection": {
"name": "hive",
"creator": {
"id": 1,
"email": "[email protected]",
"name": "Administrator"
},
"id": 1,
"inaccessibleToUser": true
},
"name": "Farmer's Market Data - Custom SQL Query",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"workspace": {

Copyright © 2019 Trifacta Inc. Page #214


"id": 1
}
}

Reference

For more information on the properties of an imported dataset, see API ImportedDatasets Get v4.
API ImportedDatasets Delete v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Delete the specified dataset.


Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v4/importedDatasets/<id>

where:

Property Description

<id> Internal identifier for the imported dataset

Request URI - Example:

/v4/importedDatasets/2

Request Body:

Copyright © 2019 Trifacta Inc. Page #215


Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

For more information on the properties of an imported dataset, see API ImportedDatasets Get v4.
API ImportedDatasets Get List v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the list of accessible imported datasets for the authenticated user.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/importedDatasets

Endpoint with embedded reference:

/v4/importedDatasets/?embed=connection

For more information, see API ImportedDatasets Get v4.


Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.

Copyright © 2019 Trifacta Inc. Page #216


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 2-102:

/v4/importedDatasets?embed=connection&limit=100&offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.
Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:


In the following example, you can see the data for two separate datasets:
JDBC
File-based (uploaded to S3)

{
"data": [
{
"id": 56,
"size": "-1",
"path": null,
"dynamicPath": null,
"type": "jdbc",
"bucket": null,
"isSchematized": true,
"isDynamic": false,
"disableTypeInference": false,
"createdAt": "2018-01-31T23:51:36.179Z",
"updatedAt": "2018-01-31T23:51:37.025Z",
"parsingRecipe": {
"id": 111
},
"relationalSource": {
"relationalPath": [
"public"
],
"columns": [
"start_date",
"end_date"
],
"filter": null,
"raw": null,
"id": 10,
"tableName": "datetable",
"createdAt": "2018-01-31T23:51:36.187Z",
"updatedAt": "2018-01-31T23:51:36.187Z",
"importedDataset": {

Copyright © 2019 Trifacta Inc. Page #217


"id": 56
}
},
"runParameters": {
"data": []
},
"name": "datetable",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"connection": {
"id": 2
}
},
{
"id": 51,
"size": "268405298",
"path":
"/server/uploads/1/462b6ec4-6f8d-408b-9caa-75b803ae8a86/MarketingAnalytic
s%20Extract.csv",
"dynamicPath": null,
"type": "s3",
"bucket": "3facs3",
"isSchematized": true,
"isDynamic": false,
"disableTypeInference": false,
"createdAt": "2018-01-25T22:05:55.270Z",
"updatedAt": "2018-01-25T22:05:56.007Z",
"parsingRecipe": {
"id": 80
},
"runParameters": {
"data": []
},
"name": "MarketingAnalytics Extract.csv",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"connection": null

Copyright © 2019 Trifacta Inc. Page #218


}
]
}

Reference

For more information on the properties of an imported dataset, see API ImportedDatasets Get v4.
API ImportedDatasets Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the specified imported dataset.


Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/importedDatasets/<id>

where:

Parameter Description

<id> Internal identifier for the imported dataset

Endpoint with embedded reference:

Use the following embedded reference to embed in the response data about the connection used to acquire the
source dataset if it was created from a Hive or relational connection.

/v4/importedDatasets/<id>?embed=connection

Request URI - Example:

Copyright © 2019 Trifacta Inc. Page #219


/v4/importedDatasets/63

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:


The following response comes an uploaded file.

Copyright © 2019 Trifacta Inc. Page #220


{
"path":
"/trifacta/uploads/1/3630243f-4a20-4996-8fa4-cb190c565358/POS-r01.txt",
"bucket": null,
"container": null,
"type": "hdfs",
"blobHost": null,
"isDynamicOrConverted": false,
"id": 9,
"dynamicPath": null,
"isSchematized": true,
"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"createdAt": "2019-01-28T19:54:47.667Z",
"updatedAt": "2019-01-28T19:54:47.847Z",
"runParameters": {
"data": []
},
"storageLocation": {
"fullUri":
"hdfs:///trifacta/uploads/1/3630243f-4a20-4996-8fa4-cb190c565358/POS-r01.
txt",
"id": 34,
"path":
"/trifacta/uploads/1/3630243f-4a20-4996-8fa4-cb190c565358/POS-r01.txt",
"size": "292817",
"workspaceId": 1,
"type": "hdfs",
"bucket": null,
"blobHost": null,
"container": null,
"hash": "215944e85eaabf9fd7ae837b36ff80711abe7ae1",
"createdAt": "2019-01-28T19:54:47.664Z",
"updatedAt": "2019-01-28T19:54:47.664Z"
},
"size": "292817",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"workspace": {
"id": 1
},
"parsingRecipe": {
"id": 25
},
"connection": null
}

Copyright © 2019 Trifacta Inc. Page #221


Response Body Example with embedded reference:
The following response comes from a relational source and includes embedded information on the connection
used to import the data.

{
"jdbcTable": "datetable",
"jdbcPath": [
"public"
],
"columns": [
"start_date",
"end_date"
],
"filter": null,
"raw": null,
"id": 56,
"size": "-1",
"path": null,
"dynamicPath": null,
"type": "jdbc",
"bucket": null,
"isSchematized": true,
"isDynamic": false,
"disableTypeInference": false,
"createdAt": "2018-01-31T23:51:36.179Z",
"updatedAt": "2018-01-31T23:51:37.025Z",
"connection": {
"id": 2,
"name": "redshift",
"description": "",
"type": "jdbc",
"isGlobal": true,
"credentialType": "custom",
"credentialsShared": true,
"uuid": "c54bbec0-e05f-11e7-aa39-995f61171ffd",
"disableTypeInference": false,
"createdAt": "2017-12-13T23:45:59.468Z",
"updatedAt": "2017-12-13T23:46:09.039Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
}
},
"parsingRecipe": {
"id": 111
},
"relationalSource": {
"relationalPath": [
"public"
],

Copyright © 2019 Trifacta Inc. Page #222


"columns": [
"start_date",
"end_date"
],
"filter": null,
"raw": null,
"id": 10,
"tableName": "datetable",
"createdAt": "2018-01-31T23:51:36.187Z",
"updatedAt": "2018-01-31T23:51:36.187Z",
"importedDataset": {
"id": 56
}
},
"runParameters": {
"data": []
},
"name": "datetable",
"description": "my table of dates",
"creator": {
"id": 1
},
"updater": {

Copyright © 2019 Trifacta Inc. Page #223


"id": 1
}
}

Response Body Example dataset with parameters:


The following example body illustrates a dataset id=29) that has been created with a regular expression pattern
parameter.

{
"id": 29,
"size": "292817",
"path":
"/trifacta/uploads/1/efeb54fc-efee-4d5f-a92b-a44c09c60aaa/POS-r01.txt",
"dynamicPath":
"/trifacta/uploads/1/efeb54fc-efee-4d5f-a92b-a44c09c60aaa/POS-r.txt",
"type": "hdfs",
"bucket": null,
"blobHost": null,
"container": null,
"isSchematized": true,
"isDynamic": true,
"disableTypeInference": false,
"createdAt": "2018-03-26T22:33:17.386Z",
"updatedAt": "2018-03-26T22:33:18.337Z",
"parsingRecipe": {
"id": 43
},
"runParameters": {
"data": [
{
"value": {
"pattern": {
"regex": {
"value": "[0-9][0-9]"
}
}
},
"insertionIndices": [
{
"index": 62,
"order": 0
}
],
"id": 2,
"type": "path",
"createdAt": "2018-03-26T22:33:17.533Z",
"updatedAt": "2018-03-26T22:33:17.662Z",
"runParameterEdit": {
"value": {
"pattern": {
"regex": {

Copyright © 2019 Trifacta Inc. Page #224


"value": "[0-9][0-9]"
}
}
},
"insertionIndices": [
{
"index": 62,
"order": 0
}
],
"id": 2,
"overrideKey": null,
"createdAt": "2018-03-26T22:33:17.658Z",
"updatedAt": "2018-03-26T22:33:17.658Z",
"runParameter": {
"id": 2
},
"creator": {
"id": 1
},
"updater": {
"id": 1
}
},
"importedDataset": {
"id": 29
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"overrideKey": null
}
]
},
"name": "Dataset with Parameters",
"description": "Dataset with parameters using regular expressions",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"connection": null

Copyright © 2019 Trifacta Inc. Page #225


}

Reference

Common Properties:
The following properties are common to file-based and JDBC datasets.

Property Description

path For HDFS and S3 file sources, this value defines the path to the source.

For JDBC sources, this value is not specified.

For uploaded sources, this value specifies the location on the default backend storage layer where the dataset has
been uploaded.

bucket (If type=s3) Bucket on S3 where source is stored.

container (Azure only) If the dataset is stored in on ADLS, this value specifies the container on the blob host where the
source is stored.

type Identifies where the type of storage where the source is located. Values:

hdfs
s3
jdbc
blobHost (Azure only) If the dataset is stored in on ADLS, this value specifies the blob host where the source is stored.

isDynamicOrConverted Property is true if the dataset is either a dynamic or converted dataset.

id Internal identifier of the imported dataset

dynamicPath (Dataset with parameters only) Specifies the path without the parameters inserted into it. Full path is defined based
on this value and the data in the runParameters area.

isSchematized (If source file is avro, or type=jdbc) If true, schema information is available for the source.

isDynamic If true, the imported dataset is a dynamic dataset (dataset with parameters). For more information, see
Overview of Parameterization.

isConverted If true, the imported dataset has been converted to CSV format for storage.

disableTypeInference If true, the initial type inferencing performed on schematized sources by the Trifacta platform is disabled for this
source. For more information, see Configure Type Inference.

hasStructuring If true, initial parsing steps have been applied to the dataset.

createdAt Timestamp for when the dataset was imported

UpdatedAt Timestamp for when the dataset was last updated

runParameters If runtime parameters have been applied to the dataset, they are listed here. See below for more information.

name

size Size of the file in bytes (if applicable)

name Internal name of the imported dataset

description User-friendly description for the imported dataset

creator.id Internal identifier of the user who created the imported dataset

Copyright © 2019 Trifacta Inc. Page #226


updater.id Internal identifier of the user who last updated the imported dataset

workspace.id Internal identifier of the workspace into which the dataset has been imported.

parsingRecipe.id If initial parsing is applied, this value contains the internal identifier of the recipe that performs the parsing.

connection.id Internal identifier of the connection to the server hosting the dataset.

If this value is null, the file was uploaded from a local file system.

To acquire the entire connection for this dataset, you can use either of the following endpoints:

/v4/importedDatasets?embed=connection
/v4/importedDatasets/:id?embed=connection

For more information, see API Connections Get v4.

runParameters reference:
The following properties are available in the runParameters area:

Property Description

value.pattern.regex.value Regular expression that is applied to the path.

insertionIndices.index Index value for the location in the path where the parameter is applied

insertionIndices.order Any applicable ordering applied to the values in the parameter.

0 - ascending
1 - descending
id Internal identiier for the parameter.

type Type of parameter. This value must be path.

createdAt Timestamp for when the parameter was created

updatedAt Timestamp for when the parameter was updated

runParameterEdit Any runtime overrides applied to the parameter during job execution

importedDataset.id Internal identifier for the dataset to which the parameter is applied

creator.id Internal identifier of the user who created the dataset with parameters

updater.id Internal identifier of the last user who modified the dataset with parameters

overrideKey Any override values applied to the dataset with parameters at run time

storageLocation reference:
The following properties are available in the storageLocation area:

Property Description

fullUri The full URI to the location where the dataset is stored.

id Internal identifier of the imported dataset

path For HDFS and S3 file sources, this value defines the path to the source.

For JDBC sources, this value is not specified.

For uploaded sources, this value specifies the location on the default backend storage layer where the dataset has been
uploaded.

Copyright © 2019 Trifacta Inc. Page #227


size Size of the file in bytes (if applicable)

workspaceId Internal identifier of the workspace into which the dataset has been imported.

type Identifies where the type of storage where the source is located. Values:

hdfs
s3
jdbc
bucket (If type=s3) Bucket on S3 where source is stored.

blobHost (Azure only) If the dataset is stored in on ADLS, this value specifies the blob host where the source is stored.

container (Azure only) If the dataset is stored in on ADLS, this value specifies the container on the blob host where the source is
stored.

hash Hash value for the imported dataset.

Tip: Changes in this value indicate that the source file has been modified.

createdAt Timestamp for when the dataset was imported

updatedAt Timestamp for when the dataset was last updated

Hive or Relational Source:


If the source data is from Hive or a relational system (type=jdbc), the following properties contain information on
the source table, the imported columns, and any custom SQL filters applied to the table. Other properties are part
of the common set.

Property Description

jdbcTable Name of the table from which the data is extracted.

If a custom SQL query has been applied, this value is null .

jdbcPath Name of the database from which the source was queried.

If a custom SQL query has been applied, this value is null .

columns List of columns imported from the source, pre-filtered.

If a custom SQL query has been applied, this value is null .

filter This value is empty.

raw If custom SQL has been applied to the data source to filter the data before it is imported, all SQL statements are listed.

For more information, see Enable Custom SQL Query.

id Internal identifier for the relational source

Size Size in bytes of the data. For relational sources, this value is -1, as the data is not available.

File:
File-based datasets support the common properties only.
Embedded connection:
For more information on the properties when the connection is embedded in the response, see
API Connections Create v4.
API ImportedDatasets Patch v4

Copyright © 2019 Trifacta Inc. Page #228


This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Modify the specified imported dataset.


Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

/v4/importedDatasets/<id>

where:

Parameter Description

<id> Internal identifier for the imported dataset

Request URI - Example:

/v4/importedDataset/8

Request Body:
Only the name and description properties should be modified. Modified properties must be included in the request.
In the following example, the name and the description of the imported dataset are modified:

{
"name": "My Imported DS"
"description": "This is my imported dataset."
}

Response

Response Status Code - Success: 200 - OK

Copyright © 2019 Trifacta Inc. Page #229


Response Body Example:

{
"id": 8,
"updater": {
"id": 1
},
"updatedAt": "2019-02-14T23:19:27.648Z"
}

Reference

For more information on the properties of an imported dataset, see API ImportedDatasets Get v4.
API ImportedDatasets Post AddToFlow v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Add the specified imported dataset to a flow based on its internal identifier.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/importedDatasets/<id>/addToFlow

where:

Parameter Description

<id> Internal identifier for the imported dataset

Request URI - Example:

Copyright © 2019 Trifacta Inc. Page #230


/v4/importedDatasets/18/addToFlow

Request Body:

{
"flow": {
"id": 12
}
}

Response

Response Status Code - Success: 201 - Created

Response Body:

{
"id": 47,
"wrangled": false,
"updatedAt": "2019-02-12T00:51:59.961Z",
"createdAt": "2019-02-12T00:51:59.931Z",
"referenceInfo": null,
"activeSample": {
"id": 52
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"recipe": {
"id": 37
},
"referencedFlowNode": null,
"flow": {
"id": 12
}
}

Copyright © 2019 Trifacta Inc. Page #231


Reference

Property Description

id Internal identifier for the new wrangled dataset.

wrangled This value is always true.


updatedAt Timestamp for when the dataset was updated.

createdAt Timestamp for when the dataset was created.

referenceInfo Reference information for the new object in the flow. Since the dataset has just been added, this value should
be null.

activeSample.id Internal identifier for the currently active sample for the dataset.

If null, there is no currently active sample for the dataset.

creator.id Internal identifier of the user who created the flow.

updater.id Internal identifier of the user who performed the update.

recipe.id Internal identifier for the recipe for the dataset.

If null, the dataset has not been wrangled in the Transformer page and does not have initial parsing steps.

referencedFlowNode Internal identifier of the node of the flow that this dataset references. Since this dataset is an imported dataset,
there is no reference. This value should be null.

flow.id Internal identifier of the flow that contains this dataset.

For more information on the other properties, see API ImportedDatasets Get v4.
API JobGroups Cancel v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Cancel the job for in progress the specified jobGroup.

NOTE: If the job has completed, this endpoint does nothing. You must delete the completed job instead.
See API JobGroups Delete v4.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Either of the following can execute this endpoint:

Copyright © 2019 Trifacta Inc. Page #232


The user who created the job
A Trifacta administrator

Request

Request Type: POST

Endpoint:

/v4/jobGroups/<id>/cancel

where:

Parameter Description

<id> Internal identifier for the job group

Request URI - Example:

/v4/jobGroups/14/cancel

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Reference

For more information on the available status messages, see API JobGroups Get v4.
API JobGroups Create v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Create a jobGroup, which launches the specified job as the authenticated user.
The request specification depends on one of the following conditions:
Dataset has already had a job run against it and just needs to be re-run.
Dataset has not had a job run, or the job definition needs to be re-specified.

NOTE: In this release, you cannot execute jobs sourced from datasets in Redshift or SQL DW or publish
to these locations via the API. This known issue will be fixed in a future release.

Copyright © 2019 Trifacta Inc. Page #233


Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/jobGroups

Request Body - Run job:


To run a job, you just specify the recipe identifier (wrangledDataset.id). If the job is successful, all defined
outputs are generated, as defined in the outputobject, publications, and writesettings objects associated with the
recipe.

{
"wrangledDataset": {
"id": 7
}
}

Request Body - Specify job:


If you must change some outputs or other settings for the specific job, you can insert these changes in the overr
ides section of the request. Below, the running environment (photon), profiling option, and writesettings for the
job are modified for this execution:

Copyright © 2019 Trifacta Inc. Page #234


{
"wrangledDataset": {
"id": 1
},
"overrides": {
"execution": "photon",
"profiler": false,
"writesettings": [
{
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/cdr_txt.c
sv",
"action": "create",
"format": "csv",
"compression": "none",
"header": false,
"asSingleFile": false
}
]
},
"ranfrom": "cli"
}

Request Body - Specify job for dataset with parameters:


If you have created a dataset with parameters, you can specify overrides for parameter values during execution
through the APIs. Through this method, you can iterate job executions across all matching sources of a
parameterized dataset. For more information on creating datasets with parameters, see
Overview of Parameterization.
In the following, the runParameters override has been specified for the varRegion. In this case, the value 02 i
s inserted for the specified variable as part of the job execution.

Copyright © 2019 Trifacta Inc. Page #235


{
"wrangledDataset": {
"id": 33
},
"overrides": {
"execution": "photon",
"profiler": false,
"writesettings": [
{
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/cdr_txt.c
sv",
"action": "create",
"format": "csv",
"compression": "none",
"header": false,
"asSingleFile": false
}
],
"runParameters": {
"overrides": {
"data": [{
"key": "varRegion",
"value": "02"
}
]}
},
},
"ranfrom": "cli"
}

Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
"reason": "JobStarted",
"sessionId": "eb3e98e0-02e3-11e8-a819-25c9559a2a2c",
"id": 9
}

Reference

Request Reference:

Property Description

wrangledDataset (required) Internal identifier for the object whose results you wish to generate. The recipes of all
preceding datasets on which this dataset depends are executed as part of the job.

Copyright © 2019 Trifacta Inc. Page #236


overrides.execution (required, if first time running the job) Indicates the running environment on which the job is
executed. Accepted values:

photon
spark

For more information, see Running Environment Options.

overrides.profiler (required, if first time running the job) When set to true, a visual profile of the job is generated as
specified by the profiling options for the platform. See Profiling Options.

overrides.writesettings (required, if first time running the job) These settings define the publishing options for the job.

ranfrom (optional) If this value is set to null, then the job does not show up in the Job Details page.

If set to cli, the job appears as a CLI job.

See Job Details Page.

Response reference:

Property Description

reason Current state of the job group at time of API call. Since this call creates the job group, this value is always Job started in
the response to this call.

sessionId Session identifier for the job group.

id Internal identifier of the job group.

API JobGroups Delete v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Delete the specified jobGroup.


Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v4/jobGroups/<id>

Copyright © 2019 Trifacta Inc. Page #237


where:

Parameter Description

<id> Internal identifier for the job group

Request URI - Example:

/v4/jobGroups/2

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

None.
API JobGroups Get Jobs v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get list of jobs for the specified jobGroup. For more information on jobGroups, see API JobGroups Get v4.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/jobGroups/<id>/jobs

Copyright © 2019 Trifacta Inc. Page #238


where:

Parameter Description

<id> Internal identifier for the job group

Request URI - Example:

/v4/jobGroups/20/jobs

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"data": [
{
"id": 41,
"status": "Complete",
"jobType": "wrangle",
"sampleSize": 100,
"percentComplete": 100,
"lastHeartbeatAt": "2019-02-11T23:55:32.604Z",
"createdAt": "2019-02-11T23:55:32.044Z",
"updatedAt": "2019-02-11T23:55:34.563Z",
"creator": {
"id": 1
},
"jobGroup": {
"id": 20
},
"errorMessage": null
},
{
"id": 42,
"status": "Complete",
"jobType": "filewriter",
"sampleSize": 100,
"percentComplete": 100,
"lastHeartbeatAt": "2019-02-11T23:55:34.676Z",
"createdAt": "2019-02-11T23:55:32.087Z",
"updatedAt": "2019-02-11T23:55:35.006Z",
"creator": {
"id": 1
},
"jobGroup": {
"id": 20
},

Copyright © 2019 Trifacta Inc. Page #239


"errorMessage": null
},
{
"id": 43,
"status": "Complete",
"jobType": "filewriter",
"sampleSize": 100,
"percentComplete": 100,
"lastHeartbeatAt": "2019-02-11T23:55:34.708Z",
"createdAt": "2019-02-11T23:55:32.089Z",
"updatedAt": "2019-02-11T23:55:35.439Z",
"creator": {
"id": 1
},
"jobGroup": {
"id": 20
},
"errorMessage": null

Copyright © 2019 Trifacta Inc. Page #240


}
]
}

Reference

Property Description

id Internal identifier of the job within the job group.

status Current status of the job. See API JobGroups Get v4.

jobType Type of job. Values:

filewriter - output results to a specified file


ingest - internal job for transferring data into HDFS.
profile - job to compute statistical information about set of results.
publish - job to publish previously wrangled results to a new destination.
wrangle - execute the specified set of Wrangle steps on a dataset.
sampleSize Size of sample as a percentage of the entire dataset. 100 means that the entire dataset is used as the sample.

percentComplete Percentage of completion of the job at the time of the request. 100 means that the job has finished or failed.

lastHeartbeatAt Timestamp for last automated status check on the job

createdAt Timestamp for when the job was launched.

updatedAt Timestamp for when the job was last updated.

creator.id Internal identifier of the user who launched the job.

jobGroup.id Internal identifier for the job group to which the job belongs.

errorMessage If the job failed, any error message is displayed here.

For more information on the other properties, see API JobGroups Get v4.
API JobGroups Get List v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get list of all jobGroups accessible to the authenticated user.


Version: v4

Required Permissions

Copyright © 2019 Trifacta Inc. Page #241


NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/jobGroups

Endpoint with embedded reference:

/v4/jobGroups/?embed=jobs,wrangledDataset

For more information, see API JobGroups Get v4.


Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

/v4/jobGroups/?embed=jobs,wrangledDataset&limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.
Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"data": [
{
"id": 20,
"status": "Complete",
"ranfrom": "ui",
"ranfor": "recipe",
"createdAt": "2019-02-11T23:55:31.804Z",
"profilingEnabled": true,
"updatedAt": "2019-02-11T23:55:35.445Z",
"runParameterReferenceDate": "2019-02-11T23:55:31.000Z",
"name": null,
"wrangledDataset": {
"id": 45,
"flow": {

Copyright © 2019 Trifacta Inc. Page #242


"id": 11,
"name": "2013 POS",
"associatedPeople": {
"data": [
{
"id": 1,
"isAdmin": true,
"flowPermission": {
"role": "owner",
"person": {
"id": 1
},
"flow": {
"id": 11
}
}
}
]
}
}
},
"jobGroupRunParameterOverrides": {
"data": []
},
"runParameterEdits": {
"data": []
},
"creator": {
"id": 1
},
"updater": {
"id": 1
}
},
{
"id": 2,
"status": "Complete",
"ranfrom": "ui",
"ranfor": "recipe",
"createdAt": "2019-02-11T11:49:44.642Z",
"profilingEnabled": true,
"updatedAt": "2019-02-11T11:51:30.325Z",
"runParameterReferenceDate": "2019-02-11T11:49:44.000Z",
"name": null,
"wrangledDataset": {
"id": 8
},
"jobGroupRunParameterOverrides": {
"data": []
},
"runParameterEdits": {
"data": []
},

Copyright © 2019 Trifacta Inc. Page #243


"creator": {
"id": 3
},
"updater": {
"id": 3
}

Copyright © 2019 Trifacta Inc. Page #244


}
]
}

Reference

Some properties related to the jobGroup appear only in this endpoint. They are listed below.
For more information on the properties of a jobGroup, see API JobGroups Get v4.

Property Description

wrangledDataset.flow.id Internal identifier for the flow containing the recipe that was executed with the job.

wrangledDataset.flow.name Name of the flow containing the recipe that was executed with the job.

wrangledDataset.flow.associatedPeople.* All users who have access to the flow and their roles

jobGroupRunParameterOverrides.* These values define any parameter overrides that were applied during the job in the following
key-value form:

"jobGroupRunParameterOverrides": {
"data": [
{
"value": {
"variable": {
"value": "basic_types1"
}
},
"id": 1,
"overrideKey": "name",
"isImplicit": false,
"createdAt":
"2018-03-21T06:56:57.042Z",
"updatedAt":
"2018-03-21T06:56:57.042Z",
"jobGroup": {
"id": 93
}
}
]
},

runParameterEdits.* runParameterEdits contains the state history of all parameters and their values during
job execution

API JobGroups Get Publications v4

This is the latest version of the APIs.

Contents:

Copyright © 2019 Trifacta Inc. Page #245


Required Permissions
Request
Response
Reference

Get list of publications for the specified jobGroup. A publication is an export of job results from the platform after
they have been initially generated.
For more information on publications, see API Publications Get v4.
For more information on jobGroups, see API JobGroups Get v4.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/jobGroups/<id>/publications

where:

Parameter Description

<id> Internal identifier for the job group

Request URI - Example:

/v4/jobGroups/22/publications

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:


The following example publication is for an export to a Hive table.

{
"data": [
{
"path": [
"default"
],

Copyright © 2019 Trifacta Inc. Page #246


"id": 6,
"tableName": "POSr01_tri",
"targetType": "hive",
"action": "create",
"createdAt": "2019-02-12T00:33:28.073Z",
"updatedAt": "2019-02-12T00:33:28.073Z",
"connection": {
"id": 1,
"host": "hadoop",
"port": 10000,
"vendor": "hive",
"params": {
"jdbc": "hive2",
"connectStringOptions": "",
"defaultDatabase": "default"
},
"ssl": false,
"vendorName": "hive",
"name": "hive",
"description": null,
"type": "jdbc",
"isGlobal": true,
"credentialType": "conf",
"credentialsShared": true,
"uuid": "6a4fbd20-2dd7-11e9-866f-a7fa75ecd0df",
"disableTypeInference": false,
"createdAt": "2019-02-11T08:31:23.635Z",
"updatedAt": "2019-02-11T08:31:23.650Z",
"credentials": [],
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"workspace": {
"id": 1
}
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"outputObject": null

Copyright © 2019 Trifacta Inc. Page #247


}
]
}

Reference

For more information on the properties of a publication, see API Publications Get v4.
For more information on the other properties, see API JobGroups Get v4.
API JobGroups Get Status v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get current status of the specified jobGroup. For more information on jobGroups, see API JobGroups Get v4.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/jobGroups/<id>/status

where:

Parameter Description

<id> Internal identifier for the job group

Request URI - Example:

/v4/jobGroups/2/status

Request URI - Example for status across all jobs:

Copyright © 2019 Trifacta Inc. Page #248


If you need to check the status of all jobs, you can use the following endpoint, querying for status values of Fail
ed:

/v4/jobGroups/?status=Failed

Request Body:
Empty.
Response Body:
Returned response is the jobGroup definition for all accessible jobGroups where status=Failed. See
API JobGroups Get v4.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

"Complete"

Reference

For more information on the available status messages, see API JobGroups Get v4.
API JobGroups Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get information on the specified job group. A job group is a job that is executed from a specific node in a flow.
The job group may contain:
Wrangling job on the dataset associated with the node
Jobs on all datasets on which the selected job may depend
A profiling job for the job group
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #249


Request

Request Type: GET

Endpoint:

/v4/jobGroups/<id>

where:

Parameter Description

<id> Internal identifier for the job group

Endpoint with embedded references:


Use the following endpoint to embed additional information:

Embed Description
Parameter

jobs Embed information about the child jobs within the job group. Array includes information on transformation, profiling, and
publishing jobs that are part of the job group.

wrangledDataset This field contains the internal identifier for the dataset on which the job was run.

/v4/jobGroups/<id>?embed=jobs,wrangledDataset

Request URI - Example:

/v4/jobGroups/8

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #250


{
"id": 8,
"name": null,
"description": null,
"ranfrom": "ui",
"ranfor": "recipe",
"status": "Complete",
"profilingEnabled": true,
"runParameterReferenceDate": "2018-01-25T18:01:15.000Z",
"createdAt": "2018-01-25T18:01:16.456Z",
"updatedAt": "2018-01-25T18:01:21.082Z",
"workspace": {
"id": 1
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"snapshot": {
"id": 8
},
"wrangledDataset": {
"id": 12
},
"flowRun": null
}

Response Body Example with embedded reference:


The following example response includes embedded information on jobs and the recipe associated with it.

{
"id": 8,
"name": null,
"description": null,
"ranfrom": "ui",
"ranfor": "recipe",
"status": "Complete",
"profilingEnabled": true,
"runParameterReferenceDate": "2018-01-25T18:01:15.000Z",
"createdAt": "2018-01-25T18:01:16.456Z",
"updatedAt": "2018-01-25T18:01:21.082Z",
"jobs": {
"data": [
{
"id": 2,
"createdAt": "2018-01-25T18:01:16.687Z",
"updatedAt": "2018-01-25T18:01:21.071Z",

Copyright © 2019 Trifacta Inc. Page #251


"status": "Complete",
"jobType": "filewriter",
"sampleSize": 100,
"percentComplete": 100,
"creator": {
"id": 1
},
"jobGroup": {
"id": 1
},
"errorMessage": null,
"scriptResult": {
"id": 1
},
"writeSetting": {
"id": 2
}
},
{
"id": 1,
"executionLanguage": "photon",
"cpJobId": null,
"templateLocation": null,
"createdAt": "2018-01-25T18:01:16.652Z",
"updatedAt": "2018-01-25T18:01:19.558Z",
"status": "Complete",
"jobType": "wrangle",
"sampleSize": 100,
"percentComplete": 100,
"creator": {
"id": 1
},
"jobGroup": {
"id": 1
},
"errorMessage": null,
"wrangleScript": {
"id": 9
}
}
]
},
"wrangledDataset": {
"id": 12,
"wrangled": true,
"referenceName": null,
"referenceDescription": null,
"createdAt": "2018-01-25T17:53:54.703Z",
"updatedAt": "2018-01-25T17:53:54.741Z",
"activeSample": {
"id": 12
},
"creator": {

Copyright © 2019 Trifacta Inc. Page #252


"id": 1
},
"updater": {
"id": 1
},
"recipe": {
"id": 12
},
"flow": {
"id": 4
}
},
"workspace": {
"id": 1
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"snapshot": {
"id": 8

Copyright © 2019 Trifacta Inc. Page #253


},
"flowRun": null
}

Reference

Property Description

id Internal identifier for the job group

name Internal name of the job group

description User-friendly description for the job group

ranfrom Where the job group was executed from:

ui - Trifacta application
cli - command line interface
ranfor Description of the object for which the job was run:

recipe - Recipe in a flow


sampling - Sampling job for a dataset
pendingDatasource - Job that is subjected to conversion, such as importing a Microsoft Excel
workbook.

status Current status of the job group:

Created - job group has been created based on the current action.
Pending - job group is queued for execution.
InProgress - job group is currently running.
Complete - job group has completed successfully.
Failed - job group has failed.
Canceled - job group was canceled by user action.
profilingEnabled When true, a profiling job was executed as part of this job group.

runParameterReferenceDate When a recipe is executed with dynamic parameters, this parameter is stored with the timestamp at the time
of execution. It can be used in debugging execution issues. Do not modify this value.

createdAt Timestamp for when the job group was launched

updatedAt Timestamp for when the job group was last updated

jobs A list of all jobs that were launched based on this job group. For more information, see
API JobGroups Get Jobs v4.

wrangledDataset Internal identifier of the object from where the job group was executed. For more information, see
API WrangledDatasets Get v4.

workspace.id Internal identifier for the workspace where the job was executed

creator.id Internal identifier for the user who created the job group

updater.id Internal identifier for the user who last updated the job group

snapshot.id Internal identifier of the data snapshot for the job group

flowRun
NOTE: This parameter is used for internal platform purposes. Do not modify.

API JobGroups Put Publish v4

Copyright © 2019 Trifacta Inc. Page #254


This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

For a specified jobGroup, this endpoint performs an ad-hoc publish of the results to the designated target.
Target information is based on the specified connection.
Job results to published are based on the specified jobGroup.
You can specify:
Database and table to which to publish
Type of action to be applied to the target table. Details are below.
Supported targets:
Hive
Redshift
For more information on jobGroups, see API JobGroups Get v4.
For additional examples, see API Workflow - Publish Results.
Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PUT

Endpoint:

/v4/jobGroups/<id>/publish

where:

Parameter Description

<id> Internal identifier for the job group

Request URI - Example:

/v4/jobGroups/31/publish

Request Body - Hive:

Copyright © 2019 Trifacta Inc. Page #255


{
"connection": {
"id": 1
},
"path": ["default"],
"table": "test_table",
"action": "create",
"inputFormat": "pqt",
"flowNodeId": 27
}

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"jobgroupId": 31,
"reason": "JobStarted",
"sessionId": "f6c5f350-2102-11e9-bb80-9faf7b15f235"
}

Reference

Request Reference:

Property Description

connection Internal identifier of the connection to use to write the results.

path Name of database to which to write the results. This value must be enclosed in square brackets.

table Name of table in the database to which to write the results.

action Type of writing action to perform with the results. Supported actions:

create - Create a new table with each publication. This table is empty except for the schema, which is taken
from the results. A new table receives a timestamp extension to its name.
load - Append a pre-existing table with the results of the data. The schema of the results and the table must
match.
createAndLoad - Create a new table with each publication and load it with the results data. A new table
receives a timestamp extension to its name.
truncateAndLoad - Truncate a pre-existing table and load it with fresh data from the results.
dropAndLoad - Drop the target table and load a new table with the schema and data from the results.

Copyright © 2019 Trifacta Inc. Page #256


inputFormat Source format of the results. Supported values:

Hive:

avro
pqt
Redshift:

NOTE: For results to be written to Redshift, the source must be stored in S3 and accessed through an S3
connection.

NOTE: By default, data is published to Redshift using the public schema. To publish using a different
schema, preface the table value with the name of the schema to use: MySchema.MyTable.

csv
json
avro

flowNodeId The internal identifier for the recipe (wrangledDataset) from which the job was executed.

For more information on the available status messages, see API JobGroups Put Publish v4.
API OutputObjects Create v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

NOTE: This feature is in Beta release.

Create an outputobject.
Version: v4

Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

Copyright © 2019 Trifacta Inc. Page #257


writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

NOTE: If an outputobject already exists for the recipe (flowNodeId) to which you are posting, you must
either modify the object instead or delete it before posting your new object.

Endpoint:

/v4/outputobjects

Request Body:
Following creates an outputobject with an embedded writesettings object to write an Avro file to the specified
location:

Copyright © 2019 Trifacta Inc. Page #258


{
"execution": "photon",
"profiler": true,
"isAdhoc": true,
"writeSettings": {
"data": [
{
"delim": ",",
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS_999.a
vro",
"action": "create",
"format": "avro",
"compression": "none",
"header": false,
"asSingleFile": false,
"prefix": null,
"suffix": "_increment",
"hasQuotes": false,
"outputObject": {
"id": 4
}
}
]
},
"flowNode": {
"id": 15
}
}

Response

Response Status Code - Success: 201 - Created

Copyright © 2019 Trifacta Inc. Page #259


{
"id": 6,
"execution": "photon",
"profiler": true,
"isAdhoc": true,
"updatedAt": "2018-11-08T19:08:57.517Z",
"createdAt": "2018-11-08T19:08:57.517Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"flowNode": {
"id": 15
}
}

Reference

For more information on the properties of an outputobject, see API OutputObjects Get v4.
API OutputObjects Delete v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

NOTE: This feature is in Beta release.

Delete the specified outputobject.


Version: v4

Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

Copyright © 2019 Trifacta Inc. Page #260


writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v4/outputobjects/<id>

where:

Property Description

<id> Internal identifier for the outputobject

Request URI - Example:

/v4/outputobjects/3

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

For more information on the properties of an outputobject, see API OutputObjects Get v4.
API OutputObjects Get List v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request

Copyright © 2019 Trifacta Inc. Page #261


Response
Reference

NOTE: This feature is in Beta release.

Get the list of outputobjects for the authenticated user.


Version: v4

Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/outputobjects

For more information, see API OutputObjects Get v4.


Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"data": [
{

Copyright © 2019 Trifacta Inc. Page #262


"id": 4,
"execution": "photon",
"profiler": true,
"isAdhoc": true,
"createdAt": "2018-11-08T18:56:13.916Z",
"updatedAt": "2018-11-08T18:56:47.334Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"flowNode": {
"id": 14
}
},
{
"id": 3,
"execution": "photon",
"profiler": true,
"isAdhoc": true,
"createdAt": "2018-11-08T18:51:56.633Z",
"updatedAt": "2018-11-08T18:52:44.535Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"flowNode": {
"id": 13
}
},
{
"id": 1,
"execution": "photon",
"profiler": true,
"isAdhoc": true,
"createdAt": "2018-11-08T15:56:10.050Z",
"updatedAt": "2018-11-08T15:56:10.218Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"flowNode": {
"id": 2
}

Copyright © 2019 Trifacta Inc. Page #263


}
]
}

Reference

For more information on the properties of a write setting, see API OutputObjects Get v4.
API OutputObjects Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

NOTE: This feature is in Beta release.

Get the specified objectobject.


Version: v4

Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

Copyright © 2019 Trifacta Inc. Page #264


/v4/outputobjects/<id>

Endpoint with embedded references:


Within the returned outputobject, you can embed any writesettings and publications objects that are associated
with it:

/v4/outputobjects/<id>?embed=writesettings,publications

Request URI - Example:

/v4/outputobjects/3/

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 3,
"execution": "photon",
"profiler": true,
"isAdhoc": true,
"createdAt": "2018-11-08T18:51:56.633Z",
"updatedAt": "2018-11-08T18:52:44.535Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"flowNode": {
"id": 13
}
}

Response Body Example with embedded reference:


The following example response includes embedded information on the writesettings and publications associated
with the outputobject:

{
"id": 3,

Copyright © 2019 Trifacta Inc. Page #265


"execution": "photon",
"profiler": true,
"isAdhoc": true,
"createdAt": "2018-11-08T18:51:56.633Z",
"updatedAt": "2018-11-08T18:52:44.535Z",
"writeSettings": {
"data": [
{
"delim": ",",
"id": 4,
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS_r01.a
vro",
"action": "create",
"format": "avro",
"compression": "none",
"header": false,
"asSingleFile": false,
"prefix": null,
"suffix": "_increment",
"hasQuotes": false,
"createdAt": "2018-11-08T18:52:44.526Z",
"updatedAt": "2018-11-08T18:52:44.526Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"outputObject": {
"id": 3
}
},
{
"delim": ",",
"id": 3,
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS-r01.c
sv",
"action": "create",
"format": "csv",
"compression": "none",
"header": true,
"asSingleFile": true,
"prefix": null,
"suffix": "_increment",
"hasQuotes": true,
"createdAt": "2018-11-08T18:51:56.871Z",
"updatedAt": "2018-11-08T18:51:56.871Z",
"creator": {
"id": 1
},
"updater": {

Copyright © 2019 Trifacta Inc. Page #266


"id": 1
},
"outputObject": {
"id": 3
}
}
]
},
"publications": {
"data": [
{
"path": [
"default"
],
"id": 1,
"tableName": "POS_r01",
"targetType": "hive",
"action": "dropAndLoad",
"createdAt": "2018-11-08T18:52:43.871Z",
"updatedAt": "2018-11-08T18:52:43.871Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"outputObject": {
"id": 3
},
"connection": {
"id": 1
}
}
]
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"flowNode": {

Copyright © 2019 Trifacta Inc. Page #267


"id": 13
}
}

Reference

Property Description

id Internal identifier of the outputobject

execution The execution engine on which the output is generated. Possible values:

photon - Job is executed on the Trifacta Server. This environment is suitable for small- to medium-sized jobs.
spark - Job is executed on the Hadoop cluster to which the Trifacta platform is connected. See Configure for Spark.
emrSpark - Job is executed on integrated EMR cluster. See Configure for EMR.
databricksSpark - Job is executed on the Azure Databricks cluster connected to the platform. See
Configure for Azure Databricks.

For more information, see Running Environment Options.

profiler If true, a visual profile of the results is generated as part of the output.

isAdhoc If true, the outputobject is used for ad-hoc execution of the recipe.

If false, the outputobject is used when a schedule job on the recipe is executed.

createdAt Timestamp for when the outputobject was created

updatedAt Timestamp for when the outputobject was last updated

creator.id Internal identifier of the user who created the object

updater.id Internal identifier of the user who last updated the object

flowNodeId Internal identifier of the recipe with which the object is associated

For more information on the writesettings properties, see API WriteSettings Get v4.
For more information on the publications properties, see API Publications Get v4.
API OutputObjects Update v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

NOTE: This feature is in Beta release.

Modify the specified outputobject.

Copyright © 2019 Trifacta Inc. Page #268


Version: v4

Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PUT

Endpoint:

/v4/outputobjects/<id>

Request URI - Example:

/v4/outputobjects/3/

Request Body:
The following changes the running environment used for the outputobject and enables visual profiling. For more
information on the available properties, see API OutputObjects Get v4.

{
"execution": "spark",
"profiler": true
}

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #269


{
"id": 3,
"updater": {
"id": 1
},
"updatedAt": "2018-11-08T19:15:07.846Z"
}

Reference

For more information on the properties of the outputobject, see API OutputObjects Get v4.
API People Create v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Create a new user.


Version: v4

Required Permissions

The authenticated user must be an admin.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/people/

Request Body:

The following properties are the minimum required to create a user:

Copyright © 2019 Trifacta Inc. Page #270


{
"accept": "accept",
"password": "foo",
"password2": "foo",
"email": "[email protected]",
"name": "abc2"
}

The following request includes all configurable parameters:

{
"accept": "accept",
"password": "foo",
"password2": "foo",
"email": "[email protected]",
"name": "abc2",
"isAdmin": false,
"ssoPrincipal": null,
"hadoopPrincipal": null,
"lastLoginTime": null,
"awsConfig": null
}

Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
"isDisabled": false,
"forcePasswordChange": false,
"state": "active",
"id": 9,
"email": "[email protected]",
"name": "Test1",
"ssoPrincipal": null,
"hadoopPrincipal": null,
"isAdmin": false,
"updatedAt": "2019-01-09T20:23:31.560Z",
"createdAt": "2019-01-09T20:23:31.560Z",
"outputHomeDir": "/trifacta/queryResults/[email protected]",
"lastStateChange": null,
"fileUploadPath": "/trifacta/uploads",
"awsConfig": null
}

Copyright © 2019 Trifacta Inc. Page #271


Reference

Request properties:

Property Description

accept This property must be set to accept to create the user.

password2 This value confirms the value for password. These two property values must be identical.

For more information on the properties of a user, see API People Get v4.
API People Delete v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Delete the specified user.


Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v4/people/<id>

where:

Parameter Description

<id> Internal identifier for the user

Request URI - Example:

/v4/people/6

Request Body:
Empty.

Copyright © 2019 Trifacta Inc. Page #272


Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

For more information on the properties of a user, see API People Get v4.
API People Get List v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Returns a list of all users of the platform.


Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/people

Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 2-102:

/v4/people?limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.

Copyright © 2019 Trifacta Inc. Page #273


Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"data": [
{
"id": 4,
"email": "[email protected]",
"name": "Test1",
"ssoPrincipal": null,
"hadoopPrincipal": null,
"isAdmin": false,
"outputHomeDir": "/trifacta/queryResults/[email protected]",
"isDisabled": false,
"forcePasswordChange": false,
"state": "active",
"lastStateChange": null,
"createdAt": "2019-01-09T20:23:31.560Z",
"updatedAt": "2019-01-09T20:25:03.000Z",
"fileUploadPath": "/trifacta/uploads",
"awsConfig": null
},
{
"id": 3,
"email": "[email protected]",
"name": "Test User95203645",
"ssoPrincipal": null,
"hadoopPrincipal": null,
"isAdmin": false,
"outputHomeDir":
"/trifacta/queryResults/[email protected]",
"isDisabled": false,
"forcePasswordChange": false,
"state": "active",
"lastStateChange": null,
"createdAt": "2019-01-09T10:39:50.310Z",
"updatedAt": "2019-01-09T10:39:50.349Z",
"fileUploadPath": "/trifacta/uploads",
"awsConfig": null
},
{
"id": 2,
"email": "[email protected]",
"name": "Test User83466845",
"ssoPrincipal": null,

Copyright © 2019 Trifacta Inc. Page #274


"hadoopPrincipal": null,
"isAdmin": false,
"outputHomeDir":
"/trifacta/queryResults/[email protected]",
"isDisabled": false,
"forcePasswordChange": false,
"state": "active",
"lastStateChange": null,
"createdAt": "2019-01-09T09:21:13.501Z",
"updatedAt": "2019-01-09T09:21:13.548Z",
"fileUploadPath": "/trifacta/uploads",
"awsConfig": null
},
{
"id": 1,
"email": "[email protected]",
"name": "Administrator",
"ssoPrincipal": null,
"hadoopPrincipal": null,
"isAdmin": true,
"outputHomeDir": "/trifacta/queryResults/[email protected]",
"isDisabled": false,
"forcePasswordChange": false,
"state": "active",
"lastStateChange": null,
"createdAt": "2019-01-09T08:40:02.555Z",
"updatedAt": "2019-01-09T08:40:02.647Z",
"fileUploadPath": "/trifacta/uploads",
"awsConfig": null

Copyright © 2019 Trifacta Inc. Page #275


}
]
}

Reference

For more information on the properties of a user, see API People Get v4.

API People Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Retrieve the platform account information for a user specified by userId.


Version: v4

Required Permissions

The authenticated user must be an admin.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/people/<id>

where:

Parameter Description

<id> Internal identifier of the user to retrieve.

Request URI - Example:

/v4/people/4

Request Body:

Copyright © 2019 Trifacta Inc. Page #276


Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 2,
"email": "[email protected]",
"name": "Joe Guy",
"ssoPrincipal": null,
"hadoopPrincipal": null,
"isAdmin": false,
"isDisabled": false,
"forcePasswordChange": false,
"state": "active",
"lastStateChange": null,
"createdAt": "2019-02-12T09:04:52.073Z",
"updatedAt": "2019-02-12T09:04:52.073Z",
"outputHomeDir": "/trifacta/queryResults/[email protected]",
"fileUploadPath": "/trifacta/uploads",
"awsConfig": null
}

Copyright © 2019 Trifacta Inc. Page #277


Reference

Property Description

id Internal user identifier

email Email address (and loginId) for the user

name Display name for the user

ssoPrincipal (If SSO is enabled) Principal value of the user for single-sign on

hadoopPrincipal (If secure impersonation is enabled) Hadoop principal value for the user, which determines permissions on the
Hadoop cluster

isAdmin If true, the user account is an administrator account.

isDisabled If true, the account is disabled.

forcePasswordChange (if enabled) When set to true, the user must change the account password on next login.

state Current state of the user account:

active - user is active and usable in the workspace


suspended - user cannot access the workspace. User's data remains.
deleted - user account has been deleted. If not re-enabled within a period of time, the user's data is
removed.

lastStateChange Timestamp for when the value of the state parameter was changed.

createdAt Timestamp for when the user account was created

updatedAt Timestamp for when the user account was last modified

outputHomeDir Home directory where the user's generated results are written

fileUploadPath Path on backend datastore where files uploaded from the user's desktop are stored for use as imported datasets.

awsConfig (If AWS integration is enabled) Value contains the S3 credentials, default bucket, and any extra buckets to which
the user has access

For more information on roles, see Manage Users.


API People Patch v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Updates the platform account information for a user specified by userId.


Version: v4

Required Permissions

The authenticated user must be an admin.

Copyright © 2019 Trifacta Inc. Page #278


NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

/v4/people/<id>

where:

Parameter Description

<id> Internal identifier of the user to update.

Request URI - Example:

/v4/people/6

Request Body:

NOTE: For the PATCH method, only the properties that are being patched need to be submitted.

{
"outputHomeDir": "/trifacta/queryResults/[email protected]",
"email": "[email protected]",
"name": "Joe Example",
"ssoPrincipal": null,
"hadoopPrincipal": null,
"isAdmin": false,
"isDisabled": false,
"forcePasswordChange": true,
"awsConfig": null
}

Request Body Example - Disable User


You can use the following request to disable the specified user. This request also removes the admin role from
the user.

{
"isAdmin": false,
"isDisabled": true
}

Copyright © 2019 Trifacta Inc. Page #279


Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 6,
"updatedAt": "2018-01-24T23:49:08.199Z"
}

Reference

For more information on these properties, see API People Get v4.

API Publications Create v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Create a publications object.


Version: v4

Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #280


Request

Request Type: POST

Endpoint:

/v4/publications

Request Body:
Following creates a publications object and associates it with outputobject ID=3:

{
"path": [
"default"
],
"tableName": "newTable",
"targetType": "hive",
"action": "create",
"outputObject": {
"id": 3
},
"connection": {
"id": 1
}
}

Response

Response Status Code - Success: 201 - Created

Copyright © 2019 Trifacta Inc. Page #281


{
"path": [
"default"
],
"id": 3,
"tableName": "newTable",
"targetType": "hive",
"action": "create",
"updatedAt": "2018-11-08T21:10:35.204Z",
"createdAt": "2018-11-08T21:10:35.204Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"outputObject": {
"id": 3
},
"connection": {
"id": 1
}
}

Reference

For more information on the properties of a publications object, see API Publications Get v4.
API Publications Delete v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Delete the specified publications object.


Version: v4

Terms...
Relevant terms:

Term Description

Copyright © 2019 Trifacta Inc. Page #282


outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v4/publications/<id>

where:

Property Description

<id> Internal identifier for the publications object

Request URI - Example:

/v4/publications/3

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

For more information on the properties of a publications object, see API Publications Get v4.
API Publications Get List v4

Copyright © 2019 Trifacta Inc. Page #283


This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the list of publications objects for the authenticated user.


Version: v4

Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/publications

For more information, see API Publications Get v4.


Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #284


{
"data": [
{
"path": [
"default"
],
"id": 2,
"tableName": "POS_r01",
"targetType": "redshift",
"action": "dropAndLoad",
"createdAt": "2018-11-08T18:56:04.475Z",
"updatedAt": "2018-11-08T18:56:04.475Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"outputObject": null,
"connection": {
"id": 4
}
},
{
"path": [
"default"
],
"id": 1,
"tableName": "POS_r01",
"targetType": "hive",
"action": "dropAndLoad",
"createdAt": "2018-11-08T18:52:43.871Z",
"updatedAt": "2018-11-08T18:52:43.871Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"outputObject": {
"id": 3
},
"connection": {
"id": 1
}
}
]
}

Copyright © 2019 Trifacta Inc. Page #285


Reference

For more information on the properties of a publication, see API Publications Get v4.
API Publications Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the specified publications object.


Version: v4

Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/publications/<id>

Request URI - Example:

/v4/publications/1/

Copyright © 2019 Trifacta Inc. Page #286


Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"path": [
"default"
],
"id": 1,
"tableName": "POS_r01",
"targetType": "hive",
"action": "dropAndLoad",
"createdAt": "2018-11-08T18:52:43.871Z",
"updatedAt": "2018-11-08T18:52:43.871Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"outputObject": {
"id": 3
},
"connection": {
"id": 1
}
}

Copyright © 2019 Trifacta Inc. Page #287


Reference

Property Description

path Path to the database where the table is to be written.

id Internal identifier of the publications object

tableName Name of the table to write

targetType Type of table that you are writing. Example values:

hive
redshift
sqldatawarehouse
For more information, see Connection Types.

action The write action to apply to the table, in the event that the table exists:

create - create a new table with each publication


createAndLoad - append your data to the table
truncateAndLoad - truncate the table and load it with your data
dropAndLoad - drop the table and write the new table in its place
createdAt Timestamp for when the publications object was created

updatedAt Timestamp for when the publications object was last updated

creator.id Internal identifier of the user who created the object

updater.id Internal identifier of the user who last updated the object

outputObject.id Internal identifier of the outputobject with which the publication object is associated

connection.id Internal identifier of the connection to use when publishing

API Publications Update v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Modify the specified publications object.


Version: v4

Terms...
Relevant terms:

Term Description

Copyright © 2019 Trifacta Inc. Page #288


outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PUT

Endpoint:

/v4/publications/<id>

Request URI - Example:

/v4/publications/3/

Request Body:
The following changes the path, table name, and action applied on the table for the specified publication. For
more information on the properties, see API Publications Get v4.

{
"path": [
"default"
],
"tableName": "MyTable-DropAndLoad",
"action": "dropAndLoad"
}

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #289


{
"id": 3,
"updater": {
"id": 1
},
"updatedAt": "2018-11-08T21:11:54.880Z"
}

Reference

For more information on the properties of the publications object, see API Publications Get v4.
API Releases Create DryRun v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

NOTE: This feature is in Beta release.

Perform a dry-run of creating a release for the specified deployment, which performs a check of all permissions
required to import the package, as well as any specified import rules.
For more information on import rules, see Define Import Mapping Rules.
If they occur, errors are reported in the response.
After you have successfully completed a dry-run, you can formally create the release via API. See
API Releases Create v4.

NOTE: Releases pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Copyright © 2019 Trifacta Inc. Page #290


Endpoint:

/v4/deployments/<id>/releases/dryRun

where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v4/deployments/2/releases/dryRun

Request Body:
The request body must include the following key and value combination submitted as form data:

key value

data "@path-to-file"

Example request - curl:

curl -X POST \
http://example.com:3005/v4/deployments/1/releases/dryRun \
-H 'authorization: Basic c29sc29uQHRyaWZhY3RhLmNvbTphZG1pbg==' \
-H 'cache-control: no-cache' \
-H 'content-type: multipart/form-data' \
-F [email protected]

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"importRuleChanges": {
"object": [],
"value": []
},
"deletedObjects": {},
"primaryFlowIds": [
8
],
"flows": [
{
"id": 8,
"name": "2013 POS",
"description": null,
"deleted_at": null,

Copyright © 2019 Trifacta Inc. Page #291


"cpProject": null,
"workspaceId": 1,
"createdAt": "2019-02-13T18:39:03.426Z",
"updatedAt": "2019-02-13T20:52:03.707Z",
"createdBy": 7,
"updatedBy": 7,
"folderId": null
}
],
"datasources": [
{
"isDynamicOrConverted": false,
"id": 30,
"dynamicPath": null,
"cpProject": null,
"workspaceId": 1,
"connectionId": null,
"deleted_at": null,
"isSchematized": true,
"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"storageLocationId": 13,
"createdAt": "2019-02-13T18:39:03.569Z",
"updatedAt": "2019-02-13T20:52:03.803Z",
"createdBy": 7,
"updatedBy": 7,
"parsingScriptId": 40,
"size_deprecated": null,
"path_deprecated": null
},
{
"isDynamicOrConverted": false,
"id": 31,
"dynamicPath": null,
"cpProject": null,
"workspaceId": 1,
"connectionId": null,
"deleted_at": null,
"isSchematized": true,
"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"storageLocationId": 14,
"createdAt": "2019-02-13T18:39:03.876Z",
"updatedAt": "2019-02-13T20:52:03.806Z",
"createdBy": 7,
"updatedBy": 7,
"parsingScriptId": 41,
"size_deprecated": null,
"path_deprecated": null

Copyright © 2019 Trifacta Inc. Page #292


},
{
"isDynamicOrConverted": false,
"id": 32,
"dynamicPath": null,
"cpProject": null,
"workspaceId": 1,
"connectionId": null,
"deleted_at": null,
"isSchematized": true,
"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"storageLocationId": 15,
"createdAt": "2019-02-13T18:39:04.232Z",
"updatedAt": "2019-02-13T20:52:03.808Z",
"createdBy": 7,
"updatedBy": 7,
"parsingScriptId": 42,
"size_deprecated": null,
"path_deprecated": null
},
{
"isDynamicOrConverted": false,
"id": 33,
"dynamicPath": null,
"cpProject": null,
"workspaceId": 1,
"connectionId": null,
"deleted_at": null,
"isSchematized": true,
"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"storageLocationId": 16,
"createdAt": "2019-02-13T18:39:04.546Z",
"updatedAt": "2019-02-13T20:52:03.810Z",
"createdBy": 7,
"updatedBy": 7,
"parsingScriptId": 43,
"size_deprecated": null,
"path_deprecated": null
},
{
"isDynamicOrConverted": false,
"id": 34,
"dynamicPath": null,
"cpProject": null,
"workspaceId": 1,
"connectionId": null,
"deleted_at": null,

Copyright © 2019 Trifacta Inc. Page #293


"isSchematized": true,
"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"storageLocationId": 17,
"createdAt": "2019-02-13T18:39:04.884Z",
"updatedAt": "2019-02-13T20:52:03.813Z",
"createdBy": 7,
"updatedBy": 7,
"parsingScriptId": 44,
"size_deprecated": null,
"path_deprecated": null
},
{
"isDynamicOrConverted": false,
"id": 35,
"dynamicPath": null,
"cpProject": null,
"workspaceId": 1,
"connectionId": null,
"deleted_at": null,
"isSchematized": true,
"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"storageLocationId": 18,
"createdAt": "2019-02-13T18:39:05.237Z",
"updatedAt": "2019-02-13T20:52:03.815Z",
"createdBy": 7,
"updatedBy": 7,
"parsingScriptId": 45,
"size_deprecated": null,
"path_deprecated": null
}
],
"flownodes": [
{
"id": 39,
"scriptId": 40,
"flowId": 8,
"flowNodeId": null,
"deleted_at": null,
"wrangled": false,
"referenceId": null,
"createdAt": "2019-02-13T18:39:03.741Z",
"updatedAt": "2019-02-13T20:52:03.917Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
},
{

Copyright © 2019 Trifacta Inc. Page #294


"id": 40,
"scriptId": 41,
"flowId": 8,
"flowNodeId": null,
"deleted_at": null,
"wrangled": false,
"referenceId": null,
"createdAt": "2019-02-13T18:39:04.049Z",
"updatedAt": "2019-02-13T20:52:03.927Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
},
{
"id": 41,
"scriptId": 42,
"flowId": 8,
"flowNodeId": null,
"deleted_at": null,
"wrangled": false,
"referenceId": null,
"createdAt": "2019-02-13T18:39:04.400Z",
"updatedAt": "2019-02-13T20:52:03.936Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
},
{
"id": 42,
"scriptId": 43,
"flowId": 8,
"flowNodeId": null,
"deleted_at": null,
"wrangled": false,
"referenceId": null,
"createdAt": "2019-02-13T18:39:04.705Z",
"updatedAt": "2019-02-13T20:52:03.944Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
},
{
"id": 43,
"scriptId": 44,
"flowId": 8,
"flowNodeId": null,
"deleted_at": null,
"wrangled": false,
"referenceId": null,
"createdAt": "2019-02-13T18:39:05.040Z",
"updatedAt": "2019-02-13T20:52:03.956Z",
"createdBy": 7,
"updatedBy": 7,

Copyright © 2019 Trifacta Inc. Page #295


"activesampleId": null
},
{
"id": 44,
"scriptId": 45,
"flowId": 8,
"flowNodeId": null,
"deleted_at": null,
"wrangled": false,
"referenceId": null,
"createdAt": "2019-02-13T18:39:05.433Z",
"updatedAt": "2019-02-13T20:52:03.965Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
},
{
"id": 45,
"scriptId": 46,
"flowId": 8,
"flowNodeId": null,
"deleted_at": null,
"wrangled": true,
"referenceId": null,
"createdAt": "2019-02-13T18:39:18.716Z",
"updatedAt": "2019-02-13T20:52:03.972Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
},
{
"id": 46,
"scriptId": 47,
"flowId": 8,
"flowNodeId": null,
"deleted_at": null,
"wrangled": true,
"referenceId": null,
"createdAt": "2019-02-13T18:39:32.754Z",
"updatedAt": "2019-02-13T20:52:03.981Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
}
],
"outputobjects": [],
"release": {
"id": 3,
"deploymentId": 1,
"packageUuid": "9bae78c0-2fcb-11e9-9523-77f56ed58844",
"notes": "example_flow",
"active": true,
"createdBy": 7,

Copyright © 2019 Trifacta Inc. Page #296


"updatedBy": 7,
"updatedAt": "2019-02-13T20:52:03.696Z",

Copyright © 2019 Trifacta Inc. Page #297


"createdAt": "2019-02-13T20:52:03.696Z"
}
}

Reference

For more information on import rule changes, see Define Import Mapping Rules.

API Releases Create v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

NOTE: This feature is in Beta release.

Create a release for the specified deployment.


Release is created from a local ZIP containing the package of the flow exported from the source system.

NOTE: Releases pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/deployments/<id>/releases

where:

Parameter Description

<id> Internal identifier for the deployment

Copyright © 2019 Trifacta Inc. Page #298


Request URI - Example:

/v4/deployments/2/releases

Request Body:
The request body must include the following key and value combination submitted as form data:

key value

data "@path-to-file"

Example request - curl:

curl -X POST \
http://example.com:3005/v4/deployments/1/releases \
-H 'authorization: Basic c29sc29uQHRyaWZhY3RhLmNvbTphZG1pbg==' \
-H 'cache-control: no-cache' \
-H 'content-type: multipart/form-data' \
-F [email protected]

Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
"importRuleChanges": {
"object": [],
"value": []
},
"deletedObjects": {},
"primaryFlowIds": [
6
],
"flows": [
{
"id": 6,
"name": "2013 POS",
"description": null,
"deleted_at": null,
"cpProject": null,
"workspaceId": 1,
"createdAt": "2019-02-13T18:39:03.426Z",
"updatedAt": "2019-02-13T20:39:41.775Z",
"createdBy": 7,
"updatedBy": 7,
"folderId": null
}
],
"datasources": [

Copyright © 2019 Trifacta Inc. Page #299


{
"isDynamicOrConverted": false,
"id": 18,
"dynamicPath": null,
"cpProject": null,
"workspaceId": 1,
"connectionId": null,
"deleted_at": null,
"isSchematized": true,
"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"storageLocationId": 13,
"createdAt": "2019-02-13T18:39:03.569Z",
"updatedAt": "2019-02-13T20:39:41.883Z",
"createdBy": 7,
"updatedBy": 7,
"parsingScriptId": 24,
"size_deprecated": null,
"path_deprecated": null
},
{
"isDynamicOrConverted": false,
"id": 19,
"dynamicPath": null,
"cpProject": null,
"workspaceId": 1,
"connectionId": null,
"deleted_at": null,
"isSchematized": true,
"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"storageLocationId": 14,
"createdAt": "2019-02-13T18:39:03.876Z",
"updatedAt": "2019-02-13T20:39:41.887Z",
"createdBy": 7,
"updatedBy": 7,
"parsingScriptId": 25,
"size_deprecated": null,
"path_deprecated": null
},
{
"isDynamicOrConverted": false,
"id": 20,
"dynamicPath": null,
"cpProject": null,
"workspaceId": 1,
"connectionId": null,
"deleted_at": null,
"isSchematized": true,

Copyright © 2019 Trifacta Inc. Page #300


"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"storageLocationId": 15,
"createdAt": "2019-02-13T18:39:04.232Z",
"updatedAt": "2019-02-13T20:39:41.889Z",
"createdBy": 7,
"updatedBy": 7,
"parsingScriptId": 26,
"size_deprecated": null,
"path_deprecated": null
},
{
"isDynamicOrConverted": false,
"id": 21,
"dynamicPath": null,
"cpProject": null,
"workspaceId": 1,
"connectionId": null,
"deleted_at": null,
"isSchematized": true,
"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"storageLocationId": 16,
"createdAt": "2019-02-13T18:39:04.546Z",
"updatedAt": "2019-02-13T20:39:41.891Z",
"createdBy": 7,
"updatedBy": 7,
"parsingScriptId": 27,
"size_deprecated": null,
"path_deprecated": null
},
{
"isDynamicOrConverted": false,
"id": 22,
"dynamicPath": null,
"cpProject": null,
"workspaceId": 1,
"connectionId": null,
"deleted_at": null,
"isSchematized": true,
"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"storageLocationId": 17,
"createdAt": "2019-02-13T18:39:04.884Z",
"updatedAt": "2019-02-13T20:39:41.892Z",
"createdBy": 7,
"updatedBy": 7,

Copyright © 2019 Trifacta Inc. Page #301


"parsingScriptId": 28,
"size_deprecated": null,
"path_deprecated": null
},
{
"isDynamicOrConverted": false,
"id": 23,
"dynamicPath": null,
"cpProject": null,
"workspaceId": 1,
"connectionId": null,
"deleted_at": null,
"isSchematized": true,
"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"storageLocationId": 18,
"createdAt": "2019-02-13T18:39:05.237Z",
"updatedAt": "2019-02-13T20:39:41.894Z",
"createdBy": 7,
"updatedBy": 7,
"parsingScriptId": 29,
"size_deprecated": null,
"path_deprecated": null
}
],
"flownodes": [
{
"id": 23,
"scriptId": 24,
"flowId": 6,
"flowNodeId": null,
"deleted_at": null,
"wrangled": false,
"referenceId": null,
"createdAt": "2019-02-13T18:39:03.741Z",
"updatedAt": "2019-02-13T20:39:41.993Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
},
{
"id": 24,
"scriptId": 25,
"flowId": 6,
"flowNodeId": null,
"deleted_at": null,
"wrangled": false,
"referenceId": null,
"createdAt": "2019-02-13T18:39:04.049Z",
"updatedAt": "2019-02-13T20:39:42.004Z",
"createdBy": 7,

Copyright © 2019 Trifacta Inc. Page #302


"updatedBy": 7,
"activesampleId": null
},
{
"id": 25,
"scriptId": 26,
"flowId": 6,
"flowNodeId": null,
"deleted_at": null,
"wrangled": false,
"referenceId": null,
"createdAt": "2019-02-13T18:39:04.400Z",
"updatedAt": "2019-02-13T20:39:42.012Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
},
{
"id": 26,
"scriptId": 27,
"flowId": 6,
"flowNodeId": null,
"deleted_at": null,
"wrangled": false,
"referenceId": null,
"createdAt": "2019-02-13T18:39:04.705Z",
"updatedAt": "2019-02-13T20:39:42.021Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
},
{
"id": 27,
"scriptId": 28,
"flowId": 6,
"flowNodeId": null,
"deleted_at": null,
"wrangled": false,
"referenceId": null,
"createdAt": "2019-02-13T18:39:05.040Z",
"updatedAt": "2019-02-13T20:39:42.028Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
},
{
"id": 28,
"scriptId": 29,
"flowId": 6,
"flowNodeId": null,
"deleted_at": null,
"wrangled": false,
"referenceId": null,

Copyright © 2019 Trifacta Inc. Page #303


"createdAt": "2019-02-13T18:39:05.433Z",
"updatedAt": "2019-02-13T20:39:42.035Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
},
{
"id": 29,
"scriptId": 30,
"flowId": 6,
"flowNodeId": null,
"deleted_at": null,
"wrangled": true,
"referenceId": null,
"createdAt": "2019-02-13T18:39:18.716Z",
"updatedAt": "2019-02-13T20:39:42.043Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
},
{
"id": 30,
"scriptId": 31,
"flowId": 6,
"flowNodeId": null,
"deleted_at": null,
"wrangled": true,
"referenceId": null,
"createdAt": "2019-02-13T18:39:32.754Z",
"updatedAt": "2019-02-13T20:39:42.048Z",
"createdBy": 7,
"updatedBy": 7,
"activesampleId": null
}
],
"outputobjects": [],
"release": {
"id": 1,
"packageUuid": "9bae78c0-2fcb-11e9-9523-77f56ed58844",
"notes": "example_flow",
"active": true,
"updatedAt": "2019-02-13T20:39:41.764Z",
"createdAt": "2019-02-13T20:39:41.764Z",
"deployment": {
"id": 1
},
"creator": {
"id": 7
},
"updater": {
"id": 7

Copyright © 2019 Trifacta Inc. Page #304


}
}
}

Reference

For more information on import rule changes, see Define Import Mapping Rules.

API Releases Delete v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

NOTE: This feature is in Beta release.

Delete the specified release.

NOTE: Releases pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v4/releases/<id>

where:

Parameter Description

<id> Internal identifier for the release

Request URI - Example:

Copyright © 2019 Trifacta Inc. Page #305


/v4/releases/2

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

None.
API Releases Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

NOTE: This feature is in Beta release.

Get the specified release.

NOTE: Releases pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/releases/<id>

Copyright © 2019 Trifacta Inc. Page #306


where:

Parameter Description

<id> Internal identifier for the release

Request URI - Example:

/v4/releases/1

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 1,
"notes": "example_flow",
"packageUuid": "9bae78c0-2fcb-11e9-9523-77f56ed58844",
"active": null,
"createdAt": "2019-02-13T20:39:41.764Z",
"updatedAt": "2019-02-13T20:42:08.746Z",
"deployment": {
"id": 1
},
"creator": {
"id": 7
},
"updater": {
"id": 7
}
}

Copyright © 2019 Trifacta Inc. Page #307


Reference

Property Description

id Internal identifier for the release

notes Display value for notes that you can add to describe the release

packageUuid Internal identifier for the package

active If true, the release is the active one for the deployment.

If set to null, the release is not active.

createdAt Timestamp for when the release was created.

updatedAt Timestamp for when the release was last updated.

deployment.id Internal identifier for the deployment to which the release is assigned

creator.id Internal identifier for the user who created the release

updater.id Internal identifier for the user who last updated the release

API Releases Package Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

NOTE: This feature is in Beta release.

Retrieve a package containing the definition of the flow for the specified release.

NOTE: Releases pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

NOTE: This method exports flows from a Product instance, which is different from exporting using the fl
ows/:id/package, which exports from the Dev instance. Connection identifiers and paths may differ
between the two instances. This method is typically used for archiving flows from the Deployment
Manager.

Response body is the contents of the package. Package contents are a ZIPped version of the flow definition.
Version: v4

Required Permissions

Copyright © 2019 Trifacta Inc. Page #308


NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/releases/<id>/package

Parameter Description

<id> Internal identifier of the release to retrieve.

Request URI - Example:

/v4/releases/7/package

Request Body:
None.

Response

Response Status Code - Success: 200 - OK

Response Body Example:


Response body is the contents of the ZIP file. This package should be downloaded to your local environment.

Reference

None.
API Releases Patch v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

NOTE: This feature is in Beta release.

Update the specified release.

Copyright © 2019 Trifacta Inc. Page #309


NOTE: Releases pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

/v4/releases/<id>

where:

Parameter Description

<id> Internal identifier for the release

Request URI - Example:

/v4/releases/2

Request Body: Example - Activate the release


You can use the following example to make the current release the active one for the deployment.

NOTE: You can have only one active release per deployment. If this release is made active as part of this
execution, the currently active release is made inactive.

NOTE: For the PATCH method, only the properties that are being patched need to be submitted.

Tip: You can use this endpoint to deactivate a release, which prevents its jobs from being run. If there is
no active release for the deployment, no jobs are run via the deployment job run endpoint. See
API Deployments Run v4.

{
"active": true
}

Response

Response Status Code - Success: 200 - OK

Copyright © 2019 Trifacta Inc. Page #310


Response Body Example:

{
"id": 2,
"updater": {
"id": 7
},
"updatedAt": "2019-02-13T20:55:21.276Z"
}

Reference

For more information on the properties of a release, see API Releases Get v4.
API WrangledDatasets Create v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Create a new wrangled dataset from the specified imported dataset or wrangled dataset. Wrangled dataset is
owned by the authenticated user.

Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/wrangledDatasets/

Request Body - Imported Dataset:

Copyright © 2019 Trifacta Inc. Page #311


{
"name": "POS - PROD 2",
"importedDataset": {
"id": 2
},
"flow": {
"id": 1
}
}

Request Body - Wrangled Dataset:

{
"name": "POS - PROD 3",
"wrangledDataset": {
"id": 8
},
"flow": {
"id": 1
}
}

Response

Response Status Code - Success: 201 - Created

Response Body Example - Imported Dataset:

Copyright © 2019 Trifacta Inc. Page #312


{
"id": 8,
"wrangled": true,
"updatedAt": "2018-02-06T19:59:22.735Z",
"createdAt": "2018-02-06T19:59:22.698Z",
"name": "POS - PROD 2",
"active": true,
"referenceInfo": null,
"activeSample": {
"id": 8
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"recipe": {
"id": 8
},
"flow": {
"id": 1
}
}

Response Body Example - Wrangled Dataset:

Copyright © 2019 Trifacta Inc. Page #313


{
"id": 9,
"wrangled": true,
"updatedAt": "2018-02-06T20:01:26.368Z",
"createdAt": "2018-02-06T20:01:26.329Z",
"name": "POS - PROD 3",
"active": true,
"referenceInfo": null,
"activeSample": {
"id": 9
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"recipe": {
"id": 9
},
"flow": {
"id": 1
}
}

Reference

For more information on the properties of a wrangled dataset, see API WrangledDatasets Get v4.

API WrangledDatasets Delete v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Delete the specified wrangled dataset.

Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

Version: v4

Required Permissions

Copyright © 2019 Trifacta Inc. Page #314


NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v4/wrangledDatasets/<id>

where:

Parameter Description

<id> Internal identifier for the imported dataset

Request URI - Example:

/v4/wrangledDatasets/7

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

For more information on the properties of a wrangled dataset, see API WrangledDatasets Get v4.
API WrangledDatasets Get List v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the list of accessible wrangled datasets for the authenticated user.

Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

Version: v4

Copyright © 2019 Trifacta Inc. Page #315


Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/wrangledDatasets

Endpoint with embedded flow:

/v4/wrangledDatasets?embed=flow

For more information, see API WrangledDatasets Get v4.


Endpoint with paged retrieval:
By default, this endpoint returns results in sets of 25.
You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

/v4/wrangledDatasets?embed=flow&limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.
Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"data": [
{
"id": 7,
"wrangled": true,
"createdAt": "2018-02-06T19:47:56.146Z",
"updatedAt": "2018-02-06T19:47:56.183Z",
"recipe": {
"id": 7
},
"flow": {
"id": 1,

Copyright © 2019 Trifacta Inc. Page #316


"name": "2013 POS",
"description": null,
"createdAt": "2018-02-06T19:42:55.091Z",
"updatedAt": "2018-02-06T19:42:55.091Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
}
},
"name": "REF_CAL",
"description": null,
"referenceInfo": null,
"activeSample": {
"id": 7
},
"creator": {
"id": 1
},
"updater": {
"id": 1
}
},
{
"id": 6,
"wrangled": true,
"createdAt": "2018-02-06T19:43:02.791Z",
"updatedAt": "2018-02-06T19:43:02.838Z",
"recipe": {
"id": 6
},
"flow": {
"id": 1,
"name": "2013 POS",
"description": null,
"createdAt": "2018-02-06T19:42:55.091Z",
"updatedAt": "2018-02-06T19:42:55.091Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
}
},
"name": "POS-r01",
"description": null,
"referenceInfo": null,
"activeSample": {
"id": 6
},
"creator": {
"id": 1

Copyright © 2019 Trifacta Inc. Page #317


},
"updater": {
"id": 1
}

Copyright © 2019 Trifacta Inc. Page #318


}
]
}

Reference

For more information on the properties of a wrangled dataset, see API ImportedDatasets Get v4.
For more information on the embedded flow properties, see API Flows Get v4.
API WrangledDatasets Get PrimaryInputDataset v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the primary input dataset for the specified wrangled dataset. For a wrangled dataset, its primary input
dataset is the original dataset from which the wrangled dataset was created.

Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/wrangledDatasets/<id>/primaryInputDataset

where:

Parameter Description

<id> Internal identifier for the wrangled dataset

Request URI - Example:

Copyright © 2019 Trifacta Inc. Page #319


/v4/wrangledDatasets/31/primaryInputDataset

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example - imported dataset:

Copyright © 2019 Trifacta Inc. Page #320


{
"importedDataset": {
"path":
"/trifacta/uploads/1/3630243f-4a20-4996-8fa4-cb190c565358/POS-r01.txt",
"bucket": null,
"container": null,
"type": "hdfs",
"blobHost": null,
"isDynamicOrConverted": false,
"id": 9,
"dynamicPath": null,
"isSchematized": true,
"isDynamic": false,
"isConverted": false,
"disableTypeInference": false,
"hasStructuring": true,
"createdAt": "2019-01-28T19:54:47.667Z",
"updatedAt": "2019-01-28T19:54:47.847Z",
"storageLocation": {
"fullUri":
"hdfs:///trifacta/uploads/1/3630243f-4a20-4996-8fa4-cb190c565358/POS-r01.
txt",
"id": 34,
"path":
"/trifacta/uploads/1/3630243f-4a20-4996-8fa4-cb190c565358/POS-r01.txt",
"size": "292817",
"workspaceId": 1,
"type": "hdfs",
"bucket": null,
"blobHost": null,
"container": null,
"hash": "215944e85eaabf9fd7ae837b36ff80711abe7ae1",
"createdAt": "2019-01-28T19:54:47.664Z",
"updatedAt": "2019-01-28T19:54:47.664Z"
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"workspace": {
"id": 1
},
"parsingRecipe": {
"id": 25
},
"connection": null
}
}

Copyright © 2019 Trifacta Inc. Page #321


Response Body Example - recipe:

{
"wrangledDataset": {
"id": 11,
"wrangled": true,
"createdAt": "2018-04-24T16:12:14.018Z",
"updatedAt": "2018-04-24T17:05:06.741Z",
"referenceInfo": {
"id": 1
},
"activeSample": {
"id": 11
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"recipe": {
"id": 11
},
"flow": {
"id": 3
}
}
}

Reference

Imported Dataset:
For more information on these properties, see API ImportedDatasets Get v4.

Wrangled Dataset:

Property Description

id Internal identifier of the recipe associated with this dataset

wrangled If true, this object is a recipe.

createdAt Timestamp for when the recipe was created

updatedAt Timestamp for when the recipe was last updated

referenceInfo.id Internal identifier of the object that provides input to this one.

activeSample.id Internal identifier of the sample currently associated with this recipe.

creator.id Internal identifier of the user who created this recipe.

updater.id Internal identifier of the user who last modified this recipe.

Copyright © 2019 Trifacta Inc. Page #322


recipe.id Internal identifier of the recipe associated with this dataset

flow.id Internal identifier of the flow that contains this dataset

For more information on the other properties, see API WrangledDatasets Get v4.
API WrangledDatasets Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the specified wrangled dataset.

Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/wrangledDatasets/<id>

Endpoint with embedded reference:


Use the following endpoint to embed information about the flow that contains the dataset in the response body.

/v4/wrangledDatasets/<id>?embed=flow

where:

Parameter Description

<id> Internal identifier for the wrangled dataset

Request URI - Example:

Copyright © 2019 Trifacta Inc. Page #323


/v4/wrangledDatasets/6/

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 6,
"wrangled": true,
"createdAt": "2018-02-06T19:43:02.791Z",
"updatedAt": "2018-02-06T19:43:02.838Z",
"recipe": {
"id": 6
},
"name": "POS-r01",
"description": null,
"referenceInfo": null,
"activeSample": {
"id": 6
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"flow": {
"id": 1
}
}

Response Body Example with embedded reference:

Copyright © 2019 Trifacta Inc. Page #324


{
"id": 6,
"wrangled": true,
"createdAt": "2018-02-06T19:43:02.791Z",
"updatedAt": "2018-02-06T19:43:02.838Z",
"flow": {
"id": 1,
"name": "2013 POS",
"description": null,
"createdAt": "2018-02-06T19:42:55.091Z",
"updatedAt": "2018-02-06T19:42:55.091Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
}
},
"recipe": {
"id": 6
},
"name": "POS-r01",
"description": null,
"referenceInfo": null,
"activeSample": {
"id": 6
},
"creator": {
"id": 1
},
"updater": {
"id": 1
}
}

Reference

Wrangled Dataset:
These properties apply to the source of the wrangled dataset.

Property Description

id Internal identifier of the wrangled dataset

wrangled If true, this dataset is a wrangled dataset

createdAt Timestamp for when the dataset was imported

updatedAt Timestamp for when the dataset was last updated

script Internal identifier of the recipe associated with this dataset

name Internal name of the wrangled dataset

Copyright © 2019 Trifacta Inc. Page #325


description User-friendly description for the wrangled dataset

activeSample Internal identifier of the currently active sample for this dataset

creator Internal identifier of the user who created the wrangled dataset

updater Internal identifier of the user who last updated the wrangled dataset

flow Internal identifier of the flow that contains this dataset

Embedded Flow:
For more information on the embedded flow properties, see API Flows Get v4.
API WrangledDatasets Patch v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Modify the specified wrangled dataset.

Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

/v4/wrangledDatasets/<id>

where:

Parameter Description

<id> Internal identifier for the wrangled dataset

Request URI - Example:

/v4/wrangledDataset/12

Copyright © 2019 Trifacta Inc. Page #326


Request Body:
Only the name and description properties should be modified. Modified properties must be included in the request.
In the following example, the name and the description of the wrangled dataset are modified:

{
"name": "Wrangled DS 2"
"description": "This is my wrangled dataset #2."
}

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 12,
"updater": {
"id": 1
},
"updatedAt": "2019-02-14T23:08:44.984Z"
}

Reference

For more information on the properties of a wrangled dataset, see API WrangledDatasets Get v4.
API WrangledDatasets Post AddToFlow v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Add the specified wrangled dataset to a flow as a reference. A reference is a link from one flow to the output of a
wrangled dataset that is sourced from another flow.

Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #327


Request

Request Type: POST

Endpoint:

/v4/wrangledDatasets/<id>/addToFlow

where:

Parameter Description

<id> Internal identifier for the wrangled dataset

Request URI - Example:

/v4/wrangledDatasets/15/addToFlow

Request Body:

{
"flow": {
"id": 27
}
}

Response

Response Status Code - Success: 201 - Created

Response Body:

Copyright © 2019 Trifacta Inc. Page #328


{
"id": 43,
"wrangled": false,
"updatedAt": "2019-01-26T01:09:38.022Z",
"createdAt": "2019-01-26T01:09:38.022Z",
"referenceInfo": null,
"activeSample": null,
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"recipe": null,
"referencedFlowNode": {
"id": 27
},
"flow": {
"id": 15
}
}

Reference

Property Description

id Internal identifier for the new wrangled dataset.

wrangled If true, the dataset is a wrangled dataset.

If false, the dataset is a reference dataset.

updatedAt Timestamp for when the dataset was updated.

createdAt Timestamp for when the dataset was created.

referenceInfo If the wrangled dataset has a reference object defined for it, its information is listed here.

activeSample Internal identifier for the currently active sample for the dataset.

If null, there is no currently active sample for the dataset.

creator.Id Internal identifier of the user who created the flow.

updater.Id Internal identifier of the user who performed the update.

recipe Internal identifier for the recipe associated with the dataset in its new flow.

If null, the dataset has not been wrangled in the Transformer page.

referencedFlowNode.id Internal identifier of the original dataset's internal identifier.

flow.Id Internal identifier of the flow that contains this dataset

API WrangledDatasets Put PrimaryInputDataset v4

This is the latest version of the APIs.

Copyright © 2019 Trifacta Inc. Page #329


Contents:
Required Permissions
Request
Response
Reference

Updated the primary input dataset for the specified wrangled dataset. Each wrangled dataset must have one and
only one primary input dataset, which can be an imported or wrangled dataset.

Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

This action performs a dataset swap for the source of a wrangled dataset, which can be done through the UI. See
Flow View Page.

Tip: After you have created a job via API, you can use this API to swap out the source data for the job's
dataset. In this manner, you can rapidly re-execute a pre-existing job using fresh data. See
API JobGroups Create v4.

Version: v4

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PUT

Endpoint:

/v4/wrangledDatasets/<id>/primaryInputDataset

where:

Parameter Description

<id> Internal identifier for the wrangled dataset

Request URI - Example:

/v4/wrangledDatasets/14/primaryInputDataset

Request Body - imported dataset:

Copyright © 2019 Trifacta Inc. Page #330


{
"importedDataset": {
"id": 9
}
}

Request Body - wrangled dataset:

{
"wrangledDataset": {
"id": 13
}
}

Response

Response Status Code - Success: 200 - OK

Response Body Example - imported dataset:

Copyright © 2019 Trifacta Inc. Page #331


{
"id": 14,
"wrangled": true,
"createdAt": "2019-02-13T18:39:32.754Z",
"updatedAt": "2019-02-13T18:40:11.639Z",
"recipe": {
"id": 14,
"name": "POS-r01",
"description": null,
"active": true,
"nextPortId": 2,
"createdAt": "2019-02-13T18:39:32.737Z",
"updatedAt": "2019-02-13T18:40:11.638Z",
"currentEdit": {
"id": 18
},
"redoLeafEdit": {
"id": 16
},
"creator": {
"id": 1
},
"updater": {
"id": 1
}
},
"referenceInfo": null,
"activeSample": {
"id": 14
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"referencedFlowNode": null,
"flow": {
"id": 3
}
}

Response Body Example - wrangled dataset:

Copyright © 2019 Trifacta Inc. Page #332


{
"id": 14,
"wrangled": true,
"createdAt": "2019-02-13T18:39:32.754Z",
"updatedAt": "2019-02-13T18:40:11.639Z",
"recipe": {
"id": 14,
"name": "POS-r01",
"description": null,
"active": true,
"nextPortId": 2,
"createdAt": "2019-02-13T18:39:32.737Z",
"updatedAt": "2019-02-13T18:40:11.638Z",
"currentEdit": {
"id": 18
},
"redoLeafEdit": {
"id": 16
},
"creator": {
"id": 1
},
"updater": {
"id": 1
}
},
"referenceInfo": null,
"activeSample": {
"id": 14
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"referencedFlowNode": null,
"flow": {
"id": 3
}
}

Reference

For more information on these properties, see API WrangledDatasets Get PrimaryInputDataset v4.
API WriteSettings Create v4

This is the latest version of the APIs.

Contents:

Copyright © 2019 Trifacta Inc. Page #333


Required Permissions
Request
Response
Reference

Create a writesettings object.


Version: v4

Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v4/writesettings

Request Body:
Following creates a writesettings object that writes a new Parquet file to the designated location each time that the
job is run.

Copyright © 2019 Trifacta Inc. Page #334


{
"delim": ",",
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS_r03.p
qt",
"action": "create",
"format": "pqt",
"compression": "none",
"header": false,
"asSingleFile": false,
"prefix": null,
"suffix": "_increment",
"hasQuotes": false,
"outputObjectId": 5
}

Response

Response Status Code - Success: 201 - Created

{
"delim": ",",
"id": 7,
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS_r03.p
qt",
"action": "create",
"format": "pqt",
"compression": "none",
"header": false,
"asSingleFile": false,
"prefix": null,
"suffix": "_increment",
"hasQuotes": false,
"updatedAt": "2018-11-08T00:15:22.948Z",
"createdAt": "2018-11-08T00:15:22.948Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"outputObjectId": 5
}

Reference

For more information on the properties of a writesettings object, see API WriteSettings Get v4.
API WriteSettings Delete v4

Copyright © 2019 Trifacta Inc. Page #335


This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Delete the specified writesettings object.


Version: v4

Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v4/writesettings/<id>

where:

Property Description

<id> Internal identifier for the writesettings object

Request URI - Example:

Copyright © 2019 Trifacta Inc. Page #336


/v4/writesettings/2

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

For more information on the properties of a writesettings object, see API WriteSettings Get v4.
API WriteSettings Get List v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the list of accessible writesettings objects for the authenticated user.
Version: v4

Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #337


Request

Request Type: GET

Endpoint:

/v4/writesettings

For more information, see API WriteSettings Get v4.


Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"data": [
{
"delim": ",",
"id": 6,
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS_r01.pqt"
,
"action": "create",
"format": "pqt",
"compression": "none",
"header": false,
"asSingleFile": false,
"prefix": null,
"suffix": "_increment",
"hasQuotes": false,
"createdAt": "2018-11-07T23:47:15.144Z",
"updatedAt": "2018-11-07T23:47:15.144Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"outputObject": {
"id": 5
}
},
{
"delim": ",",
"id": 1,
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/AllFileForma

Copyright © 2019 Trifacta Inc. Page #338


ts_2Fweb.csv",
"action": "create",
"format": "csv",
"compression": "none",
"header": true,
"asSingleFile": true,
"prefix": null,
"suffix": "_increment",
"hasQuotes": true,
"createdAt": "2018-11-07T16:01:53.699Z",
"updatedAt": "2018-11-07T16:01:53.699Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"outputObject": {
"id": 1
}

Copyright © 2019 Trifacta Inc. Page #339


}
]
}

Reference

For more information on the properties of a writesettings object, see API WriteSettings Get v4.
API WriteSettings Get v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Get the specified writesettings object.


The writesettings values allow you to specify aspects of the publication of results to the specified path locati
on.

NOTE: writesettings values are required if you are running this specified job for the dataset for the
first time.

NOTE: To specify multiple outputs, you can include additional writesettings objects in the request.
For example, if you want to generate output to csv and json, you can duplicate the writesettings obj
ect for csv and change the format value in the second one to json.

These settings correspond to values that you can apply through the UI or through the command line interface.
For UI information, see Run Job Page.
For CLI information, see CLI for Jobs.
Version: v4

Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

Copyright © 2019 Trifacta Inc. Page #340


publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v4/writesettings/<id>

Request URI - Example:

/v4/writesettings/6/

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #341


{
"delim": ",",
"id": 6,
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS_r01.p
qt",
"action": "create",
"format": "pqt",
"compression": "none",
"header": false,
"asSingleFile": false,
"prefix": null,
"suffix": "_increment",
"hasQuotes": false,
"createdAt": "2018-11-07T23:47:15.144Z",
"updatedAt": "2018-11-07T23:47:15.144Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"outputObject": {
"id": 5
}
}

Copyright © 2019 Trifacta Inc. Page #342


Reference

Property Description

delim The delimiter between field values in an output row

id Internal identifier of the writesettings object

path (required) The fully qualified path to the output location where to write the results

action (required) If the output file or directory exists, you can specify one of the following actions:

create - Create a new, parallel location, preserving the old results.


append - Add the new results to the old results.
overwrite - Replace the old results with the new results.
format (required) Output format for the results. Specify one of the following values:

csv
json
avro
pqt (Parquet)
tde (Tableau)

NOTE: To specify multiple output formats, create additional writesettings object for each output format.

compression (optional) For csv and json results, you can optionally compress them using bzip2 or gzip compression. Default
is none.

NOTE: If compression is applied, the filename in the path value must end with the appropriate extension for
the type of compression:

.gz for GZIP


.bz2 for BZIP2

header (optional) For csv results with action set to create or append, this value determines if a header row with
column names is inserted at the top of the results. Default is false.

asSingleFile (optional) For csv and json results, this value determines if the results are concatenated into a single file or stored as
multiple files. Default is false.

prefix
NOTE: Do not use or modify. For internal platform use only.

suffix
NOTE: Do not use or modify. For internal platform use only.

hasQuotes If true, each field in the output is wrapped in double-quotes.

createdAt Timestamp for when the writesettings object was created

updatedAt Timestamp for when the writesettings object was last updated

creator.id Internal identifier of the user who created the object

updater.id Internal identifier of the user who last updated the object

outputObject.id If specified, this value is the internal identifier of the outputobject with which this writesettings object is associated.

Copyright © 2019 Trifacta Inc. Page #343


API WriteSettings Update v4

This is the latest version of the APIs.

Contents:
Required Permissions
Request
Response
Reference

Modify the specified writesettings object.


Version: v4

Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PUT

Endpoint:

/v4/writesettings/<id>

Request URI - Example:

/v4/writesettings/6/

Request Body:

Copyright © 2019 Trifacta Inc. Page #344


{
"delim": ",",
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS_r01a.
pqt",
"action": "create",
"format": "pqt",
"compression": "none",
"header": false,
"asSingleFile": false,
"prefix": null,
"suffix": "_increment",
"hasQuotes": true,
"outputObjectId": 5
}

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 6,
"updater": {
"id": 1
},
"updatedAt": "2018-11-08T00:13:43.819Z"
}

Reference

For more information on the properties of the writesettings object, see API WriteSettings Get v4.
v3 Endpoints

These endpoints apply to version 3 of the APIs for the Trifacta® platform.
For more information on support for this version, see API Version Support Matrix.

Connections

Endpoint Action Behavior Documentation

/connections POST Create API Connections Create v3

/connections GET List API Connections Get List v3

/connections/:id GET Read API Connections Get v3

Copyright © 2019 Trifacta Inc. Page #345


/connections/:id/status GET Read Status API Connections Get Status v3

/connections DELETE Delete API Connections Delete v3

Datasets and Recipes

Endpoint Action Behavior Documentation

/importedDatasets POST Create API ImportedDatasets Create v3

/importedDatasets GET List API ImportedDatasets Get List v3

/importedDatasets/:id GET Read API ImportedDatasets Get v3

/importedDatasets/:id DELETE Delete API ImportedDatasets Delete v3

/importedDatasets/:id/addToFlow POST Create API ImportedDatasets Post AddToFlow


v3

/wrangledDatasets POST Create API WrangledDatasets Create v3

/wrangledDatasets GET List API WrangledDatasets Get List v3

/wrangledDatasets/:id GET Read API WrangledDatasets Get v3

/wrangledDatasets/:id DELETE Delete API WrangledDatasets Delete v3

/wrangledDatasets/:id/primaryInputDatasets GET Read API WrangledDatasets Get


PrimaryInputDataset v3

/wrangledDatasets/:id/primaryInputDatasets PUT Update API WrangledDatasets Put


PrimaryInputDataset v3

Flows

Endpoint Action Behavior Documentation

/flows POST Create API Flows Create v3

/flows GET List API Flows Get List v3

/flows/:id GET Read API Flows Get v3

/flows/:id PATCH Update API Flows Patch v3

/flows/:id DELETE Delete API Flows Delete v3

Flow import and export

Endpoint Action Behavior Documentation

/flows/package/dryRun POST Import dry run API Flows Package Post DryRun v3

/flows/package POST Import API Flows Package Post v3

/flows/:id/package/dryRun GET Export dry run API Flows Package Get DryRun v3

/flows/:id/package GET Export API Flows Package Get v3

Copyright © 2019 Trifacta Inc. Page #346


Jobgroups and Jobs

Endpoint Action Behavior Documentation

/jobGroups POST Create API JobGroups Create v3

/jobGroups GET List API JobGroups Get List v3

/jobGroups/:id GET Read API JobGroups Get v3

/jobGroups/:id/jobs GET Read API JobGroups Get Jobs v3

/jobGroups/:id/status GET Read Status API JobGroups Get Status v3

/jobGroups/:id/publish PUT Create API JobGroups Put Publish v3

/jobGroups/:id DELETE Delete API JobGroups Delete v3

Deployments and Releases

Endpoint Action Behavior Documentation

/deployments POST Create API Deployments Create v3

/deployments GET List API Deployments Get List v3

/deployments/:id GET Read API Deployments Get v3

/deployments/:id PATCH Update API Deployments Patch v3

/deployments:/:id/objectImportRules PATCH Update API Deployments Object Import Rules Patch v3

/deployments:/:id/valueImportRules PATCH Update API Deployments Value Import Rules Patch v3

/deployments/:id?embed=releases GET List API Deployments Get Release List v3

/deployments/:id/run POST Create API Deployments Run v3

/deployments/:id DELETE Delete API Deployments Delete v3

/deployments/:id/releases POST Create API Releases Create v3

/deployments/:id/releases/dryRun POST Create API Releases Create DryRun v3

/releases:id GET Read API Releases Get v3

/releases/:id PATCH Update API Releases Patch v3

/releases/:id DELETE Delete API Releases Delete v3

/releases/:id/package GET Read API Releases Package Get v3

Copyright © 2019 Trifacta Inc. Page #347


Users

Endpoint Action Behavior Documentation

/people POST Create API People Create v3

/people GET List API People Get List v3

/people PATCH Update API People Patch v3

/people/:id GET Read API People Get v3

/people/:id DELETE Delete API People Delete v3

Miscellaneous

Endpoint Action Behavior Documentation

/session GET Read API Session Get

API Connections Create v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Connections Create v4

Contents:
Required Permissions
Request
Response
Reference

Create the specified connection.

NOTE: In this release, you cannot create SQL DW connections via the API. Please create these
connections through the application. This known issue will be fixed in a future release.

For more information on connections, see CLI for Connections.


Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

Copyright © 2019 Trifacta Inc. Page #348


/v3/connections/

Request Body - Relational Connection:


For relational connections, the request body should look like the following. All properties are required unless
noted.

NOTE: Relational connections require the creation and installation of an encryption key file on the Trifacta
node. This file must be present before the connection is created. See Create Encryption Key File.

This example creates a SQL Server connection of basic type. A valid username/password combination must be
specified in the credentials property.

{
"name": "sqlserver",
"description": "",
"isGlobal": false,
"type": "jdbc",
"host": "sqlserver.example.com",
"port": 1433,
"vendor": "sqlserver",
"params": {
"connectStrOpts": ""
},
"ssl": false,
"credentialType": "basic",
"credentials": [
{
"username": "<username>",
"password": "<password>"
}
]
}

Property Description

name Display name of the connection

description (Optional) Display description for the connection

isGlobal (Optional) If true, the connection is available to all users. The default is false.

type For more information on the value to insert for the connection, see Connection Types.

host Host name of the relational server to which to connect.

port Port number for the relational server. The default value varies between database vendors. For more information, please
see the documentation provided with your database distribution.

vendor For more information on the value to insert for the connection, see Connection Types.

params (Optional) Set of JSON parameters that are passed to the database when initializing the connection. Depending on the
database vendor, you may be required to submit via this parameter the name of the default database. You can also pass in
optional parameters through the ConnecStrOpts parameter. For more information, see CLI for Connections.

Copyright © 2019 Trifacta Inc. Page #349


ssl (Optional) If set to true, the connection is made over SSL. The default is false.

NOTE: If you connect over SSL, you must modify the hostname value to use HTTPS.

NOTE: SSL connections to SQL Server are not supported.

credentialType Set this value to one of the following:

basic - Simple username/password to be provided in the credentials property.


conf - Use the connection credentials stored in trifacta-conf.json.
custom - Connection credentials are specified in the params property, such as using key-value parameters to
specify access credentials.

credentials (Optional) If credentialType=basic, this property must contain the username and password to use to connect to
the relational source.

Request Body - Hive Connection:


You can create create only one public connection to Hive.

{
"host": "hadoop",
"port": 10000,
"vendor": "hive",
"params": {
"jdbc": "hive2",
"connectStrOpts": "",
"defaultDatabase": ""
},
"ssl": false,
"name": "hiveAPI",
"description": "Hive conn via API",
"type": "jdbc",
"isGlobal": true,
"credentialType": "conf",
"credentialsShared": true
}

The following property values are specific to Hive connections.

Property Description

host Set this value to hadoop to integrate with the Hive instance for the Hadoop cluster to which the Trifacta platform
is connected.

Copyright © 2019 Trifacta Inc. Page #350


params Connection parameters for the Hive instance.

NOTE: The following parameter entry is required for Hive:

"jdbc": "hive2",

Others are optional.

type Set this value to jdbc.

isGlobal
NOTE: For Hive connections, this value must be set to true.

credentialType Set this value to conf.

For more information, see Configure for Hive.


Request Body - Redshift Connection:

{
"host": "redshift.example.net",
"port": 5439,
"vendor": "redshift",
"params": {
"connectStrOpts": "",
"defaultDatabase": "dev",
"extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS
TRUNCATECOLUMNS"
},
"ssl": true,
"name": "redshift",
"description": "Redshiftconn",
"type": "jdbc",
"isGlobal": true,
"credentialType": "custom",
"credentialsShared": true,
"credentials": [
{"key":"user","value":"<userId>"},
{"key":"password","value":"<PWD>"},
{"key":"iamRoleArn","value":"<IAM_role_ARN>"}
]
}

The following property values are specific to Redshift connections.

Property Description

params A default database value is required.

The extraLoadParams value is used when you publish results to Redshift. For more information on these
values, see http://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-conversion.html.

Copyright © 2019 Trifacta Inc. Page #351


type Set this value to jdbc.

credentialType Set this value to custom. Credentials are specified below.

credentials username and password must be specified in this key-value format, although the value for either can be an
empty string.

NOTE: iamRoleArn is optional. For more information, see


Configure for EC2 Role-Based Authentication.

For more information on parameters and credentials, see Create Redshift Connections.

Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
"connectString": "jdbc:sqlserver://sqlserver.example.com:1433",
"id": 5,
"host": "sqlserver.example.com",
"port": 1433,
"vendor": "sqlserver",
"params": {
"connectStrOpts": ""
},
"ssl": false,
"name": "sqlserver",
"description": "",
"type": "jdbc",
"createdBy": 1,
"isGlobal": false,
"credentialType": "basic",
"createdAt": "2017-07-05T18:00:19.165Z",
"updatedAt": "2017-07-05T18:00:19.165Z",
"updatedBy": 1,
"credentials": [
{
"username": "<username>"
}
]
}

Reference

For more information on the response body properties, see API Connections Get v3.

API Connections Delete v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.

Copyright © 2019 Trifacta Inc. Page #352


Latest version of this endpoint: API Connections Delete v4

Contents:
Required Permissions
Request
Response
Reference

Delete the specified connection.


Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v3/connections/<id>

where:

Parameter Description

<id> Internal identifier for the connection

Request URI - Example:

/v3/connections/4

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

None.
API Connections Get List v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more

Copyright © 2019 Trifacta Inc. Page #353


information, see API Version Support Matrix.
Latest version of this endpoint: API Connections Get List v4

Contents:
Required Permissions
Request
Response
Reference

Get the list of accessible connections for the authenticated user.


For more information on connections, see CLI for Connections.
Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/connections

Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

/v3/connections?limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.
Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"data": [
{

Copyright © 2019 Trifacta Inc. Page #354


"connectString":
"jdbc:postgresql://localhost:5432/trifacta?ssl=true",
"id": 9,
"host": "localhost",
"port": 5432,
"vendor": "postgres",
"params": {
"database": "trifacta"
},
"ssl": true,
"name": "dc31bef0_e7ea_11e6_8c19_4f07370e0072",
"description": null,
"type": "jdbc",
"createdBy": 2,
"isGlobal": false,
"credentialType": "basic",
"createdAt": "2017-01-31T19:24:37.167Z",
"updatedAt": "2017-01-31T19:24:37.167Z",
"updatedBy": 2,
"credentials": [
{
"username": "trifacta"
}
]
},
{
"connectString": "jdbc:postgresql://does-not-exist:5432/trifacta",
"id": 7,
"host": "does-not-exist",
"port": 5432,
"vendor": "postgres",
"params": {
"database": "trifacta"
},
"ssl": false,
"name": "dc3197e0_e7ea_11e6_8c19_4f07370e0072",
"description": null,
"type": "jdbc",
"createdBy": 2,
"isGlobal": false,
"credentialType": "basic",
"createdAt": "2017-01-31T19:24:31.899Z",
"updatedAt": "2017-01-31T19:24:31.899Z",
"updatedBy": 2,
"credentials": [
{
"username": "trifacta"
}
]

Copyright © 2019 Trifacta Inc. Page #355


}
]
}

Reference

For more information on the properties of a connection, see API Connections Get v3.
API Connections Get Status v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Connections Get Status v4

Contents:
Required Permissions
Request
Response
Reference

Run a connection test for the specified connection.


For more information on connections, see CLI for Connections.
Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/connections/<id>/status

where:

Parameter Description

<id> Internal identifier for the connection

Request URI - Example:

/v3/connections/10/status

Request Body:

Copyright © 2019 Trifacta Inc. Page #356


Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"result": "SUCCESS",
"reason": null
}

Reference

Property Description

result Results of testing the connection.

For more information on debugging failures in relational connections, see Enable Relational Connections.
For more information on debugging Hive connections. see Configure for Hive.
For more information on debugging S3 connections, see Enable S3 Access.

reason If the result value is not SUCCESS, additional information may be included here.

API Connections Get v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Connections Get v4

Contents:
Required Permissions
Request
Response
Reference

Get the specified connection.


For more information on connections, see CLI for Connections.
Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

Copyright © 2019 Trifacta Inc. Page #357


/v3/connections/<id>

where:

Parameter Description

<id> Internal identifier for the connection

Request URI - Example:

/v3/connections/3

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"data": [
{
"connectParams": {
"vendor": "postgres",
"host": "localhost",
"port": "5432",
"database": "trifacta"
},
"id": 10,
"host": "localhost",
"port": 5432,
"vendor": "postgres",
"params": {
"connectStrOpts": "",
"database": "trifacta"
},
"ssl": false,
"name": "postgres",
"description": "",
"type": "jdbc",
"createdBy": 1,
"isGlobal": false,
"credentialType": "basic",
"credentialsShared": true,
"uuid": "7d173c90-c4e1-11e7-a768-71cd1fa636c3",
"createdAt": "2017-11-09T00:04:00.345Z",
"updatedAt": "2017-11-09T00:04:00.345Z",
"updatedBy": 1,
"credentials": [

Copyright © 2019 Trifacta Inc. Page #358


{
"username": "<userId>"
}
]
},
{
"connectParams": {
"vendor": "hive",
"host": "hadoop",
"port": "10000",
"jdbc": "hive2",
"defaultDatabase": "default"
},
"id": 1,
"host": "hadoop",
"port": 10000,
"vendor": "hive",
"params": {
"jdbc": "hive2",
"connectStringOptions": "",
"defaultDatabase": "default"
},
"ssl": false,
"name": "hive",
"description": null,
"type": "jdbc",
"createdBy": 1,
"isGlobal": true,
"credentialType": "conf",
"credentialsShared": true,
"uuid": "ae41c5a0-c460-11e7-8163-c3c02bb1fb0b",
"createdAt": "2017-11-08T08:41:57.755Z",
"updatedAt": "2017-11-08T08:41:57.755Z",
"updatedBy": 1,
"credentials": []
}
],
"count": {
"owned": 2,
"shared": 0,

Copyright © 2019 Trifacta Inc. Page #359


"count": 2
}
}

Copyright © 2019 Trifacta Inc. Page #360


Reference

Property Description

connectParams.vendor The type of connection. See Connection Types.

connectParams.host Host of the source

connectParams.port Port number for the source

connectParams.database Name of the default database (if applicable)

id Internal identifier for the connection

host Host of the source

port Port number for the source

vendor String identifying the connection's vendor

params This setting is populated with any parameters that are passed to the source during connection and operations.
For relational sources, this setting may include the default database.

ssl When true, the Trifacta platform uses SSL to connect to the source.

name Internal name of the connection

description User-friendly description for the connection

type Type of connection

createdBy Internal identifier for the user who created the connection

isGlobal If true, the connection is public and available to all users.

NOTE: After a connection has been made public, it cannot be made private again. It must be deleted
and recreated.

Default is false. A connection can be made public through the command line interface or the Connections
page. See Connections Page.

credentialType The type of credentials used for the connection. This value varies depending on where the credentials are
stored. See CLI for Connections.

credentialsShared If true, the credentials used for the connection are available for use by users who have been shared the
connection.

uuid A universal object identifier, which is unique across instances of the platform.

This internal identifier is particularly useful when create import mapping rules.

See API Deployments Value Import Rules Patch v3.


See API Deployments Object Import Rules Patch v3.

createdAt Timestamp for when the connection was made

updatedAt Timestamp for when the connection was last updated

updatedBy Internal identifier for the user who last updated the connection

credentials If present, these values are the credentials used to connect to the database.

NOTE: For security reasons, you can store the connection's credentials in an external file on the Trifa
cta Server, after which they do not appear in this setting. See CLI for Connections.

API Deployments Create v3

Copyright © 2019 Trifacta Inc. Page #361


The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Deployments Create v4

Contents:
Required Permissions
Request
Response
Reference

Create a new deployment.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v3/deployments/

Request Body:

{
"name": "Test Deployment"
}

Response

Response Status Code - Success: 201 - Created

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #362


{
"id": 20,
"name": "Test Deployment",
"createdBy": 1,
"updatedBy": 1,
"updatedAt": "2017-10-12T23:48:54.340Z",
"createdAt": "2017-10-12T23:48:54.340Z"
}

Reference

For more information on properties of a deployment, see API Deployments Get v3.
API Deployments Delete v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Deployments Delete v4

Contents:
Required Permissions
Request
Response
Reference

Delete the specified deployment.

Deleting a deployment removes all releases, packages, and flows underneath it. This step cannot
be undone.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v3/deployments/<id>

Copyright © 2019 Trifacta Inc. Page #363


where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v3/deployments/4

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

None.
API Deployments Get List v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Deployments Get List v4

Contents:
Required Permissions
Request
Response
Reference

Get the list of accessible deployments for the authenticated user.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #364


Request

Request Type: GET

Endpoint:

/v3/deployments

Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

/v3/deployments?limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.
Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"data": [
{
"id": 2,
"name": "Deployment 2",
"createdAt": "2017-10-12T17:45:18.485Z",
"updatedAt": "2017-10-12T17:45:18.485Z",
"createdBy": 1,
"updatedBy": 1
},
{
"id": 1,
"name": "My First Deployment",
"createdAt": "2017-10-10T00:36:49.278Z",
"updatedAt": "2017-10-10T00:36:49.278Z",
"createdBy": 1,
"updatedBy": 1
}
],
"count": 2
}

Copyright © 2019 Trifacta Inc. Page #365


Reference

For more information on the properties of a deployment, see API Deployments Get v3.
API Deployments Get Release List v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Deployments Get Release List v4

Contents:
Required Permissions
Request
Response
Reference

Get the list of releases for the specified deployment for the authenticated user.

NOTE: Deployments and releases pertain to Production instances of the Trifacta® platform. For more
information, see Overview of Deployment Manager.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/deployments/:id?embed=releases

Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

/v3/deployments/:id?embed=releases&limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.
Request Body:

Copyright © 2019 Trifacta Inc. Page #366


Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 1,
"name": "Payment10-SteveO",
"createdAt": "2017-09-26T07:00:00.000Z",
"updatedAt": "2017-10-12T23:47:56.801Z",
"createdBy": 1,
"updatedBy": 1,
"releases": [
{
"id": 2,
"notes": "Testing with a new format",
"packageUuid": "11d472a0-a799-11e7-9c5c-9dd7feba47aa",
"active": null,
"createdAt": "2017-10-02T19:07:24.311Z",
"updatedAt": "2017-10-05T12:21:46.177Z",
"deploymentId": 1,
"createdBy": 1,
"updatedBy": 1
},
{
"id": 1,
"notes": null,
"packageUuid": "6648f8c0-a9e6-11e7-a092-8394937c7038",
"active": true,
"createdAt": "2017-10-05T16:01:27.881Z",
"updatedAt": "2017-10-12T20:07:42.143Z",
"deploymentId": 1,
"createdBy": 1,
"updatedBy": 1
}
]
}

Reference

For more information on the properties of a release, see API Releases Get v3.
API Deployments Get v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Deployments Get v4

Contents:

Copyright © 2019 Trifacta Inc. Page #367


Required Permissions
Request
Response
Reference

Get the specified deployment.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/deployments/<id>

where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v3/deployments/3

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #368


{
"id": 3,
"name": "Payments Deployment",
"createdAt": "2017-09-27T07:00:00.000Z",
"updatedAt": "2017-09-30T00:17:00.079Z",
"createdBy": 1,
"updatedBy": 2
}

Reference

Property Description

id Internal identifier for the deployment

name Display name for the deployment. This value appears in the user interface.

createdAt Timestamp for when the deployment was created.

updatedAt Timestamp for when the deployment was last updated.

createdBy Internal identifier for the user who created the deployment.

updatedBy Internal identifier for the user who last updated the deployment.

API Deployments Object Import Rules Patch v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Deployments Object Import Rules Patch v4

Contents:
Required Permissions
Request
Response
Reference

Create a list of object-based import rules for the specified deployment. Delete all previous rules applied to the
same object.

NOTE: Import rules must be applied to individual deployments.

The generated rules apply to all flows that are imported into the deployment after they has been created.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

The response contains any previously created rules that have been deleted as a result of this change.
You can also make replacements in the import package based on value mappings. See
API Deployments Value Import Rules Patch v3.

Copyright © 2019 Trifacta Inc. Page #369


Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

/v3/deployments/<id>/objectImportRules

where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v3/deployments/4/objectImportRules

Request Body Example: Replace connection


The following JSON array describes replacing the connection specified by the UUID, which is a field on the
connection object exported from the original platform instance. This connection reference is replaced by a
reference to connection ID 1 in the local platform instance and is applied to any release uploaded into the
deployment after the rule has been created:

[{"tableName":"connections","onCondition":{"uuid":"d75255f0-a245-11e7-861
8-adc1dbb4bed0"},"withCondition":{"id":1}}]

You can specify matching values using string literals.

Match Type Example Syntax

string literal {"uuid":"d75255f0-a245-11e7-8618-adc1dbb4bed0"}

string literal {"id":"22"}

NOTE: Use of Trifacta patterns or regular expressions is not supported.

For more information on patterns, see Text Matching.


Request Body Example: Replace multiple connections
This example request includes replacements for multiple connection references.

Copyright © 2019 Trifacta Inc. Page #370


NOTE: Rules are applied in the listed order. If you are applying multiple rules to the same object in the
import package, the second rule must reference the expected changes applied by the first rule.

This type of replacement applies if the imported packages contain sources that are imported through two separate
connections:

{"tableName":"connections","onCondition":{"uuid":"d75255f0-a245-11e7-8618
-adc1dbb4bed0"},"withCondition":{"id":1}},

{"tableName":"connections","onCondition":{"uuid":"d552045e0-c314-22b5-941
0-acd1bcd8eea2"},"withCondition":{"id":2}}
]

Response

Response Status Code - Success: 200 - OK

The response body contains any previously created rules that have been deleted as a result of this update.
Response Body Example: All new rule, no deletions
If the update does not overwrite any previous rules, then no rules are deleted. So, the response looks like the
following:

{
"deleted": []
}

Response Body Example: Replace connection


If you submit the request again, the response contains the rule definition of the previous update, which has been
deleted. This example applies to the one-rule change listed previously:

Copyright © 2019 Trifacta Inc. Page #371


{
"deleted": [
{
"onCondition": {
"uuid": "d75255f0-a245-11e7-8618-adc1dbb4bed0"
},
"withCondition": {
"id": 1
},
"id": 1,
"tableName": "connections",
"createdBy": 3,
"updatedBy": 3,
"createdAt": "2017-11-07T01:42:46.798Z",
"updatedAt": "2017-11-07T01:42:46.798Z",
"deploymentId": 4
}
]
}

Reference

Property Description

onCondition The matching object identifier and the specified literal or pattern to match.

withCondition The identifier for the object type, as specified in by the tableName value, which is being modified.

id Internal identifier for the object import rule

tableName Name of the table to which the mapping is applied. Values:

connections - applies to local connections


deploymentId Internal identifier for the deployment to which to apply the import rule.

API Deployments Patch v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Deployments Patch v4

Contents:
Required Permissions
Request
Response
Reference

Update the specified deployment.

Copyright © 2019 Trifacta Inc. Page #372


NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

/v3/deployments/<id>

where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v3/deployments/2

Request Body: Example - Modify the deployment name

{
"name": "New Deployment Name"
}

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 2,
"updatedBy": 1,
"updatedAt": "2017-10-13T00:06:12.147Z"
}

Reference

For more information on the properties of a deployment, see API Deployments Get v3.
API Deployments Run v3

Copyright © 2019 Trifacta Inc. Page #373


The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Deployments Run v4

Contents:
Required Permissions
Request
Response
Reference

Run the job for the active release of the specified deployment.
At least one manual output must be specified for the main flow within the package. See Flow View Page.
An active release must be specified for the deployment. See API Releases Patch v3.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v3/deployments/<id>/run

where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v3/deployments/4/run

Request Body:
Empty.
Request Body - dataset with parameters:
You can apply parameter overrides when running a deployment. Add the following structure to the request body:

Copyright © 2019 Trifacta Inc. Page #374


{
"runParameters": {
"overrides": {
"data": [
{ "key": "myParamName", "value": "override value"
}]
}
}
}

where:

Item Description

key Name of parameter to apply to the job run.

value The string value to assign to the parameter for the job run.

Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
"data": [
{
"reason": "JobStarted",
"sessionId": "dd6a90e0-c353-11e7-ad4e-7f2dd2ae4621",
"id": 33,
"jobs": {
"data": [
{
"id": 68
},
{
"id": 69
},
{
"id": 70
}
]
}
}
]
}

Copyright © 2019 Trifacta Inc. Page #375


Reference

Property Description

reason Action undertaken on the endpoint.

sessionId Internal identifier for the session of the job run

id JobGroup identifier. For more information, see API JobGroups Get v3.

jobs.data.id Internal identifier for the individual jobs that compose the job group being executed.

API Deployments Value Import Rules Patch v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Deployments Value Import Rules Patch v4

Contents:
Required Permissions
Request
Response
Reference

Create a list of value-based import rules for the specified deployment. Delete any previous rules applied to the
same values.

NOTE: Import rules must be applied to individual deployments.

The generated rules apply to all flows that are imported into the Production instance after they have been created.

NOTE: Deployments pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

The response contains any previously created rules that have been deleted as a result of this change.
You can also make replacements in the import package based on object references. See
API Deployments Object Import Rules Patch v3.
Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

Copyright © 2019 Trifacta Inc. Page #376


/v3/deployments/<id>/valueImportRules

where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v3/deployments/4/valueImportRules

Request Body Example: Single value replacement


The following JSON array describes a single replacement rule for the S3 bucket name. In this case, the wrangle
-dev bucket name has been replaced by the wrangle-prod bucket name, which means data is pulled in the
Production deployment from the appropriate S3 bucket.

NOTE: The executing user of any job must have access to any data source that is remapped in the new
instance.

[{"type":"s3Bucket","on":"wrangle-dev","with":"wrangle-prod"}]

Request Body Example: Multiple value replacements


The following JSON array describes two replacements for the fileLocation values. In this case, rules are
applied in succession.

NOTE: Rules are applied in the listed order. If you are applying multiple rules to the same object in the
import package, the second rule must reference the expected changes applied by the first rule.

[
{"type":"fileLocation","on":"klamath","with":"klondike"},
{"type":"fileLocation","on":"/\/dev\//","with":"/prod/"}
]

In the above:
The first rule replaces the string klamath in the path to the source with the following value: klondike.
The second rule performs a regular expression match on the string /dev/. Since the match is described
using the regular expression syntax, the backslashes must be escaped. The replacement value is the
following literal: /prod/.

You can specify matching values using the following types of matches:

Match Type Example Syntax

string literal {"on":"d75255f0-a245-11e7-8618-adc1dbb4bed0"}

Copyright © 2019 Trifacta Inc. Page #377


regular expression {"on":"/[0-9a-zA-z]{8}-a245-11e7-8618-adc1dbb4bed0/"}

NOTE: Use of Trifacta patterns is not supported.

For more information on patterns, see Text Matching.

Response

Response Status Code - Success: 200 - OK

The response body contains any previously created rules that have been deleted as a result of this update.
Response Body Example: All new rule, no deletions
If the update does not overwrite any previous rules, then no rules are deleted. So, the response looks like the
following:

{
"deleted": []
}

Response Body Example: Replace file location, delete previous rule


If you submit the request again, the response contains the rule definition of the previous update, which has been
deleted.

{
"deleted": [
{
"on": "wrangle-dev",
"id": 1,
"type": "s3Bucket",
"with": "wrangle-prod",
"createdBy": 3,
"updatedBy": 3,
"createdAt": "2017-11-07T02:16:57.743Z",
"updatedAt": "2017-11-07T02:16:57.743Z",
"deploymentId": 1
}
]
}

Copyright © 2019 Trifacta Inc. Page #378


Reference

Property Description

on The specified literal or pattern to match.

id Internal identifier for the value import rule

type The type of value import rule:

fileLocation - the location of a specified file.


s3Bucket - location of the S3 bucket to modify
with The replacement value or pattern

deploymentId Internal identifier for the deployment to which to apply the import rule.

API Flows Create v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Flows Create v4

Contents:
Required Permissions
Request
Response
Reference

Create a new flow with specified name.

NOTE: You cannot add datasets to the flow through this endpoint. Moving pre-existing datasets into a
flow is not supported in this release. Create the flow first and then when you create the datasets,
associate them with the flow at the time of creation.
See API ImportedDatasets Create v3.
See API WrangledDatasets Create v3.

Version: v3

Required Permissions

The authenticated user must be an admin.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

Copyright © 2019 Trifacta Inc. Page #379


/v3/flows/

Request Body:
A name value is required.

{
"name": "My Flow",
"description": "This is my flow."
}

Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
"id": 6,
"name": "My Flow",
"description": "This is my flow.",
"createdBy": 1,
"updatedBy": 1,
"updatedAt": "2017-02-17T17:08:57.848Z",
"createdAt": "2017-02-17T17:08:57.848Z"
}

Reference

For more information on the properties of a flow, see API Flows Get v3.

API Flows Delete v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Flows Delete v4

Contents:
Required Permissions
Request
Response
Reference

Delete the specified flow.

Copyright © 2019 Trifacta Inc. Page #380


Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v3/flows/<id>

where:

Parameter Description

<id> Internal identifier for the flow

Request URI - Example:

/v3/flows/2

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

For more information on the properties of a user, see API Flows Get v3.
API Flows Get List v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Flows Get List v4

Contents:
Required Permissions
Request
Response
Reference

Copyright © 2019 Trifacta Inc. Page #381


Returns a list of all flows accessible to the authenticated user.
Version: v3

Required Permissions

If you are not logged in or are logged as a non-admin user, you can retrieve only your flows.
If you are logged in as an admin, you can retrieve all flows in the platform.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/flows

Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

/v3/flows?limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Examples:

Copyright © 2019 Trifacta Inc. Page #382


[
{
"id": 7,
"name": "USDA_Farmers_Market_2014 Flow",
"description": null,
"deleted_at": null,
"cpProject": null,
"createdAt": "2017-11-07T19:30:29.296Z",
"updatedAt": "2017-11-07T19:30:29.296Z",
"createdBy": 1,
"updatedBy": 1,
"associatedPeople": [
{
"outputHomeDir":
"/trifacta/queryResults/[email protected]",
"name": "Administrator",
"email": "[email protected]",
"id": 1,
"flowpermission": {
"flowId": 7,
"personId": 1,
"role": "owner"
}
}
]
},
{
"id": 3,
"name": "2013 POS",
"description": null,
"deleted_at": null,
"cpProject": null,
"createdAt": "2017-11-07T17:02:34.662Z",
"updatedAt": "2017-11-07T17:02:34.662Z",
"createdBy": 1,
"updatedBy": 1,
"associatedPeople": [
{
"outputHomeDir":
"/trifacta/queryResults/[email protected]",
"name": "Administrator",
"email": "[email protected]",
"id": 1,
"flowpermission": {
"flowId": 3,
"personId": 1,
"role": "owner"
}
}
]
}
]

Copyright © 2019 Trifacta Inc. Page #383


Reference

For more information on the properties of a user, see API Flows Get v3.

API Flows Get v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Flows Get v4

Contents:
Required Permissions
Request
Response
Reference

Retrieve the flow information for a specified flow identifier.


Version: v3

Required Permissions

The authenticated user must be an admin.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/flows/<id>

Parameter Description

<id> Internal identifier of the flow to retrieve.

Request URI - Example:

/v3/flows/10

Request Body:
Empty.

Copyright © 2019 Trifacta Inc. Page #384


Response

Response Status Code - Success: 200 - OK

Response Body Example:

[
{
"id": 10,
"name": "2013 POS",
"description": null,
"deleted_at": null,
"cpProject": null,
"createdAt": "2017-11-07T17:02:34.662Z",
"updatedAt": "2017-11-07T17:02:34.662Z",
"createdBy": 1,
"updatedBy": 1,
"associatedPeople": [
{
"outputHomeDir":
"/trifacta/queryResults/[email protected]",
"name": "Administrator",
"email": "[email protected]",
"id": 1,
"flowpermission": {
"flowId": 3,
"personId": 1,
"role": "owner"
}
}
]
}
]

Copyright © 2019 Trifacta Inc. Page #385


Reference

Property Description

id Internal identifier for the flow

name Display text for the flow

description User-friendly description for the flow

cpProject Not used.

createdAt Timestamp for when the flow was created

updatedAt Timestamp for when the flow was last modified

createdBy Internal identifier of the user who created the flow

updatedBy Internal identifier of the user who last updated the flow

associatedPeople.outputHomeDir Output directory for the associated person.

associatedPeople.name Name of the associated person.

associatedPeople.email User ID (email address) of the associated person.

associatedPeople.id Internal ID for the associated person.

associatedPeople.flowpermission.flowId Internal ID for the flow.

associatedPeople.flowpermission.personId Internal ID for the permitted user.

associatedPeople.flowpermission.role Role for the permitted user.

owner - can do anything to the flow.


collaborator - can do most actions on the flow except change the name and
delete the flow.

API Flows Package Get DryRun v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Flows Package Get DryRun v4

Contents:
Required Permissions
Request
Response
Reference

Performs a dry-run of generating a flow package and exporting it, which performs a check of all permissions
required to export the package.
If they occur, permissions errors are reported in the response.
Version: v3

Required Permissions

Copyright © 2019 Trifacta Inc. Page #386


NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/flows/<id>/package/dryRun

Parameter Description

<id> Internal identifier of the flow to retrieve.

Request URI - Example:

/v3/flows/7/package/dryRun

Request Body:
None.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{}

Reference

None.
API Flows Package Get v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Flows Package Get v4

Contents:
Required Permissions
Request
Response
Reference

Retrieve a package containing the definition of the specified flow.


Response body is the contents of the package. Package contents are a ZIPped version of the flow definition.

Copyright © 2019 Trifacta Inc. Page #387


Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/flows/<id>/package

Parameter Description

<id> Internal identifier of the flow to retrieve.

Request URI - Example:

/v3/flows/7/package

Request Body:
None.

Response

Response Status Code - Success: 200 - OK

Response Body Example:


Response body is the contents of the ZIP file. This package should be downloaded to your local environment.

Reference

None.
API Flows Package Post DryRun v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Flows Package Post DryRun v4

Contents:
Required Permissions
Request
Response
Reference

Copyright © 2019 Trifacta Inc. Page #388


Performs a dry-run of importing a flow package, which performs a check of all permissions required to import the
package, as well as any specified import rules.
For more information on import rules, see Define Import Mapping Rules.
If they occur, errors are reported in the response.
After you have successfully completed a dry-run, you can execute a formal import. See
API Flows Package Post v3.
Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v3/flows/package/dryRun

Request URI - Example:

/v3/flows/package/dryRun

Request Body:
The request body must include the following key and value combination submitted as form data. This path is the
location of the ZIP package that you are importing.

key value

data "@path-to-file"

Example request - curl:

curl -X POST \
http://example.com:3005/v3/flows/package/dryRun \
-H 'authorization: Basic c29sc29uQHRyaWZhY3RhLmNvbTphZG1pbg==' \
-H 'cache-control: no-cache' \
-H 'content-type: multipart/form-data' \
-F [email protected]

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #389


{
"importRuleChanges": {
"object": [],
"value": []
},
"flowName": "[267f4340] 2013 POS"
}

Reference

None.
API Flows Package Post v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Flows Package Post v4

Contents:
Required Permissions
Request
Response
Reference

Performs an import of a flow package, which also applies any specified import rules.
Before you import, you can perform a dry-run to check for errors. See API Flows Package Post DryRun v3.
For more information on import rules, see Define Import Mapping Rules.
Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v3/flows/package/

Request URI - Example:

/v3/flows/package

Copyright © 2019 Trifacta Inc. Page #390


Request Body:
The request body must include the following key and value combination submitted as form data. This path is the
location of the ZIP package that you are importing.

key value

data "@path-to-file"

Example request - curl:

curl -X POST \
http://example.com:3005/v3/flows/package \
-H 'authorization: Basic c29sc29uQHRyaWZhY3RhLmNvbTphZG1pbg==' \
-H 'cache-control: no-cache' \
-H 'content-type: multipart/form-data' \
-F [email protected]

Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
"importRuleChanges": {
"object": [],
"value": []
},
"flowName": "[267f4340] 2013 POS"
}

Reference

None.
API Flows Patch v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Flows Patch v4

Contents:
Required Permissions
Request
Response
Reference

Update an existing flow based on the specified identifier.

Copyright © 2019 Trifacta Inc. Page #391


NOTE: You cannot add datasets to the flow through this endpoint. Moving pre-existing datasets into a
flow is not supported in this release. Create the flow first and then when you create the datasets,
associate them with the flow at the time of creation.
See API ImportedDatasets Create v3.
See API WrangledDatasets Create v3.

Version: v3

Required Permissions

The authenticated user must be an admin.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

/v3/flows/<id>

Parameter Description

<id> Internal identifier of the flow to update.

Request URI - Example:

/v3/flows/6

Request Body:
You can modify the following properties.

{
"name": "My Flow",
"description": "This is my flow."
}

Response

Response Status Code - Success: 200 - Ok

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #392


{
"id": 6,
"updatedBy": 1,
"updatedAt": "2017-02-17T18:28:47.675Z"
}

Reference

For more information on the properties of a flow, see API Flows Get v3.

API ImportedDatasets Create v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API ImportedDatasets Create v4

Contents:
Required Permissions
Request and Response
Examples by Type
File (HDFS and S3 sources)
Hive
Relational
Relational with Custom SQL Query
Reference

Create an imported dataset from an available resource. Created dataset is owned by the authenticated user.

NOTE: When an imported dataset is created via API, it is always imported as an unstructured dataset.
Any recipe that references this dataset should contain initial parsing steps required to structure the data.

NOTE: Do not create an imported dataset from a file that is being used by another imported dataset. If
you delete the newly created imported dataset, the file is removed, and the other dataset is corrupted.
Use a new file or make a copy of the first file first.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request and Response

Request Type: POST

Endpoint:

Copyright © 2019 Trifacta Inc. Page #393


/v3/importedDatasets

Response Status Code - Success: 201 - Created

Examples by Type

Below, you can review the basic request body for creating imported datasets for various types of sources:
File (HDFS or S3 source)
Hive
Relational
Relation with Custom SQL Query

File (HDFS and S3 sources)

Request Body - HDFS file:


Below, the bucket value is set to null. This parameter applies only to S3 sources.

NOTE: The path value should not include the HDFS protocol, host, or port information. You only need to
provide the path on HDFS.

{
"path":
"/trifacta/uploads/1/4aee9852-cf92-47a8-8c6a-9ff2adeb3b4a/POS-r02.txt",
"type": "hdfs",
"bucket": null,
"name": "POS-r02b.txt",
"description": "POS-r02 - copy"
}

Request Body - S3 file:


For S3 sources, a bucket must be specified. Below, the bucket value is set to myBucket.

NOTE: The path value should not include the S3 protocol, host, or port information. You only need to
provide the path on S3.

{
"path":
"/trifacta/uploads/1/4aee9852-cf92-47a8-8c6a-9ff2adeb3b4a/POS-r02.txt",
"type": "s3",
"bucket": "myBucket",
"name": "POS-r02b.txt",
"description": "POS-r02 - copy"
}

Copyright © 2019 Trifacta Inc. Page #394


Response Body - file:
Following example is for an HDFS file. For an S3 file, type=s3.

{
"id": 8,
"size": "281032",
"path":
"/trifacta/uploads/1/4aee9852-cf92-47a8-8c6a-9ff2adeb3b4a/POS-r02.txt",
"isSharedWithAll": false,
"type": "hdfs",
"bucket": null,
"isSchematized": false,
"createdBy": 1,
"updatedBy": 1,
"updatedAt": "2017-02-08T18:38:56.640Z",
"createdAt": "2017-02-08T18:38:56.560Z",
"connectionId": null,
"parsingScriptId": 14,
"cpProject": null
}

Hive

Request Body - Hive:


Notes:
Note that the type=jdbc.
The columns key is optional. If not provided, all columns in the source table are included.

{
"visible": true,
"numFlows": 0,
"size": -1,
"type": "jdbc",
"jdbcType": "TABLE",
"jdbcPath": [
"DB1"
],
"jdbcTable": "MyHiveTable",
"columns": [
"column1",
"column2"
],
"connectionId": 16,
"name": "My Hive Table"
}

Response Example - Hive:

Copyright © 2019 Trifacta Inc. Page #395


{
"jdbcTable": "MyHiveTable",
"jdbcPath": [
"DB1"
],
"columns": [
"column1",
"column2"
],
"filter": null,
"raw": null,
"isSharedWithAll": false,
"id": 192,
"size": "-1",
"type": "jdbc",
"connectionId": 16,
"createdBy": 1,
"updatedBy": 1,
"updatedAt": "2017-02-17T18:09:17.745Z",
"createdAt": "2017-02-17T18:09:16.407Z",
"path": null,
"bucket": null,
"parsingScriptId": 366,
"cpProject": null,
"isSchematized": true
}

Relational

Request Body - Relational:


Notes:
If you know the size value for the table, please provide. It is helpful for performance reasons and
validation but is not required.
The columns key is optional. If not provided, all columns in the source table are included.

Copyright © 2019 Trifacta Inc. Page #396


{
"visible": true,
"numFlows": 0,
"size": 65536,
"type": "jdbc",
"jdbcType": "TABLE",
"jdbcPath": [
"OracleDB_1"
],
"jdbcTable": "MyOracleTable",
"columns": [
"I",
"J",
"K"
],
"connectionId": 7,
"name": "My Oracle Table"
}

Response Example - Relational:

{
"jdbcTable": "MyOracleTable",
"jdbcPath": [
"OracleDB_1"
],
"columns": [
"I",
"J",
"K"
],
"filter": null,
"raw": null,
"isSharedWithAll": false,
"id": 195,
"size": "65536",
"type": "jdbc",
"connectionId": 7,
"createdBy": 1,
"updatedBy": 1,
"updatedAt": "2017-02-17T18:10:48.662Z",
"createdAt": "2017-02-17T18:10:47.441Z",
"path": null,
"bucket": null,
"parsingScriptId": 372,
"cpProject": null,
"isSchematized": true
}

Copyright © 2019 Trifacta Inc. Page #397


Relational with Custom SQL Query

You can submit custom SQL queries to relational or hive connections. These custom SQLs can be used to
pre-filter the data inside the database, improving performance of the query and the overall dataset.
For more information, see Enable Custom SQL Query.
Request Body:
Notes:
See previous notes on queries to relational sources.
As part of the request body, you must submit the custom SQL query as the value for the raw property.

The following example is valid for Oracle databases. Note the escaping of the double-quote marks.

NOTE: Syntax for the custom SQL query varies between relational systems. For more information on
syntax examples, see Create Dataset with SQL.

{
"visible": true,
"numFlows": 0,
"type": "jdbc",
"jdbcType": "TABLE",
"connectionId": 7,
"raw": "SELECT INST#,BUCKET#,INST_LOB# FROM
\"AUDSYS\".\"CLI_SWP$7395268a$1$1\"",
"size": -1,
"name": "SQL Dataset 1"
}

Response Body:
In the response, note that the source of the data is defined by the connectionId value and the SQL defined in
the raw value.

Copyright © 2019 Trifacta Inc. Page #398


{
"jdbcTable": null,
"jdbcPath": null,
"columns": null,
"filter": null,
"raw": [
"SELECT INST#,BUCKET#,INST_LOB# FROM
\"AUDSYS\".\"CLI_SWP$7395268a$1$1\""
],
"isSharedWithAll": false,
"id": 196,
"size": "-1",
"type": "jdbc",
"connectionId": 7,
"createdBy": 1,
"updatedBy": 1,
"updatedAt": "2017-02-17T19:09:10.117Z",
"createdAt": "2017-02-17T19:07:12.757Z",
"path": null,
"bucket": null,
"parsingScriptId": 378,
"cpProject": null,
"isSchematized": true
}

Reference

For more information on the properties of an imported dataset, see API ImportedDatasets Get v3.
API ImportedDatasets Delete v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API ImportedDatasets Delete v4

Contents:
Required Permissions
Request
Response
Reference

Delete the specified dataset.


Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #399


Request

Request Type: DELETE

Endpoint:

/v3/importedDatasets/<id>

where:

Property Description

<id> Internal identifier for the imported dataset

Request URI - Example:

/v3/importedDatasets/2

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

For more information on the properties of an imported dataset, see API ImportedDatasets Get v3.
API ImportedDatasets Get List v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API ImportedDatasets Get List v4

Contents:
Required Permissions
Request
Response
Reference

Get the list of accessible imported datasets for the authenticated user.
Version: v3

Copyright © 2019 Trifacta Inc. Page #400


Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/importedDatasets

Endpoint with embedded reference:

/v3/importedDatasets/?embed=connection

For more information, see API ImportedDatasets Get v3.


Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

/v3/importedDatasets?embed=connection&limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.
Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:


In the following example, you can see the data for three separate connections:
JDBC with custom SQL query
JDBC
File-based (uploaded)

{
"data": [
{
"id": 15,
"size": "-1",
"path": null,
"isSharedWithAll": false,

Copyright © 2019 Trifacta Inc. Page #401


"type": "jdbc",
"bucket": null,
"isSchematized": true,
"createdAt": "2017-02-03T07:34:40.727Z",
"updatedAt": "2017-02-03T07:34:43.910Z",
"createdBy": 1,
"updatedBy": 1,
"name": "all_objects",
"description": null,
"connection": {
"id": 2
},
"parsingRecipe": {
"id": 37
},
"relationalSource": {
"relationalPath": [
"adam_test",
"sys"
],
"columns": [
"name",
"object_id",
"principal_id",
"schema_id",
"parent_object_id",
"type",
"type_desc",
"create_date",
"modify_date",
"is_ms_shipped",
"is_published",
"is_schema_published"
],
"filter": null,
"raw": null,
"id": 1,
"tableName": "all_objects",
"createdAt": "2017-02-03T07:34:40.760Z",
"updatedAt": "2017-02-03T07:34:40.760Z",
"datasourceId": 15
}
},
{
"id": 14,
"size": "2049672",
"path":
"/trifacta/uploads/1/1a0d9144-a0bc-44a2-ae93-fc3f53613930/base_type_map_a
rray_record_large.avro",
"isSharedWithAll": false,
"type": "hdfs",
"bucket": null,
"isSchematized": true,

Copyright © 2019 Trifacta Inc. Page #402


"createdAt": "2017-02-03T05:16:49.934Z",
"updatedAt": "2017-02-03T05:16:51.200Z",
"createdBy": 1,
"updatedBy": 1,
"name": "base_type_map_array_record_large.avro",
"description": null,
"parsingRecipe": {
"id": 35
}
},
{
"id": 13,
"size": "711558",
"path":
"/trifacta/queryResults/administrator/hello_world/original_2_1_random_row
s.json",
"isSharedWithAll": false,
"type": "s3",
"bucket": "3fac-jlong-test",
"isSchematized": false,
"createdAt": "2017-02-03T05:15:07.398Z",
"updatedAt": "2017-02-03T05:15:11.724Z",
"createdBy": 1,
"updatedBy": 1,
"name": "original_2_1_random_rows.json",
"description": null,
"parsingRecipe": {
"id": 33
}

Copyright © 2019 Trifacta Inc. Page #403


}
]
}

Reference

For more information on the properties of an imported dataset, see API ImportedDatasets Get v3.
API ImportedDatasets Get v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API ImportedDatasets Get v4

Contents:
Required Permissions
Request
Response
Reference

Get the specified imported dataset.


Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/importedDatasets/<id>

where:

Parameter Description

<id> Internal identifier for the imported dataset

Endpoint with embedded reference:

Use the following embedded reference to embed in the response data about the connection used to acquire the
source dataset if it was created from a Hive or relational connection.

/v3/importedDatasets/<id>?embed=connection

Copyright © 2019 Trifacta Inc. Page #404


Request URI - Example:

/v3/importedDatasets/196

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 196,
"size": "-1",
"path": null,
"isSharedWithAll": false,
"type": "jdbc",
"bucket": null,
"isSchematized": true,
"createdAt": "2017-02-17T19:07:12.757Z",
"updatedAt": "2017-02-17T19:09:10.117Z",
"createdBy": 1,
"updatedBy": 1,
"name": "SQL Dataset 1 – 4",
"description": null,
"connection": {
"id": 7
},
"parsingRecipe": {
"id": 378
},
"relationalSource": {
"relationalPath": null,
"columns": null,
"filter": null,
"raw": [
"SELECT INST#,BUCKET#,INST_LOB# FROM
\"AUDSYS\".\"CLI_SWP$7395268a$1$1\""
],
"id": 109,
"tableName": null,
"createdAt": "2017-02-17T19:07:12.767Z",
"updatedAt": "2017-02-17T19:07:12.767Z",
"datasourceId": 196
}
}

Response Body Example with embedded reference:

Copyright © 2019 Trifacta Inc. Page #405


The following response includes embedded information on the connection used to import the data.

{
"id": 313,
"size": "35651584",
"path": null,
"isSharedWithAll": false,
"type": "jdbc",
"bucket": null,
"isSchematized": true,
"createdAt": "2017-02-22T23:33:54.400Z",
"updatedAt": "2017-02-22T23:34:18.148Z",
"createdBy": 1,
"updatedBy": 1,
"name": "TestOracleDS",
"description": null,
"connection": {
"id": 7,
"name": "Oracle",
"description": "",
"type": "jdbc",
"createdBy": 1,
"isGlobal": true,
"credentialType": "basic",
"createdAt": "2017-01-11T01:21:54.950Z",
"updatedAt": "2017-01-11T01:22:20.107Z",
"updatedBy": 1
},
"parsingRecipe": {
"id": 645
},
"relationalSource": {
"relationalPath": [
"AUDSYS"
],
"columns": [
"INST#",
"BUCKET#",
"INST_LOB#",
"MAX_SEQ#",
"FLUSH_SCN",
"FLUSH_TIME",
"MIN_SCN",
"MAX_SCN",
"MIN_TIME",
"MAX_TIME",
"SID#",
"SERIAL#",
"STATUS",
"LOG_PIECE"
],
"filter": null,

Copyright © 2019 Trifacta Inc. Page #406


"raw": null,
"id": 121,
"tableName": "CLI_SWP$7395268a$1$1",
"createdAt": "2017-02-22T23:33:54.406Z",
"updatedAt": "2017-02-22T23:33:54.406Z",

Copyright © 2019 Trifacta Inc. Page #407


"datasourceId": 313
}
}

Reference

Common Properties:
The following properties are common to file-based and JDBC datasets.

Property Description

id Internal identifier of the imported dataset

size Size of the source file in bytes (if applicable)

path For HDFS and S3 file sources, this value defines the path to the source.

For JDBC sources, this value is not specified.

isSharedWithAll If true, the source is shared among all users of the platform.

type Identifies where the type of storage where the source is located. Values:

hdfs
s3
jdbc
bucket (If type=s3) Bucket on S3 where source is stored.

isSchematized (If source file is avro, or type=jdbc) If true, schema information is available for the source.

createdAt Timestamp for when the dataset was imported

UpdatedAt Timestamp for when the dataset was last updated

createdBy Internal identifier of the user who created the imported dataset

updatedBy Internal identifier of the user who last updated the imported dataset

name Internal name of the imported dataset

description User-friendly description for the imported dataset

connection Internal identifier of the connection to the server hosting the dataset.

If this value is null, the file was uploaded from a local file system.

To acquire the entire connection for this dataset, you can use either of the following endpoints:

/v3/importedDatasets?embed=connection
/v3/importedDatasets/:id?embed=connection

For more information, see API Connections Get v3.

parsingRecipe Internal identifier of the recipe that is used to parse the imported dataset for wrangling.

Hive or Relational Source:


If the source data is from Hive or a relational system (type=jdbc), the following properties contain information on
the source table, the imported columns, and any custom SQL filters applied to the table:

Copyright © 2019 Trifacta Inc. Page #408


Property Description

relationalPath Name of the database from which the source was queried.

If a custom SQL query has been applied, this value is null.

columns List of columns imported from the source, pre-filtered.

If a custom SQL query has been applied, this value is null.

filter This value is empty.

raw If custom SQL has been applied to the data source to filter the data before it is imported, all SQL statements are listed.

For more information, see Enable Custom SQL Query.

id Internal identifier for the relational source

tableName Name of the table from which the data is extracted.

If a custom SQL query has been applied, this value is null.

createdAt Timestamp for when the source was imported

updatedAt Timestamp for when the source was last updated

datasourceId Internal identifier for the datasource.

File:
File-based datasets support the common properties only.
API ImportedDatasets Post AddToFlow v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API ImportedDatasets Post AddToFlow v4

Contents:
Required Permissions
Request
Response
Reference

Add the specified imported dataset to a flow based on its internal identifier.
Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

Copyright © 2019 Trifacta Inc. Page #409


/v3/importedDatasets/<id>/addToFlow

where:

Parameter Description

<id> Internal identifier for the imported dataset

Request URI - Example:

/v3/importedDatasets/4/addToFlow

Request Body:

{
"flow": {
"id": 4
}
}

Response

Response Status Code - Success: 201 - Created

Response Body:

{
"id": 14,
"createdBy": 1,
"updatedBy": 1,
"scriptId": 7,
"flowId": 4,
"wrangled": false,
"updatedAt": "2017-06-28T19:38:29.275Z",
"createdAt": "2017-06-28T19:38:29.016Z",
"flowNodeId": null,
"deleted_at": null,
"activesampleId": 15
}

Copyright © 2019 Trifacta Inc. Page #410


Reference

Property Description

id Internal identifier for the new wrangled dataset.

createdBy Internal identifier of the user who created the flow.

updatedBy Internal identifier of the user who performed the update.

scriptId Internal identifier for the recipe for the dataset.

If null, the dataset has not been wrangled in the Transformer page.

flowId Internal identifier of the flow that contains this dataset

wrangled If true, this dataset is a wrangled dataset.

updatedAt Timestamp for when the dataset was updated.

createdAt Timestamp for when the dataset was created.

flowNodeId Internal identifier of the dataset within the flow.

deleted_at Timestamp for when the dataset was deleted.

If null, the dataset has not been deleted.

activesampleId Internal identifier for the currently active sample for the dataset.

If null, there is no currently active sample for the dataset.

For more information on the other properties, see API ImportedDatasets Get v3.
API JobGroups Create v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API JobGroups Create v4

Contents:
Required Permissions
Request
Response
Reference

Create a jobGroup, which launches the specified job as the authenticated user.
The request specification depends on one of the following conditions:
Dataset has already had a job run against it and just needs to be re-run.
Dataset has not had a job run, or the job definition needs to be re-specified.

NOTE: In this release, you cannot execute jobs sourced from datasets in Redshift or SQL DW or publish
to these locations via the API. This known issue will be fixed in a future release.

Version: v3

Required Permissions

Copyright © 2019 Trifacta Inc. Page #411


NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v3/jobGroups

Request Body - Re-run job:


If you are re-running a job that has already executed and do not need to modify any job settings, you can use the
following simplified body to launch it:

{
"wrangledDataset": {
"id": 7
}
}

Request Body - Specify job:


If you are specifying a new job or must re-run a job with new settings, you must include a version of the following
request body. Required parameters are listed below:

{
"wrangledDataset": {
"id": 1
},
"overrides": {
"execution": "photon",
"profiler": false,
"writesettings": [
{
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/cdr_txt.csv
",
"action": "create",
"format": "csv",
"compression": "none",
"header": false,
"asSingleFile": false
}
]
},
"ranfrom": "cli"
}

Request body - Specify job with parameter overrides:

Copyright © 2019 Trifacta Inc. Page #412


The spec for the previous job has been augmented with parameter overrides below. For more information on
parameters, see Overview of Parameterization.
See the runParameters block:

{
"wrangledDataset": {
"id": 1
},
"ranfrom": "cli",
"runParameters": {
"overrides": {
"data": [
{
"key": "myParamName",
"value": "override value"
}
]
}
},
"overrides": {
"execution": "photon",
"profiler": false,
"writesettings": [
{
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/cdr_txt.csv
",
"action": "create",
"format": "csv",
"compression": "none",
"header": false,
"asSingleFile": false
}
]
}
}

Response

Response Status Code - Success: 201 - Created

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #413


{
"jobgroupId": 3,
"jobIds": [
5,
6
],
"reason": "JobStarted",
"sessionId": "9c2c6220-ef2d-11e6-b644-6dbff703bdfc"
}

Reference

Request Reference:

Property Description

wrangledDataset (required) Internal identifier for the object whose results you wish to generate. The recipes of all
preceding datasets on which this dataset depends are executed as part of the job.

runParameters.overrides Use this section specify key-value pairs for parameter overrides to be applied during job execution.

overrides.execution (required, if first time running the job) Indicates the running environment on which the job is
executed. Accepted values:

photon
spark - Spark job on the integrated Hadoop cluster
databricksSpark - Spark implementation on Azure Databricks
For more information, see Running Environment Options.

overrides.profiler (required, if first time running the job) When set to true, a visual profile of the job is generated as
specified by the profiling options for the platform. See Profiling Options.

overrides.writesettings (required, if first time running the job) These settings define the publishing options for the job. See
below.

ranfrom (optional) If this value is set to null, then the job does not show up in the Job Details page.

If set to cli, the job appears as a CLI job.

See Job Details Page.

writesettings Reference:
The writesettings values allow you to specify aspects of the publication of results to the specified path locati
on.

NOTE: writesettings values are required if you are running this specified job for the dataset for the
first time.

NOTE: To specify multiple outputs, you can include additional writesettings objects in the request.
For example, if you want to generate output to csv and json, you can duplicate the writesettings obj
ect for csv and change the format value in the second one to json.

These settings correspond to values that you can apply through the UI or through the command line interface.
For UI information, see Run Job Page.
For CLI information, see CLI for Jobs.

Copyright © 2019 Trifacta Inc. Page #414


Property Description

path (required) The fully qualified path to the output location where to write the results

action (required) If the output file or directory exists, you can specify one of the following actions:

create - Create a new, parallel location, preserving the old results.


append - Add the new results to the old results.
overwrite - Replace the old results with the new results.

format (required) Output format for the results. Specify one of the following values:

csv
json
avro
pqt

NOTE: To specify multiple output formats, create additional writesettings object for each output
format.

compression (optional) For csv and json results, you can optionally compress them using bzip2 or gzip compression.
Default is none.

header (optional) For csv results with action set to create or append, this value determines if a header row with
column names is inserted at the top of the results. Default is false.

asSingleFile (optional) For csv and json results, this value determines if the results are concatenated into a single file or stored
as multiple files. Default is false.

API JobGroups Delete v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API JobGroups Delete v4

Contents:
Required Permissions
Request
Response
Reference

Delete the specified jobGroup.


Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Copyright © 2019 Trifacta Inc. Page #415


Endpoint:

/v3/jobGroups/<id>

where:

Parameter Description

<id> Internal identifier for the job group

Request URI - Example:

/v3/jobGroups/2

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

None.
API JobGroups Get Jobs v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API JobGroups Get Jobs v4

Contents:
Required Permissions
Request
Response
Reference

Get list of jobs for the specified jobGroup. For more information on jobGroups, see API JobGroups Get v3.
Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #416


Request

Request Type: GET

Endpoint:

/v3/jobGroups/<id>/jobs

where:

Parameter Description

<id> Internal identifier for the job group

Request URI - Example:

/v3/jobGroups/2/jobs

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"data": [
{
"id": 5,
"writeSetting": {
"id": 3
},
"scriptResult": {
"id": 4
},
"createdAt": "2017-05-05T20:38:15.883Z",
"updatedAt": "2017-05-05T20:38:19.411Z",
"status": "Complete",
"jobType": "filewriter",
"sampleSize": 100,
"percentComplete": 100,
"cpJob": null,
"createdBy": 1,
"errorMessage": null,
"jobGroup": {
"id": 3
}
},
{
"id": 6,

Copyright © 2019 Trifacta Inc. Page #417


"writeSetting": {
"id": 4
},
"scriptResult": {
"id": 5
},
"createdAt": "2017-05-05T20:38:15.888Z",
"updatedAt": "2017-05-05T20:38:19.007Z",
"status": "Complete",
"jobType": "filewriter",
"sampleSize": 100,
"percentComplete": 100,
"cpJob": null,
"createdBy": 1,
"errorMessage": null,
"jobGroup": {
"id": 3
}
},
{
"id": 4,
"executionLanguage": "photon",
"createdAt": "2017-05-05T20:38:15.808Z",
"updatedAt": "2017-05-05T20:38:18.433Z",
"status": "Complete",
"jobType": "wrangle",
"sampleSize": 100,
"percentComplete": 100,
"cpJob": null,
"createdBy": 1,
"errorMessage": null,
"jobGroup": {
"id": 3
}

Copyright © 2019 Trifacta Inc. Page #418


}
]
}

Reference

Property Description

writeSetting User-visible output settings. Contents may vary depending on the type of output.

scriptResult Internal identifier for job execution. Used by other dependent jobs to identify where to write results to or to collect
results from.

executionLanguage Indicator for the running environment where the job was executed. Values:

photon - Photon running environment internal to the Trifacta platform.


spark - Spark running environment on Hadoop cluster.
For more information on running environments, see Running Environment Options.

status Current status of the job. See API JobGroups Get v3.

jobType Type of job. Values:

filewriter - output results to a specified file


ingest - internal job for transferring JDBC-based data to backend storage. For more information, see
Configure JDBC Ingestion.
profile - job to compute statistical information about set of results.
publish - job to publish previously wrangled results to a new destination.
wrangle - execute the specified set of Wrangle steps on a dataset.
sampleSize Size of sample as a percentage of the entire dataset. 100 means that the entire dataset is used as the sample.

percentComplete Percentage of completion of the job at the time of the request. 100 means that the job has finished or failed.

createdBy Internal identifier of the user who launched the job.

errorMessage If the job failed, any error message is displayed here.

For more information on the other properties, see API JobGroups Get v3.
API JobGroups Get List v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API JobGroups Get List v4

Contents:
Required Permissions
Request
Response
Reference

Copyright © 2019 Trifacta Inc. Page #419


Get list of all jobGroups accessible to the authenticated user.
Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/jobGroups

Endpoint with embedded reference:

/v3/jobGroups/?embed=jobs,wrangledDataset

For more information, API JobGroups Get v3.


Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

/v3/jobGroups/?embed=jobs,wrangledDataset&limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.
Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #420


{
"data": [
{
"id": 9,
"name": null,
"description": null,
"ranfrom": "ui",
"status": "InProgress",
"profilingEnabled": true,
"createdAt": "2017-01-31T20:01:57.513Z",
"updatedAt": "2017-01-31T20:01:58.984Z",
"createdBy": 2,
"updatedBy": 2,
"wrangledDataset": {
"id": 92
},
"snapshot": {
"id": 54
},
"wrangleScript": {
"id": 62
},
"jobs": {
"data": null
}
},
{
"id": 8,
"name": null,
"description": null,
"ranfrom": "ui",
"status": "Complete",
"profilingEnabled": true,
"createdAt": "2017-01-31T19:59:23.804Z",
"updatedAt": "2017-01-31T20:00:28.278Z",
"createdBy": 2,
"updatedBy": 2,
"wrangledDataset": {
"id": 92
},
"snapshot": {
"id": 53
},
"wrangleScript": {
"id": 60
},
"jobs": {
"data": null
}
}
]
}

Copyright © 2019 Trifacta Inc. Page #421


Reference

For more information on the properties of a jobGroup, see API JobGroups Get v3.
API JobGroups Get Status v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API JobGroups Get Status v4

Contents:
Required Permissions
Request
Response
Reference

Get current status of the specified jobGroup. For more information on jobGroups, see API JobGroups Get List v3.
Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/jobGroups/<id>/status

where:

Parameter Description

<id> Internal identifier for the job group

Request URI - Example:

/v3/jobGroups/2/status

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #422


"Complete"

Reference

For more information on the available status messages, see API JobGroups Get List v3.
API JobGroups Get v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API JobGroups Get v4

Contents:
Required Permissions
Request
Response
Reference

Get information on the specified job group. A job group is a job that is executed from a specific node in a flow.
The job group may contain:
Wrangling job on the dataset associated with the node
Jobs on all datasets on which the selected job may depend
A profiling job for the job group
Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/jobGroups/<id>

where:

Parameter Description

<id> Internal identifier for the job group

Endpoint with embedded references:


Use the following endpoint to embed additional information:

Embed Description
Parameter

Copyright © 2019 Trifacta Inc. Page #423


jobs Embed information about the child jobs within the job group. Array includes information on transformation, profiling, and
publishing jobs that are part of the job group.

wrangledDataset This field contains the internal identifier for the dataset on which the job was run.

/v3/jobGroups/<id>?embed=jobs,wrangledDataset

Request URI - Example:

/v3/jobGroups/8

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 8,
"name": null,
"description": null,
"ranfrom": "ui",
"status": "Complete",
"profilingEnabled": true,
"createdAt": "2017-01-31T19:59:23.804Z",
"updatedAt": "2017-01-31T20:00:28.278Z",
"createdBy": 2,
"updatedBy": 2,
"wrangledDataset": {
"id": 92
},
"snapshot": {
"id": 53
},
"wrangleScript": {
"id": 60
},
"jobs": {
"data": null
}
}

Response Body Example with embedded reference:


The following example response includes embedded information on jobs and the recipe associated with it.

Copyright © 2019 Trifacta Inc. Page #424


{
"id": 11,
"name": null,
"description": null,
"ranfrom": "ui",
"status": "Complete",
"profilingEnabled": true,
"createdAt": "2017-02-22T23:24:46.247Z",
"updatedAt": "2017-02-22T23:24:48.627Z",
"createdBy": 1,
"updatedBy": 1,
"wrangledDataset": {
"id": 118,
"wrangled": true,
"createdAt": "2017-02-22T22:54:13.782Z",
"updatedAt": "2017-02-22T22:54:22.107Z",
"createdBy": 1,
"updatedBy": 1,
"activeSample": {
"id": 56
},
"flow": {
"id": 29
},
"script": {
"id": 107
}
},
"snapshot": {
"id": 65
},
"wrangleScript": {
"id": 79
},
"jobs": {
"data": [
{
"id": 23,
"jobId": 23,
"writesettingId": 12,
"scriptresultId": 41,
"createdAt": "2017-02-22T23:24:46.789Z",
"updatedAt": "2017-02-22T23:24:48.618Z",
"status": "Complete",
"jobType": "filewriter",
"sampleSize": 100,
"percentComplete": 100,
"cpJobId": null,
"createdBy": 1,
"errorMessage": null,
"jobGroup": {
"id": 11
}

Copyright © 2019 Trifacta Inc. Page #425


},
{
"id": 22,
"executionLanguage": "photon",
"jobId": 22,
"createdAt": "2017-02-22T23:24:46.740Z",
"updatedAt": "2017-02-22T23:24:48.426Z",
"wranglescriptId": 80,
"status": "Complete",
"jobType": "wrangle",
"sampleSize": 100,
"percentComplete": 100,
"cpJobId": null,
"createdBy": 1,
"errorMessage": null,
"jobGroup": {
"id": 11
}
}

Copyright © 2019 Trifacta Inc. Page #426


]
}
}

Reference

Property Description

id Internal identifier for the job group

name Internal name of the job group

description User-friendly description for the job group

ranfrom Where the job group was executed from:

ui - Trifacta application
cli - command line interface
status Current status of the job group:

Created - job group has been created based on the current action.
Pending - job group is queued for execution.
InProgress - job group is currently running.
Complete - job group has completed successfully.
Failed - job group has failed.
Canceled - job group was canceled by user action.
profilingEnabled When true, a profiling job was executed as part of this job group.

createdAt Timestamp for when the job group was launched

updatedAt Timestamp for when the job group was last updated

createdBy Internal identifier for the user who created the job group

updatedBy Internal identifier for the user who last updated the job group

wrangledDataset Internal identifier of the object from where the job group was executed.

snapshot Internal identifier of the data snapshot for the job group

wrangleScript Internal identifier of the Wrangle script to execute for the job group

jobs A list of all jobs that were launched based on this job group

API JobGroups Put Publish v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API JobGroups Put Publish v4

Contents:
Required Permissions
Request
Response
Reference

Copyright © 2019 Trifacta Inc. Page #427


For a specified jobGroup, this endpoint performs an ad-hoc publish of the results to the designated target.
Target information is based on the specified connection.
Job results to published are based on the specified jobGroup.
You can specify:
Database and table to which to publish
Type of action to be applied to the target table. Details are below.
Supported targets:
Hive
Redshift
For more information on jobGroups, see API JobGroups Get v3.
Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PUT

Endpoint:

/v3/jobGroups/<id>/publish

where:

Parameter Description

<id> Internal identifier for the job group

Request URI - Example:

/v3/jobGroups/2/publish

Request Body:

{
"connection": {
"id": 1
},
"path": ["default"],
"table": "test_table3",
"action": "create",
"inputFormat": "avro",
"flowNodeId": 10
}

Copyright © 2019 Trifacta Inc. Page #428


Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"jobgroupId": 2,
"jobIds": [
11
],
"reason": "JobStarted",
"sessionId": "d9f13aa0-3b35-11e7-9bff-c764dff4fad8"
}

Reference

Request Reference:

Property Description

connection Internal identifier of the connection to use to write the results.

path Path to the database to which to write the results.

table Name of table in the database to which to write the results.

action Type of writing action to perform with the results. Supported actions:

create - Create a new table with each publication. This table is empty except for the schema, which is taken
from the results. A new table receives a timestamp extension to its name.
load - Append a pre-existing table with the results of the data. The schema of the results and the table must
match.
createAndLoad - Create a new table with each publication and load it with the results data. A new table
receives a timestamp extension to its name.
truncateAndLoad - Truncate a pre-existing table and load it with fresh data from the results.
dropAndLoad - Drop the target table and load a new table with the schema and data from the results.

inputFormat Source format of the results. Supported values:

Hive:

avro
pqt
Redshift:

NOTE: For results to be written to Redshift, the source must be stored in S3 and accessed through an S3
connection.

NOTE: By default, data is published to Redshift using the public schema. To publish using a different
schema, preface the table value with the name of the schema to use: MySchema.MyTable.

csv
json
avro

Copyright © 2019 Trifacta Inc. Page #429


For more information on the available status messages, see API JobGroups Get v3.
API People Create v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API People Create v4

Contents:
Required Permissions
Request
Response
Reference

Create a new user.


Version: v3

Required Permissions

The authenticated user must be an admin.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v3/people/

Request Body:

{
"accept": "accept",
"password": "foo",
"password2": "foo",
"email": "[email protected]",
"name": "abc2"
}

Response

Response Status Code - Success: 201 - Created

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #430


{
"outputHomeDir": "/trifacta/queryResults/[email protected]",
"isAdmin": false,
"isDisabled": false,
"id": 2,
"email": "[email protected]",
"name": "abc2",
"updatedAt": "2017-02-08T19:07:08.985Z",
"createdAt": "2017-02-08T19:07:08.985Z",
"ssoPrincipal": null,
"hadoopPrincipal": null,
"cpPrincipal": null,
"lastLoginTime": null,
"awsConfig": null
}

Reference

Request properties:

Property Description

accept This property must be set to accept to create the user.

password2 This value confirms the value for password. These two property values must be identical.

For more information on the properties of a user, see API People Get v3.
API People Delete v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API People Delete v4

Contents:
Required Permissions
Request
Response
Reference

Delete the specified user.


Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #431


Request

Request Type: DELETE

Endpoint:

/v3/people/<id>

where:

Parameter Description

<id> Internal identifier for the user

Request URI - Example:

/v3/people/2

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

For more information on the properties of a user, see API People Get v3.
API People Get List v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API People Get List v4

Contents:
Required Permissions
Request
Response
Reference

Returns a list of all users of the platform.


Version: v3

Copyright © 2019 Trifacta Inc. Page #432


Required Permissions

If you are not logged in or are logged as a non-admin user, you can retrieve a very limited set of properties
for each user.
If you are logged in as an admin, you can retrieve the full property set for each user.

Request

Request Type: GET

Endpoint:

/v3/people

Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

/v3/people?limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Examples:


When not logged in, or logged in as a non-admin user:

{
"data": [
{
"name": "ExampleUser",
"email": "[email protected]",
"id": 955
},
{
"name": "Example User 2",
"email": "[email protected]",
"id": 888
}
]
}

Copyright © 2019 Trifacta Inc. Page #433


When logged in as an admin:

{
"data": [
{
"outputHomeDir": "/trifacta/queryResults/[email protected]",
"id": 955,
"email": "[email protected]",
"name": "ExampleUser",
"ssoPrincipal": null,
"hadoopPrincipal": "ExampleUser",
"cpPrincipal": null,
"isAdmin": false,
"isDisabled": false,
"lastLoginTime": null,
"createdAt": "2017-01-25T23:22:36.707Z",
"updatedAt": "2017-01-25T23:22:36.707Z",
"awsConfig": null
},
{
"outputHomeDir": "/trifacta/queryResults/[email protected]",
"id": 888,
"email": "[email protected]",
"name": "Example User 2",
"ssoPrincipal": null,
"hadoopPrincipal": null,
"cpPrincipal": null,
"isAdmin": false,
"isDisabled": false,
"lastLoginTime": null,
"createdAt": "2017-01-25T21:38:59.537Z",
"updatedAt": "2017-01-25T21:38:59.537Z",
"awsConfig": null
}
]
}

Reference

For more information on the properties of a user, see API People Get v3.

API People Get v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API People Get v4

Contents:
Required Permissions
Request

Copyright © 2019 Trifacta Inc. Page #434


Response
Reference

Retrieve the platform account information for a user specified by userId.


Version: v3

Required Permissions

The authenticated user must be an admin.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/people/<id>

where:

Parameter Description

<id> Internal identifier of the user to retrieve.

Request URI - Example:

/v3/people/2

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #435


{
"outputHomeDir": "/trifacta/queryResults/[email protected]",
"id": 2,
"email": "[email protected]",
"name": "foobar379216291",
"ssoPrincipal": null,
"hadoopPrincipal": "foo",
"cpPrincipal": null,
"isAdmin": false,
"isDisabled": false,
"lastLoginTime": "2017-01-27T23:24:54.930Z",
"createdAt": "2017-01-12T00:37:03.138Z",
"updatedAt": "2017-01-27T23:24:54.936Z",
"awsConfig": null
}

Reference

Property Description

outputHomeDir Home directory where the user's generated results are written

id Internal user identifier

email Email address (and loginId) for the user

name Display name for the user

ssoPrincipal (If SSO is enabled) Principal value of the user for single-sign on

hadoopPrincipal (If secure impersonation is enabled) Hadoop principal value for the user, which determines permissions on the Hadoop
cluster

cpPrincipal (If enabled) Principal value used to integrate with cloud platform

isAdmin If true, the user account is an administrator account.

isDisabled If true, the account is disabled.

lastLoginTime Timestamp for the last time that the user logged in

createdAt Timestamp for when the user account was created

updatedAt Timestamp for when the user account was last modified

awsConfig (If AWS integration is enabled) Value contains the S3 credentials, default bucket, and any extra buckets to which the user
has access

API People Patch v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API People Patch v4

Contents:

Copyright © 2019 Trifacta Inc. Page #436


Required Permissions
Request
Response
Reference

Updates the platform account information for a user specified by userId.


Version: v3

Required Permissions

The authenticated user must be an admin.

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

/v3/people/<id>

where:

Parameter Description

<id> Internal identifier of the user to update.

Request URI - Example:

/v3/people/2

Request Body:

{
"outputHomeDir": "/trifacta/queryResults/[email protected]",
"email": "[email protected]",
"name": "Joe Example",
"ssoPrincipal": null,
"hadoopPrincipal": null,
"cpPrincipal": null,
"isAdmin": false,
"isDisabled": false,
"awsConfig": null
}

Copyright © 2019 Trifacta Inc. Page #437


Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 2,
"updatedAt": "2017-05-18T19:46:46.839Z"
}

Reference

For more information on these properties, see API People Get v3.

API Releases Create DryRun v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Releases Create DryRun v4

Contents:
Required Permissions
Request
Response
Reference

Perform a dry-run of creating a release for the specified deployment, which performs a check of all permissions
required to import the package, as well as any specified import rules.
For more information on import rules, see Define Import Mapping Rules.
If they occur, errors are reported in the response.
After you have successfully completed a dry-run, you can formally create the release via API. See
API Releases Create v3.

NOTE: Releases pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Copyright © 2019 Trifacta Inc. Page #438


Endpoint:

/v3/deployments/<id>/releases/dryRun

where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v3/deployments/2/releases/dryRun

Request Body:
The request body must include the following key and value combination submitted as form data:

key value

data "@path-to-file"

Example request - curl:

curl -X POST \
http://example.com:3005/v3/deployments/1/releases/dryRun \
-H 'authorization: Basic c29sc29uQHRyaWZhY3RhLmNvbTphZG1pbg==' \
-H 'cache-control: no-cache' \
-H 'content-type: multipart/form-data' \
-F [email protected]

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"importRuleChanges": {
"object": [],
"value": []
},
"flowName": "POS-r01 Flow"
}

Reference

For more information on import rule changes, see Define Import Mapping Rules.

API Releases Create v3

Copyright © 2019 Trifacta Inc. Page #439


The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Releases Create v4

Contents:
Required Permissions
Request
Response
Reference

Create a release for the specified deployment.


Release is created from a local ZIP containing the package of the flow exported from the source system.

NOTE: Releases pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v3/deployments/<id>/releases

where:

Parameter Description

<id> Internal identifier for the deployment

Request URI - Example:

/v3/deployments/2/releases

Request Body:
The request body must include the following key and value combination submitted as form data:

key value

data "@path-to-file"

Example request - curl:

Copyright © 2019 Trifacta Inc. Page #440


curl -X POST \
http://example.com:3005/v3/deployments/1/releases \
-H 'authorization: Basic c29sc29uQHRyaWZhY3RhLmNvbTphZG1pbg==' \
-H 'cache-control: no-cache' \
-H 'content-type: multipart/form-data' \
-F [email protected]

Response

Response Status Code - Success: 201 - Created

Response Body Example:

{
"importRuleChanges": {
"object": [],
"value": []
},
"flowName": "POS-r01 Flow"
}

Reference

For more information on import rule changes, see Define Import Mapping Rules.

API Releases Delete v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Releases Delete v4

Contents:
Required Permissions
Request
Response
Reference

Delete the specified release.

NOTE: Releases pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v3

Required Permissions

Copyright © 2019 Trifacta Inc. Page #441


NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

/v3/releases/<id>

where:

Parameter Description

<id> Internal identifier for the release

Request URI - Example:

/v3/releases/2

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

None.
API Releases Get v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Releases Get v4

Contents:
Required Permissions
Request
Response
Reference

Get the specified release.

NOTE: Releases pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Copyright © 2019 Trifacta Inc. Page #442


Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/releases/<id>

where:

Parameter Description

<id> Internal identifier for the release

Request URI - Example:

/v3/releases/2

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"id": 2,
"notes": "My second release",
"packageUuid": "b6b76bc0-a1c6-11e7-8c9d-f53cb0bb7b0a",
"active": null,
"createdAt": "2017-08-01T07:00:00.000Z",
"updatedAt": "2017-10-05T12:26:36.326Z",
"deploymentId": 1,
"createdBy": 1,
"updatedBy": 2
}

Copyright © 2019 Trifacta Inc. Page #443


Reference

Property Description

id Internal identifier for the release

notes Display value for notes that you can add to describe the release

packageUuid Internal identifier for the package

active If true, the release is the active one for the deployment.

If set to null, the release is not active.

createdAt Timestamp for when the release was created.

updatedAt Timestamp for when the release was last updated.

createdBy Internal identifier for the user who created the release.

updatedBy Internal identifier for the user who last updated the release.

API Releases Package Get v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Releases Package Get v4

Contents:
Required Permissions
Request
Response
Reference

Retrieve a package containing the definition of the flow for the specified release.

NOTE: Releases pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

NOTE: This method exports flows from a Product instance, which is different from exporting using the fl
ows/:id/package, which exports from the Dev instance. Connection identifiers and paths may differ
between the two instances. This method is typically used for archiving flows from the Deployment
Manager.

Response body is the contents of the package. Package contents are a ZIPped version of the flow definition.
Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #444


Request

Request Type: GET

Endpoint:

/v3/releases/<id>/package

Parameter Description

<id> Internal identifier of the release to retrieve.

Request URI - Example:

/v3/releases/7/package

Request Body:
None.

Response

Response Status Code - Success: 200 - OK

Response Body Example:


Response body is the contents of the ZIP file. This package should be downloaded to your local environment.

Reference

None.
API Releases Patch v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API Releases Patch v4

Contents:
Required Permissions
Request
Response
Reference

Update the specified release.

NOTE: Releases pertain to Production instances of the Trifacta® platform. For more information, see
Overview of Deployment Manager.

Version: v3

Copyright © 2019 Trifacta Inc. Page #445


Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PATCH

Endpoint:

/v3/releases/<id>

where:

Parameter Description

<id> Internal identifier for the release

Request URI - Example:

/v3/releases/2

Request Body: Example - Activate the release


You can use the following example to make the current release the active one for the deployment.

NOTE: You can have only one active release per deployment. If this release is made active as part of this
execution, the currently active release is made inactive.

Tip: You can use this endpoint to deactivate a release, which prevents its jobs from being run. If there is
no active release for the deployment, no jobs are run via the deployment job run endpoint. See
API Deployments Run v3.

{
"active": true
}

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #446


{
"id": 2,
"updatedBy": 1,
"updatedAt": "2017-10-13T00:06:12.147Z"
}

Reference

For more information on the properties of a release, see API Releases Get v3.
API WrangledDatasets Create v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API WrangledDatasets Create v4

Contents:
Required Permissions
Request
Response
Reference

Create a new wrangled dataset from the specified imported dataset or wrangled dataset. Wrangled dataset is
owned by the authenticated user.

Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: POST

Endpoint:

/v3/wrangledDatasets/

Request Body - Imported Dataset:

Copyright © 2019 Trifacta Inc. Page #447


{
"name": "Copy of Imported Dataset 2",
"importedDataset": {
"id": 2
},
"flow": {
"id": 2
}
}

Request Body - Wrangled Dataset:

{
"name": "Copy of Wrangled Dataset 18",
"wrangledDataset": {
"id": 18
},
"flow": {
"id": 1
}
}

Response

Response Status Code - Success: 201 - Created

Response Body Example - Imported Dataset:

{
"id": 23,
"flowId": 2,
"scriptId": 24,
"wrangled": true,
"createdBy": 1,
"updatedBy": 1,
"updatedAt": "2017-02-08T20:28:06.067Z",
"createdAt": "2017-02-08T20:28:06.067Z",
"flowNodeId": null,
"deleted_at": null,
"activesampleId": null,
"name": "Copy of Imported Dataset 2",
"active": true
}

Response Body Example - Wrangled Dataset:

Copyright © 2019 Trifacta Inc. Page #448


{
"id": 20,
"flowId": 2,
"scriptId": 21,
"wrangled": true,
"createdBy": 1,
"updatedBy": 1,
"updatedAt": "2017-02-08T20:26:09.446Z",
"createdAt": "2017-02-08T20:26:09.446Z",
"flowNodeId": null,
"deleted_at": null,
"activesampleId": null,
"name": "Copy of Wrangled Dataset 18",
"active": true
}

Reference

For more information on the properties of a wrangled dataset, see API WrangledDatasets Get v3.

API WrangledDatasets Delete v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API WrangledDatasets Delete v4

Contents:
Required Permissions
Request
Response
Reference

Delete the specified wrangled dataset.

Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: DELETE

Endpoint:

Copyright © 2019 Trifacta Inc. Page #449


/v3/wrangledDatasets/<id>

where:

Parameter Description

<id> Internal identifier for the imported dataset

Request URI - Example:

/v3/wrangledDatasets/2

Request Body:
Empty.

Response

Response Status Code - Success: 204 - No Content

Response Body Example:


Empty.

Reference

For more information on the properties of a wrangled dataset, see API ImportedDatasets Get v3.
API WrangledDatasets Get List v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API WrangledDatasets Get List v4

Contents:
Required Permissions
Request
Response
Reference

Get the list of accessible wrangled datasets for the authenticated user.

Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Copyright © 2019 Trifacta Inc. Page #450


Request

Request Type: GET

Endpoint:

/v3/wrangledDatasets

Endpoint with embedded flow:

/v3/wrangledDatasets?embed=flow

For more information, see API WrangledDatasets Get v3.


Endpoint with paged retrieval:

By default, this endpoint returns results in sets of 25.


You can apply query parameters to change the size of the default set and to page through result sets. The
following example queries for results 100 at a time. In this case, the query asks for results 201-300:

/v3/wrangledDatasets?embed=flow&limit=100%offset=2

If the count of retrieved results is less than the limit, you have reached the end of the results.
Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"data": [
{
"id": 35,
"wrangled": true,
"createdAt": "2017-02-03T05:16:55.844Z",
"updatedAt": "2017-02-03T05:16:56.998Z",
"createdBy": 1,
"updatedBy": 1,
"name": "base_type_map_array_record_large",
"description": null,
"activeSample": {
"id": 12
},
"flow": {
"id": 12
},

Copyright © 2019 Trifacta Inc. Page #451


"script": {
"id": 36
}
},
{
"id": 33,
"wrangled": true,
"createdAt": "2017-02-03T05:15:16.145Z",
"updatedAt": "2017-02-03T05:15:18.859Z",
"createdBy": 1,
"updatedBy": 1,
"name": "original_2_1_random_rows",
"description": null,
"activeSample": {
"id": 10
},
"flow": {
"id": 11
},
"script": {
"id": 34
}
},
{
"id": 31,
"wrangled": true,
"createdAt": "2017-02-03T01:53:41.284Z",
"updatedAt": "2017-02-03T01:53:41.284Z",
"createdBy": 1,
"updatedBy": 1,
"name": "cdr.txt",
"description": null,
"flow": {
"id": 10
},
"script": {
"id": 32
}
},
{
"id": 30,
"wrangled": true,
"createdAt": "2017-02-03T01:53:40.930Z",
"updatedAt": "2017-02-03T01:53:40.930Z",
"createdBy": 1,
"updatedBy": 1,
"name": "customer_above_512k",
"description": null,
"flow": {
"id": 10
},
"script": {
"id": 31

Copyright © 2019 Trifacta Inc. Page #452


}

Copyright © 2019 Trifacta Inc. Page #453


}
]
}

Reference

For more information on the properties of a wrangled dataset, see API ImportedDatasets Get v3.
For more information on the embedded flow properties, see API Flows Get v3.
API WrangledDatasets Get PrimaryInputDataset v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API WrangledDatasets Get PrimaryInputDataset v4

Contents:
Required Permissions
Request
Response
Reference

Get the primary input dataset for the specified wrangled dataset. For a wrangled dataset, its primary input
dataset is the original dataset from which the wrangled dataset was created.

Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/wrangledDatasets/<id>/primaryInputDataset

where:

Parameter Description

<id> Internal identifier for the wrangled dataset

Request URI - Example:

Copyright © 2019 Trifacta Inc. Page #454


/v3/wrangledDatasets/3/primaryInputDataset

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example - imported dataset:

{
"importedDataset": {
"id": 47,
"size": "292817",
"path":
"/trifacta/uploads/1/2a677cbe-ca19-4d47-b038-65cda938588d/POS-r01.txt",
"isSharedWithAll": false,
"type": "hdfs",
"cpProject": null,
"bucket": null,
"connectionId": null,
"isSchematized": false,
"createdAt": "2017-02-21T17:54:56.621Z",
"updatedAt": "2017-02-21T17:54:56.840Z",
"createdBy": 1,
"updatedBy": 1,
"parsingScriptId": 92
}
}

Response Body Example - wrangled dataset:

Copyright © 2019 Trifacta Inc. Page #455


{
"wrangledDataset": {
"id": 50,
"scriptId": 49,
"flowId": 10,
"flowNodeId": null,
"deleted_at": null,
"wrangled": true,
"createdAt": "2017-02-21T16:53:15.619Z",
"updatedAt": "2017-02-21T17:04:03.257Z",
"activesampleId": 33,
"createdBy": 1,
"updatedBy": 1
}
}

Reference

Imported Dataset:
For more information on these properties, see API ImportedDatasets Get v3.

Wrangled Dataset:

Property Description

scriptId Internal identifier of the recipe associated with this dataset

flowId Internal identifier of the flow that contains this dataset

flowNodeId Internal identifier for the node of the flow to which the dataset is attached

deletedAt Timestamp for when the dataset was deleted.

If null, the dataset has not been deleted.

wrangled If true, this dataset is a wrangled dataset

For more information on the other properties, see API WrangledDatasets Get v3.
API WrangledDatasets Get v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API WrangledDatasets Get v4

Contents:
Required Permissions
Request
Response
Reference

Get the specified wrangled dataset.

Copyright © 2019 Trifacta Inc. Page #456


Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v3/wrangledDatasets/<id>

Endpoint with embedded reference:


Use the following endpoint to embed information about the flow that contains the dataset in the response body.

/v3/wrangledDatasets/<id>?embed=flow

where:

Parameter Description

<id> Internal identifier for the wrangled dataset

Request URI - Example:

/v3/wrangledDatasets/35/

Request Body:
Empty.

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #457


{
"id": 35,
"wrangled": true,
"createdAt": "2017-02-03T05:16:55.844Z",
"updatedAt": "2017-02-03T05:16:56.998Z",
"createdBy": 1,
"updatedBy": 1,
"name": "base_type_map_array_record_large",
"description": null,
"activeSample": {
"id": 12
},
"flow": {
"id": 12
},
"script": {
"id": 36
}
}

Response Body Example with embedded reference:

{
"id": 35,
"wrangled": true,
"createdAt": "2017-02-03T05:16:55.844Z",
"updatedAt": "2017-02-03T05:16:56.998Z",
"createdBy": 1,
"updatedBy": 1,
"name": "base_type_map_array_record_large",
"description": null,
"activeSample": {
"id": 12
},
"flow": {
"id": 12,
"name": "base_type_map_array_record_large Flow",
"description": null,
"createdAt": "2017-02-03T05:16:55.684Z",
"updatedAt": "2017-02-03T05:16:55.684Z",
"createdBy": 1,
"updatedBy": 1
},
"script": {
"id": 36
}
}

Copyright © 2019 Trifacta Inc. Page #458


Reference

Wrangled Dataset:
These properties apply to the source of the wrangled dataset.

Property Description

id Internal identifier of the wrangled dataset

wrangled If true, this dataset is a wrangled dataset

createdAt Timestamp for when the dataset was imported

UpdatedAt Timestamp for when the dataset was last updated

createdBy Internal identifier of the user who created the wrangled dataset

updatedBy Internal identifier of the user who last updated the wrangled dataset

name Internal name of the wrangled dataset

description User-friendly description for the wrangled dataset

activeSample Internal identifier of the currently active sample for this dataset

flow Internal identifier of the flow that contains this dataset

script Internal identifier of the recipe associated with this dataset

Embedded Flow:
For more information on the embedded flow properties, see API Flows Get v3.
API WrangledDatasets Put PrimaryInputDataset v3

The v3 APIs are scheduled for End of Life (EOL). You should migrate to using the latest version. For more
information, see API Version Support Matrix.
Latest version of this endpoint: API WrangledDatasets Put PrimaryInputDataset v4

Contents:
Required Permissions
Request
Response
Reference

Updated the primary input dataset for the specified wrangled dataset. Each wrangled dataset must have one and
only one primary input dataset, which can be an imported or wrangled dataset.

Tip: In the Trifacta application UI, the WrangledDataset object is called a recipe.

This action performs a dataset swap for the source of a wrangled dataset, which can be done through the UI. See
Flow View Page.

Tip: After you have created a job via API, you can use this API to swap out the source data for the job's
dataset. In this manner, you can rapidly re-execute a pre-existing job using fresh data. See
API JobGroups Create v3.

Copyright © 2019 Trifacta Inc. Page #459


Version: v3

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: PUT

Endpoint:

/v3/wrangledDatasets/<id>/primaryInputDataset

where:

Parameter Description

<id> Internal identifier for the wrangled dataset

Request URI - Example:

/v3/wrangledDatasets/3/primaryInputDataset

Request Body - imported dataset:

{
"importedDataset": {
"id": <id>
}
}

Request Body - wrangled dataset:

{
"wrangledDataset": {
"id": <id>
}
}

Response

Response Status Code - Success: 200 - OK

Response Body Example:

Copyright © 2019 Trifacta Inc. Page #460


{
"id": 50,
"scriptId": 49,
"flowId": 10,
"flowNodeId": null,
"deleted_at": null,
"wrangled": true,
"createdAt": "2017-02-21T16:53:15.619Z",
"updatedAt": "2017-02-21T17:04:03.257Z",
"activesampleId": 33,
"createdBy": 1,
"updatedBy": 1
}

Reference

For more information on these properties, see API WrangledDatasets Get PrimaryInputDataset v3.
API Session Get

Contents:
Required Permissions
Request
Response
Reference

Get the specified session information.


Version: all

Required Permissions

NOTE: Each request to the Trifacta® platform must include authentication credentials. See
API Authentication.

Request

Request Type: GET

Endpoint:

/v2/session

Request Body:
Empty.

Copyright © 2019 Trifacta Inc. Page #461


Response

Response Status Code - Success: 200 - OK

Response Body Example:

{
"outputHomeDir": "/trifacta/queryResults/[email protected]",
"id": 1,
"email": "[email protected]",
"name": "My Account",
"ssoPrincipal": null,
"hadoopPrincipal": null,
"cpPrincipal": null,
"isAdmin": true,
"isDisabled": false,
"forcePasswordChange": false,
"lastLoginTime": "2018-01-24T21:03:54.813Z",
"deleted_at": null,
"createdAt": "2018-01-24T08:29:11.248Z",
"updatedAt": "2018-01-24T21:03:54.813Z",
"awsconfigId": null,
"roles": [
{
"id": 1,
"role": "dataAdmin",
"createdAt": "2018-01-24T08:28:34.369Z",
"updatedAt": "2018-01-24T08:28:34.369Z",
"peopleworkspaces": {
"workspaceId": 1,
"personId": 1,
"roleId": 1,
"createdAt": "2018-01-24T08:29:11.360Z",
"updatedAt": "2018-01-24T08:29:11.360Z"
}
},
{
"id": 2,
"role": "wrangler",
"createdAt": "2018-01-24T08:28:34.369Z",
"updatedAt": "2018-01-24T08:28:34.369Z",
"peopleworkspaces": {
"workspaceId": 1,
"personId": 1,
"roleId": 2,
"createdAt": "2018-01-24T08:29:11.362Z",
"updatedAt": "2018-01-24T08:29:11.362Z"
}
},
{
"id": 4,
"role": "admin",

Copyright © 2019 Trifacta Inc. Page #462


"createdAt": "2018-01-24T08:28:39.321Z",
"updatedAt": "2018-01-24T08:28:39.321Z",
"peopleworkspaces": {
"workspaceId": 1,
"personId": 1,
"roleId": 4,
"createdAt": "2018-01-24T08:29:11.362Z",
"updatedAt": "2018-01-24T08:29:11.362Z"
}

Copyright © 2019 Trifacta Inc. Page #463


}
]
}

Reference

User information:
For more information on user properties, see API People Get v3.
Roles:

Property Description

id Internal identifier for the role

role User-friendly name of the role. Values:

dataAdmin
Wrangler

NOTE: All valid user accounts must have the Wrangler role.

createdAt Timestamp for when the role was added to the user account.

updatedAt Timestamp for when the role was last updated to the user account.

peopleworkspaces.workspaceId Internal identifier of the workspace to which the role applies.

peopleworkspaces.personId Internal identifier for the user within the workspace.

peopleworkspaces.roleId Internal identifier for this role within the workspace.

peopleworkspaces.createdAt Timestamp for when the workspace was created.

peopleworkspaces..updatedAt Timestamp for when the workspace was last updated.

API Version Support Matrix


Versions of the Trifacta® REST APIs are supported according to the following dates.

API Status Active Support Active Support End/ End of Life Date
Version Start Date Maintenance Start Date

v4 Active 2018-04-06
(Release 5.0)

v3 Active 2017-02-27 The next release of Trifacta Wrangler The next release of Trifacta Wrangler
(Release 4.0) Enterprise after Release 6.0 Enterprise after Release 6.0

API Migration to v4

Contents:
Connections
Datasets and Recipes
Flows
Flow import and export

Copyright © 2019 Trifacta Inc. Page #464


Jobgroups and Jobs
Deployments and Releases
Users

This document describes how to migrate your existing usage of the v3 endpoints to their v4 equivalents.

NOTE: In the next Trifacta Wrangler Enterprise after Release 6.0, the v3 endpoints reach End of Life
(EOL) and will no longer be available in the product. You must migrate your API endpoint usage to v4.

This section contains a mapping of documentation between the publicly available v3 endpoints and their v4
equivalents.

NOTE: Except as noted, these v3 behaviors should be reflected in the v4 endpoints. Please be sure to
review the notes.

Legend:

Item Description

URI Version-independent (vX) URI to the endpoint.

:id indicates that a numerical internal identifier for the relevant object must be included.
Method REST method to execute

v3 Documentation Link to v3 version of the documentation for the endpoint

v4 Documentation Link to v4 version of the documentation for the endpoint

Notes Any key notes on the changes between v3 and v4

Connections

URI Method v3 Documentation v4 Documentation Notes

/vX/connections POST API Connections Create API Connections Create Changes to request body
v3 v4 Changes to response body (connection
object)

/vX/connections GET API Connections Get API Connections Get


List v3 List v4

/vX/connections/:id GET API Connections Get v3 API Connections Get v4 Reference content on new version of
connection object

/vX/connections/:id/status GET API Connections Get API Connections Get


Status v3 Status v4

/vX/connections/:id DELETE API Connections Delete API Connections Delete


v3 v4

Copyright © 2019 Trifacta Inc. Page #465


Datasets and Recipes

URI Method v3 Documentation v4 Documentation Notes

/vX/importedDatasets POST API API More parameters returned in


ImportedDatasets ImportedDatasets the generated object
Create v3 Create v4

/vX/importedDatasets GET API API


ImportedDatasets ImportedDatasets
Get List v3 Get List v4

/vX/importedDatasets/:id GET API API More available parameters in


ImportedDatasets ImportedDatasets the new version of the
Get v3 Get v4 importedDatasets object

/vX/importedDatasets/:id DELETE API API


ImportedDatasets ImportedDatasets
Delete v3 Delete v4

/vX/importedDatasets/:id/addToFlow POST API API More returned parameters in


ImportedDatasets ImportedDatasets the generated object
Post AddToFlow v3 Post AddToFlow v4

/vX/wrangledDatasets POST API API Different set of parameters


WrangledDatasets WrangledDatasets returned in the generated
Create v3 Create v4 object

/vX/wrangledDatasets GET API API


WrangledDatasets WrangledDatasets
Get List v3 Get List v4

/vX/wrangledDatasets/:id GET API API Different set of parameters


WrangledDatasets WrangledDatasets returned in the object
Get v3 Get v4

/vX/wrangledDatasets/:id DELETE API API


WrangledDatasets WrangledDatasets
Delete v3 Delete v4

/vX/wrangledDatasets/:id/primaryInputDataset GET API API More information is returned


WrangledDatasets WrangledDatasets about the primary input
Get Get dataset
PrimaryInputDataset PrimaryInputDataset
v3 v4

/vX/wrangledDatasets/:id/primaryInputDataset PUT API API More information is returned


WrangledDatasets WrangledDatasets about the primary input
Put Put dataset
PrimaryInputDataset PrimaryInputDataset
v3 v4

Flows

URI Method v4 Documentation v4 Documentation Notes

/vX/flows POST API Flows Create v3 API Flows Create v4

/vX/flows GET API Flows Get List v3 API Flows Get List v4

/vX/flows/:id GET API Flows Get v3 API Flows Get v4 associatedPeople parameters from v3 are no longer available.

/vX/flows/:id PATCH API Flows Patch v3 API Flows Patch v4

/vX/flows/:id DELETE API Flows Delete v3 API Flows Delete v4

Copyright © 2019 Trifacta Inc. Page #466


Flow import and export

URI Method v3 Documentation v4 Documentation Notes

/vX/flows/package/dryRun POST API Flows Package Post API Flows Package Post
DryRun v3 DryRun v4

/vX/flows/package POST API Flows Package Post API Flows Package Post More parameters returned in the
v3 v4 generated object

/vX/flows/:id/package/dryRun GET API Flows Package Get API Flows Package Get
DryRun v3 DryRun v4

/vX/flows/:id/package GET API Flows Package Get API Flows Package Get
v3 v4

Jobgroups and Jobs

URI Method v3 v4 Notes


Documentation Documentation

/v4/jobGroups POST API JobGroups API JobGroups v4 version supports adding overrides for datasets with
Create v3 Create v4 parameters through API endpoint

/v4/jobGroups GET API JobGroups API JobGroups Returned information in v4 version includes runtime
Get List v3 Get List v4 parameter overrides that were applied

/v4/jobGroups/:id GET API JobGroups API JobGroups


Get v3 Get v4

/v4/jobGroups//:id/jobs GET API JobGroups API JobGroups Leaner and more informative response in v4
Get Jobs v3 Get Jobs v4

/v4/jobGroups/:id/status GET API JobGroups API JobGroups v4 doc includes method of acquiring status of all jobs
Get Status v3 Get Status v4 for a specified status (e.g. Failed)

/v4/jobGroups/:id/publish PUT API JobGroups API JobGroups


Put Publish v3 Put Publish v4

/v4/jobGroups/:id DELETE API JobGroups API JobGroups


Delete v3 Delete v4

Copyright © 2019 Trifacta Inc. Page #467


Deployments and Releases

URI Method v3 Documentation v4 Documentation Notes

/vX/deployments POST API Deployments Create v3 API Deployments Create v4

/vX/deployments GET API Deployments Get List v3 API Deployments Get List v4

/vX/deployments/:id GET API Deployments Get v3 API Deployments Get v4

/vX/deployments/:id PATCH API Deployments Patch v3 API Deployments Patch v4

/vX/deployments/:id/objectImportRules PATCH API Deployments Object Import Rules API Deployments Object Import Rules
Patch v3 Patch v4

/vX/deployments/:id/valueImportRules PATCH API Deployments Value Import Rules API Deployments Value Import Rules
Patch v3 Patch v4

/vX/deployments/:id?embed=releases GET API Deployments Get Release List v3 API Deployments Get Release List v4

/vX/deployments/:id/run POST API Deployments Run v3 API Deployments Run v4

/vX/deployments/:id DELETE API Deployments Delete v3 API Deployments Delete v4

/vX/releases POST API Releases Create v3 API Releases Create v4

/vX/deployments/:id/releases/dryRun POST API Releases Create DryRun v3 API Releases Create DryRun v4

/vX/releases/:id GET API Releases Get v3 API Releases Get v4

/vX/releases/:id PATCH API Releases Patch v3 API Releases Patch v4

/vX/releases/:id DELETE API Releases Delete v3 API Releases Delete v4

Users

URI Method v3 Documentation v4 Documentation Notes

/vX/people POST API People Create v3 API People Create v4 Same set of required parameters
More available parameters through v4 endpoint

/vX/people GET API People Get List v3 API People Get List v4

/vX/people:id PATCH API People Patch v3 API People Patch v4 More editable parameters

/vX/people/:id GET API People Get v3 API People Get v4 More parameters available in the user object

/vX/people/:id DELETE API People Delete v3 API People Delete v4

API - UI Integrations
You can automate execution of tasks against the Trifacta platform by referencing URL destinations through your
Chrome browser.
Pre-requisites:
If the user has not authenticated with the Trifacta platform, the user is redirected to the Login page, where
s/he can login before completing the UI integration.
The integration must be executed through Google Chrome.
How to:
To execute a UI integration, login to the platform and execute the following:

[http|https]://<host>:<port>/<UI endpoint>

Copyright © 2019 Trifacta Inc. Page #468


where:
http or https is the appropriate protocol identifier, depending on whether you have enabled SSL
connections
<host> is the Trifacta platform.
<port> is the port number to connect to the Trifacta platform. Default value is 3005.
<UI endpoint> is the URI to use, as specified for the UI endpoint.

UI Integrations:
UI Integration - Create Dataset

UI Integration - Create Dataset

Contents:
Pre-requisites
Authentication
Sources of Data
Step-by-Step Guide

Using the following URL endpoint, you can create a dataset from another application through the Trifacta
application.

NOTE: This integration is not supported in the Wrangler Enterprise desktop application.

Pre-requisites

If you are calling from a source application, you must be logged into that application first. See
Authentication below.
You must authenticate with the Trifacta platform before you are redirected to the target destination. See
API - UI Integrations.
This URL integration is supported on HDFS and S3 datastores.
It is assumed that there are no conflicting datasets with the names that are used to create the dataset in
this set of steps. No name validation is performed as part of this action.

Authentication

NOTE: Before using any UI integration, you must first login to the application. If you are not logged in, you
are redirected to the login page, where you can input your credentials before reaching your target URL.

In addition to authentication with the Trifacta platform, the authenticated user must also have the appropriate
permissions to access the assets on the datastore. This includes:
Permissions to access the folder or directory
Appropriate impersonated user configured for the account, if secure impersonation is enabled.
If this dataset is going to be executed later via command line interface, you must create the dataset with
the same user that will execute the job.
For more information:

Topic Section

HDFS: permissions and security See Configure Hadoop Authentication.

Copyright © 2019 Trifacta Inc. Page #469


HDFS: usage See Using HDFS.

See HDFS Browser.

S3: permissions and security See Enable S3 Access.

S3: usage See Using S3.

See S3 Browser.

Sources of Data

You can use this integration to create datasets from single files or a single directory. Below are some example
URLs for sources from Hadoop HDFS or S3:

Datastore Source Example URL Results


Type

HDFS Directory hdfs:///user/warehouse/campaign_data/ User can choose the file through the UI to use for the
dataset.

File hdfs:///user/warehouse/campaign_data/d000001_01.csv User can complete the steps through the UI to create
the dataset.

S3 Directory s3:///3fad-demo/data/biosci/source/ User can choose the file through the UI to use for the
dataset.

File s3:///3fad-demo/data/biosci/source/1-DRUG15Q1.txt User can complete the steps through the UI to create
the dataset.

NOTE: The above results assume that the user has the appropriate permissions to access the file or
directory. If the user lacks permissions, an HTTP 404 error is displayed.

Step-by-Step Guide

Steps:
1. Acquire the target URL for the datastore through the Trifacta® application or through the datastore itself.
Examples URLs:
a. HDFS (file):

hdfs:///user/warehouse/campaign_data/d000001_01.csv

b. S3 (directory):

s3:///3fad-demo/data/biosci/source/

2. Navigate the browser to the appropriate URL in the Trifacta platform. The following example applies to the
HDFS file example from above. It must be preceded by the base URL for the platform. For more
information, see API - UI Integrations.

<base_url>/import/data?uri=hdfs:///user/warehouse/campaign_data/d000
001_01.csv

Copyright © 2019 Trifacta Inc. Page #470


3. For file-based URLs, the file is selected automatically.
4. For directory-based URLs, the user can select which ones to include through the browser. Click the Add
Datasets to a Flow. Add the dataset to an existing flow or create a new one for it.
5. After the datasets have been imported, open the flow in which your import is located. For the datasets that
you wish to execute, you should do the following in the Flow View page:
a. Click the icon for the dataset.
b. From the URL, retrieve the identifiers for the flow and the dataset. These values are needed for later
execution through the command line interface.
c. Example:

Dataset URL flowId datasetId

http://latest-dev.trifacta.net:3005/flows/31#dataset=186 31 186

The flowId is consistent across all datasets that you imported through the above steps.
6. You can open the datasets and wrangle them as needed.

7. Complete any required actions from within your source application.

You can run jobs on the dataset through the following interfaces:
UI: See Run Job Page.
API: See API JobGroups Create v3.
CLI: See CLI for Jobs.

API Workflows
In this section, you can review examples of how to execute workflows using one or more of the available API
endpoints.
Topics:
API Workflow - Develop a Flow
API Workflow - Deploy a Flow
API Workflow - Run Job on Dataset with Parameters
API Workflow - Publish Results
API Workflow - Manage Outputs
API Workflow - Swap Datasets

API Workflow - Develop a Flow

Contents:
Overview
Example Datasets
Step - Create Containing Flow
Step - Create Datasets
Step - Wrangle Data
Step - Run Job
Step - Monitoring Your Job
Step - Re-run Job

Copyright © 2019 Trifacta Inc. Page #471


Overview

This example walks through the process of creating, identifying, and executing a job through automated methods.
For this example, these tasks are accomplished using the following methods:

NOTE: This API workflow applies to a Development instance of the Trifacta® platform, which is the
default platform instance type. For more information on Development and Production instance, see
Overview of Deployment Manager.

1. Locate or create flow. The datasets that you wrangle must be contained within a flow. You can add them
to an existing flow or create a new one through the APIs.
2. Create dataset. Through the APIs, you create an imported dataset from an asset that is accessible through
one of the established connections. Then, you create the recipe object through the API.
a. For the recipe, you must retrieve the internal identifier.
b. Through the application, you modify the recipe for the dataset.
3. Automate job execution. Using the APIs, you can automate execution of the wrangling of the dataset.
a. As needed, this job can be re-executed on a periodic basis or whenever the source files are
updated.

Example Datasets

In this example, you are attempting to wrangle monthly point of sale (POS) data from three separate regions into a
single dataset for the state. This monthly data must be enhanced with information about the products and stores
in the state. So, the example has a combination of transactional and reference data, which must be brought
together into a single dataset.

Tip: To facilitate re-execution of this job each month, the transactional data should be stored in a
dedicated directory. This directory can be overwritten with next month's data using the same filenames.
As long as the new files are structured in an identical manner to the original ones, the new month's data
can be processed by re-running the API aspects of this workflow.

Example Files:
The following files are stored on your HDFS deployment:

Path and Filename Description

hdfs:///user/pos/POS-r01.txt Point of sale transactions for Region 1.

hdfs:///user/pos/POS-r02.txt Point of sale transactions for Region 2.

hdfs:///user/pos/POS-r03.txt Point of sale transactions for Region 3.

hdfs:///user/ref/REF_PROD.txt Reference data on products for the state.

hdfs:///user/ref/REF_CAL.txt Reference data on stores in the state.

NOTE: The reference and transactional data are stored in separate directories. In this case, you can
assume that the user has read access through his Trifacta account to these directories, although this
access must be enabled and configured for real use cases.

Base URL:
For purposes of this example, the base URL for the Trifacta platform is the following:

Copyright © 2019 Trifacta Inc. Page #472


http://www.example.com:3005

Step - Create Containing Flow

To begin, you must locate a flow or create a flow through the APIs to contain the datasets that you are importing.

NOTE: You cannot add datasets to the flow through the flows endpoint. Moving pre-existing datasets
into a flow is not supported in this release. Create or locate the flow first and then when you create the
datasets, associate them with the flow at the time of creation.
See API ImportedDatasets Create v4.
See API WrangledDatasets Create v4.

Locate:

NOTE: If you know the display name value for the flow and are confident that it is not shared with any
other flows, you can use the APIs to retrieve the flowId. See API Flows Get List v4.

1. Login through the application.


2. In the Flows page, select or create the flow to contain the above datasets.
3. In the Flow Details page for that flow, locate the flow identifier in the URL:

Flow Details URL http://www.example.com:3005/flows/10


Flow Id 10

4. Retain this identifier for later use.


Create:
1. Through the APIs, you can create a flow using the following call:

Endpoint http://www.example.com:3005/v4/flows
Authentication Required

Method POST
Request Body
{
"name": "Point of Sale - 2013",
"description": "Point of Sale data for state"
}

2. The response should be status code 201 - Created with a response body like the following:

Copyright © 2019 Trifacta Inc. Page #473


{
"id": 10,
"updatedAt": "2017-02-17T17:08:57.848Z",
"createdAt": "2017-02-17T17:08:57.848Z",
"name": "Point of Sale - 2013",
"description": "Point of Sale data for state",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"workspace": {
"id": 1
}
}

3. Retain the flow identifier (10) for later use.

For more information, see API Flows Create v4.

Checkpoint: You have identified or created the flow to contain your dataset or datasets.

Step - Create Datasets

To create datasets from the above sources, you must:


1. Create an imported dataset for each file.
2. For each imported dataset, create a recipe, which can be used to transform the imported dataset.
The following steps describe how to complete these actions via API for a single file.
Steps:
1. To create an imported dataset, you must acquire the following information about the source. In the above
example, the source is the POS-r01.txt file.
a. path
b. type
c. name
d. description
e. bucket (if a file stored on S3)
2. Construct the following request:

Endpoint http://www.example.com:3005/v4/importedDataset
Authentication Required

Method POST

Copyright © 2019 Trifacta Inc. Page #474


Request Body
{
"path": "/user/pos/POS-r01.txt",
"type": "hdfs",
"bucket": null,
"name": "POS-r01.txt",
"description": "POS-r01.txt"
}

3. You should receive a 201 - Created response with a response body similar to the following:

{
"id": 8,
"size": "281032",
"path": "/user/pos/POS-r01.txt",
"dynamicPath": null,
"type": "hdfs",
"bucket": null,
"isSchematized": true,
"isDynamic": false,
"disableTypeInference": false,
"updatedAt": "2017-02-08T18:38:56.640Z",
"createdAt": "2017-02-08T18:38:56.560Z",
"parsingScriptId": {
"id": 14
},
"runParameters": {
"data": []
},
"name": "POS-r01.txt",
"description": "POS-r01.txt",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"connection": null
}

4. You must retain the id value so you can reference it when you create the recipe.
5. See API ImportedDatasets Create v4.
6. Next, you create the recipe. Construct the following request:

Endpoint http://www.example.com:3005/v4/wrangledDatasets
Authentication Required

Method POST

Copyright © 2019 Trifacta Inc. Page #475


Request Body
{ "name":"POS-r01",
"importedDataset": {
"id":8
},
"flow": {
"id":10
}
}

7. You should receive a 201 - Created response with a response body similar to the following:

{
"id": 23,
"wrangled": true,
"updatedAt": "2018-02-06T19:59:22.735Z",
"createdAt": "2018-02-06T19:59:22.698Z",
"name": "POS-r01",
"active": true,
"referenceInfo": null,
"activeSample": {
"id": 23
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"recipe": {
"id": 23
},
"flow": {
"id": 10
}
}

8. From the recipe, you must retain the value for the id. For more information, see
API WrangledDatasets Create v4.

9. Repeat the above steps for each of the source files that you are adding to your flow.

Checkpoint: You have created a flow with multiple imported datasets and recipes.

Step - Wrangle Data

After you have created the flow with all of your source datasets, you can wrangle the base dataset to integrate all
of the source into it.
Steps for Transactional data:

Copyright © 2019 Trifacta Inc. Page #476


1. Open the POS-r01 dataset. It's loaded in the Transformer page.
2. To chain together the other transactional data into this dataset, you use a union transform. In the Search
panel, enter union in the textbox and press ENTER.
3. In the Union page:
a. Click Add datasets.
b. Select the other two transactional datasets: POS-r02 and POS-r03.

NOTE: When you join or union one dataset into another, changes made in the joined
dataset are automatically propagated to the dataset where it has been joined.

c. Add the datasets and align by name.


d. Check the dataset names and fields. If all looks well, click Add to Recipe.
Steps for reference data:
In the columns Store_Nbr and Item_Nbr are unique keys into the REF_CAL and REF_PROD datasets,
respectively. Using the Join panel, you can pull in the other fields from these reference datasets based on these
unique keys.
1. Open the POS-r01 dataset.
2. In Search panel, enter join for the transform. The Join panel opens.
3. Select the RED_PROD dataset. Click Accept. Click Next.
4. Review the two keys to verify that they are the proper columns on which to structure the join. Click Next.
5. Click the All tab. Select all fields to add. Click Review.
6. After reviewing your join, click Add to Recipe.
7. For each Item_Nbr value that has a matching ITEM_NBR value in the reference dataset, all of the other
reference fields are pulled into the POS-r01 dataset.

You can repeat the above general process to integrate the reference data for stores.

Checkpoint: You have created a flow with multiple datasets and have integrated all of the relevant data
into a single dataset.

Step - Run Job

Through the APIs, you can specify and run a job. In the above example, you must run the job for the terminal
dataset, which is POS-r01 in this case. This dataset contains references to all of the other datasets. When the job
is run, the recipes for the other datasets are also applied to the terminal dataset, which ensures that the output
reflects the proper integration of these other datasets into POS-r01.

Steps:
1. Acquire the internal identifier for the recipe for which you wish to execute a job. In the previous example,
this identifier was 23.
2. Construct a request using the following:

Endpoint http://www.example.com:3005/v4/jobGroups
Authentication Required

Method POST

Request Body:

Copyright © 2019 Trifacta Inc. Page #477


{
"wrangledDataset": {
"id": 23
},
"overrides": {
"execution": "photon",
"profiler": true,
"writesettings": [
{
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS-
r01.csv",
"action": "create",
"format": "csv",
"compression": "none",
"header": false,
"asSingleFile": false
}
]
},
"ranfrom": null
}

3. In the above example, the specified job has been launched for recipe 23 to execute on the Photon running
environment with profiling enabled.
a. Output format is CSV to the designated path. For more information on these properties, see
API JobGroups Create v4.
b. Output is written as a new file with no overwriting of previous files.
4. A response code of 201 - Created is returned. The response body should look like the following:

{
"reason": "JobStarted",
"sessionId": "9c2c6220-ef2d-11e6-b644-6dbff703bdfc"
"id": 3,
}

5. Retain the id value, which is the job identifier, for monitoring.

Step - Monitoring Your Job

You can monitor the status of your job through the following endpoint:

Endpoint http://www.example.com:3005/v4/jobgroup/<id>/status
Authentication Required

Method GET
Request Body None.

When the job has successfully completed, the returned status message is the following:

Copyright © 2019 Trifacta Inc. Page #478


"Complete"

For more information, see API JobGroups Create v4.

Step - Re-run Job

In the future, you can re-run the job exactly as you specified it by executing the following call:

Tip: You can swap imported datasets before re-running the job. For example, if you have uploaded a new
file, you can change the primary input dataset for the dataset and then use the following API call to re-run
the job as specified. See API WrangledDatasets Put PrimaryInputDataset v4.

Endpoint http://www.example.com:3005/v4/jobGroups
Authentication Required

Method POST
Request Body
{
"wrangledDataset": {
"id": 23
}
}

The job is re-run as it was previously specified.


If you need to modify any job parameters, you must create a new job definition.
API Workflow - Deploy a Flow

Contents:
Overview
Pre-requisites
Workflow
Step - Get Flow Id
Step - Export a Flow
Step - Create Deployment
Step - Create Connection
Step - Create Import Rules
Step - Import Release
Step - Activate Release
Step - Run Deployment
Step - Iterate
Step - Set up Production Schedule

Copyright © 2019 Trifacta Inc. Page #479


Overview
In this workflow, you learn how to deploy a flow in development to a production instance of the platform. After you
have created and finished a flow in a Development (Dev) instance, you can deploy it to an environment designed
primarily for production execution of jobs for finished flows (Prod instance). For more information on managing
these deployments, see Overview of Deployment Manager.

Pre-requisites

Finished flow: This example assumes that you have finished development of a flow with the following
characteristics:
Single dataset imported from a table through a Redshift connection
Single JSON output
Separate Dev and Prod instances: Although it is possible to deploy flows to the same instance in which they are
developed, this example assumes that you are deploying from a Dev instance to a completely separate Prod
instance. The following implications apply:
Separate user accounts to access Dev (User1) and Prod (Admin2) instances.

Tip: You should do all of your recipe development and testing in Dev/Test. Avoid making changes
in a Prod environment.

NOTE: Although these are separate user accounts, the assumption is that the same admin-level
user is using these accounts through the APIs.

New connections must be created in the Prod instance to access the production version of the database
table.

Workflow

In this example, your environment contains separate Dev and Prod instances, each of which has a different set of
users.

Item Dev Prod

Environment http://wrangle-dev.example.com:3005 http://wrangle-prod.example.com:3005

Tip: Dev environment work can be done through the UI, which may
be easier.

User User1 Admin2

NOTE: User1 has no access to Prod.

Source DB devWrangleDB prodWrangleDB

Source Table Dev-Orders Prod-Orders

Connection Dev Redshift Conn Prod Redshift Conn


Name

Example Flow:

Copyright © 2019 Trifacta Inc. Page #480


User1 is creating a flow, which is used to wrangle weekly batches of orders for the enterprise. The flow contains:

A single imported dataset that is created from a Redshift database table.


A single recipe that modifies the imported dataset.
A single output to a JSON file.
Production data is hosted in a different Redshift database. So, the Prod connection is different from the
Dev connection.
Steps:
1. Build in Dev instance: User1 creates the flow and iterates on building the recipe and running jobs until a
satisfactory output can be generated in JSON format.
2. Export: When User1 is ready to push the flow to production, User1 exports the flow and downloads the
export package ZIP file to the local desktop.
3. Deploy to Prod instance:
a. Admin2 creates a new deployment in the Prod instance.
b. Admin2 creates a new connection (Prod Redshift Conn) in the Prod instance.
c. Admin2 creates new import rules in the Prod instance to map from the old connection (Dev Redshift
Conn) to the new one (Prod Redshift Conn).
d. Admin2 uploads the export ZIP package.
4. Test deployment: Through Flow View in the Prod instance, Admin2 runs a job. The results look fine.
5. Set schedule: Using cron, Admin2 sets a schedule to run the active release for this deployment once per
week.
a. Each week, the Prod-Orders table must be refreshed with data.
b. The dataset is now operational in the Prod environment.

Step - Get Flow Id

The first general step is for the Dev user (User1) to get the flowId and export the flow from the Dev instance.
Steps:

Tip: If it's easier, you can gather the flowId from the user interface in Flow View. In the following example,
the flowId is 21:

http://www.wrangle-dev.example.com:3005/flows/21

1. Through the APIs, you can create a flow using the following call:

Endpoint http://www.wrangle-dev.example.com:3005/v4/flows
Authentication Required

Method GET
Request Body None.

2. The response should be status code 200 - OK with a response body like the following:

Copyright © 2019 Trifacta Inc. Page #481


{ "data": [
{
"id": 21,
"name": "Intern Training",
"description": "null",
"createdAt": "2019-01-08T18:14:37.851Z",
"updatedAt": "2019-01-08T18:57:26.824Z",
"creator": {
"id": 2
},
"updater": {
"id": 2
},
"folder": {
"id": 1
}
"workspace": {
"id": 1
}
},
{
"id": 19,
"name": "example Flow",
"description": null,
"createdAt": "2019-01-08T17:25:21.392Z",
"updatedAt": "2019-01-08T17:30:30.959Z",
"creator": {
"id": 2
},
"updater": {
"id": 2
},
"folder": {
"id": 4
}
"workspace": {
"id": 1
}
}
]
}

3. Retain the flow identifier (21) for later use.

Checkpoint: You have identified the flow to export.

For more information, see API Flows Get List v4.

Copyright © 2019 Trifacta Inc. Page #482


Step - Export a Flow

Export the flow to your local desktop.

Tip: This step may be easier to do through the UI in the Dev instance.

Steps:
1. Export flowId=21:

Endpoint http://www.wrangle-dev.example.com:3005/v4/flows/21/package
Authentication Required

Method GET
Request Body None.

2. The response should be status code 200 - OK. The response body is the flow itself.
3. Download and save this file to your local desktop. Let's assume that the filename you choose is flow-Wra
ngleOrders.zip.
For more information, see API Flows Package Get v4.

Step - Create Deployment

In the Prod environment, you can create the deployment from which you can manage the new flow. Note that the
following information has changed for this environment:

Item Prod env value

userId Admin2

baseURL http://www.wrangle-prod.example.com:3005

Steps:
1. Through the APIs, you can create a deployment using the following call:

Endpoint http://www.wrangle-prod.example.com:3005/v4/deployments
Authentication Required

NOTE: Username and password credentials must be submitted for the Admin2 account.

Method POST
Request Body
{
"name": "Production Orders"
}

2. The response should be status code 201 - Created with a response body like the following:

Copyright © 2019 Trifacta Inc. Page #483


{ "id": 3,
"name": "Production Orders",
"updatedAt": "2017-11-27T23:48:54.340Z",
"createdAt": "2017-11-27T23:48:54.340Z"
"creator": {
"id": 1,
},
"updater": {
"id": 1,
}
}

3. Retain the deploymentId (3) for later use.


For more information, see API Deployments Create v4.

Step - Create Connection

When a flow is exported, its connections are not included in the export. Before you import the flow into a new
environment:
Connections must be created or recreated in the Prod environment. In some cases, you may need to point
to production versions of the data contained in completely different databases.
Rules must be created to remap the connection to use in the imported flow.
This section and the following step through these processes.
Steps:
1. From the Dev environment, you collect the connection information for the flow:

Endpoint http://www.wrangle-dev.example.com:3005/v4/connections
Authentication Required

NOTE: Username and password credentials must be submitted for the User1 account.

Method GET
Request Body None.

2. The response should be status code 200 - Ok with a response body like the following:

{
"data": [
{
"connectParams": {
"vendor": "redshift",
"vendorName": "redshift",
"host": "dev-redshift.example.com",
"port": "5439"
},
"id": 9,

Copyright © 2019 Trifacta Inc. Page #484


"host": "dev-redshift.example.com",
"port": 5439,
"vendor": "redshift",
"params": {
"connectStrOpts": "",
"defaultDatabase": "devWrangleDB",
"extraLoadParams": "BLANKSASNULL EMPTYASNULL
TRIMBLANKS TRUNCATECOLUMNS"
},
"ssl": false,
"vendorName": "redshift",
"name": "Dev Redshift Conn",
"description": "",
"type": "jdbc",
"isGlobal": true,
"credentialType": "custom",
"credentialsShared": true,
"uuid": "b8014610-ce56-11e7-9739-27deec2c3249",
"disableTypeInference": false,
"createdAt": "2017-11-21T00:55:50.770Z",
"updatedAt": "2017-11-21T00:55:50.770Z",
"credentials": [
{
"user": "devDBuser"
}
"creator": {
"id": 2
},
"updater": {
"id": 2
},
"workspace": {
"id": 1
}
]
}
],
"count": {
"owned": 1,
"shared": 0,

Copyright © 2019 Trifacta Inc. Page #485


"count": 1
}
}

3. You retain the above information for use in Production.


4. In the Prod environment, you create the new connection using the following call:

Endpoint http://www.wrangle-prod.example.com:3005/v4/connections
Authentication Required

NOTE: Username and password credentials must be submitted for the Admin2 account.

Method POST
Request Body
{
"connectParams": {
"vendor": "redshift",
"vendorName": "redshift",
"host": "prod-redshift.example.com",
"port": 1433
}
"host": "prod-redshift.example.com",
"port": 1433,
"vendor": "redshift",
"params": {
"connectStrOpts": "",
"defaultDatabase": "prodWrangleDB",
"extraLoadParams": "BLANKSASNULL EMPTYASNULL
TRIMBLANKS TRUNCATECOLUMNS"
},
"vendorName": "redshift",
"name": "Redshift Conn Prod",
"description": "",
"isGlobal": true,
"type": "jdbc",
"ssl": false,
"credentialType": "custom",
"credentials": [
{
"username": "prodDBUser",
"password": "<password>"
}
]
}

5. The response should be status code 201 - Created with a response body like the following:

Copyright © 2019 Trifacta Inc. Page #486


{
"connectParams": {
"vendor": "redshift",
"vendorName": "redshift",
"host": "prod-redshift.example.com",
"port": 5439
},
"id": 12,
"host": "prod-redshift.example.com",
"port": 5439,
"vendor": "redshift",
"params": {
"connectStrOpts": "",
"defaultDatabase": "prodWrangleDB",
"extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS
TRUNCATECOLUMNS"
},
"ssl": false,
"name": "Redshift Conn Prod",
"description": "",
"type": "jdbc",
"isGlobal": true,
"credentialType": "custom",
"credentialsShared": true,
"uuid": "fa7e06c0-0143-11e8-8faf-27c0392328c5",
"disableTypeInference": false,
"createdAt": "2018-01-24T20:20:11.181Z",
"updatedAt": "2018-01-24T20:20:11.181Z",
"credentials": [
{
"username": "prodDBUser"
}
],
"creator": {
"id": 2
},
"updater": {
"id": 2
}
}

6. When you hit the /v4/connections endpoint again, you can retrieve the connectionId for this
connection. In this case, let's assume that the connectionId value is 12.

See API Connections Create v4.

Copyright © 2019 Trifacta Inc. Page #487


Step - Create Import Rules

Now that you have defined the connection to use to acquire the production data from within the production
environment, you must create an import rule to remap from the Dev connection to the Prod connection within the
flow definition. This rule is applied during the import process to ensure that the flow is working after it has been
imported.
In this case, you must remap the uuid value for the Dev connection, which is written into the flow definition, with
the connection Id value from the Prod instance.
For more information on import rules, see Define Import Mapping Rules.
Steps:
1. From the Dev environment, you collect the connection information for the flow:

Endpoint http://www.wrangle-dev.example.com:3005/v4/connections
Authentication Required

NOTE: Username and password credentials must be submitted for the User1 account.

Method GET
Request Body None.

2. The response should be status code 200 - Ok with a response body like the following:

{
"data": [
{
"connectParams": {
"vendor": "redshift",
"vendorName": "redshift",
"host": "dev-redshift.example.com",
"port": "5439"
},
"id": 9,
"host": "dev-redshift.example.com",
"port": 5439,
"vendor": "redshift",
"params": {
"connectStrOpts": "",
"defaultDatabase": "devWrangleDB",
"extraLoadParams": "BLANKSASNULL EMPTYASNULL
TRIMBLANKS TRUNCATECOLUMNS"
},
"ssl": false,
"vendorName": "redshift",
"name": "Dev Redshift Conn",
"description": "",
"type": "jdbc",
"isGlobal": true,
"credentialType": "custom",

Copyright © 2019 Trifacta Inc. Page #488


"credentialsShared": true,
"uuid": "b8014610-ce56-11e7-9739-27deec2c3249",
"disableTypeInference": false,
"createdAt": "2017-11-21T00:55:50.770Z",
"updatedAt": "2017-11-21T00:55:50.770Z",
"credentials": [
{
"user": "devDBuser"
}
"creator": {
"id": 2
},
"updater": {
"id": 2
},
"workspace": {
"id": 1
}
]
}
],
"count": {
"owned": 1,
"shared": 0,

Copyright © 2019 Trifacta Inc. Page #489


"count": 1
}
}

3. From the above information, you retain the following, which uniquely identifies the connection object,
regardless of the instance to which it belongs:

"uuid": "b8014610-ce56-11e7-9739-27deec2c3249",

4. Against the Prod environment, you now create an import mapping rule:

Endpoint http://www.wrangle-prod.example.com:3005/v4/deployments/3/objectImportRules
Authentication Required

Method PATCH

Request Body:

[{"tableName":"connections","onCondition":{"uuid":
"b8014610-ce56-11e7-9739-27deec2c3249"},"withCondition":{"id":12}}]

5. The response should be status code 200 - Ok with a response body like the following:

{
"deleted": []
}

Since the method is a PATCH, you are updating the rules set that applies to all imports for this deployment.
In this case, there were no pre-existing rules, so the response indicates that nothing was deleted. If another
set of import rules is submitted, then the one you just created is deleted.
See API Deployments Object Import Rules Patch v4.
See API Deployments Value Import Rules Patch v4.

Step - Import Release

You are now ready to import the package into the release.
Steps:
1. Against the Prod environment, you now import the package:

Endpoint http://www.wrangle-prod.example.com:3005/v4/deployments/3/releases
Authentication Required

Method POST

Copyright © 2019 Trifacta Inc. Page #490


Request Body The request body must include the following key and value combination submitted as form data:

key value

data "@path-to-flow-WrangleOrders.zip"

2. The response should be status code 201 - Created with a response body like the following:

{ "importRuleChanges": {
"object": [{"tableName":"connections","onCondition":{"uuid":
"b8014610-ce56-11e7-9739-27deec2c3249"},"withCondition":{"id":12}}],
"value": []
},
"flowName": "Wrangle Orders"
}

See API Releases Create v4.

Step - Activate Release

When a package is imported into a release, the release is automatically set as the active release for the
deployment. If at some point in the future, you need to change the active release, you can use the following
endpoint to do so.
Steps:
1. Against the Prod environment, use the following endpoint:

Endpoint http://www.wrangle-prod.example.com:3005/v4/releases/5
Authentication Required

Method PATCH
Request Body
{
"active": true
}

2. The response should be status code 200 - OK with a response body like the following:

{
"id": 3,
"updater": {
"id": 3,
}
"updatedAt": "2017-11-28T00:06:12.147Z"
}

See API Releases Patch v4.

Copyright © 2019 Trifacta Inc. Page #491


Step - Run Deployment

You can now execute a test run of the deployment to verify that the job executes properly.

NOTE: When you run a deployment, you run the primary flow in the active release for that deployment.
Running the flow generates the output objects for all recipes in the flow.

NOTE: For datasets with parameters, you can apply parameter overrides through the request body
through the following API call. For more information, see API Deployments Run v4.

Steps:
1. Against the Prod environment, use the following endpoint:

Endpoint http://www.wrangle-prod.example.com:3005/v4/deployments/3/run
Authentication Required

Method POST
Request Body None.

2. The response should be status code 201 - Created with a response body like the following:

{
"data": [
{
"reason": "JobStarted",
"sessionId": "dd6a90e0-c353-11e7-ad4e-7f2dd2ae4621",
"id": 33
}
]
}

See API Deployments Run v4.

Step - Iterate

If you need to make changes to fix issues related to running the job:
Recipe changes should be made in the Dev environment and then passed through export and import of the
flow into the Prod deployment.
Connection issues:
Check Flow View in the Prod instance to see if there are any red dots on the objects in the package.
If so, your import rules need to be fixed.
Verify that you can import data through the connection.
Output problems could be related to permissions on the target location.

Step - Set up Production Schedule

When you are satisfied with how the production version of your flow is working, you can set up periodic schedules
using a third-party tool to execute the job on a regular basis.

Copyright © 2019 Trifacta Inc. Page #492


The tool must hit the Run Deployment endpoint and then verify that the output has been properly generated.
API Workflow - Run Job on Dataset with Parameters

Contents:
Overview
Basic Workflow
Example Datasets
Step - Create Containing Flow
Step - Create Datasets with Parameters
Example 1 - Dataset with Datetime parameter
Example 2 - Dataset with Variable
Example 3 - Dataset with pattern parameter
Step - Wrangle Data
Step - Run Job
Example 1 - Dataset with Datetime parameter
Example 2 - Dataset with Variable
Example 3 - Dataset with pattern parameter
Step - Monitoring Your Job
Step - Re-run Job

Overview

This example workflow describes how to run jobs on datasets with parameters through the Trifacta® platform. A d
ataset with parameters is a dataset in which some part of the path to the data objects has been parameterized.
Since one or more of the parts of the path can vary, you can build a dataset with parameters to capture data that
spans multiple files. For example, datasets with parameters can be used to parameterize serialized data by region
or data or other variable.

NOTE: This API workflow only works with version 4 (v4) or later of the APIs.

For more information on datasets with parameters, see Overview of Parameterization.

Basic Workflow

The basic method by which you build and run a job for a dataset with parameters is very similar to the
non-parameterized dataset method with a few notable exceptions. The steps in this workflow follow the same
steps for the standard workflow. Where the steps overlap links have been provided to the non-parameterized
workflow. For more information, see API Workflow - Develop a Flow.

Example Datasets

This example covers three different datasets, each of which features a different type of dataset with parameters.

Example Parameter Description


Number Type

1 Datetime In this example, a directory is used to store daily orders transactions. This dataset must be defined with a
parameter Datetime parameter to capture the preceding 7 days of data. Jobs can be configured to process all of this data as
it appears in the directory.

2 Variable This dataset segments data into four timezones across the US. These timezones are defined using the following
text values in the path: pacific, mountain, central, and eastern. In this case, you can create a
parameter called region, which can be overridden at runtime to be set to one of these four values during job
execution.

Copyright © 2019 Trifacta Inc. Page #493


3 Pattern This example is a directory containing point-of-sale transactions captured into individual files for each region.
parameter Since each region is defined by a numeric value (01, 02, 03), the dataset can be defined using a pattern
parameter.

Step - Create Containing Flow

You must create the flow to host your dataset with parameters.
In the response, you must capture and retain the flow Identifer.
For more information, see API Workflow - Develop a Flow.

Step - Create Datasets with Parameters

NOTE: When you import a dataset with parameters, only the first matching dataset is used for the initial
file. If you want to see data from other matching files, you must collect a new sample within the
Transformer page.

Example 1 - Dataset with Datetime parameter

Suppose your files are stored in the following paths:

MyFiles/1/Datetime/2018-04-06-orders.csv
MyFiles/1/Datetime/2018-04-05-orders.csv
MyFiles/1/Datetime/2018-04-04-orders.csv
MyFiles/1/Datetime/2018-04-03-orders.csv
MyFiles/1/Datetime/2018-04-02-orders.csv
MyFiles/1/Datetime/2018-04-01-orders.csv
MyFiles/1/Datetime/2018-03-31-orders.csv

When you navigate to the directory through the application, you mouse over one of these files and select Paramet
erize.
In the window, select the date value (e.g. YYYY-MM-DD) and then click the Datetime icon.

Datetime Parameter:
Format: YYYY-MM-DD
Date Range: Date is last 7 days.
Click Save.
The Datetime parameter should match with all files in the directory. Import this dataset and wrangle it.
After you wrangle the dataset, return to its flow view and select the recipe. You should be able to extract the flowId
and recipeId values from the URL.
For purposes of this example, here are some key values:
flowId: 35
recipeId: 127

Example 2 - Dataset with Variable

Suppose your files are stored in the following paths:

Copyright © 2019 Trifacta Inc. Page #494


MyFiles/1/variable/census-eastern.csv
MyFiles/1/variable/census-central.csv
MyFiles/1/variable/census-mountain.csv
MyFiles/1/variable/census-pacific.csv

When you navigate to the directory through the application, you mouse over one of these files and select Paramet
erize.
In the window, select the region value, which could be one of the following depending on the file: eastern, cent
ral, mountain, or pacific. Click the Variable icon.

Variable Parameter:
Name: region
Default Value:Set this default to pacific.
Click Save.
In this case, the variable only matches one value in the directory. However, when you apply runtime overrides to
the region variable, you can set it to any value.

Import this dataset and wrangle it.


After you wrangle the dataset, return to its flow view and select the recipe. You should be able to extract the flowId
and recipeId values from the URL.
For purposes of this example, here are some key values:
flowId: 33
recipeId: 123

Example 3 - Dataset with pattern parameter

Suppose your files are stored in the following paths:

MyFiles/1/pattern/POS-r01.csv
MyFiles/1/pattern/POS-r02.csv
MyFiles/1/pattern/POS-r03.csv

When you navigate to the directory through the application, you mouse over one of these files and select Paramet
erize.
In the window, select the two numeric digits (e.g. 02). Click the Pattern icon.

Pattern Parameter:
Type: Trifacta pattern
Matching regular expression: {digit}{2}
Click Save.
In this case, the Trifacta pattern should match any sequence of two digits in a row. In the above example, this
expression matches: 01, 02, and 03, all of the files in the directory.

Import this dataset and wrangle it.


After you wrangle the dataset, return to its flow view and select the recipe. You should be able to extract the flowId
and recipeId values from the URL.
For purposes of this example, here are some key values:

Copyright © 2019 Trifacta Inc. Page #495


flowId: 32
recipeId: 121

Checkpoint: You have created flows for each type of dataset with parameters.

Step - Wrangle Data

After you have created your dataset with parameter, you can wrangle it through the application. For more
information, see Transformer Page.

Step - Run Job

Below, you can review the API calls to run a job for each type of dataset with parameters, including relevant
information about overrides.

Example 1 - Dataset with Datetime parameter

In the following example, the Datetime parameter has been overridden with the value 2018-04-03 as part of the
job creation.

NOTE: You cannot apply overrides to these types of datasets with parameters.

1. Endpoint http://www.example.com:3005/v4/jobGroups
Authentication Required

Method POST
Request Body
{
"wrangledDataset": {
"id": 127
},
"overrides": {
"execution": "photon",
"profiler": true,
"writesettings": [
{
"path":
"MyFiles/queryResults/[email protected]/2018-04-03-orders.csv",
"action": "create",
"format": "csv",
"compression": "none",
"header": false,
"asSingleFile": false
}
]
},
"runParameters": {}
}

Copyright © 2019 Trifacta Inc. Page #496


2. In the above example, the job has been launched for recipe 127 to execute on the Photon running
environment with profiling enabled.
a. Output format is CSV to the designated path. For more information on these properties, see
API JobGroups Create v4.
b. Output is written as a new file with no overwriting of previous files.
3. A response code of 201 - Created is returned. The response body should look like the following:

{
"reason": "JobStarted",
"sessionId": "5b883530-3920-11e8-a37a-db6dae3c6e43",
"id": 29
}

4. Retain the jobgroupId=29 value for monitoring.

Example 2 - Dataset with Variable

In the following example, the region variable has been overwritten with the value central to execute the job
on orders-central.csv:

1. Endpoint http://www.example.com:3005/v4/jobGroups
Authentication Required

Method POST

Copyright © 2019 Trifacta Inc. Page #497


Request Body
{
"wrangledDataset": {
"id": 123
},
"overrides": {
"execution": "photon",
"profiler": true,
"writesettings": [
{
"path":
"MyFiles/queryResults/[email protected]/region-eastern.csv",
"action": "create",
"format": "csv",
"compression": "none",
"header": false,
"asSingleFile": false
}
]
}
"runParameters": {
"overrides": {
"data": [{
"key": "region",
"value": "central"
}
]}
}
}

2. In the above example, the job has been launched for recipe 123 to execute on the Photon running
environment with profiling enabled.
a. Output format is CSV to the designated path. For more information on these properties, see
API JobGroups Create v4.
b. Output is written as a new file with no overwriting of previous files.
3. A response code of 201 - Created is returned. The response body should look like the following:

{
"reason": "JobStarted",
"sessionId": "aa0f9f00-391f-11e8-a37a-db6dae3c6e43",
"id": 27
}

4. Retain the jobgroupId=27 value for monitoring.

Example 3 - Dataset with pattern parameter

In the following example, the value 02 has been inserted into the pattern to execute the job on POS-r02.csv:

Copyright © 2019 Trifacta Inc. Page #498


NOTE: You cannot apply overrides to these types of datasets with parameters.

1. Endpoint http://www.example.com:3005/v4/jobGroups
Authentication Required

Method POST
Request Body
{
"wrangledDataset": {
"id": 121
},
"overrides": {
"execution": "photon",
"profiler": false,
"writesettings": [
{
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS-r02.
"action": "create",
"format": "csv",
"compression": "none",
"header": false,
"asSingleFile": false
}
]
},
"runParameters": {}
}

2. In the above example, the job has been launched for recipe 121 to execute on the Photon running
environment with profiling enabled.
a. Output format is CSV to the designated path. For more information on these properties, see
API JobGroups Create v4.
b. Output is written as a new file with no overwriting of previous files.
3. A response code of 201 - Created is returned. The response body should look like the following:

{
"reason": "JobStarted",
"sessionId": "16424a60-3920-11e8-a37a-db6dae3c6e43",
"id": 28
}

4. Retain the jobgroupId=28 value for monitoring.

Step - Monitoring Your Job

After the job has been created and you have captured the jobGroup Id, you can use it to monitor the status of your
job. For more information, see API JobGroups Get Status v4.

Copyright © 2019 Trifacta Inc. Page #499


Step - Re-run Job

If you need to re-run the job as specified, you can use the wrangledDataset identifier to re-run the most recent job.

Tip: When you re-run a job, you can change any variable values as part of the request.

Example request:

Endpoint http://www.example.com:3005/v4/jobGroups
Authentication Required

Method POST
Request Body
{
"wrangledDataset": {
"id": 123
},
"runParameters": {
"overrides": {
"data": [{
"key": "region",
"value": "central"
}
]}
}
}

For more information, see API Workflow - Develop a Flow.


API Workflow - Publish Results

Contents:
Overview
Basic Workflow
Step - Create Connections
Step - Run Job
Step - Publish Results to Hive
Step - Publish Results to Redshift
Step - Publish Results to Tableau Server
Step - Publish Results to SQL DW

Overview

After you have run a job to generate results, you can publish those results to different targets as needed. This
section describes how to automate those publishing steps through the APIs.

NOTE: This workflow applies to re-publishing job results after you have already generated them.

Copyright © 2019 Trifacta Inc. Page #500


NOTE: After you have generated results and written them to one target, you cannot publish to the same
target. You must configure the outputs to specify a different format and location and then run a new job.

In the application, you can publish after generating results. See Publishing Dialog.

Basic Workflow

1. Create connections to each target to which you wish to publish. Connections must support write
operations.
2. Specify a job whose output meets the requirements for the target.
3. Run the job.
4. When the job completes, publish the results to the target(s).

Step - Create Connections

For each target, you must have access to create a connection to it. After a connection is created, it can be reused,
so you may find it easier to create them through the application.
Other connections must be created through the application. Links to instructions are provided below.

NOTE: Connections created through the application must be created through the Connections page,
which is used for creating read/write connections. Do not create these connections through the Import
Data page. See Connections Page.

Connection Required Output Example Create Doc Link Other Requirements


Format Id via API

Hive Avro 1 Y Create Hive Requires integration with a Hadoop cluster.


Connections

Redshift Avro 2 N Create Redshift Requires S3 set as the base storage layer. See
Connections Set Base Storage Layer.

Tableau TDE 3 Y Create Tableau Server


Server Connections

SQL DW Parquet 4 N Create SQL DW Available only on Azure deployments. See


Connections Configure for Azure.

Step - Run Job

Before you publish results to a different datastore, you must generate results and store them in HDFS.

NOTE: To produce some output formats, you must run the job on your Hadoop cluster.

In the examples below, the following example data is assumed:

Identifier Value

jobId 2

flowId 3

wrangledDatasetId (also flowNodeId) 10

For more information on running a job, see API JobGroups Create v4.
For more information on the publishing endpoint, see API JobGroups Put Publish v4.

Copyright © 2019 Trifacta Inc. Page #501


Step - Publish Results to Hive

The following uses the Avro results from the specified job (jobId = 2) to publish the results to the test_table tab
le in the default Hive schema through connectionId=1.

NOTE: To publish to Hive, the targeted database is predefined in the connection object. For the path val
ue in the request body, you must specify the schema in this database to use. Schema information is not
available through API. To explore the available schemas, click the Hive icon in the Import Data page. The
schemas are the first level of listed objects. For more information, see Import Data Page.

Request:

Endpoint http://www.wrangle-dev.example.com:3005/v4/jobGroups/2/publish
Authentication Required

Method PUT
Request Body
{
"connection": {
"id": 1
},
"path": ["default"],
"table": "test_table",
"action": "create",
"inputFormat": "avro",
"flowNodeId": 10
}

Response:

Status Code 200 - OK


Response Body
{
"jobgroupId":2,
"reason":"JobStarted",
"sessionId":"24862060-4fcd-11e8-8622-fda0fbf6f550"
}

Step - Publish Results to Redshift

The following uses the Avro results from the specified job (jobId = 2) to publish the results to the test_table2 ta
ble in the public Redshift schema through connectionId=2.

Copyright © 2019 Trifacta Inc. Page #502


NOTE: To publish to Redshift, the targeted database is predefined in the connection object. For the path
value in the request body, you must specify the schema in this database to use. Schema information is
not available through API. To explore the available schemas, click the Redshift icon in the Import Data
page. The schemas are the first level of listed objects. For more information, see Import Data Page.

Request:

Endpoint http://www.wrangle-dev.example.com:3005/v4/jobGroups/2/publish
Authentication Required

Method PUT
Request Body
{
"connection": {
"id": 2
},
"path": ["public"],
"table": "test_table2",
"action": "create",
"inputFormat": "avro",
"flowNodeId": 10
}

Response:

Status Code 200 - OK


Response Body
{
"jobgroupId":2,
"reason":"JobStarted",
"sessionId":"fae64760-4fc4-11e8-8cba-0987061e4e16"
}

Step - Publish Results to Tableau Server

The following uses the TDE results from the specified job (jobId = 2) to publish the results to the test_table3 ta
ble in the default Tableau Server database through connectionId=3.

Request:

Endpoint http://www.wrangle-dev.example.com:3005/v4/jobGroups/2/publish
Authentication Required

Method PUT

Copyright © 2019 Trifacta Inc. Page #503


Request Body
{
"connection": {
"id": 3
},
"path": ["default"],
"table": "test_table3",
"action": "createAndLoad",
"inputFormat": "tde",
"flowNodeId": 10
}

Response:

Status Code 200 - OK


Response Body
{
"jobgroupId":2,
"reason":"JobStarted",
"sessionId":"24862060-4fcd-11e8-8622-fda0fbf6f552"
}

Step - Publish Results to SQL DW

The following uses the Parquet results from the specified job (jobId = 2) to publish the results to the test_table
4 table in the dbo SQL DW database through connectionId=4.

Request:

Endpoint http://www.wrangle-dev.example.com:3005/v4/jobGroups/2/publish
Authentication Required

Method PUT
Request Body
{
"connection": {
"id": 4
},
"path": ["dbo"],
"table": "test_table4",
"action": "createAndLoad",
"inputFormat": "pqt",
"flowNodeId": 10
}

Response:

Copyright © 2019 Trifacta Inc. Page #504


Status Code 200 - OK
Response Body
{
"jobgroupId": 2,
"jobIds": 22,
"reason": "JobStarted",
"sessionId": "855f83a0-dc94-11e8-bd1a-f998d808020d"
}

API Workflow - Manage Outputs

Contents:
Overview
Basic Workflow
Step - Get Recipe ID
Step - Create OutputObject
Step - Run a Test Job
Step - Create WriteSettings Object
Step - Get Connection ID for Publication
Step - Create a Publication

Overview
Through the APIs, you can separately manage the outputs associated with an individual recipe. This workflow
describes how to create output objects, which are associated with your recipe, and how to publish those outputs
to different datastores in varying formats. You can continue to modify the output objects and their related write
settings and publications independently of managing the wrangling process. Whenever you need new results, you
can reference the wrangled dataset with which your outputs have been associated, and the job is executed and
published in the appropriate manner to your targets.
Terms...
Relevant terms:

Term Description

outputobjects An outputobject is a definition of one or more types of outputs and how they are generated. It must be
associated with a recipe.

NOTE: An outputobject must be created for a recipe before you can run a job on it. One and only one
outputobject can be associated with a recipe.

writesettings A writesettings object defines file-based outputs within an outputobject. Settings include path, format,
compression, and delimiters.

publications A publications object is used to specify a table-based output and is associated with an outputobject. Settings
include the connection to use, path, table type, and write action to apply.

NOTE: If you need to make changes for purposes of a specific job run, you can add overrides to the
request for the job. These overrides apply only for the current job. For more information, see
API JobGroups Create v4.

Copyright © 2019 Trifacta Inc. Page #505


Basic Workflow

1. Get the internal identifier for the recipe for which you are building outputs.
2. Create the outputobject for the recipe.
3. Create a writesettings object and associate it with the outputobject.
4. Run a test job, if desired.
5. For any publication, get the internal identifier for the connection to use.
6. Create a publication object and associate it with the outputobject.
7. Run your job.

Step - Get Recipe ID

To begin, you need the internal identifier for the recipe.

NOTE: In the APIs, a recipe is identified by its internal name, a wrangled dataset.

Request:

Endpoint http://www.wrangle-dev.example.com:3005/v4/wrangleddatasets
Authentication Required

Method GET
Request Body None.

Response:

Status Code 200 - OK


Response Body

Copyright © 2019 Trifacta Inc. Page #506


{
"data": [
{
"id": 11,
"wrangled": true,
"createdAt": "2018-11-12T23:06:36.473Z",
"updatedAt": "2018-11-12T23:06:36.539Z",
"recipe": {
"id": 10
},
"name": "POS-r01",
"description": null,
"referenceInfo": null,
"activeSample": {
"id": 11
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"flow": {
"id": 4
}
},
{
"id": 1,
"wrangled": true,
"createdAt": "2018-11-12T23:19:57.650Z",
"updatedAt": "2018-11-12T23:20:47.297Z",
"recipe": {
"id": 19
},
"name": "member_info",
"description": null,
"referenceInfo": null,
"activeSample": {
"id": 20
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"flow": {
"id": 6
}
}
]
}

Copyright © 2019 Trifacta Inc. Page #507


cURL example:

curl -X GET \
http://www.wrangle-dev.example.com:3005/v4/connections \
-H 'authorization: Basic <auth_token>' \
-H 'cache-control: no-cache'

Terms...
Relevant terms:

Term Description

URL URL and method to execute.

authorization Authorization taken to pass to the platform. Basic authorization works.

NOTE: This token must be passed with each request to the platform.

cache-control Cache control setting.

content-type HTTP content type to send. These applications use application/json.

Checkpoint: In the above, let's assume that the recipe identifier of interest is wrangledDataset=11.
This means that the flow where it is hosted is flow.id=4. Retain this information for later.

For more information, see API Connections Get v4.

Step - Create OutputObject


Create the outputobject and associate it with the recipe identifier. In the following request, the wrangledDataset
identifier that you retrieved in the previous call is applied as the flowNodeId value.

The following example includes an embedded writesettings object, which generates a CSV file output. You
can remove this embedded object if desired, but you must create a writesettings object before you can
generate an output.
Request:

Endpoint http://www.wrangle-dev.example.com:3005/v4/outputobjects
Authentication Required

Method POST

Copyright © 2019 Trifacta Inc. Page #508


Request Body
{
"execution": "photon",
"profiler": true,
"isAdhoc": true,
"writeSettings": {
"data": [
{
"delim": ",",
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS_01.avro
"action": "create",
"format": "avro",
"compression": "none",
"header": false,
"asSingleFile": false,
"prefix": null,
"suffix": "_increment",
"hasQuotes": false
}
]
},
"flowNode": {
"id": 11
}
}

Response:

Status Code 201 - Created


Response Body
{
"id": 4,
"execution": "photon",
"profiler": true,
"isAdhoc": true,
"updatedAt": "2018-11-13T00:20:49.258Z",
"createdAt": "2018-11-13T00:20:49.258Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"flowNode": {
"id": 11
}
}

Copyright © 2019 Trifacta Inc. Page #509


cURL example:

curl -X POST \
http://www.wrangle-dev.example.com/v4/outputobjects \
-H 'authorization: Basic <auth_token>' \
-H 'cache-control: no-cache' \
-H 'content-type: application/json' \
-d '{
"execution": "photon",
"profiler": true,
"isAdhoc": true,
"writeSettings": {
"data": [
{
"delim": ",",
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS_01.avro"
,
"action": "create",
"format": "avro",
"compression": "none",
"header": false,
"asSingleFile": false,
"prefix": null,
"suffix": "_increment",
"hasQuotes": false
}
]
},
"flowNode": {
"id": 11
}
}'

Terms...
Relevant terms:

Term Description

URL URL and method to execute.

authorization Authorization taken to pass to the platform. Basic authorization works.

NOTE: This token must be passed with each request to the platform.

cache-control Cache control setting.

content-type HTTP content type to send. These applications use application/json.

Checkpoint: You've created an outputobject (id=4) and an embedded writesettings object and have
associated them with the appropriate recipe flowNodeId=11. You can now run a job for this recipe
generating the specified output.

Copyright © 2019 Trifacta Inc. Page #510


Step - Run a Test Job

Now that outputs have been defined for the recipe, you can just execute a job on the specified recipe flowNodeI
d=11:

Request:

Endpoint http://www.wrangle-dev.example.com:3005/v4/jobGroups
Authentication Required

Method POST
Request Body
{
"wrangledDataset": {
"id": 11
}
}

Response:

Status Code 201 - Created


Response Body
{
"reason": "JobStarted",
"sessionId": "4de74ab0-e6db-11e8-89d6-a98f99482612",
"id": 2
}

NOTE: To re-run the job against its currently specified outputs, writesettings, and publications, you only
need the recipe ID. If you need to make changes for purposes of a specific job run, you can add overrides
to the request for the job. These overrides apply only for the current job. For more information, see
API JobGroups Create v4.

To track the status of the job:


You can monitor the progress through the application.
You can use the appropriate endpoint to check job status. See API JobGroups Get Status v4.

Checkpoint: You've run a job, generating one output in Avro format.

Step - Create WriteSettings Object


Suppose you want to create another file-based output for this outputobject. You can create a second writesettings
object, which publishes the results of the job run on the recipe to the specified location.
The following example creates settings for generating a parquet-based output.

Copyright © 2019 Trifacta Inc. Page #511


Request:

Endpoint http://www.wrangle-dev.example.com:3005/v4/writesettings/
Authentication Required

Method POST
Request Body
{
"delim": ",",
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS_r03.pqt",
"action": "create",
"format": "pqt",
"compression": "none",
"header": false,
"asSingleFile": false,
"prefix": null,
"suffix": "_increment",
"hasQuotes": false,
"outputObjectId": 4
}

Response:

Status 201 - Created


Code

Copyright © 2019 Trifacta Inc. Page #512


Response
Body {
"delim": ",",
"id": 2,
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS_r03.pqt
"action": "create",
"format": "pqt",
"compression": "none",
"header": false,
"asSingleFile": false,
"prefix": null,
"suffix": "_increment",
"hasQuotes": false,
"updatedAt": "2018-11-13T01:07:52.386Z",
"createdAt": "2018-11-13T01:07:52.386Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"outputObject": {
"id": 4
}
}

cURL example:

curl -X POST \
http://www.wrangle-dev.example.com/v4/writesettings \
-H 'authorization: Basic <auth_token>' \
-H 'cache-control: no-cache' \
-H 'content-type: application/json' \
-d '{ "delim": ",",
"path":
"hdfs://hadoop:50070/trifacta/queryResults/[email protected]/POS_r03.pqt"
,
"action": "create",
"format": "pqt",
"compression": "none",
"header": false,
"asSingleFile": false,
"prefix": null,
"suffix": "_increment",
"hasQuotes": false,
"outputObject": {
"id": 4
}
}

Copyright © 2019 Trifacta Inc. Page #513


Terms...
Relevant terms:

Term Description

URL URL and method to execute.

authorization Authorization taken to pass to the platform. Basic authorization works.

NOTE: This token must be passed with each request to the platform.

cache-control Cache control setting.

content-type HTTP content type to send. These applications use application/json.

Checkpoint: You've added a new writesettings object and associated it with your outputobject (id=4).
When you run the job again, the Parquet output is also generated.

Step - Get Connection ID for Publication

To generate a publication, you must identify the connection through which you are publishing the results.
Below, the request returns a single connection to Hive (id=1).

Request:

Endpoint http://www.wrangle-dev.example.com:3005/v4/connections
Authentication Required

Method GET
Request Body None.

Response:

Status Code 200 - OK

Copyright © 2019 Trifacta Inc. Page #514


Response Body
{
"data": [
{
"connectParams": {
"vendor": "hive",
"vendorName": "hive",
"host": "hadoop",
"port": "10000",
"jdbc": "hive2",
"defaultDatabase": "default"
},
"id": 1,
"host": "hadoop",
"port": 10000,
"vendor": "hive",
"params": {
"jdbc": "hive2",
"connectStringOptions": "",
"defaultDatabase": "default"
},
"ssl": false,
"vendorName": "hive",
"name": "hive",
"description": null,
"type": "jdbc",
"isGlobal": true,
"credentialType": "conf",
"credentialsShared": true,
"uuid": "28415970-e6c4-11e8-82be-9947a31ecdd5",
"disableTypeInference": false,
"createdAt": "2018-11-12T21:44:39.816Z",
"updatedAt": "2018-11-12T21:44:39.842Z",
"credentials": [],
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"workspace": {
"id": 1
}
}
],
"count": 1
}

cURL example:

Copyright © 2019 Trifacta Inc. Page #515


curl -X GET \
http://www.wrangle-dev.example.com/v4/connections \
-H 'authorization: Basic <auth_token>' \
-H 'cache-control: no-cache' \
-H 'content-type: application/json'

Terms...
Relevant terms:

Term Description

URL URL and method to execute.

authorization Authorization taken to pass to the platform. Basic authorization works.

NOTE: This token must be passed with each request to the platform.

cache-control Cache control setting.

content-type HTTP content type to send. These applications use application/json.

For more information, see API Connections Get List v4.

Step - Create a Publication

You can create publications that publish table-based outputs through specified connections. In the following, a
Hive table is written out to the default database through connectionId = 1. This publication is associated with
the outputObject id=4.
Request:

Endpoint http://www.wrangle-dev.example.com:3005/v4/publications
Authentication Required

Method POST
Request Body
{
"path": [
"default"
],
"tableName": "myPublishedHiveTable",
"targetType": "hive",
"action": "create",
"outputObject": {
"id": 4
},
"connection": {
"id": 1
}
}

Copyright © 2019 Trifacta Inc. Page #516


Response:

Status Code 201 - Created


Response Body
{
"path": [
"default"
],
"id": 3,
"tableName": "myPublishedHiveTable",
"targetType": "hive",
"action": "create",
"updatedAt": "2018-11-13T01:25:39.698Z",
"createdAt": "2018-11-13T01:25:39.698Z",
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"outputObject": {
"id": 4
},
"connection": {
"id": 1
}
}

cURL example:

curl -X POST \
http://latest-dev.trifacta.net:3005/v4/publications \
-H 'authorization: Basic <auth_token>' \
-H 'cache-control: no-cache' \
-H 'content-type: application/json' \
-d '{
"path": [
"default"
],
"tableName": "myPublishedHiveTable",
"targetType": "hive",
"action": "create",
"outputObject": {
"id": 4
},
"connection": {
"id": 1
}
}'

Copyright © 2019 Trifacta Inc. Page #517


Terms...
Relevant terms:

Term Description

URL URL and method to execute.

authorization Authorization taken to pass to the platform. Basic authorization works.

NOTE: This token must be passed with each request to the platform.

cache-control Cache control setting.

content-type HTTP content type to send. These applications use application/json.

Checkpoint: You're done.

You have done the following:


1. Created an output object:
a. Embedded a writesettings object to define an Avro output.
b. Associated the outputobject with a recipe.
2. Added another writesettings object to the outputobject.
3. Added a table-based publication object to the outputobject.
You can now generate results for these three different outputs whenever you run a job (create a jobgroup) for the
associated recipe.
API Workflow - Swap Datasets

Contents:
Overview
Example Datasets
Assumptions
Step - Import Dataset
Step - Swap Dataset from Recipe
Step - Rerun Job
Step - Monitor Your Job
Step - Schedule Your Job

Overview

After you have created a flow, imported a dataset, and created a recipe for that dataset, you may need to swap in
a different dataset and run the recipe against that one. This workflow steps through that process via the APIs.

NOTE: If you are processing multiple parallel datasources in a single job, you should create a dataset
with parameters and then run the job. For more information, see
API Workflow - Run Job on Dataset with Parameters .

This workflow utilizes the following methods:


1. Creating an imported dataset. After the new file has been added to the backend datastore, you can
import into the Trifacta platform as an imported dataset.
2. Swap dataset. Using the ID of the imported dataset you created, you can now assign the dataset to the

Copyright © 2019 Trifacta Inc. Page #518


recipe in your flow.
3. Run a job. Run the job against the dataset.
4. Monitor progress. Monitor the progress of the job until it is complete.

Example Datasets

In this example, you are wrangling data from orders placed in different regions on a quarterly basis. When a new
file drops, you want to be able to swap out the current dataset that is assigned to the recipe and swap in the new
one. Then, run the job.

Example Files:
The following files are stored on your HDFS deployment:

Path and Filename Description

hdfs:///user/orders/MyCo-orders-west-Q1.txt Orders from West region for Q1

hdfs:///user/orders/MyCo-orders-west-Q2.txt Orders from West region for Q2

hdfs:///user/orders/MyCo-orders-north-Q1.txt Orders from North region for Q1

hdfs:///user/orders/MyCo-orders-north-Q2.txt Orders from North region for Q2

hdfs:///user/orders/MyCo-orders-east-Q1.txt Orders from East region for Q1

hdfs:///user/orders/MyCo-orders-east-Q1.txt Orders from East region for Q2

Assumptions

You have already created a flow, which contains the following imported dataset and recipe:

NOTE: When an imported dataset is created via API, it is always imported as an unstructured dataset.
Any recipe that references this dataset should contain initial parsing steps required to structure the data.

Tip: Through the UI, you can import one of your datasets as unstructured. Create a recipe for this dataset
and then edit it. In the Recipe panel, you should be able to see the structuring steps. Back in Flow View,
you can chain your structural recipe off of this one. Dataset swapping should happen on the first recipe.

Object Type Name Id

flow MyCo-Orders-Quarter 2

Imported Dataset MyCo-orders-west-Q1.txt 8

Recipe (wrangledDataset) n/a 9

Job n/a 3

Base URL:
For purposes of this example, the base URL for the Trifacta platform is the following:

http://www.example.com:3005

Copyright © 2019 Trifacta Inc. Page #519


Step - Import Dataset

NOTE: You cannot add datasets to the flow through the flows endpoint. Moving pre-existing datasets
into a flow is not supported in this release. Create or locate the flow first and then when you create the
datasets, associate them with the flow at the time of creation.
See API ImportedDatasets Create v4.
See API WrangledDatasets Create v4.

NOTE: When an imported dataset is created via API, it is always imported as an unstructured dataset.
Any recipe that references this dataset should contain initial parsing steps required to structure the data.

The following steps describe how to create an imported dataset and assign it to the flow that has already been
created (flowId=2).
Steps:
1. To create an imported dataset, you must acquire the following information about the source.
a. path
b. type
c. name
d. description
e. bucket (if a file stored on S3)
2. In this example, the file you are importing is MyCo-orders-west-Q2.txt. Since the files are similar in
nature and are stored in the same directory, you can acquire this information by gathering the information
from the imported dataset that is already part of the flow. Execute the following:

Endpoint http://www.example.com:3005/v4/importedDatasets
Authentication Required

Method POST
Request Body
{
"path": "/user/orders/MyCo-orders-west-Q2.txt",
"type": "hdfs",
"bucket": null,
"name": "MyCo-orders-west-Q2.txt",
"description": "MyCo-orders-west-Q2"
}

3. The response should be a 201 - Created status code with something like the following:

Copyright © 2019 Trifacta Inc. Page #520


{
"id": 12,
"size": "281032",
"path": "/user/orders/MyCo-orders-west-Q2.txt",
"dynamicPath": null,
"type": "hdfs",
"workspaceId": 1,
"bucket": null,
"isSchematized": false,
"isDynamic": false,
"disableTypeInference": false,
"createdAt": "2018-10-29T23:15:01.831Z",
"updatedAt": "2018-10-29T23:15:01.889Z",
"parsingRecipe": {
"id": 11
},
"runParameters": [],
"name": "MyCo-orders-west-Q2.txt.txt",
"description": "MyCo-orders-west-Q2.txt",
"creator": {
"id": 1,
},
"updater": {
"id": 1,
},
"connection": null,
}

4. You must retain the id value so you can reference it when you create the recipe.
5. See API ImportedDatasets Create v4.

Checkpoint: You have imported a dataset that is unstructured and is not associated with any flow.

Step - Swap Dataset from Recipe

The next step is to swap the primary input dataset for the recipe to point at the newly imported dataset. This step
automatically adds the imported dataset to the flow and drops the previous imported dataset from the flow.
1. Use the following to swap the primary input dataset for the recipe:

Endpoint http://www.example.com:3005/v4/wrangledDatasets/9/primaryInputDataset
Authentication Required

Method PUT

Copyright © 2019 Trifacta Inc. Page #521


Request Body
{
"importedDataset": {
"id": 12
}
}

2. The response should be a 200 - OK status code with something like the following:

Copyright © 2019 Trifacta Inc. Page #522


{
"id": 9,
"wrangled": true,
"createdAt": "2019-03-03T17:58:53.979Z",
"updatedAt": "2019-03-03T18:01:11.310Z",
"recipe": {
"id": 9,
x "name": "POS-r01",
x "description": null,
"active": true,
"nextPortId": 1,
"createdAt": "2019-03-03T17:58:53.965Z",
"updatedAt": "2019-03-03T18:01:11.308Z",
"currentEdit": {
"id": 8
},
"redoLeafEdit": {
"id": 7
},
"creator": {
"id": 1
},
"updater": {
"id": 1
}
},
"referenceInfo": null,
"activeSample": {
"id": 7
},
"creator": {
"id": 1
},
"updater": {
"id": 1
},
"referencedFlowNode": null,
"flow": {
"id": 2
}
}

3. The new imported dataset is now the primary input for the recipe, and the old imported dataset has been
removed from the flow.

Step - Rerun Job

To execute a job on this recipe, you can simply re-run any job that was executed on the old imported dataset,
since you reference the job by jobId and wrangledDataset (recipe) Id.

Copyright © 2019 Trifacta Inc. Page #523


Endpoint http://www.example.com:3005/v4/jobGroups
Authentication Required

Method POST
Request Body
{
"wrangledDataset": {
"id": 9
}
}

The job is re-run as it was previously specified.


If you need to modify any job parameters, you must create a new job definition.

Step - Monitor Your Job

After the job has been queued, you can track it to completion. See API Workflow - Develop a Flow.

Step - Schedule Your Job

When you are satisfied with how your flow is working, you can set up periodic schedules using a third-party tool to
execute the job on a regular basis.
The tool must hit the above endpoints to swap in the new dataset and run the job.

Copyright © 2019 Trifacta Inc. Page #524


Copyright © 2019 - Trifacta, Inc.
All rights reserved.

You might also like