Start Free

Speed up your data preparation with Trifacta

Free Sign Up
Wrangle Summit 2021 On Demand

You can still experience the best people, ideas and technology in data engineering, all in one place

Get All-Access Pass
 
All Blog Posts

Orchestrate Your Data Pipelines on Trifacta Using Plans

December 8, 2020

Whether you already have a whole bunch of flows on Trifacta or not, for now, let’s assume that you have and are curious to know how you can go beyond simply running your flows on a schedule and essentially, automate your entire data pipelines on Trifacta. 

What is a Plan?

A Plan on Trifacta, as the name suggests, enables you to plan the execution of flows by defining the sequence in which you’d like to execute them. 

Plans are created using Trifacta’s visual builder that makes it dead simple to create a workflow wherein each step is a task configured based on your needs.

Besides executing flow tasks, you can also execute an HTTP task as part of a plan. In other words, you can trigger a webhook endpoint or call any external API via a step in a Plan. Additionally, a Plan can be run manually or can be configured to run on a schedule. 

Why Create a Plan?

The short answer is to operationalize and automate your data pipelines on Trifacta. If you are running a lot of different flows, instead of running a flow in its entirety, you can configure specific recipes in a flow to run as tasks in a Plan — you no longer need to rely on our APIs or external tooling for this. 

Moreover, when creating a Plan, you have more flexibility regarding task execution with advanced features like HTTP Tasks, Conditional Execution, and Parallel Execution. 

HTTP Tasks

By adding an HTTP task in a Plan, you can: 

  • Send a Webhook Notification
  • Execute a Plan from another Plan
  • Make a request to any external API (that you have access to)

A simple use case is to send yourself a Slack message when a task in a Plan fails — you can do this by calling this API endpoint from Slack

Conditional Execution

As the name suggests, conditional execution allows you to specify the status of a task execution as a condition for running a task. You can choose one of the following statuses as a condition to execute the subsequent tasks:

  • On success: This is the default status; only on the successful execution of the task should the subsequent task execute
  • On failure: Think of this as a fallback condition to execute a task when the previous one fails 
  • On execution: This ignores the success or failure of the task in question and results in the execution of the subsequent task anyway

As suggested before, you can send yourself a notification on failed executions by defining a condition and running an HTTP task on that node. Similarly, you can choose to notify by email the project stakeholders on successful executions or both, depending on your use case and preferences. 

Parallel Execution

Running tasks parallely can save time and increase the efficiency of your Plans. For instance, trying to transform and move two files into two separate tables will be faster when the respective tasks are run in parallel rather than sequentially. 

Do keep in mind that for tasks to run in parallel, there shall be no dependency between them.

Conclusion

Currently only available as a paid feature, Plans extend the power of Trifacta by adding a layer of automated workflows on top of your data transformation flows.

If you’re one of our customers and are running a host of scheduled jobs, do have a look at the capabilities put forth by Plans — you are sure to fall in love. 

And if you’re curious to learn more about Plans or other paid features, schedule a demo here