Connecting to Data

  • ESSENTIALS: Connecting to Data

    Overview In order to perform any transformations, you must first connect to the data you wish to perform transformations on. Creating a Flow A flow is your hub for organizing and managing datasets used to generate results. You must create a flow and add a dataset to that flow in order to Wrangle that data. From here you can add additional datasets or jump into the Transformer Page. Add…

  • FAQ: What is Datasource Swapping?

    Datasource Swapping is a method to apply an existing recipe to multiple datasets with the same schema. Source swapping replaces the first datasource with the second datasource. Basically, if you create a dataset using a source file and you have a different source file with the exact same columns, you can swap the original source file with the second source file and the recipe logic will be applied…

  • FAQ: Creating a Datasource from HDFS

    Which objects in HDFS can I use to create a datasource?You can create a datasource based on an HDFS file or an HDFS folder. How does Trifacta create a datasource based on a folder?When you select a folder as the basis for a datasource, Trifacta creates a single datasource that contains the contents of all of the files in the selected folder. If the selected folder contains subfolders, Trifacta wil…

  • HOW TO: Apply a Recipe to a New Dataset

    If you want to apply the logic of an existing recipe to another dataset, you can replace the imported dataset with the new dataset you would like the recipe applied to. You can either replace an existing imported dataset if you don't plan on generating output from it in the future, or you can maintain the existing flow and create copies that can be used for new datasets. You can preserve the …

  • HOW TO: Copy a Recipe

    Copying a Recipe allows the user to apply a Recipe to another dataset, or have different steps applied to the imported dataset You can either: Copy a Recipe without inputs, useful for applying an existing recipe to a new dataset, or Copy a Recipe with inputs, useful for applying different steps to an existing imported dataset See Also: HOW TO: Chain Recipes HOW TO: Apply a Recipe to a new D…

  • HOW TO: Copy a Flow

    In the Flows Page, perform the following steps: Select the more options (three dots) icon to the right of the Flow you would like to copy Select the option to Make a copy A window will appear allowing you to edit the name and description of this new copied flow When you are ready, select Ok and your newly created Flow will appear in the Flows Page Select the Flow name to enter your new Flow S…

  • HOW TO: Delete a Flow

    In the Flows Page, perform the following steps: Select the more options (three dots) icon to the right of the Flow you would like to delete Select the option to Delete A window will pop up asking if you are sure you would like to delete the selected Flow. Select Ok if you would like to proceed Note: You will not be able to undo the delete, so all accompanying Recipes and Wrangled Datasets will b…

  • HOW TO: Create a Single Dataset Using Multiple Files

    You can create a dataset based on multiple files in a single directory. These files must have the same structure. 1. In the Flow view, click Add Dataset. 2. Navigate to the directory in HDFS, S3, Hive, etc. 3. Navigate to the folder that contains the files that you want to wrangle. 4. Click the '+' next to the Directory icon to the left of the folder name. ​This will select the en…