Connecting to Data

  • Connecting to Data

    In order to perform any transformations, you must first connect to the data you wish to perform transformations on.Creating a Flow A flow is your hub for organizing and managing datasets used to generate results. You must create a flow and add a dataset to that flow in order to Wrangle that data. From here you can add additional datasets or jump into the Transformer Page. Adding dataset…

  • FAQ: What is Datasource Swapping?

    Datasource Swapping is a method to apply an existing recipe to multiple datasets with the same schema. Source swapping replaces the first datasource with the second datasource. Basically, if you create a dataset using a source file and you have a different source file with the exact same columns, you can swap the original source file with the second source file and the recipe logic will be applied…

  • HOW TO: Import Data from External Sources

    1. Click the 'Datasources' link in the User Options Menu:2. Click the 'Add Datasource' button on the right hand side of the Sources page:Depending on your version of Trifacta and exact configuration, following datasource options may be available: Use Existing DatasourceLocal FileHDFSS3Hive 3. Select an option from the Add Datasource dropdown menu. In this example, we'…

  • HOW TO: Import Local Data

    1. Click on the 'Datasources' link in the User Options Menu:2. Click the 'Add Datasource' button on the right hand side of the Sources page: 3. Select 'File' from the Add Datasource dropdown menu:NOTE: Trifacta Wrangler Enterprise may have additional options, such as HDFS, Hive, S3 or others. 4. From the file browser, browse to the location on your local machine wh…

  • FAQ: Creating a Datasource from HDFS

    Which objects in HDFS can I use to create a datasource?You can create a datasource based on an HDFS file or an HDFS folder. How does Trifacta create a datasource based on a folder?When you select a folder as the basis for a datasource, Trifacta creates a single datasource that contains the contents of all of the files in the selected folder. If the selected folder contains subfolders, Trifacta wil…

  • ERROR: Undefined is Not a Function in Trifacta Wrangler

    Problem DescriptionCreating a dataset in Trifacta Wrangler fails when you try to upload a file. The following error appears: "Error: undefined is not a function." CauseIn Trifacta Wrangler, you can only create a dataset using files that are smaller than 100MB. This error message appears if you attempt to upload a file that is larger than 100MB.keywords: create dataset, error, undefine…

  • VIDEO: Creating a Datasource from HDFS/S3

    Learn how to create a datasource using data stored in HDFS/S3.See also:HOW TO: Create a Datasource from HDFSkeywords: HDFS, S3, datasource, source, import data…

  • HOW TO: Create a Dataset

    1. Click the 'Workspace' tab on the menu bar.2. Click 'New Dataset' at the top-right side of the screen:3. Complete the following fields in the 'Create a new dataset' window: FieldDescription Dataset NameRequired. Enter the name of the dataset. Dataset DescriptionOptional. Enter a description of the dataset. 4. Select the location of the dataset that your datasour…

  • HOW TO: Swap Datasources

    1. Click the dataset drop down menu. 2. Click the datasource. 3. Choose new datasource and click 'Select'. Datasorce Swap Tips: Make a copy of your dataset prior to swapping sources. Learn how here. Assumes identical schema. Applying one script to a source with a different schema may not produce the desired results. …

  • HOW TO: Create a Single Dataset Using Multiple Files

    You can create a dataset based on multiple files in a single directory. These files must have the same structure. 1. In the Workspace view, click New Dataset. 2. Click the HDFS Browser tab. 3. Navigate to the folder that contains the files that you want to wrangle. 4. Click the Directory icon to the left of the folder name. ​This will select the entire contents of the folder as the dataset. …

  • HOW TO: Copy a Dataset

    1. Open the Workspace tab in Trifacta.2. Identify the dataset that you want to copy.3. Hover over the name of the dataset that you want to copy. A set of Action Buttons will appear. 4. Click the Copy Dataset action button: Trifacta creates a copy of your dataset. When the copy has been successfully created, a green alert window appears, as shown in the following image: In our example, the…