Start Free

Speed up your data preparation with Designer Cloud powered by Trifacta

Free Sign Up
All Blog Posts

Wrangling JSON Metadata via Un-Nesting

October 27, 2017

JSON is a popular file format used to store unstructured content. Many popular databases use JSON to, for example, store media files. While it’s very easy for machines to parse JSON, JSON presents some challenges for analysts because data is nested inside of the document.

For our example, we’ll look at Twitter data. Opening our 59MB file in a text editor reveals the data.

{“twitter_info1”:”{“hashtags”:[],”created_date”:”2015-04-16T19:38:58″,”mention”:[“USTELCOCares”],”twitter_text”:”@USTELCOcares i haven’t been an USTELCO customer in 10+ years, but suddenly received bill for svcs never used. customer svc ppl can’t find record”}”,”user_info1″:”{“user_name”:”Jason Oh”,”user_id”:”jasonoh”,”screen_name”:[“USTELCOCares”],”location”:”ManhUSTELCOan”,”friends_count”:”79″}”,”created_date”:”2015-04-16″,”created_time”:”19:38:58″,”twitter_text”:”@USTELCOcares i haven’t been an USTELCO customer in 10+ years, but suddenly received bill for svcs never used. customer svc ppl can’t find record”,”twitter_text2″:”@USTELCOcares”,”Handle”:”@USTELCOcares”,”friends_count2″:”79″, “location2″:”ManhUSTELCOan”,”screen_name2″:”[“USTELCOCares”]”,”user_id2″:”jasonoh”,”user_name2″:”Jason Oh”,”friends_count1″:”79″,”location1″:”ManhUSTELCOan”,”screen_name1″:”[“USTELCOCares”]”,”user_id1″:”jasonoh”,”user_name1″:”Jason Oh”,”friends_count”:”79″,”location”:”ManhUSTELCOan”,”user_id”:”jasonoh”,”user_name”:”Jason Oh”} {“twitter_info1”:”{“hashtags”:[“USTELCO”,”cisco”,”network”],”created_date”:”2015-04-16T19:39:45″,”mention”:[],”twitter_text”:”#USTELCO taps Ciena, #Cisco to build its 100G U.S. metro #network”}”,”user_info1″:”{“user_name”:”ExterNetworks Inc”,”user_id”:”ExterNetworks”,”screen_name”:[],”location”:”Piscataway, NJ 08854 “,”friends_count”:”1229″}”,”created_date”:”2015-04-16″,”created_time”:”19:39:45″,”twitter_text”:”#USTELCO taps Ciena, #Cisco to build its 100G U.S. metro #network”,”twitter_text2″:null,”Handle”:null,”friends_count2″:”1229″,”location2″:”Piscataway, NJ 08854 “,”screen_name2″:”[]”,”user_id2″:”ExterNetworks”,”user_name2″:”ExterNetworks Inc”,”friends_count1″:”1229″,”location1″:”Piscataway, NJ 08854 “,”screen_name1″:”[]”,”user_id1″:”ExterNetworks”,”user_name1″:”ExterNetworks Inc”,”friends_count”:”1229″,”location”:”Piscataway, NJ 08854 “,”user_id”:”ExterNetworks”,”user_name”:”ExterNetworks Inc”}

For the purpose of demonstration, we can format one record a little differently by hand so we can more clearly see the content of the tweet and the associated metadata:


   “twitter_info1”:“{“hashtags”:[],”created_date”:”2015-04-16T19:38:58″,”mention”:[“USTELCOCares”],“twitter_text”:”@USTELCOcares i haven’t been an USTELCO customer in 10+ years, but suddenly received bill for svcs never used. customer svc ppl can’t find record”}”,

   “user_info1”:“{“user_name“:”Jason Oh”,”user_id”:”jasonoh”,”screen_name”:[“USTELCOCares”],”location”:“ManhUSTELCOan”,”friends_count”:”79″}”,



   “twitter_text”:“@USTELCOcares i haven’t been an USTELCO customer in 10+ years, but suddenly received bill for svcs never used. customer svc ppl can’t find record”,







   “user_name2”:“Jason Oh”,





   “user_name1”:“Jason Oh”,




   “user_name”:“Jason Oh”


The JSON document consists of name/value pairs, where values can also be arrays. In the above example, on a user record, one attribute is named `friends_count1` and the value is `79`. As an analyst, you may have thousands or millions of JSON documents to analyze. Pulling out the values and ordering them can be extremely complex. Fortunately, with Trifacta, there are some easy shortcuts that make working with JSON data a snap.

Un-nest Transformation

When you import your JSON file, Trifacta automatically begins to organize your data and provide column headers.


Built into every edition of Trifacta is an unnest transformation. The Unnest Transform expands the contents of an array or map and creates new columns in your dataset for each element of the array or map. For maps, unnest takes each key-value pair and creates a new column with the key as the column header and the values as the columns rows. Simply selecting all the data in your column, Trifacta will recognize each name/value pair in the JSON map and split them out into separate columns with the appropriate headings.

More advanced Trifacta users can also write a recipe using the Transform Builder to achieve the same results.

Your Turn

And that’s it! As always, if you have any questions, you can visit the Trifacta Community for help.