Start Wrangling

Speed up your data preparation with Trifacta

Free Sign Up
Trifacta Ranked #1 in Data Preparation Market Study

Dresner Advisory Services study reviews and ranks 24 vendors

Get the Report
Schedule a Demo

What is Data Structure? Structuring Data & Organization Like Martha Stewart

June 2, 2016

How can Martha Stewart be of any relevance in a blog post about What is data structure? Stay with us and we promise to explain.

When thinking of what is data structure, think of your car keys. Suppose you’ve lost your car keys at least once a week, and there seems to be no pattern for where they end up. But, alas, you’re a bit of a hoarder: you have to slip sideways through doorways because of the piles of magazines, children’s toys, and boxes stacked ceiling-height. How long do you think it would take to find your keys?

But what if your home is organized like Martha Stewart’s: items are categorized; like is kept with like, arrayed in pleasing symmetry and order. How easy would it be to find your car keys there? In answering the question, What is data structure?, think of data structure as computing’s version of Martha Stewart.

What Is Data Structure: Why It’s Important
When you’re dealing with terabytes and petabytes of data that you somehow need to analyze and make sense of, the magnitude of the problem is clearly more complex than finding a set of car keys. Data analysts spend up to 80 percent of their time preparing data for analysis and only 20 percent of the time actually doing the analysis.

Data structures give you a way to collect and organize data so that you can do something with it. A program’s algorithm is responsible for the “do something with it” part—it’s the underlying logic that performs operations on your data—so data structures and algorithms are inextricably linked.

When you’re looking for something—for example, you want to know in real time how your customers are talking about your product across all social networks—the speed and efficiency of your search is of utmost importance. Structuring the data is key to maximizing the speed and efficiency of your searches so that you can get to the analysis as quickly as possible.

What Is Data Structure: Some Examples
There are many types of data structures; we’ll cover just a few here.

  • Arrays are a fixed-length list made up of a collection of objects or values. An array lets you determine the position of each object or value using a mathematical formula. Example: Calculating racing results in a field of 300 runners.
  • Queues structure data in a “first in, first out” order. Just like a real-life queue (or line) of people at a bank, the first one to arrive also leaves first. Example: Serving print requests on a single shared printer.
  • Stacks structure data in a “last in, first out” order. Example: Undoing an action in a computer program (one of the more ubiquitous examples is the Back button in your browser).
  • Trees are a hierarchical data structure that consists of one or more data nodes. The first node is called the root; each node can have zero or more child nodes. Example: Storing data that naturally forms a hierarchy, such as an org chart.

What Is Data Structure: Its Role in Data Wrangling
“Data wrangling” is the process of taking data in its native format and making it usable for analysis. Structuring the data is only one of the six processes involved in data wrangling—but structuring data is integral to the success of your big data efforts:

“If you talk to any data scientist worth their salt, they will tell you that the first challenge of putting data to work for your business is getting it into a structured format so that you can analyze, interpret and make decisions around your data. This is lovingly referred to as “Data Wrangling” and it’s what sucks up the bulk of the unproductive (wasteful) time (4 out of 5 days, by most accounts). That’s because instead of spending time understanding the data, you’re wasting time trying to pull it all together in a useable format. This is what usually creates the bottleneck in any organization.”

What is data structure? In short, it’s the first, critical step to actually doing something useful with your data.

Related Posts

Why Trifacta, Why Now?

I’m incredibly honored and excited to join the team at Trifacta. From the moment I met Joe and saw my first... more

  |  July 23, 2014

Visual Profiling for Data Transformation

How do you feel when encountering a data set for the first time? Perhaps you may feel the dread of the... more

  |  October 14, 2014

Celebrating the Small World of Big Data

Computing is a small world; data is an even smaller one. It’s small enough that a core group of technology... more

  |  April 10, 2014