Start Wrangling

Speed up your data preparation with Trifacta

Free Sign Up
Trifacta Ranked #1 in Data Preparation Market Study

Dresner Advisory Services study reviews and ranks 24 vendors

Get the Report
Schedule a Demo

The Dos and Don’ts of Big Data Success

May 18, 2016

As a technology that bridges data storage/processing platforms with visualization tools, our team at Trifacta has seen our fair share of big data stack implementations. Sometimes, customers already have one piece of their complete; other times, they’re starting from scratch. Architecting a modern environment typically means Hadoop is involved in some way, shape, or form, and Trifacta Wrangler Enterprise is seen as a huge help to IT in unlocking the potential of that investment. Our conversations often revolve around the vision the IT organization has for the business—driving deeper insights with access to more complex and comprehensive data—and the fact that they need to find the right tools for their business analysts before any of it comes to fruition.

Architecting and implementing the right solution is no picnic, and the IT organization has their hands full. They must consider the Hadoop distribution and cluster sizing, ingestion, data wrangling, and data visualization, and whether all of those components will be cohesive and comply with necessary security standards. Though it’s always challenging, we’ve seen some organizations take savvy shortcuts and build upon success faster than others. Below, we’ve listed a few of our biggest dos and don’ts for the IT organization when it comes to big data success:

THE DON’Ts:

1) Don’t focus solely on Hadoop and ingestion.
No doubt, Hadoop is an important component of your big data stack. But focus all of your efforts on implementation and ingestion, and you’ll have little business value to show from your initiative. We’ve seen banks solely work on ingestion for years with little to show for it. Without obvious business value or relevance, your organization won’t see the ROI of Hadoop and may grow skeptical of the investment.

2) Don’t isolate the business from your big data strategy/environment.
It’s tempting to work exclusively within your IT organization before involving business units—after all, you’re the one with the necessary skills and are responsible for the implementation. However, excluding business users from the process means that once you’re ready, odds are, they won’t be. They’ll have little information about the infrastructure, but more importantly, what to do with it. Business units need runway time to plan for a Hadoop-worthy use case, which will ultimately prove the value of your investment.

3) Don’t strive for perfection before execution.
Most IT organizations isolate themselves from the business during Hadoop setup because of the need to ensure that every detail is resolved before execution. While ideal in theory, it doesn’t allow you to be agile, build traction with business users and better understand how your organization will use the tools. Meticulous preparation and planning is responsible, but lean too heavy in that direction and you risk overlooking a critical factor that you would have learned  testing and strategizing with the business.

THE DOs:

1) Do focus on the bigger picture.
Instead of thinking through one aspect of your big data stack, focus on setting up the stack in one full sweep. Think through the process, necessary hand-offs, and business outcomes, working with the right people in both your IT organization and various business units to understand the full cycle. Of course, this is a balancing act, and requires diligence across your team to appropriately deal with each component. But it pays off tremendously—with the ability to get the business involved sooner, you’ll see success much faster, too.

2) Do partner with the business early on.
Not only will a strong business/IT partnership ensure success later down the road, but it will also help build trust and excitement among both teams. Aligning with a business initiative provides direction and obvious value to your hard work.

3) Do start small.
In order to set up a full stack, you’ll have to start small. This is okay. Don’t strive for perfection; strive for results. As the business’s initiatives evolve, your technologies and process will mature, too—the key is to work toward progress together. Many of our customers, such as PepsiCo, operated this way. They understood their business needs and were able to go from not having Hadoop at all, to successful usage, fairly quickly.

THE BOTTOM LINE:
Architecting a big data stack isn’t easy, but it’s essential to remember what’s driving your effort in the first place: business value from new or more sophisticated use-cases. Loop in business users early on in the process in order to build traction, ensure that they’re bought in on tools and processes, and can get started as early as possible.

Since there are many moving parts involved, understand that there will be a learning curve, both for your IT organization and your business users. Move fast, learn quickly, and build upon your big data success to prove ROI on your investment.

To learn more about inciting business users with the right initiatives for your Hadoop investment, read our white paper, “Best Practices for Executing New Analytics Initiatives”

Related Posts

5 Ways Trifacta Helps You Free Up Time for Signal Hunting

Dan Woods is CTO and founder of CITO Research. He has written more than 20 books about the strategic... more

  |  April 22, 2016

Commit to Clean Data: Ensure Transparency

We’ve covered a lot of ground in our Clean Data Manifesto series, outlining the 5 tenets of clean data.... more

  |  September 12, 2018

Why Data Wrangling is Key to Avoiding a “Frozen” Data Lake

When one of the biggest healthcare providers designed and implemented a data lake, they had big expectations.... more

  |  April 29, 2016