Join us on April 7-9, 2021

The first industry event focused on data engineering

Register Today
All Blog Posts

Wrangling Exercise Bike Data

February 18, 2016

Getting in better shape and losing weight are important to many people – in fact, they are the top 2 New Year resolutions. For me, it was no different, one of my goals for 2016 is to exercise more. However, many resolutions do not last long, with less than 50% of them making it past the 6 month mark. To increase my motivation, I decided to analyze my 2015 exercise bike sessions.

In January 2015, I bought a Kettler E3 exercise bike because I wanted to improve my fitness, and I used it fairly regularly since. After each cycling session, I usually take a photo of the display. The Kettler E3 can also record training sessions to a USB stick, which I plan to use more in the future.

image04 image02

I entered the data from the photos into an Excel spreadsheet and exported it as CSV, which took about 30 minutes. Then I loaded the CSV file into Trifacta Wrangler and started with the data cleanup and transformation:


First, I wanted to know when I usually work out. Using predictive interaction and the transform editor, I replaced the dots in the time field with colons to change it into a valid datetime representation (see above screenshot labeled 1). The column details profile for the “Date” field showed my training patterns over time. The more intense training before and the break after my knee injury in February 2015 as well as gaps during business trips and vacations stand out (see below screenshot labeled 2). I found it also interesting to learn that Wednesday and Sundays are my favorite training days (3), and that I mostly train in the evening (4):


Depending on whether I activate my persona on the bike, the energy is reported as kJoule or kcal. To unify this, separated the unit from the value and converted it to kJoule if it’s kcal. Then I derived the kcal values and using the energy of a gram of fat, I calculated how much fat worth of energy I burned in each training session:

split col: Energy on: `{delim}`
rename col: Energy3 to: ‘EnergyUnit’
rename col: Energy2 to: ‘EnergyValue’
derive value: (EnergyUnit == ‘kcal’) ? (EnergyValue * 4.1868) : EnergyValue
rename col: column11 to: ‘EnergyKJoule’
drop col: EnergyValue,EnergyUnit
rename col: column12 to: ‘TimeInMinutes’
derive value: EnergyKJoule / 4.1868
rename col: column13 to: ‘EnergyKCal’
derive value: EnergyKCal / 8.8
rename col: column14 to: ‘GramFat’

Finally, I cleaned out mismatched values in the pulse field, changed the time value to a minute-based number and derived the year. Then I was able to aggregate the main stats by year:


The most time-consuming part of this analysis was entering the data from the photos into the spreadsheet, which I plan to automate more by recording training sessions via USB. The data wrangling in Trifacta took only 10 minutes.

Through my analysis, I learnt that I spent about 24 hours cycling in 43 sessions and burned the equivalent of 0.9 kg fat in 2015. My goal for 2016 is to exercise twice a week and to prevent any injuries – so far, I’m on track.

At Trifacta, we apply data wrangling to wherever we can, like to runners’ data from the JP Morgan Corporate Challenge or to beer types for the perfect drink recommendation. If you’re like us, sign up for Trifacta Wrangler to try it out yourself. And if you end up with a story to tell, email it to us at – we’d love to hear it!