Power Meter 301

This is our third and final segment on the nuts-and-bolts of power measurement. Part 1 is located HERE and Part 2 is located HERE.

To begin this segment, I’d like to wrap up a few take-home pieces of information that we’ve learned from the first two parts:

1. A power meter is simply a data-gathering device. In essence, it is telling you real-time how much effort you are putting forth; a quick and constant calculation of our equation Power = Force * Distance / Time.

2. When a power meter manufacturer quotes an accuracy range (i.e. +/- 2%), this is taking everything into account in terms of actual power measurement. For example, if the power meter’s strain gauges aren’t entirely accurate, this will lend to some error (the ‘force’ part of the equation). If the time cycle over which that power is being measured is inaccurate, this can also lead to inaccurate data.

3. When power meter manufacturers quote that accuracy rating, it does not account for the potential data loss or inaccuracy due to transmission protocol or your head unit. As we saw in Powermeter 201, two head units reading off of the same power meter did not have identical data. Sorry to crash your party, but you might not actually have the fastest time on your local Strava segment – due to lost data packets and variability in GPS information. Turns out the best way to really know who is the fastest in your group is to actually ride with other people. How arcane!

Above photo © Jake Orness

More Testing

So far, we’ve done a few tests. First we did a series feeding data from a single power meter to multiple head units. Next, we strapped three power meters (Stages, SRM, and Powertap) on a single bike, and sent data from each to its own head unit. All of the multi-PM rides were done indoors, to control for temperature, humidity, and to reduce or eliminate vibration of the bicycle. The first and second-round tests are available in Powermeter 201, linked at the bottom of this page.

The final test runs were done with all three power meters, but outdoors on real roads. Would temperature and altitude change fiddle with our data? Could road vibration cause hardware and accelerometers to go on the fritz? That was our quest – the perfect test.

Run 1 - Outdoors

Here’s the setup for our first outdoor test:

-Stages sending to Garmin 310XT (blue)
-SRM sending to Garmin 500 (red)
-Powertap SL+ sending to Joule GPS (black)

All power meter settings were the same as previous runs: 1-second recording, GPS off, include zeros in averaging, etc. All external sensors were the same, including the external ANT+ wheel speed sensor, sending data to both the Stages and SRM head units. Of course, all three power meters and heads had the zero offset function performed at the beginning of the ride.

This ride was just over 40 kilometers, done over variable terrain (flats, hills, descents), and on both smooth and bumpy pavement. The first hour and fifteen minutes were just good ol’ pedaling. After that, I did a lower cadence segment that also happened to be on a very bumpy bridge, to see if we could wreak any havoc on the power meters or head units.

First, let’s look at a speed x distance graph, provided by Dr. Chung:

They all look really similar, don’t they? Because we decided to use an external ANT+ speed sensor for the Stages – and not rely on GPS data for speed – everything lines up nicely. The SRM and Stages (using the same speed sensor) are almost spot-on with the Powertap, which measures speed via its axle spacer magnet.

Next, we’ll look at cadence x distance. I drew in a green circle around the area where I did the low cadence segment:

Initially, Dr. Chung’s overall response was, “I don't see an obvious cadence anomaly in the Stages data, so that's good.”

During the low cadence segment, there are a few more zeros than the other two power meters, but that comes as no surprise, given our data from the previous article. In fact, Stages publishes a 30rpm lower limit for their system – and our data confirms this.

What I notice is that, overall, the SRM data is the outlier. We can’t really explain it. There are more zeros all around. Either it is right and the other two power meters are wrong, or vice versa. I had both Garmin head units in my rear pocket, simply because I had too much other stuff on my bars, and a broken wrist strap for the 310XT. Was my body blocking the signal? If so, why didn’t it affect the Stages data? I have to think that the Powertap data is good, because we were using a supplemental cadence sensor, and a head unit on my bike’s stem. At this point, all we can say is: ‘Interesting.’

Next, we’ll look at VE, or Virtual Elevation. This is Dr. Chung’s way to compare effective wind and/or elevation of multiple systems. In essence, it compares the relative measured effort of each system against the others. There is no single objective right or wrong – all we can do is compare one system against the others to see how they trend.

Similar to the previous tests, the Stages seemed to run slightly ‘high’ compared to the others (i.e. showing higher wattage required over varied terrain and wind). VE is cumulative, meaning that once something diverges, it generally isn’t going to magically find its way back.

Again, the oddball in this test seemed to be SRM. Dr. Chung asked me directly: ‘Did something happen to the SRM at about the 16.4km mark (~ 10 miles)?’

Indeed, it drifts away from the others. At first they all track very similarly; at 16.4km, either the PT and Stages climb away, or the SRM takes a dive. I remarked that I could not recall anything happening on the ride at that time; I was JRA (‘just riding along’). Was it a head unit thing? A temperature thing? At this point, we can only say – that is the output.

Run 2 – Outdoors

For the second ride, I managed to put all three head units on the handlebars. I wanted to be able to see all of the head units to make sure that data wasn’t dropping for long periods of time, and we weren’t getting any interference from my body.

-Stages sending to Garmin 310XT (yellow)
-SRM sending to Garmin 500 (green)
-Powertap SL+ sending to Joule GPS (red)

Here is the full file from TrainingPeaks:

Let’s zoom in on a smaller portion:

While we see a few small ‘jiggles’ in data here and there, they all track similarly. What’s the bird’s eye view analysis from TrainingPeaks?

“Looking at the files it looks like for the outside ride, the data is so similar that none of them stand out over any of the other ones. The data are more or less identical [for training purposes].”

By now, I think they’d had enough of staring at near-identical files for me. I hear their point loud and clear.

Testing Summary

I think we’ve seen enough charts and graphs by now. For those who don’t crunch numbers for a living, what are the take-homes of all of our testing? After all of the discussions, interviews, and tests, I’ll attempt to distill everything into a few succinct points:

-Overall, our three power meters trended very similarly. Looking at ALL of my rides (and there were more than just the ones we showed), the Powertap and SRM were close enough that I’d consider them nearly ‘the same’. The Stages tracked and trended similarly (i.e. look at the VE charts), but seemed to have slightly higher power readings overall.

-After discussions with several professional coaches and physiologists, all agreed that a person can adequately train to the data from any of these systems. They all noted that the most important thing is that a power meter is accurate relative to itself. For all of those people wondering our opinion of Stages’ accuracy – there it is. We saw slightly more variability in things like RPM x Crank Torque and obvious differences at low cadence, but nothing that would keep you from successfully training with the system (in my personal opinion). I do not feel that Stages is being disingenuous with their marketing messages – it is not sold as an SRM Science, and it does not carry the price.

-Where I do feel the systems begin to diverge is with testing, not training. If you are doing your own rolling resistance testing and trying to detect a 1-watt difference between tires or trying to fine-tune your bike position on a velodrome for minimum aerodynamic drag – that is where the super-fine detail becomes necessary. Based on the data from our tests and others that I have seen, SRM takes the cake here. As we discussed in Powermeter 101, this actually has as much to do with their PC7’s fast recording rate as it does the crank itself. Powertap and Quarq can also be used for good testing, in my opinion, but their limiter is the head units’ choice to ‘dumb down’ ANT+ protocol. Overall, this is a non-issue for 99% of the market; I can count the number of folks I know that do their own aero and Crr testing on one hand. You can also count me in the group that – despite its shortcomings – loves the Garmin 500. It doesn’t capture data as accurately as the PC7, but it has a wonderfully simply interface, long battery life, and of course, it has GPS. I consider its data ‘good enough’ for almost anyone.

Note on non-round chainrings

I’ve had a handful of people ask about how any sort of oval or non-round chainring affects power measurement. Put simply, it does affect crank-based power measurement. The reason has nothing to do with any of these power meters’ quality or design, but rather a simply physical reality: Our power equation that we keep referring to assumes a constant velocity. For a crank-based system, it assumes that your pedals are turning at a perfectly steady speed throughout each complete revolution. For a wheel-based system, it assumes that your wheel is turning at a constant speed. We know that neither of these situations are very likely.

With oval chainrings, their #1 goal is to mess with this velocity. They cause your crank velocity to change during each pedal stroke, so you spend more or less time in certain parts of that pedal stroke. For example, they may want you to spend more time – slow the crank down – during the main ‘power phase’.

While we don’t want to dive too deep into this topic, the net effect is that you’ll get slightly inflated power numbers using an oval ring on a crank-based power meter. How much this gets inflated depends on how odd-shaped your ring is (the ‘less-round’ it is, the more your power will read high). How high are we talking? Through the course of my research, I heard anywhere from 0.5% to 4%. I did not have any non-round rings during my testing, so I can’t make a claim based on personal experience.

Just for fun, let’s say your power meter quotes +/- 1.5% accuracy. At 200 actual watts, that means your displayed power output could be 197 to 203. If your oval chainrings adds another 1.5% on top of that, we have a net of +/- 3%. On the high end, that would tell you that you’re putting out 206 watts, when you’re actually putting out 200. If you haven’t set your zero offset in three weeks, it could drift much farther than that. Did you PR on your Strava segment, or just get bad data?

Target Customer for each system

How do you know which system is right for you? What is best? This question is discussed often in our reader forum. I will attempt to lay out what I feel is the target customer for each of the three power meters we tested.

SRM: At least with our testing, we tended to see the cleanest data in most situations with SRM. Our first outdoor ride left us stumped, but the odd data could very well be due to the head unit placement. I think SRM is for the customer who wants unlimited wheel choice and the ability to do very detailed data analysis. If you want to do aero testing or super-fine analysis of sprint workouts (i.e. small time increments with little-or-no smoothing applied), this is your system.

Stages: I see the Stages customer as someone who 1) wants the ability to swap to different wheels on their bike and 2) wants an easy-to-install system, and 3) is not going to do aero or Crr testing. Whether or not the left-leg measurement is a problem is up to you. After my testing, I don’t feel this is an issue for the vast majority of people who would train with this system. It is simple and appears to do what they advertise.

Powertap: Their Achilles heel is and will always be the wheel choice argument. However, their system opens up another very important area - unlimited crank and chainring choice. If you’re not yet 100% sure what crank length you want to ride or have different length cranks on different bikes, this is your system. In addition, we’ve learned that due to the realities of physics, wheel-based power measurement will produce the most accurate power numbers with any non-round chainring.

What if you want to swap your power meter between bikes? What system is best – if you only want to buy ONE? There is no simple answer to this. At first glance, Powertap is the answer. However, you might not want to have the same cassette on each bike, requiring a swap. Additionally, you cannot swap the same wheel between, say, a triathlon bike with 130mm rear spacing, and a disc brake equipped cyclocross bike with 135mm spacing. If you know what you’re doing, you can swap cranks between bikes, but with the modern problem of a million-and-one bottom bracket standards, the chance that all of your bikes are compatible with the same crank is nearly zero. Also, if you’re like me, you don’t have the same chainring sizes or even crank lengths on all bikes. All of this does not intend to champion or poo-poo one style of power meter (crank or wheel-based) – we just want to suggest that you evaluate your priorities before buying and don’t assume that one power meter is best for all situations.

Key take-homes for ALL power meters

What can you do to ensure the best data possible from your power meter? These recommendations most certainly apply to all of the systems that I have direct experience with: SRM, Quarq, Powertap, and Stages. They should also apply to similar strain gauge-based systems, such as Power2Max and Rotor.

1. Set your head unit to at least one-second recording. My opinion for all Garmin users is that you do not want to use ‘Smart Recording’.

2. Set your zero offset often! Yes, really! How often? I set mine at the beginning of every ride. On a Garmin, this is the ‘calibrate’ function. For a Joule GPS, you press ‘manual zero’. Whatever head unit you use, read the instructions and learn how to do it. This is essential.

3. Include zeros in your power averaging. Well, if you really want to cheat and show your buddies that you have super high power averages compared to them, you can choose to calculate the non-zero average, but most coaches will tell you to include the zeros.

4. I recommend not relying on GPS data for speed, or any ‘virtual’ calculations, such as Powertap’s virtual cadence. Supplemental ANT+ sensors will always be more accurate.

Overall, I think the most important thing to understand is that a power meter is a tool, but it isn’t the end-all, be-all. In terms of measuring our efforts, they are as objective as we can get, but I hope our data show that power meters and head units are not perfect. So many people seem to take their power numbers as ‘the truth’, but fail to understand everything that goes into the number you see on your screen. Heck, I’m still working through an issue where I’ll have the same power file displayed in two different software programs – and show a 20-watt difference in average power. Yes, 20 watts! The beauty of technology is that each of these systems can improve over time via software and firmware updates, but what does that say for comparing to your historical data? If you have 10 years’ worth of data from one power meter, but just switched to something else… or they updated a critical part of their firmware… or you decided to try oval chainrings – can you really compare data?

I am admittedly playing devil’s advocate here, but that’s my job. All I’m saying is that power meters are most certainly useful, at the end of the day, they are just tools. They are not substitutes for critical thinking. If you absolutely MUST hold 220 watts for your next time trial, but just can’t seem to do it, you have to use your noggin and back off of the effort. It could be that you’re just tired – OR that you are actually producing 220 watts, and the calibration on your power meter is off. Power meters only realize their full potential as deadly weapons of cycling speed when they join their brethren of common sense and the human brain.