Measuring Power and Using the Data

FT, NP, IF and TSS Explained

In the previous two articles in this series – please read them first if you haven’t already, links follow this article– I gave the physical definition of “power” and described how a bicycle resists your efforts to move it, asking you to deliver ever more power to go faster.

I have stressed—and will continue to stress—that you should think of power in terms of work. It takes work to move down the road and the faster you can do that work, the faster you go. Work per unit time is the physical definition of power, and it is the most valuable way you can think intuitively about power in your racing and training. More work, in less time, makes you a faster cyclist.

But now we have to switch gears for a moment and think about an equivalent definition of power. If we rearrange the terms in the definition of power via a little eighth grade algebra, we find that power can also be defined as:

Force x Speed = Power

When dealing with bikes, we talk about force and speed in rotary terms, since cranks and wheels spin around. The rotating equivalents of force and speed are:

Torque x RPM = Power

Bicycle powermeters measure power in this fashion. They take rapid sample measurements of how hard you are pedaling, and how fast. They take these measurements either at the crankset, bottom bracket or rear axle. The little computer on your handlebars takes a few of these measurements, averages them together and multiplies the results. Voila—a wattage number pops up on the display.

Average Power

The first step in using power is to figure out after your ride, or after a portion of it, what was your average power. The power output of your ride is averaged over time, since power itself is measured over time. Let’s say you did the following ride through the hills near your house:

10 minutes at 145 watts (cruising to the “start”)
20 minutes at 265 watts (the big rolling section)
30 minutes at 175 watts (moderate tempo back home)

What was the average power? We multiply each power level by the time during which we held it, add them together, and divide by total time.

(10x145)+(20x265)+(30x175) = 12,000

12,000/60 = 200 watts = Average Power

When your bike computer unit displays average power, it is using much smaller units of time than the minutes I used here—it averages the power output by summing up intervals of a second or two. I used minutes merely to illustrate the concept.

The next thing new power users realize is that the average power displayed on the computer sometimes doesn’t match up with how tired their legs are. After a long hard group ride through the hills, I get to my driveway, look at the average power and think, “Dang. I just know I rode harder than that.” I might know from doing some flat TT intervals that I can hold 200 watts for an hour without too much trouble, but the 200 watt average from the hilly ride leaves my legs all rubbery.

What we have here is a disconnect: The one-hour 200 watt ride in the hills was significantly harder than the 1 hour 200 watt flat TT effort. How can this be?

The reason the hilly ride was harder is that the physiological stress of riding harder is not linear. If I ride 10% harder, I don’t get 10 percent more tired, or be able to ride 10 percent less far. I might have to stop after only 6-7 minutes of riding instead of going for 60 minutes. A 10 percent increase in power can cut 80 to 90 percent off my ability to keep going.

Higher levels of power deliver an exponentially increasing fatigue load, or stress. In the ride described above, that hard portion in the middle placed a huge load on my legs and made the entire ride harder than the 200 watt average figure would indicate.

What we need is a way of measuring the total stress of the variable-effort ride so that the resulting number properly reflects just how hard it really was. I want to be able to compare my variable-effort rides to my constant-effort rides. A simple average power calculation doesn’t do it.

Normalized Power

We start from the notion that we seek a consistent way of measuring the overall stress incurred during rides of differing types. Rides vary during the ride: long and short forays into high and low power periods. Rides also vary according to how long they were and how hard the overall effort was.

Andrew Coggan, Ph. D., addressed this question in a chapter written for USA Cycling: “Training and racing using a power meter: an introduction.” In that paper, he defined the term Training Stress Score, or TSS. This is a function of the ride’s duration, its average power, and a measure of the intensity of the ride relative to the rider’s maximum capabilities. This last term is called the Intensity Factor, and is the key to our quest.

Since blood lactate concentrations provide a reliable index or marker for a variety of physiological responses to exercise, Dr. Coggan used data from 76 trained cyclists and plotted lactate concentrations against power levels. A power function of the form

Lactate = Power^4

was the final output, providing a reliable fit of the data. Blood lactate levels will tend to rise to the fourth power of power output. What this means is that a 10 percent increase in power will tend to induce a 46 percent increase in blood lactate concentration. If you are already rolling at a power sufficient to induce 2 mmol/l of blood lactate, a 10 percent increase in power might lift your lactate concentration to 2.9 and you have started the clocking ticking. You have gone above your lactate threshold and you’re not going to be able to ride all day (or even all morning).

Every foray into higher power during a ride results in an exponentially higher stress for that bit of time. In order to properly average together each piece of the ride into a meaningful average number, we have to take account of the fact that sections of the ride that were 10 percent more power were, in fact, 46 percent more tiring. So, that’s what we’ll do.

Taking our earlier ride data:

10 minutes at 145 watts
20 minutes at 265 watts
30 minutes at 175 watts

Instead of multiplying each power segment by the time for which it was held, we will first raise each power segment to the fourth power, then do the averaging. Finally, we’ll take the fourth root to bring everything back to normal units.

(10x145^4)+(20x265^4)+(30x175^4)

Divide that sum by 60, and take the fourth root…

216 = Normalized Power (“NP”)

If we apply the same algorithm to the constant-effort ride where we just “sat on” 200 watts for the whole 60 minutes, we get NP = 200.

Now the hilly ride makes sense. By adjusting each ride segment according to its true physiological stress, and only then taking the average, we see that it wasn’t equivalent to a 200 watt constant-effort ride. It was, in fact, harder. It made us more tired, and placed a higher training load on us.

The NP algorithm forces the cyclist to take account of the exponentially higher cost of higher-power parts of the ride. It allows us to compare rides of various types. The term “normalize” describes the practice of adjusting data so that measurements taken in different settings or times can be compared to one another. The most common everyday example is the Consumer Price Index, where the price of a basket of goods is re-calculated in terms of a similar basket bought in the 1982-1984 period. If today’s CPI is 204, that means today’s basket costs 104 percent more than the basket in 82-84. Normalizing our power files allows us to compare the stress of a ride done alone today through the hills to a ride done yesterday with the bike team around the industrial park.

The NP construct has proved controversial in some circles. The notion that we can precisely compare the stress of a hilly group ride to a solo time trial strains credulity to some cyclists and coaches. While I find that the NP algorithm does allow for reliably making such comparisons, I will avoid that debate and focus it on the narrower, and in my view, more important use of the tool.

Even if we allow that the NP concept doesn’t permit fair comparisons of wildly differing rides, that’s not what we’re talking about here. An individual athlete is doing the same few routes, over and over. The NP from one of those routes done in February can be compared with great reliability to that same route done in April. If an athlete is forming a pacing strategy for a given Ironman course, the NP of doing it one way can be compared with great confidence to the NP of doing it another way. Even the NP of two different athletes on the same course can be examined with confidence.

We are most interested in changes at the margin – the effect of small changes that build over a season, or that separate a too-hard triathlon bike leg from a just-right bike leg. In my view, NP is a fantastically valuable and accurate tool when used in those contexts.

Back to TSS

We began talking about NP, and its punitive treatment of harder efforts, when we were searching for a measure of relative intensity, or IF. If we are going to decide that a given ride was some percentage of our inherent capabilities, we need a benchmark for our inherent capabilities. We might consider such measures as:

Power at Lactate Threshold
Power at Ventilatory Threshold
Power at certain HR levels

In the paper mentioned above, Dr. Coggan makes a compelling case for using a practical benchmark that doesn’t require blood samples or lab equipment. Functional Threshold (“FT”) is defined as the power that a cyclist could hold in a maximal 1-hour time trial. This parallels the “T” pace that Jack Daniels defined in his landmark run training book, The Daniels Running Formula, which was also meant to be a maximal 1-hour effort. A 1-hour benchmark captures an athlete’s ability to generate aerobic power, and it is of a duration that is convenient to either perform (via a 40k TT race) or estimate (via 20 to 30-minute efforts).

Both Coggan’s “FT” and Daniels’ “T” are based on the simple premise that the best predictor of performance is performance itself. Basing fitness benchmarks on metabolic or cardiovascular markers is interesting, but those are once-removed from the only benchmark that really matters: How fast can you go?

While FT relates only to 1-hour power, it has been convincingly shown (by Daniels, Coggan, Hunter Allen and others) that it is a reliable predictor of the power/pace that can be held for shorter and much longer periods. The entire power “curve” shifts up when FT power goes up. More on this in future articles.

So, if FT is the benchmark for your fitness, how does NP fit in? If we compute the NP from a ride, and divide by FT, we get the Intensity Factor, or IF. I might do a 2-hour ride that results in an NP of 200 watts. If my FT at that time of year is 225 watts, then my IF for the ride was 89% (200/225). I rode those two hours at 89% of my functional threshold – a pretty tough workout.

To calculate the TSS, or the accumulated training stress from a particular ride, we use the formula:

TSS = IF^2 x Duration x 100

Using the data two paragraphs above:

TSS = .89^2 x 2 x 100 = 158

I would write that number in my training log. Among other goals, one of the goals of my bike training is to keep my weekly rolling average of TSS as high as I can manage, given my time limitations and my ability to recover on a daily basis.

Another powerful use of TSS is in forming pacing strategies for long-course racing. It reliably captures the fatigue accumulated on a triathlon bike leg. Again, more on this in future articles.

Readers seeking a deeper exposition of these concepts and how to apply them to racing and training should refer to the book, Racing and Training with a Power Meter by Hunter Allen and Andrew Coggan, Ph.D.