An important unpaid resource of the Financial Times is its stable of intelligent, observant readers. Here is an example. Last week the paper published an op-ed written by British-Indian science journalist Anjana Ahuja.
Today, the FT published the letter of a reader who caught an error in Ms Ahuja’s column.
[Anjana Ahuja] states that “most American babies born in 1900 failed to live past 50”: it is true that life expectancy in the US in 1900 was approximately 47 years. However, it needs to be recalled that a life expectancy calculation is an average or mean; to be precise, a weighted average expected number of years of future life as of a given point in time, in this case, birth. However a claim about the age below which the majority of a given birth cohort will die, is a statement about the median age at death rather than the mean age at death.
Life expectancy calculations at birth, especially those calculated before the mid-20th century, can deviate significantly from the corresponding median ages at death due to the effect of infant and childhood mortality rates. As it turns out the calculated median age at death for the US 1900 birth cohort was more like 56 or 57.
Peretz Perl, “How life expectancy calculations deviate“, letter to the editor, Financial Times, 21 December 2016 (metered paywall).
This is an important point. The mean simple average) and the median measure different things. To illustrate with an exaggerated, hypothetical case, suppose we have a sample of 100 individuals born in the same year, 40 of which die at age 5, 50 at age 50 and 10 at age 70. The average (mean) life expectancy at birth for this cohort would by 34 years. But the median life expectancy would be 50 years. Why? Because this is the life expectancy of the median person, i.e. the life expectancy of the 50th person (the median!), arranging all individuals in order of their years of life.
I may be wrong, but my impression is that few in a population of supposedly well-educated people are able to distinguish between a mean (simple average) and a median. If Ms Ahuja, who has a PhD in space physics, confuses the two measures, anyone can.
The median is useful concept. I think it should be taught in middle school, or even earlier.
Alas, life expectancy changes over time, typically increasing, so calculation of the median for a specific cohort becomes more difficult. Mr Perl goes on to explain that, for this reason, the “’calculated’ median understates the actual historical median age at death for that cohort”:
That’s because the standard calculation for such a median age at death assumes that the probabilities of death at each age are frozen at the rates that were applicable in the year of calculation — in this case 1900. The true median age at death for the 1900 US birth cohort was dramatically impacted by the ongoing improvements in longevity as the cohort made its way through the 20th century.
For example, the probability that an American born in 1900 who had survived to the age of 40 would die before age 41 was significantly lower when he or she actually reached age 40 in 1940 than what it was thought to be in when they were born in 1900, and so on. The actual median age at death to which most Americans born in 1900 survived was closer to 60.
Mr Perl displays an impressive knowledge of demography and statistics, far beyond what might be expected even from a Financial Times reader! I googled “Peretz Perl” and discovered he is an actuary at New York-based TIAA-CREF, one of the largest pension funds in the world. That explains everything.
We cannot all become actuaries. But it should be possible for all of us to distinguish between a mean and a median. The concepts are not difficult and, in my opinion, could be taught to students from an early age.