Monday, 23 September 2013

Cheating at Statistics

Taking an example from the Second World War here, from the gigantic field of slaughter that was the Eastern Front from 1941 to 1945. Who was it who said that there are three types of lies - lies, damned lies and statistics? Anyway, here's an instance of how statistical terminology used differently by two different sides can cause a serious discrepancy in figures. Courtesy of Tank Archives, links to two parts of a five-part series on Statistics, as used by Nazi Germany and the Soviet Union -

Cheating at Statistics, Part 1
Cheating at Statistics, Part 3

Parts 2, 4 and 5 deal with plain old cheating at Statistics - hopelessly inaccurate and miscalculated figures, deliberately fudged or misstated figures, and figures that were flat-out dreamed up. But Parts 1 and 3 highlight a far greater issue than simple cheating at figures - they deal with legitimate statistics that can lead to grave error. This sort of thing goes way beyond counting knocked-out tanks or downed aircraft - it has implications everywhere in statistics. Like Inigo Montoya tells Vizzini in The Princess Bride: "You keep using that word. I do not think it means what you think it means."

In Part 1, the Luftwaffe listed aircraft that were shot down or wrecked on the ground as partial losses : 40%, or 50%, depending on how much of the aircraft was recoverable. This is a sensible metric when thinking of an airplane or a tank or a truck as a collection of resources and parts. "40% of the resources by value that went into the aircraft or 40% of the thousands of parts that went into the aircraft were lost." It does hide, however, that the aircraft or tank as a fighting unit was out of action. While it may be a cliche to say that the whole is more than the sum total of its parts, that applies literally in case of complex machinery like tanks and aircraft. The consequence of the German method of counting losses, as noted in this article, is that no vehicle is ever a total loss unless it is lost in land captured by the enemy.

The Russian approach is described in Part 3 is to count every single unit that's out of action as a loss. Vehicles stuck in mud and needing repairs, knocked-out tanks that could simply have the holes patched up and sent back into battle, vehicles in long-term repairs or refurbishment and so on - in effect everything that was even temporarily taken out of fighting - became a loss. The end result is that Russian loss statistics are enormously inflated while German loss statistics are far below what you would expect.

I don't know how the Americans or the British counted their lost tanks, but this sort of information could counter a whole lot of confusion. Now that bashing Russian equipment has become a favorite pastime amongst the various fans of US military equipment, they point at the T-34's colossal loss statistics to suggest that the tank was vastly inferior and that its superiority was largely a propaganda myth (Completely ignoring the praise heaped on it by the T-34's German evaluators, including Erwin Rommel and Heinz Guderian). That the T-34 sent to Aberdeen in the United States was an old vehicle that had been built in a hurry and repaired after repeated battle damage didn't help too much. Having the statistics for T-34 and Panzer losses cleared up helps put the counter-claim about the T-34's ineffectiveness to rest : The T-34 truly was the "Queen of Tanks" and a design to make the world take notice.

