In honour of the ongoing Euro 2016 championships, our fifth and latest instalment in the increasingly inaccurately named Jellybean Trilogy on flaws in academic research kicks off with a study that ostensibly set out to discover whether football referees are more likely to give red cards to some players more than others.
A total of 29 academic teams were given the same dataset, with each using a different statistical approach to investigate referee bias. They used a wide variety of methods – from linear regression techniques to complex multi-level regressions and Bayesian approaches, whatever any of that means – and came up with a wide variety of results. Taken together, the studies did point to a single answer – yes – and yet nine teams actually found no significant relationships in any of the data they were given.
As an article called Science isn’t broken on science blog FiveThirtyEight observes: “The variability in results wasn’t due to fraud or sloppy work. These were highly competent analysts who were motivated to find the truth.” The point to bear in mind here, however, is that “even the most skilled researchers must make subjective choices that have a huge impact on the result they find”.
While the 29 teams may not have been fraudulently designing their study to come up with a particular answer, the practice of tweaking variables to achieve an acceptable – for which read ‘publishable’ – result has become prevalent enough to warrant a name. It is called ‘p-hacking’, with the ‘p’ standing for the 0.05 or 1-in-20 threshold below which a finding becomes statistical significant.
The FiveThirtyEight article makes its point with an interactive game that exhorts you to “Hack your way to scientific glory”. The proffered political and economic variables lead to 1,800 combinations, of which 1,078 yield a publishable p-value. Having swiftly ‘proved’ – with a p-value below 0.01 – that Democrats have a negative effect on the economy we were congratulated: “Get ready to be published!”
As you might expect from a science website, the article is actually pro-science and it make the point that, well, science is hard. It goes on to argue: “If we’re going to rely on science as a means for reaching the truth – and it’s still the best tool we have – it’s important that we understand and respect just how difficult it is to get a rigorous result.”
Well, fair enough, but science is going through a bit of an image problem at the moment. What is more, people are beginning to notice – and some of them have a sense of humour. In Jellybean IV , for example, we highlighted the spoof research that chocolate could aid weight loss, which owed a lot of its PR success to its publication – always after paying a fee – in a range of academic journals.
Topping that is the story that an academic journal published a paper called ‘Get me off your ******* mailing list’, which consisted only of those seven words repeated again and again for 10 pages. Then there was John Oliver’s recent dissection of scientific studies in Last Week Tonight, which covered a number of The Jellybean Trilogy’s main points – with a few more jokes and a lot more swearing.
Arguably of greater concern than some journals’ apparent willingness to publish whatever they are emailed, so long as there is a cheque attached, or even the positive/negative impact of almost any substance on almost any aspect of human health is the fact that many scientific studies either cannot be repeated or are not repeated because there is little incentive for researchers to do so.
And even when studies are repeated, with the original conditions replicated as far as is possible, they can come up with a different answer. Another FiveThirtyEight article highlights a 2011 project that replicated 100 well-known psychological studies and found that, while 97% of the originals produced ‘statistically significant’ results, only 36% of the repeats did the same.
At this point, regular visitors to The Value Perspective may well be reckoning there is a statistically significant chance we are about to conclude this piece by observing how numerous academic studies have shown time and again that value tends to outperform other styles of investing over the longer term.
Tempting as that may be, we shall resist – just as we shall resist quoting Keynes’ well-rehearsed dictum on changing facts. Instead, we will turn to the Nobel prize-winning physicist, Richard Feynman, who Wikipedia assures us is known for his work in “the path integral formulation of quantum mechanics, the theory of quantum electrodynamics and the physics of the superfluidity of supercooled liquid helium”. Whatever that means, we can take it as read the man was a genius.
“There are myths and pseudo-science all over the place,” he observed. “I might be quite wrong, maybe they do know all this ... but I don't think I'm wrong – you see, I have the advantage of having found out how difficult it is to really know something. How careful you have to be about checking the experiments. How easy it is to make mistakes and fool yourself.
“I know what it means to know something. And therefore, I see how they get their information and I can’t believe that they know it. They haven’t done the work necessary, they haven’t done the checks necessary, they haven’t taken the care necessary. I have a great suspicion that they don’t know.”