Add new comment

Want facts and want them fast? Our Maths in a minute series explores key mathematical concepts in just a few words.
What do chocolate and mayonnaise have in common? It's maths! Find out how in this podcast featuring engineer Valerie Pinfield.
Is it possible to write unique music with the limited quantity of notes and chords available? We ask musician Oli Freke!
How can maths help to understand the Southern Ocean, a vital component of the Earth's climate system?
Was the mathematical modelling projecting the course of the pandemic too pessimistic, or were the projections justified? Matt Keeling tells our colleagues from SBIDER about the COVID models that fed into public policy.
PhD student Daniel Kreuter tells us about his work on the BloodCounts! project, which uses maths to make optimal use of the billions of blood tests performed every year around the globe.
It may be obfuscating the issue to suggest that pvalues alone are not strongly suggestive of some genuine effect in an experiment like this. The reason is that unless the experiment is flawed, the only probability distribution of results that can occur without an interesting effect is that of the null hypothesis. Of course a single pvalue like 0.009 is not by itself indicative of anything but a fluke: run 100 experiments and this sort of result is rather likely. This is why the particle physics community demands pvalues of 0.0000003 before using the term "discovery". I don't believe they qualify this with a Bayesian criterion: the null hypothesis is simply the hypothesis that the hypothesis is not true.
It is perhaps inaccurate to say that no effect has been discovered years ago: for example, the book "Entangled Minds" by Dean Radin (2006) presents metaanalysis of a wide range of possible effects from experiments over several decades and finds qualitatively similar weak effects for most of them. The pvalues are much lower in some cases, with much larger effective sample sizes. If I understand the data correctly, the effects cannot be adequately explained by the hypothesis that some of the experiments are flawed or fraudulent, because the distribution of the results of different experiments is much more like a weak effect with the variation in the results being the result of sampling. I would welcome the views of an expert like McConway on Radin's claims.
So it may be more accurate to say that (1) these effects have not been accepted as definitely genuine by mainstream psychology (or other sciences) and (2) there is little or no understanding of mechanisms that might explain the effects.
I feel it is this lack of any real scientific understanding of the nature of the purported phenomenon that blocks acceptance rather than the statistics themselves. Weaker statistics have been used (rightly in my opinion) to make major strategic decisions in other fields. Given a strong pvalue (say 0.000001) in an experiment, two intelligent people can come to two different conclusions. The first might say "unless this experiment is faulty or fraudulent, there is very likely a real effect here". The second might say "this experiment is almost certainly faulty or fraudulent, since the conclusion is ridiculous". If ESP effects are weak in the way that Radin's analysis and this experiment (and many others) suggest, it would take a rather large experiment (or a genuine effect plus a bit of luck) to even reach these sorts of pvalues, though more extreme ones can be found by combining data from many different experiments.
Liam Roche