a bonobo humanity?

‘Rise above yourself and grasp the world’ Archimedes – attribution

Posts Tagged ‘Bayesian probability

bayesian stuff again, I think

leave a comment »

It’ll never catch on…

I’ve been reading and listening to stuff on Bayesian inference/probability for – a few years, on and off, and I’m not effectively wrapping my aged brain around it, so again I’ll embark on the almost forlorn hope that writing about it will not only help, but ultimately win the day. 

Two recently read books (at least) have offered explanations to me – Steven Pinker’s Rationality (chapter 5 – ‘Beliefs and Evidence’) and Sean Carroll’s The Big Picture (chapters 9 and 10, ‘Learning about the world’ and ‘Updating our knowledge’), as well as the odd video and podcast, but I know that currently I couldn’t pass the basic test – if somebody asked me to explain Bayesian – let’s see – reasoning, statistics, inference, analysis, probability, optimisation… I’d just have to tell them something along the lines of ‘that’s for me to know and you to find out’. 

But really, I know that putting this sort of stuff into my own words has worked for me in the past, and I do have some faith in my intelligence, so here goes. I won’t go into the identity of the reverend Mister Thomas Bayes, but I’ll note, as most commentators have, that Bayesian thinking has achieved massive prominence in recent decades. The reason is that not taking account of it is probably the major reason that people’s reasoning isn’t reasonable (this is me channelling Gilbert and Sullivan). So, okay, I’ll add another essential contributor, Daniel Kahneman (in Thinking fast and slow):

Bayes’ rule specifies how prior beliefs [or knowledge]… should be combined with the diagnosticity of the evidence, the degree to which it favours the hypothesis over the alternative. 

which sounds a bit abstract, but it identifies a common failing in statistical analysis. One example would be that a certain suite of symptoms suggests that a patient is suffering from condition x, but the symptoms may also fit with condition y which is very prevalent in the community, whereas condition x is almost unheard of. This should not necessarily be enough to convince you, but it should affect the balance of probabilities. More data may be required. How much evidence do we need (and of what kind) to become convinced that we’re dealing with condition x rather than y? 

Pinker gives the example of breast cancer. I’ll try not to quote – I want to describe it correctly in my own words! Suppose that 1% of women in a given community have breast cancer (that this is a clear, proven statistic, aka the base rate). And suppose that a test for this cancer is 90% sensitive or effective, and its false-positive rate is 9%. Woman x returns a positive test. What, as a percentage, is the chance she has breast cancer? Most people, including doctors, put it at over 80%, but using Bayes reasoning, 9% is more like it. 

Of course, this is presented as an abstract case, so those tested easily forget the background statistic, the base rate (1%), and focus on the test statistic. If they were dealing with a real case they might observe other factors besides the diagnostic test, symptoms which might increase that 1% figure in their experience.

Can all this be quantified? In the case of a doctor experienced in diagnosing breast cancer, not so easy, but in the abstract case? And in the experienced case, how reliable is the experience? I’m probably setting myself too hard a task here (but to what degree of probability?). In any case, prob(hypothesis) is used to represent our degree of credence in the hypothesis (that x has breast cancer). And why such credence? The evidence, or the data supports it, or so we believe. So, prob(hypothesis)|data represents posterior probability, credence upon examination of said data. 

One of the problems in our everyday reasoning is that we either ignore base rates or have little idea what they are, and having no clear idea might make that base rate diminish or rise in our estimation. Then again, there are many who believe in stuff, and fervently, when the base rate, or the prior occurrence, appears to be exactly nil. In fact, that is why they believe in them. Pinker quotes the most excellent Scottish philosopher:

No testimony is sufficient to establish a miracle, unless the testimony be of such a kind, that its falsehood would be more miraculous, than the fact, which it endeavours to establish. 

If it wisnae incredibly rare and unbelievable an aw that, then it widnae be a miracle, ye ken? So it pays to be skeptical. Pinker presents an equation, of sorts, which I think is more confusing than it needs to be, considering that Bayesian thinking is more about likelihood of a hypothesis given background information or prior knowledge, and using quasi-equations doesn’t help with this (to my non-mathematical mind). A fine example is given by the vlogger Gutsick Gibbon (aka Erica) in analysing the interestingly large differences between the Y chromosomes in chimps and bonobos, and those in humans. Creationists (a semi-endangered species mostly found in the USA) have leapt upon these differences as evidence that chimps and bonobos are fundamentally different ‘created kinds’ from humans – created, that is, by their deity. In so doing, they’ve swept aside all the priors, such as our morphological and anatomical similarities vis-a-vis Pan paniscus and Pan troglodytes, and particularly the extreme closeness of our respective genomes in general. The Y chromosome contains the male sex-determining and developmental genes, and this chromosome, in both chimps and bonobos, does indeed differ substantially from the human version. Though this is curious, it hardly supports a creationist thesis (which takes us into miracle territory). Gutsick Gibbon, in a video linked below, takes us through a scientific paper which explains, based on the researchers’ hypothesis, why these two Pan species have Y chromosomes that differ markedly, not only from humans, but from other ‘great apes’, such as gorillas and orang-utans. As to the hypothesis, it’s about ‘elevated substitution rates’, about which I understand very little, but they have to do with ‘inversions’, ‘insertions’, ‘palindromic architecture’ and other technical genomic terminology, the overall point being that a lot of evolutionary development has occurred in the Y chromosomes of the Pan species in a very short amount of time, in evolutionary time scales. Here’s a couple of samples from the paper, highlighted by Erica:

The phylogenetic analysis of multi-species alignments for the X chromosome, and separately for the Y chromosome, revealed the expected species topology, but detected higher substitution rates on the Y chromosome than on the X chromosome for all the branches, consistent with male mutation bias. 

These results indicate a stronger male mutation bias for the Pan lineage and a weaker bias for the Pongo [orangutan] lineage than for the human lineage. Strong male mutation bias in the Pan lineage is consistent with increased sperm production due to sperm competition. 

Erica, and the paper, go into a lot more technical detail, of course, in their areas of expertise, but as she points out, this is a perfect example of the testing of reasonable hypotheses. As to the relationship with Bayesian probability or inference, it helps to inform you of what to look for. Obviously these researchers are not looking for evidence of ‘created kinds’, as that’s a dead-end ‘solution’, ultimately the product of a once-local religion made good, to the point that some of its adherents still wish to impose it on all humanity, and while my understanding of genomics is fairly limited, I can understand the importance of sperm competition in non-monogamous species well enough. 

I’m not sure if all this has dealt with Bayesian stuff effectively, but I’m clearer about the general idea, I think, and that’s good enough, for now. There’s a whole area of expertise called Bayesian statistics, though, which I’m content to avoid. For now….?

References

Steven Pinker, Rationality, 2021

Sean Carroll, The Big Picture, 2016

Daniel Kahneman, Thinking fast and slow, 2011

Written by stewart henderson

July 6, 2024 at 9:28 am