This post is a bit of "thinking aloud" - I have some data, but not a full answer for why the data shows what it does.
We know that the DNA passed on by the same two parents to their children will vary, such that, although every child will receive half their DNA from each parent, the level of shared DNA between the siblings will vary, depending on which 'bits' of the parents' DNA they each received. And that, as relationships become more distant, the quantity of DNA shared becomes even more variable for particular levels of relationship.
This is why, for a specific quantity of shared DNA, several possible relationships are often predicted by the DNA testing companies.
However, I have been interested to see the other quantities of DNA shared, as more of the family have tested over the years.
I do have quite a few second cousin matches now, thanks to my grandmother being one of ten, but I'm concentrating here on just four of them - a single second cousin from my grandfather's side, and three second cousins from my grandmother's side, who are siblings to each other - and comparing them to myself and two of my first cousins. This is because the closer relationships, of the siblings to each other, and of the first cousins to each other, are confirmed through the shared DNA, as well as the known family history.
So this is how we all relate to each other:
And these are the levels of shared DNA:
Below is a table of the averages, and ranges, of shared cM for particular relationships, taken from the DNAPainter diagram:
So, with the exception of the 39cM shared between match 5 and me, and of the 23cM shared between matches 1 and 6, all of the values do actually fall within the range for possible second cousins.
However, the probability of the relationships being second cousins (or even half second cousins) seems to be classed as fairly low for many of the values:
I have included the Ancestry predictions for the relationships in the following table:
As you can see, only two of the relationships (highlighted in yellow) are predicted to be possible second cousins. If there is a "half relationship" situation, another two of the predictions (highlighted in pink) would be okay.
But Ancestry's predictions for all the other comparisons are for more distant relationships.
When I received that very first match, one of the first things I did was to put the shared DNA figure into a predictor and, if it hadn't been for the match then being able to give me a couple of names that I recognised, I would probably have been looking at the wrong generation of my tree, at least initially, to try to find our shared ancestry.
It's obviously not impossible, though, so I'm not discounting it and will continue to explore the possibility, through the clustering of other shared matches.
So, one point I am trying to make is the importance of "doing the genealogy" and not just relying on such predictions. Does the predicted relationship fit with the known family history, with ages, and with locations, etc? If not, don't just assume the "most probable" prediction is the correct one.
Another possibility I have wondered about, is whether the predictions from companies such as Ancestry, and the Shared cM Project, might have a tendency to predict more distant relationships for those of us in the UK. This could be due to much of the data coming from people with ancestry in the US. It seems those in the US often have many more matches than those of us in the UK, and potentially, a higher level of "overlapping ancestors", which might create a higher level of shared DNA for particular relationships. And thus 'bias' the predictions.
I don't know enough about the wider field of DNA statistics to know whether that is possible, or whether other people in the UK have found similarly lower levels of shared DNA.
But I shall certainly be checking the predictions for all my other identified DNA matches more closely in future, to see if those show the same tendency.








No comments:
Post a Comment