In January, I mentioned that I had now tested with MyHeritage, rather than just uploading kits from other DNA companies to that site, since MyHeritage have now brought in "Whole Genome Sequencing" (WGS). At the time, my kit was still in the processing stage, and I was looking forward to receiving the results, and the possibility of comparing them to my other kits there.
So this post is the beginning of any comparisons, looking at the new test compared to the upload I did, in November 2016, of the data from my FTDNA test.
[Please note, this is just my personal exploration of the results I have - I don't keep up with the "bigger picture" of what's happening regarding genetic genealogy so, if you're looking for more detailed analysis and comments regarding the WGS test, I suggest reading the comparisons carried out by someone such as Roberta Estes, on her blog, "DNAeXplained – Genetic Genealogy". I also don't have a subscription to MyHeritage, which might affect the level of detail available for my kits.]
One of my expectations of the new test was that there would be less matches than I have with the FTDNA kit. This was because, in order to compare kits from different companies, who might not all test exactly the same points in the DNA, MyHeritage uses a process of 'imputation'. This process 'fills in' gaps in the sequences. Although imputation seems to be a common process used by all of the DNA companies, and is carried out in accordance with specific principles, it can potentially lead to cases where people are incorrectly identified as matches, when they shouldn't be (and possibly vice versa). Looking at my "new match" notifications from MyHeritage in the past, I've often thought that could be the case, with many of the matches showing low levels of shared DNA. Hence my expectation that the better coverage of the new test would discount these lower level matches.
But I was wrong!
When I first received my results, the new kit showed a total number of matches of 17019, whereas my FTDNA uploaded kit showed 16531. The totals as at the time of writing this (23/3/26) are 17328, and 16820 respectively.
It's currently not possible to download a list of all one's matches at MyHeritage, although that used to be possible. So I opted for a 'cut & paste' collection of the closest 2000 matches to each kit, and put those into two spreadsheets (and, yes, that did take a while.) For both kits, this resulted in the lowest matches having a total shared DNA of around 19cM/20cM.
I then did a fairly simple comparison between the two spreadsheets, and discovered that almost half of the names in each spreadsheet did not appear in the other sheet:
It is possible (and, I imagine, quite likely) that many of those who only appeared on one sheet do match the other kit but are beyond the first 2000 matches. I haven't specifically looked at many of the matches to check that yet, given the numbers of "No" matches involved. But, with my 'close' and 'extended' family only accounting for 17 of the matches, and the rest all being 'distant', I can imagine small changes in the levels of shared DNA could make quite a change to the order in which they appear on my match list.
It was also obvious from those figures that there was something a bit 'odd' with the comparisons, since one would expect the same number of 'yes' matches in each sheet.
The difference was caused by the fact that I had only compared names (since I expected there might be differences in the levels of shared DNA, so hadn't included that information in the comparison criteria, but didn't think to include other items, such as age, where the matches were from, or who manages the DNA, etc.)
- five names that appeared twice in the MyH sheet, but only once in the FTDNA.- three matches appeared twice in the FTDNA sheet, but only once in the MyH sheet.- ten matches in the MyH sheet were only identified as "DNA kit", an increase of three from the number of such matches in the FTDNA sheet.- three 'private' matches in the FTDNA sheet did not appear in the MyH sheet.
At this point, I copied all the ''no" entries into one spreadsheet, and the "yes" entries into another, and physically aligned all the "yes" entries for the two kits, so that I could investigate how the shared DNA levels might have changed. I took out all of the 'anomalous' entries identified above, leaving 1009 entries which appeared in both kits.
[Note, I have also now re-run the comparison between the sheets, having concatenated "name", "age", "from", "managed by", "contact", and the "tree or not", items. Doing so identified just three entries that didn't match up correctly. Two of them were where there was one entry in the MyH sheet but two in the FTDNA sheet, and I had picked the wrong one to include. One of these would make no difference to the figures, the other would increase the number of kits that have gained one segment. The third entry was a mismatch between two kits labelled as "unknown" that I hadn't spotted. I don't like that sort of error so, in the following, I have removed that kit (leaving 1008 in both 'yes' lists), and also updated the other two entries with the correct matching details. The new comparison also showed that I could have included some of the entries just identified as "DNA kit" in the following comparisons, since they can be matched up across the two kits. However, I haven't added them in, since none of them are particularly close matches, or make a noticeable difference to the figures/charts.]
To start with, I looked at the "no" spreadsheet and, for each kit, plotted the total DNA shared against the longest segment, just to give me an idea of the levels of sharing that didn't make it into the other kit list:
As you can see, the majority of the kits seem to be where the total shared is between 20-30cM, and the longest segment is less than 20cM. As I mentioned above, I suspect that many of these kits might be matching my other test, but the variations mean the matches appear in different orders and these just didn't appear in the first 2000 entries.
However, there are some where the longest segment is over 30cM, or the total shared DNA is over 40cM, as well as the longest segment being over 20cM. So I decided to check each of those, to see if they were matching the other kit, but beyond the first 2000 matches - only four of them were:
Of the four kits easily identified as also matching the other (FTDNA upload), the first in the table above had gained two new segments, on different chromosomes, the next had lost one segment, but then gained three new segments on other chromosomes, the third showed increases on the two 'existing' segments, plus the addition of a new segment on a different chromosome, and the final one showed an increase (of over 20cM) on the 'existing' segment.
I might come back to these details, as and when I take another look at chromosome mapping. But, for now, I moved on to look at the 'yes' sheet, ie those matches that appeared in the first 2000 entries of both my FTDNA upload, and also the new MyHeritage WGS test.
The 'yes' kits
I began by looking at a scattergram of the change in 'Total cM shared' against the change in 'Longest segment' but it's perhaps more helpful to look at the following two charts first.
This shows the numbers of matches whose 'Total cM shared' changed, within particular ranges of values (calculated by 'Total shared cM with MyH kit - Total shared cM with FTDNA upload'):
I think that's important to note, given that the specific examples I'm exploring in more detail are all 'outliers', ie the matches where the changes are more extreme. I'm looking at them because I find the situations intriguing, not because I'm saying there is anything 'wrong'.
One can see the same thing, when looking at the changes in values of the 'Longest segment' (calculated by Longest segment matching MyH kit - Longest segment matching FTDNA upload) - the majority of matches showed very little change in the longest segment value:
The following image shows the changes in Total cM shared against the changes in Longest segment for each individual match:
I think there are two different things showing up here - there's the points falling along a diagonal, indicating that there's been a change in both the Longest segment length and the Total cM shared. But then also a horizontal line of points along the x-axis, where there's been a change in Total cM shared, but without corresponding changes to the Longest segment length - potentially indicating the loss, or gain, of other, smaller, segments.
From the following figures, it can be seen that, although again, the majority of kits showed no changes in the numbers of segments, almost 200 matches did show either a loss, or a gain:
Of the 81 who lost one segment, the Total cM shared decreased for 80 of them, but the Longest segment showed no change for 70 of those. And, for the 101 that gained one segment, 99 showed an increase in the Total cM shared, but 78 of those showed no change to the Longest segment.
So those figures would seem to support the possible explanation for the 'horizontal' line of points, that the segments being lost, or gained, are smaller segments, rather than the longest. [and whether any of it is 'significant' would be a totally different issue, given that many of the changes are only in the range of 5-10cM.]
I was intrigued by one match, who had lost one segment, and yet both their Longest segment and the Total cM shared had increased (by 23.5cM and 18.7cM respectively.) This was a case where a 'gap' between two small 6cM segments on chromosome 18, is now shown as matching, creating one segment of 31cM, another three segments remaining identical:
Another match I followed up was one where the number of matching segments increased by 3 yet the Total cM decreased by 1.8cM and the Longest segment decreased by 20.5cM:
In some ways, I don't know what to make of this - the total loss of a segment on one chromosome, but gaining four small segments on different chromosomes.
The companies give us many such small matches so, according to their science etc, it must indicate at least a 'potential' relationship. But I certainly wouldn't be spending time looking for a genealogical connection to such a match!
The other match that gained three segments had increased both their Total cM shared, and their longest segment (by 28.0cM and 4.9cM respectively):
That seems a bit more 'reasonable' than the previous case, with an increase to the existing segment, and the 'discovery' of three other segments.
But how relevant some of these segments are remains to be seen.
Closer matches
Finally, I looked for any changes to the matching with my closest relatives.
In comparisons with my mother's kit, the MyHeritage WGS kit showed different totals on seven chromosomes, from those shown with the FTDNA upload. Two chromosomes showed decreases, the other five were increases, but all individual changes were less than 7cM, producing an increase in Total cM shared of 16.7cM. I'd need to research the particular start and end RSID points of the tests, to see if differences in those explain these changes (since I should match my mother along the full length of every chromosome.)
Comparing my kits to my uncle's, with whom I share 43 segments on each kit, six of the segments had changed slightly (one increased, five reduced), all changes less than 3cM. Three of the segments are all on chromosome one and at least the first segment is potentially due to wider coverage of the newer test, since the starting location has changed to exactly the same RSID point that my mother's kit did.
The next closest eight matches, taking me down to a Total shared DNA level of 100cM, includes seven identified second and third cousins. Of these, only one shows a change in the Total shared DNA, with an increase of 18.4cM on chromosome one. In this case, the increase doesn't seem to be connected to a change in the starting location (which is actually quite interesting, since the starting location for this match on my FTDNA uploaded kit was already showing the earlier location - so why wasn't that kit showing as matching to my mother, and my uncle, from that point?)
Since this 2c should be matching my uncle over the same range, I shall investigate this further.
But that can wait until another day!








