Wednesday, August 16, 2017

Imprecise Science Part 2: MyHeritage

After the less-than-encouraging analysis of my AncestryDNA matches to see what percentage of my matches also match my parents (only 69%), I took a look at MyHeritage. These are matches obtained by uploading our raw test data from the AncestryDNA tests to theMyHeritage website (currently a free process), rather than by taking MyHeritage's native tests, so results may be impacted by that. MyHeritage is also a newer entrant to DNA testing/matching.

Suffice to say, MyHeritage currently makes AncestryDNA's match processing look good.

I have 97 matches at MyHeritage (excluding parents and great-uncle), so a much smaller sample size than AncestryDNA. It's just as well, as MyHeritage doesn't currently have any tools for working with or analysing your matches (and there aren't any external tools for this either as far as I know), so you have to look at each match manually and record the necessary details to do any analysis.

4 of my matches, or 4%, also match my Mum.

Another 4 of my matches, or 4%, also match my Dad.

So that's a 92% rate for false positives for me and/or false negatives on my parents.


Of the 4 maternal matches, MyHeritage reports me as having more (generally around double!) shared cMs with that match than my Mum does for 3 of them. The paternal matches were a bit more plausible, only one of those showed me as having more shared cMs than my Dad did.

Side note - there's no ability to search matches by birthplaces in their tree at MyHeritage, so no easy way to locate any Wing descendants yet.




Thursday, August 10, 2017

Imprecise Science Part 1: AncestryDNA


Are all your DNA matches really a match?

Debbie Kennett recently published her methodology and results when comparing her own AncestryDNA match results to her parents’ matches in order to identify false matches (either a false positive at child level, so identical by chance, or a false negative at parent level, or some horrible algorithmic glitch). Debbie’s post built on analysis done by Blaine Bettinger and other DNA genealogists and she has links to their respective results in her post.

 As I have tested myself and both my parents I thought I’d do the same. There's some extra commentary at the end regarding Wing one-place study implications, just to keep this on-topic for this blog!

Summary

Of my 14,522 matches:

  • 4,470, or 31%, are not shared with either parent
  • 4,528, or 31%, are shared with my dad (which represents 44% of his matches)
  • 5,547, or 38%, are shared with my mum (which represents 34% of her matches)

15 cM looks like a good cutoff point above which any match is almost certainly going to be legitimate. Sadly only 3.75% of my matches are above that level! 

Conversely, any matches below 7cM are more likely to be identical by chance rather than identical by descent.

This is pretty similar to the kind of picture Debbie saw with her AncestryDNA results.


Data

Here are my personal matches, broken down by the cM total length:



Those 3 matches of 50cM or more are my parents plus my grandmother’s brother.

Of those 21 matches of 25cM or more, only 1 has a tree of more than a couple of generations. I’ll save that rant for another time…

Here are my matches broken down into slightly more cM bins, along with what does and doesn’t match at least one of my parents:


See that wee outlier? I have a 25 cM match that on the face of it doesn’t look like a real match. I’ve done some digging into this one – her kit is administered by her daughter, and the daughter IS a match to both me and my father. The daughter and my Dad have a 17cM match, yet I match to the daughter at 18cM (it's over 2 segments, not 1 as my Dad's match is) and the mum at 25cM! I’m still trying to get my head around that.


What Now?

I knew there were some false matches knocking around, but I’m a bit disappointed it’s as high as it is to be honest. It does mean that if I’m doing any further analysis of my matches, eg following Twigs of Yore’s visualisation exercises using NodeXL, I should definitely start with a match list that’s had those false positives removed if possible rather than using a raw match list. Not only would the list be 30% shorter and more manageable to handle, it would be more accurate.

Our DNA raw data has also been uploaded to FamilyTreeDNA, MyHeritage and GedMatch and I’m planning to run comparisons on my match results at each of those places (update: follow the links to see the comparisons). At MyHeritage, I have an alleged 94cM total match that is not a match to either of my parents! I suspect that in reality I do not have a gorgeous blonde Norwegian cousin (with whom I apparently also somehow share 0% ethnicity), I think something must be awry in the algorithms.

Wing

I currently have 8 matches at AncestryDNA who have a family tree that has someone born in Wing Buckinghamshire in it. That doesn’t mean we’re related through Wing (or even at all), although you can bet I’m keen to find out one way or the other. In one case it looks like I can tell – one of them (who even bears one of my Wing surnames!) isn’t a match to either of my parents so most likely he is a false positive match and we are not related. Is that a sad trombone I hear?



 
template by suckmylolly.com : background by Tayler : dingbat font TackODing