Your Genetic Genealogist

Posted: April 9, 2015 at 3:44 am

Responsible genealogists adhere to high standards of proof in their research, in the evidence that they present and in the conclusions they reach. I strongly believe that genetic genealogists should as well. When we make claims that are not supported by sound science, then we undermine the credibility of our field.

Experience has demonstrated to me that there is great folly in claiming small segments can be used as proof (yes, even supporting) in genealogical research.When I use the term "small segments" in this article, I am referring to unphased "matching" segments under 5 centiMorgans and I am addressing their use in matching, not admixture. A few genetic genealogists have argued that there are certain instances when smallsegments are not only helpful in our genealogical research, but reliable. I strongly disagree.

One of the many problems with utilizing small segments is that, in general, people tend to see evidence that supports their theories and reject evidence that does not. Because the nature of small segments is so random, as I will demonstrate, it is possible that an individual will see patterns where none exist in reality, such as in a cluster of tiny, meaningless "matching" segments. This also holds true for admixture analysis.

Blaine Bettinger already wrote a great blog explaining the work that has already been done on this issue along with some of his own comparisons, so I am going to concentrate on the multi-generational data to which I have access. Angie Bush has kindly allowed me access to her family's extensive data while she is unable to collaborate on this post since she is on a genealogy cruise. (Thanks, Angie!)

All of these examples are the first ones I looked at, so they are randomly chosen and not selected with bias. There is a huge amount of analysis that can still be performed on this data set. Since Gedmatch was down when I wrote this, I concentrated on Family Tree DNA data. When I am able to access Gedmatch again, I will add to my analysis.

First let's look at this simple chart of my data compared to James, a confirmed paternal fourth cousin, and then my father's data compared to that same cousin. As you can see, both my father and I have one substantial matching segment with James on Chromosome 4 (in purple). Some would argue that because we have one longer matching segment, that this makes the matching small segments reported more valid and thus can be more responsibly attributed to our known common ancestor.

Notice the segments highlighted in red in my chart. Those are all segments that were reported to be matching between me and James that do not show up as matches with my father. So, right off the bat, we can eliminate eight segments of what some might claim is supporting evidence of the known relationship with James. That is 66.6% of the segments under 5 cM, which is in line with what was found in the 23andMe study.

Next, look at the green segments. In this case, it appears that I inherited those from my father, but if you look closely, they are actually longer for me than for my father. This means that they are at least, partially, false positives or pseudo-segments. Incidentally, the one substantial matching segment we have in common (purple) is also reported to be a bit longer for me than for my father, which illustrates that it is questionable to rely too heavily on what appear to be exact assignments. In my list of matching segments, only the pink segments on chromosomes 2 and 3 are left as potentially fully IBD segments. Some will say that the fact that they persist from parent to child makes them more reliable indicators of a genealogical relationship. Perhaps, but there is no proof that that the pink segments weren't originally pseudo-segments interpreted as a match by the technology in my father's data and then passed to me through recombination of his two chromosomes. Does that sound far-fetched? Well let's see by looking at multi-generational data.

Please bear with me because this is going to take awhile. This chart is the matching DNA between Brynne and a known Bush cousin from her mother's father's father's branch of the family. The common ancestors are Frederick Bush and Martha White, so you can see that the expected path of inheritance for matching DNA between Brynne and this cousin is: Brynne >>Angie >> Grandpa >> Great Grandpa

Now, let's look at the same comparisons with the threshold lowered to 1 centiMorgan.

Go here to read the rest:
Your Genetic Genealogist


Comments are closed.

Archives