How to Create DNA Source Citations And When to Use Them
Experienced genealogists know the value of citing our sources, but noting that information becomes a headache when using DNA evidence. So how can we cite our DNA sources, and where should we use those citations?
Citations for Documents
When we have a Deed from Fremont County, Idaho, we can quickly create a source by specifying the author, title, publishing details, and where to find the title.
"Fremont County, Deeds, 1880-1928," digital images, FamilySearch (familysearch.org: accessed 27 June 2021), Joseph E Lamborn to Orlando Gooch (1916); citing Fremont Count, County Court, book 16, page 289.
Regardless of whether FamilySearch always has this collection or not, we can track down the Fremont County Deed Book 16 and look at page 289 and find the sale of land from Joseph to Orlando.
The Problem With Citations for DNA Databases
But with genetic evidence, the DNA databases are fluid entities. Algorithms, tools, and processes change frequently. Additionally, DNA testers add and remove their results constantly. Thus, the ability to consult the original resource on which we base our conclusions decreases over time.
Additionally, the evidence about a DNA match comes from multiple data points. For example, while I could determine that a DNA match shares 104 cMs across three segments in one location on a genetic genealogy website, understanding how I am related to that person requires other inputs from multiple locations on the sites. Additionally, that information often depends on whether I have tested either one or both of my parents.
If we compare this information to a genealogy source, it's like finding the birth records for all of a couple's children on multiple pages in a birth registry. And it's couple's contingent upon the parents' information being the same or similar enough to piece them all together.
When we cite these entries, we create separate citations. But with DNA evidence, researchers will often strive to have one citation to serve them all.
DNA evidence can only predict so much.
Many websites will state that an 875 cM match is a first cousin. But according to the Shared cM Tool on DNA Painter, this could be any of the following relationships:
half 1st cousin
1st cousin 1x removed from a great-grandparent
1st cousin 1x removed from a grandparent
And the cMs relationship will not tell you if this is from your mother's or your father's side. Without additional pieces of evidence, a single citation is insufficient.
Reproducing Genetic Genealogy Research Is Difficult
The reproducibility problem increases if you use the WATO (What are the Odds Tool), Leeds Method, triangulation, or clustering tools. This problem results from these derived sources depending on inputs from multiple factors.
For instance, you may have a triangulated segment of 3 people sharing 13 cms on Chromosome 1. However, that only tells you that they share DNA in that position. It doesn't reveal to you HOW they are related.
So, how do you cite your DNA sources in genealogy research?
It all depends on where you're citing your sources.
For further insights, watch this video:
DNA Source Citations in Online Trees
If you're creating DNA source citations, perhaps our friends at WikiTree have some citation guidelines we should implement.
WikiTree, a free family tree building platform, offers users a standard to follow when marking family relationships as DNA confirmed.
This first example is for close genetic relationships confirmed with autosomal DNA testing:
[Paternal/Maternal] relationship is confirmed by [company] test match between [person 1] and [person 2 - either ID or relationship between the two]. Their most recent common ancestors are [couple] the [relationship to person 1] of [person 1] and [relationship to person 2] of [person 2]. The predicted relationship from [company] is a [relationship] based on sharing # cMs across # segments.
Paternal relationship is confirmed by AncestryDNA test match between Devon Noel Lee and first cousin once removed. Their most recent common ancestors are George Geiszler and Evaline Townley Peak, the paternal great-grandparents of Devon and the maternal grandparents of 1C1R. The predicted relationship from Ancestry is a first cousin based on sharing 209 cMs across 12 segments.
When I tested with Ancestry, the company could not tell me that I was related to this 1C1R. It only told me we were 1-2nd cousins. Additionally, I could only tell how the 1C1R and I were related because we both uploaded our family trees to the platform.
Meanwhile, the following is a triangulation citation.
[Paternal/Maternal] relationship is confirmed by a triangulated group on [Company] consisting of [name 3 persons] who share [# segments on chromosome #]. The most recent common ancestors shared by all three are [couple]. These matches have been independently verified by [person names] via the [tool]. [Person A] and [Person B] are [relationship abbrev.]; [Person A] and [Person C] are [relationship abbrev.][Person B] and [Person C] are [relationship abbrev.].
Paternal relationship is confirmed by a triangulated group on MyHeritage consisting of Devon Noel Lee, N Colwell, and M. Townley, who share 18 cM on chromosome 7. The most recent common ancestors shared by all three are Richard Townley and Anna Sexton. These matches have been independently verified by Devon Lee via the MyHeritage Chromosome Browser. Devon and N. Colwell are 1C1R; Devon and M. Townley are 3C1R. N. Colwell and M. Townley are 3C.
Notice how the triangulation is a combined source citation. The triangulation tool does not tell you how you're related. Instead, you have to look at trees to determine how the group is related.
What happens when you don't know a relationship, but you're working on a case?
For instance, I have a 2nd-3rd cousin on Ancestry who I cannot tell how we are related. I don't have my parents' DNA. My match did not upload their family tree. I suspect this relationship is through the Hankinson/Cramer line. On the Shared Matches page, this 2nd-3rd cousin matches a match that descends from the Hankinson/Cramers from my maternal line.
How would I cite this unknown match?
I suppose I don't. Or, if I'm working on a case, I would have to leave off our common ancestor and the maternal line as part of the citation but include my suspicions in the report. (Which I'll reference later.)
How to craft DNA Source Citations for Complex Analysis
With the citations above, we may be using sources to add evidence to our family trees, much like a vital record, land record, or census.
When we combine genetic evidence with paper trails, we can confirm our research and our ancestors' relationships.
However, many cases are far too complex to resolve with a simple citation as listed above. Thus, we would need to write a research report and add citations to that report. Then, we can share our findings and cite that report as supporting evidence.
However, some fellow researchers will dislike citing ourselves in this fashion. It might fall into an "I know because I said so" category. However, it might be the only way to help someone else make sense of our discoveries. Much in the same way, a complex genealogy research report helps to identify an ancestor that had little direct evidence to understand.
↪️ Confused about DNA and genealogy?
Grab your copy of this FREE DNA guide:
DNA Source Citations in a Research Report
I have to give credit to Robin Wirthlin, of Family Locket, for her tips on how to write DNA Research Report. I especially like the structure and how it can adapt to nearly any situation.
To better explain how Andy discovered the identity of my Grannie's biological father, we could follow this process.
Identify the father of Louise Long, born Marie Anderson, daughter of Agnes Anderson.
Louise was born in 1920 in Columbus, Ohio, to Agnes Anderson.
Louise has three living descendants who have taken a DNA test - two daughters and a granddaughter of her deceased third daughter.
Agnes Anderson was a single woman, in her 30s, at the time of Louise's birth. She had worked for the B&O railroad in Newark, Licking, Ohio.
Agnes had no other children other than Louise.
Body of Report:
A female DNA matched the daughters and granddaughter. (Identify how much DNA each person shared with the female match.)
She hoped the match linked to her mother's line.
DNA comparisons between the daughters and granddaughters confirmed that the female DNA match could only be connected through Louise's birth parents. (Add additional DNA matches to prove this fact)
The female DNA match did not have any connections to the family of Agnes Anderson.
The female match's mother's line did not have a male who lived in a town where Agnes lived in the 1920s. (Identify the paper records that place the men in these locations.)
The female match's father's line had two brothers who lived in Licking County, Ohio. One of the brothers worked for the B&O railroad and lived within a mile of Agnes's home. (Identify the paper records that place the men in these locations.)
Additional DNA matches discovered that are descendants of Louise's half-siblings confirmed this theory. (Add other DNA matches to establish this fact)
Within this report, I could include some screenshots from the WATO tool to support the likelihood of the relationship based on the shared DNA cMs. I could also insert the Shared Centimorgan Project chart.
Louise Long, born Marie Anderson, is the established biological daughter of Agnes Anderson, based her on adoption papers. DNA evidence and genealogy records suggest that her biological father is Delbert Hankinson.
DNA research can be complicated. So seek out professional assistance from our friends over at Legacy Tree Genealogists. Tell them Devon Noel Lee referred you.
Where do the DNA Source citations go in a report?
Whenever we mention a source, we should cite the information. Thus, when I indicate the records that helped place the potential fathers in their time and place, I would cite the sources that provided that information.
With DNA, we have a different challenge. I would have to use a combination of the following citations, based on recommendations from Family Locket with a minor modification.
"Member Matches for Devon Lee and Female DNA Match," AncestryDNA (ancestry.com/dna: accessed 1 June 2018), predicted 2nd-3rd Cousin, sharing 133 cM, 6 segments.
"Member Matches for Aunt A and Female DNA Match," AncestryDNA (ancestry.com/dna: accessed 1 June 2018), predicted 2nd-3rd Cousin, sharing 277 cM, 13 segments.
"Member Matches for Aunt B and Female DNA Match," AncestryDNA (ancestry.com/dna: accessed 1 June 2018), predicted 2nd-3rd Cousin, sharing 372 cM, 12 segments.
"Member Matches for Female DNA Match and Known Paternal Match of Aunt A & B," AncestryDNA (ancestry.com/dna: accessed 1 June 2018), no DNA matches shared.
"Member Matches for Female DNA Match and Known Relatives of Agnes Anderson," AncestryDNA (ancestry.com/dna: accessed 1 June 2018), no DNA matches shared.
I'm not sure I need a citation for DNA Painter, but here's a possible citation if required to use one.
"What Are The Odds," DNA Painter (DNAPainter.com: accessed 1 June 2018), Ancestry DNA Data for descendants of Potential Father A.
Once we complete our research reports, we should upload them to online trees as media files. Then, we can then use the following citation:
Lee, Devon Noel, "Establishing the Identity of Louise Long's Biological Father," research report, 18 June 2018, in possession of Andrew Lee, Los Alamos, NM.
We can add this source citation to our family trees and a source to validate our conclusions.
Citation Style Tip:
Frequently I see genealogists add the HTTPS:// or HTTP:// in their citations. The inclusion of this detail is redundant information in a source citation. Will https://www.familyhistoryfanatics.com and www.familyhistoryfanatics.com take you to the same place?
Yes. Yes, it will.
In fact, when I wrote this blog post, my editing program automatically created a URL link for both.
Thus, the https:// and http:// is redundant. Would you please stop using these in your citations? It signifies that you don't understand how the internet works.
Additionally, it's also acceptable to drop the "www." But that's not as unnecessary as the HTTP. So feel free to keep the the 'www's.
Citations I Wouldn't Use for DNA Evidence
In reviewing the comments on the above Reddit discussion, I agree with many points regarding the citation recommendation found through a credentialing organization.
"AncestryDNA," database, Ancestry.com (http://dna.ancestry.com : 16 September 2014), predicted 4th to 6th cousin match to the user "[AncestryDNA user name]," and shaky leaf hint identifying shared ancestors as Fredrick Charles Bush (1859-1938) and Martha White (1857-1938)" whose shared ancestors is Fredrick Charles Bush (1859-1938) and Martha White (1857-1938).
My first problem agrees with a critic of this citation. It's that the "shaky leaf hint" is useless. The hint doesn't tell you how accurate the suggestion or the supporting document was.
Moreover, the critic said it best, "that [the hint] is just Ancestry telling you that one of your matches added that person to their tree. For example, I could add Odin to my tree six generations back, and if Ancestry would tell my match that that was their ancestor's name even though it obviously wasn't."
Additionally, the second half of that citation references analysis. It's my experience that analysis requires multiple techniques. We could have consulted, ThruLines, Common Ancestors from the Shared Match page, a conversation with the match, and descendancy research. Much of this analysis takes multiple steps, so I would again go to the report method.
Leave off the shaky leaf recommendation and pare the citation down to:
"AncestryDNA," database, Ancestry.com (http://dna.ancestry.com: 16 September 2014), predicted 4th to 6th cousin match to the user "[AncestryDNA user name]."
Then if you want to reference trees or ThruLines, you can use something like this in the research report.
"AncestryDNA ThruLines for Devon Lee," matches through George Joseph Geiszler, Ancestry (ancestry.com: accessed 1 June 2018.)
Give these suggestions a try when citing why you think a relationship exists based on DNA evidence.