Tuesday, February 21, 2012

DNA: WEIRD need not apply

OpenSNP is offering free DNA testing kits from 23andMe to increase the genetic database beyond people who are WEIRD.  This is an acronym that stands for White, Educated, Industrialized, Rich, and Democratic.  I did not know that this was the name for what I have been noticing.  Most of my genetic relatives are just like me, and I did not suspect this was because we are all related.  I probably have genetic relatives all over the world in all shapes, sizes, and colors.  Yet they do not show up in the database at 23andMe.  I suspected that their absence was not evidence that they were not related, but rather that they did not have the means or interest to undertake DNA testing, and this article provides me with a more formal name for this situation.  The limited subject range also permeates areas beyond genetic testing.

I am working on the results of a lady who submitted her kit to 23andMe through their Roots into the Future project, designed to gather data on African Americans.  She currently has 46 relations, while I have over 1300.  She should also have thousands of detectable distant genetic relatives, but they are not in the genetic database (yet).

Monday, February 20, 2012

DNA: Uncovering hidden matches

I am still analyzing and comparing the genetic matches of my mother and her brother at 23andMe.  At this site, you find people related to you in the "Relative Finder."  My mother has over 1300 matches.  Her brother has around 500.  My mother has few Irish matches.  Her brother has lots of Irish matches that also match her, but were previously not known, because she reached her limit of matches.  It is great that I now have access to these Irish cousins and have some good leads to follow.  The site does not produce such matches for my mother because the number of revealed matches is capped at an undisclosed number.  Her slots were taken by matches with multiple segments.  Most of these Irish matches are identical on only one segment; therefore, their percentage of matching was less than her 1300 multiple-segment cousins and they did not make the cut.

This genetic cousin matches both my mother and her brother.  This match is revealed in his account, but not hers.

The first hint I had about missing matches in the system was with my father's third cousin.  I compared my mother to my father's third cousin by chance and was surprised that they matched.  Neither appears in the other's Relative Finder.  As explained above, my mother is maxed out on matches, but this cousin only has about 600 matches, no where near the limit.  Yet because my mother was maxed out in her account, this cousin was prevented from seeing her in his account.  I wonder how many more cousins she has in the database that are not discoverable.

23andMe comparison of known third cousins, matching on chromosomes 13 and 21.
Also matches my mother through unknown connections.

To take this a step further, at 23andMe this third cousin and I share no genetic material.  If you compare us at GEDmatch, we do match, but strangely, not on the large areas on 13 and 21 that my father shares with this third cousin and not on the smaller segment on 18 that he shares with my mother.  He and I share a 5.8 cM match on chromosome 1, which is another area where he matches my mother.  This segment is large enough that 23andMe should have picked it up.  (Their minimum threshold is 5 cM.  Matches smaller than this may be coincidental.)

GEDmatch comparison between R.S. and me.
Of the nine identical segments, two can be identified as from my father and one from my mother.
Note that I did not inherit any part of the three large segments that R.S. shares with my parents.

GEDmatch comparison between R.S. and my uncle.
The two tiny segments may be coincidental and not suggestive of shared ancestry.

GEDmatch comparison between R.S. and my father.
They are third cousins and this analysis is consistent with that documented relationship.
The two larger segments on 13 and 21 are also reported in 23andMe.

GEDmatch comparison between R.S. and my mother.
The largest match on chromosome 18 is the one match identified by 23andMe.

Wednesday, February 15, 2012

Visit to the National Archives

Yesterday I visited the National Archives in New York City at 201 Varick Street.  (They plan to relocate this year.)  The tour was sponsored by the New York Genealogical and Biographical Society.  The National Archives' main genealogical holdings include federal census records, naturalization records, and military records.  You can view census records at home now, so you may find the repository more useful for the naturalization records and military records.  The presentation focused on Alien Files, or A-Files.  Such a file was created for an immigrant, or alien, who was not naturalized as of April 1, 1944.  Individuals who were born before 1909 and who have A-Files are indexed online.  I have few recent immigrants in my family trees, but I managed to find a match in their online index to show you an example.  After locating someone in the index, you can then order the Alien File from the Kansas City office.  Locating an individual in the online index can provide you with a date of birth and location, as well as other names used.  The file itself may contain information such as parentage and copies of original documents, perhaps birth certificates and marriage records.

Only result for "Regenye" from the online Archival Research Catalog of the National Archives.
You can use the "Scope & Content" tab to possibly identify if the entry matches your person of interest.
If you think there is a match, go ahead and order the A-File.
Over at Ancestry.com, a check for Joseph Regenye provides us with his draft registration card for World War II.
The date of birth on the draft registration matches the date in the A-File, so we know we have the same person.


NARA will also offer the 1940 federal census on April 2, 2012.  There is no name index yet.  You can search the digitized images for a place name for guidance.  If your subject of interest was at the same address in 1930 as 1940, you should not have too many enumeration districts to view.  If you have no address for your subject, then you may have to wait for a name index.  You can also identify enumeration districts at SteveMorse.org.

Archival Research Catalog result for digitized image of "Verona" to identify enumeration district
in the (unreleased) 1940 federal census.

In the 1930 federal census, the Newark City Home for Boys in Verona, Essex County, New Jersey
was in enumeration district 7-615.  In the 1940 census, the Home was in district 7-366.
Note the decreasing population of the institution from 252 in 1930 to 71 in 1940.