Archive for ‘Databases’

June 25, 2012

2012 – Year of the Census — #Genealogy, #Census, #State, #Territory

by C. Michael Eliasz-Solomon

NY State Censuses: Colonial | State

2012 has certainly been a very good genealogical year for this jester. Recently, Ancestry.com completed the 1940 index for NY and I was thrilled to find my grand-uncle Frank Leszczynski ! Grand-Uncle Frank (aka Franciszek was 75 in 1940, and was the god-father at my aunt Catherine’s birth in 1914 and was from my great-grandfather Tomasz’s first wife, Julianna). He is a Naturalized citizen on/before 1940, after having filed in 1931 (Declaration of Intent). Why he is living with a family of Pawelczak as a lodger is a question. After all, he has two half-siblings living nearby, including my grand-uncle Michael whom he was living with when he filed the Declaration of Intent in 1931. So why live at 819 Oliver Street in North Tonawanda (Niagara County, NY) with the Pawelczaks — which he & the Pawelczaks did since 1935 according to the census data?

I still need to find Frank’s death certificate and death notice (if possible) and his Naturalization papers (Erie County or more likely Niagara County).

Ancestry on 5th-June-2012 also released indexes and images of the NY State Census for 1892, 1915, 1925 (previously they had done 1905, partially?).

NY Censuses & 1940 US Census both making my research in NY state a little more complete.

State Censuses

Family Tree Magazine had a nice “Cut & Save” chart on State Censuses (not the US Federal Census). Here is my Cut/Saved images of the States and their Colonial or Territorial or State Censuses that are available … somewhere.

Alabama .. Minnesota

Mississippi .. Wyoming

June 24, 2012

Big Data – Every Minute Of Every Day …

by C. Michael Eliasz-Solomon

20120622-094344.jpg     Every minute of Every day,  you and I and the rest of the Internati produce data, big data in some kind of Internet colony. We email or blog or even a Facebook post or a 140 character tweet. Being genealogists we search databases and post trees with their connections and images like the 1940 US Census pages that hold our family members. And every day we post more data to the Internet. That is what the picture shows.

The pace of Big Data is increasing too.

Who backs up the Internet? Who archives the web? The “Wayback Machine” seems to record our civilization’s record so this work may last as long as Babylon’s cuneiform or Egypts hieroglyphs. Or will it? I know the Library of Congress is wrestling with Archival Issues of Digital works.

What is the disaster recovery plan of a sun spot interference or another magnetic burst? Books will survive and be immediately available but what about digital works? How do we backup all of this data exlposion?

May 21, 2012

Post Office Department – Stanczyk’s Mailbag — #Polish, #Genealogy, Kuc, Kucz, Swiniary

by C. Michael Eliasz-Solomon

From the Post Office Department

From Stanczyk’s Mail Bag

Email From:   Barbara

I have been trying to do research in Pacanow but have not been very successful.  My Grandmother — Maryanna Kuc(z) is from Oblekon.  I wrote to the parish there — Parafia p.w. Najswietszej Maryi Panny Krolowej Swiata but never received a reply.  Perhaps they just couldn’t find any information.
My Grandmother:      Maryanna Kuc(z)
Born:                        March 15, 1886
Baptized                   March  25, 1887
Immigrated to USA:    September 1912
Father:                      Benedict Ku(z)
Mother’s first name:    Marianna
She had a sister Eva (born 1895)
 & a brother Jozef  (born 1893) both came to America.
I think she had other siblings but have not been able to find any records from Poland at all i.e. Marriage of parents, birth or baptisms or death of her parents.  I know her father was alive in 1912 when she came to America.
If you can help or shed any light on how I could obtain the information I am seeking, I would be extremely grateful.
Keep up the excellent work on your blog.
Thank you for any information in can provide and Thank you for your blog,  I learn a lot from it.
Barbara
I had told Barbara that I would search the Swiniary indexes that I have pictures of to see if I could find anything for her. When I searched my indexes, I found that her family name is spelled most as she had it: Kucz, but I did find one example where the priest wrote Kuć. There was also another family Kuzon, but I do NOT feel like they are the same family as her Kucz/ Kuć. Since this was from the era 1829-1852 the records were in Polish. I found one marriage index in the Swiniary parish:
1836 Franciszek Kuć marries Maryanna Duponką   [this is not your great-grandparents, but probably related]. 1836 was the only year that I had a marriage index picture.
1830-1840 no Kucz/ Kuć births in the indexes.
1841 Jozef Kucz birth record #23
1842 Maciej Kucz birth record #21
1843-1845 no Kucz/ Kuć births in the indexes.
1846-1849 I had no indexes (or pictures thereof)
1850 no Kucz/ Kuć births in the indexes.
1851 I had no indexes (or pictures thereof)
1852 Stanislaw Kucz birth record #28
I think I have seen Kuc in the surrounding parishes (Biechow & Pacanow).
First off, I checked the LDS website (FamilySearch.org). I wanted to see what microfilm they had. Your birthdates: 1886, 1893, 1895 are rather late (most LDS microfilm stop around 1884). Here is their inventory for Swiniary (you want “Akta urodzeń“, for births):

Family History Library Catalog (Place Search): Swiniary

Akta urodzeń 1686-1811 — małżeństw 1668-1863 — zgonów 1686-1811 –  INTL Film [ 939952 ]
Akta urodzeń 1797-1811, 1826-1865 –  INTL Film [ 939951 ]
Akta urodzeń, małżeństw, zgonów 1812-1816, 1818-1825 –  INTL Film [ 939949 ]
Akta urodzeń, małżeństw, zgonów 1878-1884 –  INTL Film [ 1808854 Items 9-15 ]

Akta zgonów 1797-1839 –  INTL Film [ 939950 ]

That is all the LDS (aka Mormons) have in their Family History Library that you can rent microfilm from. Next I checked the Polish National Archives via PRADZIAD . They did have books/microfilm for the date range you are seeking. Here is the contact info for the archive that has the data you seek. You would need to write them in Polish and they will write you back with their findings and instructions for wiring their bank the money they require (all in Polish).

PRADZIAD:

http://baza.archiwa.gov.pl/sezam/pradziad.php?l=en&mode=showopis&id=14781&miejscowosc=swiniary

Archive:

Archiwum Państwowe w Kielcach Oddział w Pińczowie – akta przeniesione do AP w Kielcach
28-400 Pińczów, ul. Batalionów Chłopskich 32
tel: (41) 357-20-02
fax: 357-20-02
email: pinczow@kielce.ap.gov.pl

I hope this helps you out!

–Stanczyk

May 4, 2012

BIG Genealogy — #Genealogy, #FamilyTree, #GEDCOM

by C. Michael Eliasz-Solomon

When Stanczyk, wrote the title, he was not referring to Ancestry.com or any other endeavor by genealogical companies from the western USA. No, Stanczyk is fascinated with numbers .. of people.

Yesterday, this jester wrote about the Confuscius Family Tree. It is commonly accepted to be the largest genealogy (family tree). But I had to wonder … Why?

It is an old genealogy, dating back to Confucius’ birth in 551 BCE. It is now 2012, so we have a genealogy that is 2,563 years old. My much beloved wife/kids are Jewish. In the Hebrew calendar we are presently in the year 5772. Despite my having been to a Jewish Genealogical Conference and meeting a man who told me his genealogy went back to King David. [This jester resisted the rude/snarky comment that if he researched using both Old & New Testaments he could push his research back to Adam.]

I also did not ask him to show me his documentation, but assuming he could, his genealogy would have been another 500 years earlier (~ 1050BCE) and therefore this tree mathematically speaking (assuming there are other Judeo-Christian couplings before I & my wife) his tree had the potential if you could/would follow all/many branches and not just the direct lineal trunk you have a tree with approximately 100 generations (adding another 17 generations to the 83 for Confucius). This assumes a generation is 30 years. Now if we look at Confucius and see 2560 years = 83 generations, we see an average of 30.84 years per generation — so 30 years per generation is not a bad estimate.

What genealogy could be older still? Well according to the Bible we record the Jewish peoples in Babylonia. So perhaps we can extend King David and/or one of his citizens back to King Hammurabi of Babylonia — that would yield another 650 years (~1700BCE) or about another 22 generations. Let me see if Confucius’ family tree is about 2 Million for 83 generations we get about 24,096 people per generation. So by adding 39 more generations then Hammurabi’s Family Tree should contain approximately another 940,000 people. So come on Iraq produce your family tree of nearly 3 Million people!

What genealogy could be older than that? There is a quote that goes something like, “History knows no time when the Egyptians were not highly developed both physically and intellectually.” True enough, recorded history does go back furthest in the Pharaohnic dynasties. That takes genealogy back to the first dynasty King (Pharoah) Menes, who sure enough had a son who wrote about Astronomy [source: Timechart History Of The World, ISBN 0-7607-6534-0 ]. That takes us to approximately, 3,000 BCE, another 1300 years/44 generations/1.06Million people! Ok, since there is no recorded history earlier than that, we will not have a properly sourced genealogy older than this. So people who are Elizabeth Shown Mills devotees turn your heads away …

What genealogy could possibly be older than that? I read that the indigenous peoples of Australia have an oral history of 48,000 generations! The aboriginal people of Australia date back to about 50,000 BCE, which would be 52,000 years ago/1734 generations/41.8Million people in their family tree. That’s not 48,000 generations, but that is more than twice as much as genealogy researchers test using their FAN24.ged file which has 24 completely full generations with 16.8Million pseudo people.

Now that is what I call BIG Genealogy. But where is that family tree (not FAN24.ged)? Why has no genealogy older than Confucius’ genealogy been found and carried forward to the present day? Is it possible that such a family tree exists?

–Email me!

Related Blog Articles …

Random Musings” (10-March-2010, see musing #2)

April 22, 2012

Alytus / Olita – Udrija / Baksiai — #Polish, #Lithuanian, #Genealogy

by C. Michael Eliasz-Solomon

Recently, Stanczyk was asked about a Pennsylvania family and if I could find their ancestral villages, so they could make a family pilgrimage to get in touch with their Genealogical Roots.

See the red annotation (circle / underline) near the map center. This is region as shown from a 1757 map of the Polish / Lithuanian Commonwealth.

One of the immediate points of this region needs to be made explicit. Obviously, it was a part of the Lithuanian Duchy before, then Part of Poland, it became part of Prussian-Poland partition, then part of the Russian Empire, before becoming Lithuania in modern times.

That much border re-drawing causes a lot of languages / archives to come into play. Records can be expected to be found in Latin, Lithuanian, Polish, German, Hebrew/Yiddish and Russian.

The region is known in various languages. So I sought out JewishGen ShtetlSeeker to help me learn all of the various names and here is the pop-up if you hover over the Alytus name:

Most researchers will want to take note of it as Olita in Suwalki wojewodztwo (when in the Polish Kingdom) or as Oлита (Russian/Cyrillic) in Troki uyezd, Vilna gubernia.

Family Search has microfilm for both Catholic and Jewish metrical books:


Lithuania, Alytus – Church records (1)
Metrical books, 1797-1873
Lithuania, Alytus – Jewish records (1)
Metrical books, 1835-1914

Pradziad has some archival records too. Their records are for Jewish metrical records in the year range: 1835-1872 .

Obviously, if you visit the locale, then parish records may exist in Udrija or Baksiai parishes/synagogues in the Alytus region of Lithuania. Besides the Catholic records, there may also be Lutheran records too.

A more modern map (Olita/Alytus) can be found on the Polish map site mapywig.org . Please NOTE this is a large / detailed map. The area of this article is in the left-center area on the river.

April 2, 2012

1940 Census Preparations – Pays 1st Dividend

by C. Michael Eliasz-Solomon

Stanczyk,

Found his grand-uncle Michael Leszczynski (deputy sheriff) at 5071 Broadyway, Depew, Erie County, NY in the 1940 US Census. He was in ED 15-37, on SHT 6-A (line 4 was Michael and his wife Felicia was on line 5). Click on the link if you have access to Ancestry.com.

Kudos to Ancestry.com for getting their 1940 US Census working in short order. Their Image Viewer is excellent, very fast.

April 1, 2012

1940 US Census – Here’s What Enumeration Districts I’m Researching

by C. Michael Eliasz-Solomon

2nd-April-2012 (72 years are up)

Here is Stanczyk’s initial research list before there are complete indexes.

Enumeration Districts (EDs)

By State/County:

MI-Wayne-Detroit84-590,  84-710,   84-583,  84-584,  84-586,  84-1246,  84-1471

MI-Macomb — 50-70A

MI-St Clair — 74-14

NY-Erie-Depew — 15-37

OH-Lucas-Toledo — 95-217,  95-221

PA-Philadelphia — 51-22

Families

MI — Eliasz, Epperly, Gawlik/Gawlikowski, Gronek, Kedzierski, Vespek, Wlecial/Wlecialowski

NY — Leszczynski (Frank, Michael, Teofil)

OH — Eliasz, Mylek, Sobieszczanski

PA — Solomon

Related Spreadsheet

http://mikeeliasz.wordpress.com/2012/03/24/1940-us-census-9-days-away-genealogy-preparation/

Related 1940 Census Info (EDs, etc)

http://www.archives.gov/research/census/1940/finding-aids.html#maps

March 30, 2012

Ancestry Adds 1940 US Census ED Maps — #Genealogy, #1940, #Census

by C. Michael Eliasz-Solomon

Stanczyk, saw that Ancestry.com released/updated the 1940 US Census, Enumeration District Maps. It actually says ‘and Descriptions’ in its database title, but for the life of me I did not see any textual descriptions nor any images of words other than Legends and stray comments on hospitals, asylums, nunneries, etc (which were interleaved in the whitespace of the maps).

I queried on the ED I got from Steve Morse’s One-Step website (unified census page) that let me convert 1930 EDs into 1940 EDs. I used ED 84-590 (where I expect to find my grandmother and her children — including my father).

I did an exact search on 84-590 and Ancestry showed me an option for either the city map or the county map. While the county map was interesting, the city map of Detroit was what I was after. I clicked on the link to view the city map for ED 84-590, but what I got was page 1 of 46 pages (not the page where 84-590 was). Well I “gutted it out” and browsed sequentially through all of the pages searches from one corner to the opposing corner reading each and every ED until I found ED 84-590 on page number 40.

That kind of brute force search was not a total waste. I did confirm 84-590 was correct ED that I should search on Monday when they release the 1940 US Census. I was also able to confirm my Vespeks ED as either  84-1246 or less likely (since it is for the prior address) 84-1252. Perhaps my dedicated readers will note that this is the one ED (it gave 84-1244 or 84-1245 — which were close) that was wrong in Steve Morse’s webpage lookup. The fault as I said before was not Steve Morse, but the US government providing inaccurate mapping of the 1930 ED to the 1940 ED, but the description of the EDs on Steve Morse’s lookup image did give me a look at the other descriptions nearby and I was able to divine that 84-1246 should be the one I search. Well this also points out the value of Ancestry’s new database. I was able to look at ED Map and confirm that 84-1246 was correct ED and that 84-1244/1245 EDs were near misses to the known address I had.

I was also able to verify that ED 84-583/584 would probably contain my Galiwks and Wlecials [assuming they are in enumerated in Detroit and not at the Macomb county farm address]. I could see how close they were to  St. Adalbertus church and the the last known addresses I had and how they were all closely clustered in the same area (not obvious from the addresses).

My only complain is that Ancestry should take you to the correct page for your ED and not force you to do a brute force, page-by-page search. Detroit was a LARGE city in 1940 — imagine NYC, LA, Chicago or Philadelphia where were (and still are) larger than Detroit; Those would be awful searches.  For my friends that have Polish family in Hamtramck, not to fear, there are only four pages to comb through. For the few people that I have emailed through the last few months about CHENE St project, just go to image/page 40 of Detroit (or click on the link) you are near my grandmother’s ED.

Archives.gov says you have 2 days and about 16 hours (and counting) to ready yourself for the 1940 US Census. Good Luck!

March 18, 2012

Dziennik Polski Detroit Newspaper Database App Search Page

by C. Michael Eliasz-Solomon

Stanczyk,

was finally able to use his training from Steve Morse’s presentation at RootsTech 2012 to create a One-Step Search App for the Dziennik Polski Detroit Newspaper Database.

To search on 30,920 Polish Vital Record Events, just go to the new Dziennik Polski Detroit Newspaper Database App Search page (on the right, under PAGES,  for future reference).

FAQ

For more background on the Dziennik Polski Detroit Newspaper click on the link.

You can search on the following fields:

Last Name – exact means the full last name exactly as you typed it. You can also select the ‘starts with’ radio button and just provide the first few starting characters. Do not use any wild card characters!

First Name – exact means the full first name exactly as you typed it. You can also select the ‘starts with’ radio button and just provide the first few starting characters. Do not use any wild card characters!

Newspaper Date – exact means that you need to enter the full date. Dates are of the format:

06/01/1924 (for June 1st, 1924). Format is MM/DD/YYYY. Leading zeros are required for a match.

You can use ‘contains’ radio button to enter a partial date. The most useful partial is just to provide the Year (YYYY). Do not use any wild card characters!

Event Type – exact means the full event type. This is not recommended. You SHOULD select the ‘starts with’ radio button and just provide the first few starting characters. Do not use any wild card characters! Uppercase is not required.

Valid Events Types: BIRTH,  CONSULAR,  DEATH,  or MARRIAGE

Indexer – exact means the full indexer exactly as you typed it. You can also select the ‘starts with’ radio button and just provide the first few starting characters. Do not use any wild card characters!

The Indexer is meant to be informational only, but you could conceivably want to search on this field too, so it is provided.

March 10, 2012

Ancestry.com Broken ? Is Your GEDCOM Export OK? — #Genealogy, #Technology

by C. Michael Eliasz-Solomon

Stanczyk, wants to know if anyone else is having problems Exporting their GEDCOM from Ancestry.com?


 This is what I see when I try to export my gedcom from the tree settings screen. It never gets past 0% complete.

I have tried to submit a Help Ticket for technical support and so far I have not received any response. What gives Ancestry?

I can still work on my tree and updates appear to be saved. I can synch to the Ancestry App (on the iPhone) and the changes are there too. 

March 3, 2012

Google’s Chrome Browser For Genealogy — #Genealogy, #Technology

by C. Michael Eliasz-Solomon

     Stanczyk was a big Mozilla/Firefox browser user. On Mac or Windows it did not matter. So it was a shock that I switched to Chrome (Google’s browser).

I did so mostly on Google’s promise that “microdata” would be another widget that would greatly enhance the search experience for genealogy data. I waiting on that feature — still am waiting.

On Tuesday I mentioned Virtual Keyboard 1.45, for entering your diacriticals through your browser into say Ancestry.com. Today, I was reading Kathy Judge Nemaric’s blog – “Dead Reckoning” [nice name for a genealogy blog] and she mentioned an extension to the Chrome Browser. It is called Ancestry Family Search Extension 2.4 .

     Open up a new Tab (Ctrl-T works) and click on Chrome Web Store. In the “Search Store” field, type in “Ancestry Family Search” and press the Enter key to bring up the extension (see on the left).

Click on the Add to Chrome button and then click on the Install button in the dialog box that pops up to confirm your wish. Once you have installed the extensions into your Chrome browser, it will show like the following screen:

Now you are ready to reap the rewards of that hard work. Go to Ancestry.com and perhaps open up your family tree on an individual you are working on. Now your browser’s address bar has a new  “widget”. Next to the STAR widget you have been using to Bookmark pages is a new widget shaped like a TREE.

See the red circle (and arrow)? Just click on that and it will bring up a new window on top the current TAB in your browser with (in my case) Tomasz Leszczynski result set from the Family Search databases. If you click on one result, then a new TAB will open to the exact record in Family Search.

This is a very nice synergy between the two websites. So I am thinking, that if Google produces their microdata widget, that 2012 will be the year of the widget in Genealogy and perhaps the year of the CHROME browser too.

There is one microdata Schema Explorer browser extension already in the Chrome Web Store. But you will want to wait for Google’s which will use the website: http://historical-data.org/ . I am guessing Google will use this website to develop schemas to guide its browser.

2012 is shaping up to be a very good year for genealogy and to switch to CHROME!

February 28, 2012

Dying For Diacriticals … Beyond ASCII — #HowTo, #Genealogy, #Polish

by C. Michael Eliasz-Solomon

Stanczyk mused recently upon a few of the NAMEs in my genealogy:

Bębel, Elijasz, Guła, Leszczyński, Kędzierski, Wątroba, Wleciał, Biechów, Pacanów, Żabiec

If you want to write Elijasz (or any of its variants) you are golden. But each of the other names require a diacritic (aka diacritical mark). Early on, I had to drop the diacritics, because I did not have computer software to generate these characters (aka glyphs). So my genealogy research and my family tree were recorded in ASCII characters. For the most part that is not a concern unless you are like John Rys and trying to find all of the possibly ways your Slavic name can be spelled/misspelled/transliterated and eventually recorded in some document and/or database that you will need to search for. Then the import becomes very clear. Also letters with an accent character (aka diacritic) sort differently than  letters without the diacritic mark. For years, I thought Żabiec was not in a particular Gazetteer I use, until I realized there was a dot above the Z and the dotted-Z named villages came after all of the plain Z (no dot) villages and there was Żabiec many pages later! The dot was not recorded in the Ship Manifest, nor in a Declaration of Intent document. So I might not have found the parish so easily that Żabiec belongs to. I hope you are beginning to see the import of recording diacritics in your family tree.

How?

The rest of my article today teaches you how to do this. Mostly we are in a browser, surfing the ‘net, in all its www glory. After my “liberal indoctrination” (aka #RootsTech 2012), I have switched browsers to Google’s Chrome (from Mozilla Firefox) browser. Now I did this to await the promised “microdata” technology that will improve my genealogical search experience.  I am still waiting,  Mr Google !!!   But while I am waiting, I did find a new browser extension that I am rather fond of that solves my diacritical problem: Virtual Keyboard Interface 1.45. I just double-click in a text field and a keyboard pops-up:

Just double-click on a text field, say at Ancestry.com . Notice the virtual keyboard has a drop down (see “Polski“), so I could have picked Русский (for Russian) if I was entering Cyrillic characters into my family tree.

But I want to keep using my browser …            OK!  Now I used to prepare an MS Word document or maybe a Wordpad document with just the diacriticals I need (say Polish, Russian, and Hebrew) then I can cut & paste them from that editor into my browser or computer application as needed — a bit tedious and how did I create those diacritical characters anyway?

I use  Character Map in Windows and Character Palette -or- Keyboard Viewer  on the MAC:

Now if I use one of these Apps, then I can forgo the Wordpad document  ( of special chars. ) altogether and just copy / paste from these to generate my diacritical characters.

What I would like to see from web 2.0 pages and websites is what Logan Kleinwaks did on his WONDERFUL GenealogyIndexer.org website. Give us a keyboard widget like Logan’s, please ! What does a near perfect solution look like …

Logan has thoughtfully provided ENglish, HEbrew, POlish, HUngarian, ROmanian, DEutsche (German),  Slavic, and RUssian characters. Why is it only nearly perfect? Logan, may I please have a SHIFT (CAPITAL) key on the BKSP / ENTER line for uppercase characters? That’s it [I know it is probably a tedious bit of work to this].

Beyond ASCII ?

The title said  beyond Ascii. So is everything we have spoken about. Ascii is a standard that is essentially a typewriter keyboard,  plus the extra keys (ex. Backspace, Enter, Ctrl-F, etc.) that do special things on a computer. So what is beyond Ascii? Hebrew characters (), Chinese/Japanese  glyphs (串), Cyrillic (Я), Polish slashed-L (Ł), or Dingbats (❦ – Floral Heart). You can now enter of these beyond ascii characters (UNICODE)  in any program with the above suggestions.

Programmer Jargon – others  proceed with caution …

The above are all UNICODE character sets.  UTF-8 can encode all of the UNICODE characters (1.1 Million so far) in nice and easy 8bit bytes (called octets — this is why UTF-8 is not concerned with big/little endianess). In fact, UTF-8‘s first 128 characters is an exact 1:1 mapping of ASCII making ascii a valid UNICODE characters set. In fact, more than half of all web pages out on the WWW (‘Net) are encoded with UTF-8. Makes sense that our gedcom files are too! In fact UTF-8 can have that byte-order-mark (BOM) at the front of our gedcom or not and it is still UTF-8. In fact the UTF-8 standard prefers there be no byte order mark [see Chapter 2 of UNICODE] at the beginning of a file. So please FamilySearch remove the BOM from the GEDCOM standard.

If FamilySearch properly defines the newline character in the gedcom grammar [see Chapter 5, specifically 5.8 of UNICODE] then there is nothing in the HEAD tag that would be unreadable to a program written in say Java (which is UTF-16 capable to represent any character U+0000 to U+FFFF) unless there is an invalid character which then makes the gedcom invalid. Every character in the HEAD tag is actually defined within 8bit ascii which can be read by UTF-8 and since UTF-8 can read all UNICODE encodings you could use any computer language that is at least UTF-8  compliant to read/parse the HEAD tag (which has the CHAR tag and its value that defines the character set). Everything in the HEAD tag, with the exception of the BOM is within the 8bit  ascii character set. Using UTF-8 as a default encoding to read the HEAD will work even if there is a BOM.

February 27, 2012

#PA #Genealogy – Access To Vital Statistics — Public Access, Privacy Law, PA Act 110, SB 361

by C. Michael Eliasz-Solomon

PA Act 110 – Public Records (formerly known as Senate Bill 361)

This bill amends the Act of June 29, 1953 (P.L. 304, No. 66), known as the Vital Statistics Law of 1953, to provide for public access to certain birth and death certificates after a fixed amount of time has passed. This legislation provides that such documents become public records 105 years after the date of birth or 50 years after the date of death.

This is a mixed bag, but at least its consistent. I wish it was 72 years  (like the census) instead of 105. Also the 50 years after death is way too long. Dead is dead. Maybe you could make a case for 5-10 years. By doing greater than 30-35 years you are forcing genealogy research to skip generations since the current generation would die before gaining access. Genealogists will have to will research plans to children in PA.

The indexes (I hate the word indices) are here: Birth Index (1906 — so far that’s it) | Death Index (1906-1961).   By the way, you will need the American Soundex of the last name as this is how the records are sorted:  American Soundex of Surname, followed by alphabetical on FirstName. Use Steve Morse’s Soundex One-Step page.

February 16, 2012

GEDCOM “RailRoad Tracks” (aka Graphic Syntax Diagram) – #Genealogy, #Technology

by C. Michael Eliasz-Solomon

The above diagram is what Stanczyk had been jabbering about since the #RootsTech conference. Isn’t that much easier on the eyes and the grey matter than a complex UML diagram? Who even knows what a UML diagram is or if it is correct or not?

What does it say is in a GEDCOM file (ex.  Eliasz.ged)?

A HEAD tag  optionally followed by a SUBmissioN Record followed by 1 or more GEDCOM lines followed by a TRLR tag.

ex. gedcom lines  that can be “traced” along the railroad tracks at the top.

 0 HEAD
 1 SOUR Stanczyk_Software
 1 SUBM @1@
 1 GEDC
 2 VERS   5.5.1
 2 FORM  LINEAGE-LINKED
 1 CHAR  UNICODE
 0 @1@ SUBM
 ...
 0 TRLR

OK Stanczyk_Software does not exist, but was made up as a fictitious valid SOURce System Identifier name. The GEDCOM file (*.ged) is a text file and you can view/edit the file with any text editor (vi | NotePad | WordPad | etc.). I do not recommend editing your gedcom outside of your family tree software, but there is certainly nothing stopping you from doing that ( DO NOT TRY THIS AT HOME). If you knew gedcom, you could correct those erroneous/buggy gedcom statements that are generated by so many programs — that cause poor Dallan Quass to ONLY acheive 94% compatibility with his GEDCOM parser.

Have you ever downloaded your gedcom from ANCESTRY and then uploaded it to RootsWeb? Then you might see all those crazy _APID  tags.   It is a custom tag (since it begins with an underscore  — GEDCOM rules dear boy/girl).   It really messed up my RootsWeb pages with gobbledygook. I finally decided to edit one gedcom and remove all of the _APID tags before I uploaded the file to RootsWeb. Aaah that is SO much better on the eyes. Oh I probably do not want to re-upload the edited gedcom into ANCESTRY, but at least my RootsWeb pages are so much better!   The _APID is just a custom tag for ANCESTRY (who knows what they do with it) so to appeal to my sense of aesthetics, I just removed them — no impact on the RootsWeb pages, other than improved readability. [If you try this, make a backup copy of the gedcom and edit the backup copy!]

Now obviously the above graphic syntax diagram is not complete. It needs to be resolved to a very low level of detail such that all valid GEDCOM lines can be traced. It also requires me/you to add in some definitional things (like exactly what is a level# — you know those numbers at the beginning of each line).

I have a somewhat mid-level  graphic syntax diagram that I generated using an Open Source (i.e. free) graphic syntax diagrammer, as I said in one my comments, I will send it to whoever asks (already sent it to Ryan Heaton & Tamura Jones). You can get a copy of Ryan Heaton’s presentation from RootsTech 2012 and compare it to his UML diagram (an object model). I think you will quickly realize that you cannot see how GEDCOM relates to the UML diagram — therefore it is difficult to ask questions or make suggestions. A skilled data architect/data modeler or a high-level object-oriented programmer could make the comparison and intuit what FamilySearch is proposing, but a genealogist without those technical skills could NOT.

I am truly asking the question, “Can a genealogist without a computer science degree or job read the above diagram?” and trace with his finger a valid path of correct GEDCOM syntax [ assuming a whole set of diagrams were published]. The idea is to see how the GEDCOM LINES (in v5.5.1 parlance FAMILY_RECORD, INDIVIDUAL_RECORD, SOURCE_RECORD, etc.) are defined and whether or not what FamilySearch is proposing something complete/usable and that advances the capabilities of the current generation of software without causing incompatibilities (ruining poor Dallan Quass’s 94% achievement). Will it finally allow us to move the images/audio/video multimedia types along with the textual portion of our family trees and keep those digital  objects connected to the correct people when moving between software programs?

 

GEDCOM files are like pictures of our beloved ancestors. They live on many years beyond those that created them. Let’s not lose any of them OK?

February 13, 2012

Blog Bigos …

by C. Michael Eliasz-Solomon

Stanczyk added a new Page (Tech Diary) to record my technology doings.

While doing that and reading from my blogroll (and emails), I discovered some history about the “defacto standard GEDCOM” (wiki: GEDCOM ). Now I strongly recommend you start from “defacto” link rather than the wikipedia link.

  • RootsTech 2012 – had two GEDCOM presentations by Ryan Heaton (FamilySearch, GEDCOMX project).
  • RootsTech 2012 – had one open source GEDCOM parser presentation by Dallan Quass. Dallan was quite remarkable in his efforts to achieve a 94% commonality amongst 7,000 different GEDCOM files. Dallan Quass has a GitHub project for his Open Source GEDCOM parser.
  • Modern Software Experience (Tamura Jones) had a couple articles that caused me to write this article. His most recent GEDCOM article that caught my eye was:  BetterGEDCOM (2/2/2012). I also noticed he had a GEDCOMX article from 12/12/2011. These two articles provide a good discussion. I also noticed that the BetterGEDCOM project had their own project blog. [also see his Gentle Introduction to GEDCOM  article].

I believe those provide the most recent current thoughts on GEDCOM (that I have not penned).

  • I have been studying GEDCOM v5.5 (the last GEDCOM standard).
  • I produced a partial Graphic Syntax Diagram of GEDCOM v5.5 [what I had been calling "Railroad Tracks"] just to demonstrate how I thought this diagram was a better vehicle to communicate the standard [than say UML object models].
  • I could not resist making slight tweaks to GEDCOM v5.5 even in my preliminary studies. Mostly so we could discuss GEDCOM in a readable fashion (i.e. whitespace for formatting, and comment lines ) or because the language cries out for consistency (i.e. requiring the HEAD tag to be a zero level, just like the TRLR tag).

My  Graphic Syntax Diagram of GEDCOM v5.5 was produced using an open source tool. It is partial and still high level. I did put in a construct so that you can clearly see all 128 standard tags. The Graphic Syntax Diagrammer is an excellent tool. I will have to offer the author a suggestion for the PNG images that it outputs. I need to take my diagram and manually edit it to make the drawing a better fit for 8.5″ x 11.0″ (aka A1) paper. I need to graphically wrap the railroad tracks and to add page breaks so that the image is itself usable for viewing/discussions. I will offer this sample drawing to any interested parties — including emailing the edited product to Ryan Heaton and Dallan Quass [who since they did not request it -- can feel free to ignore it].

My goal is to make minor tweaks to  GEDCOM v5.5 via this diagram [not programming] and try and get DallanQ to produce a one-off parser for it (call it, say GEDCOM 5.5.999) and hope that my tweaks will not lower Dallan’s hard work of achieving 94% compatibility. If it turns out to have virtually no effect on Dallan’s 94% compatibility in his Open Source parser, then I can think about  getting some software vendors to utilize the enhancements (via end user requests), since they are trivial, just to move the standard forward and to open an interest in the vendors to looking at how we create a new Open Standard for GEDCOM.

P.S.

Thanks to Tamura Jones, I now know I need to update my diagram to GEDCOM v5.5.1 first

February 5, 2012

Google Me Some Shiny New Genealogical Data

by C. Michael Eliasz-Solomon

Google was at RootsTech 2012. Google was a Keynoter, Google was a Vendor and Google was a presenter. Google was in the house. The tech gear had some Android devices in the audience too.

Only Apple had more technology there. Unfortunately, it was among the users, developers, and presenters. Tim Cook bring Apple to RootsTech 2013!!! Your customers deserve Apple to give the same presence as Google. As I said in my last article, iPads, iPhones, MacBooks (mostly Pro, but some Air) — the attendees were so tech laden you would have thought Ubiquitous Computing had arrived. Isn’t there a recession? Where did all these tech warriors come from? These were users a bit more than developers. Bloggers were numerous, most wore Mardi Gras beaded necklaces so they were recogizable. Then you had secret bloggers such as Stanczyk. Everyone was a genealogist. Users encouraged Vendors/Developers with praise and requests for more/better technology. Oh and make the tech transparent.

But this is about Google. Before the conference I had written the Google tech off as too low brow to bother with. Then Jay Verkler showed up — who is apparently the Steve Jobs of genealogy. He was the Keynoter on day one. Stanczyk is a genealogist and I have been to genealogy conferences before. These are usually staid affairs. Genealogists are … how should I put it … umm, old. It is not unusual to see octogenarians and nonogenarians (90’s). But the energy in the auditorium of 4,200 conference attendees was electric. These were not stodgy, Luddites. Notebooks and pens were almost nonexistent!! People were excited and very much anticipating — what, I do not think we had a clue, but expectations were off the charts.

Jay did not disappoint. He was personable and masterful in his presentation skills. Mr Verkler is a Visionary like Steve Jobs and the audience knew it and responded. It was Jay who weaved the vision which everyone now wants ASAP. He brought up Google and my eyes were prepared to glaze over. I did not even record the Google execs’ names [shame on me]. They were good! They had prepared for RootsTech and they showed brand new tech and also Microcode. I do not have words to express what I saw, but everyone in the audience wanted it.

Google showed Microcode which would be a Google Chrome plug-in and appear as a widget/icon in the address bar that can do amazing search/exchange tricks in a Web 2.0+ way. It would utilize Historical-Data.org in some unspecified way to do this genealogy magic. It was beyond amazing. Google created a genealogy plug-in!! Google is apparently also coordinating in an API-like way to transfer these search result magics into other websites like FamilySearch, Ancestry, etc. that put this magic into the beyond amazing realm.

Firefox and Safari take note if you do not want to see a massive shift to Chrome. I am pretty sure all genealogists will use Chrome when Microcode widget arrives.

February 5, 2012

Is GEDCOM Dead? Date/Place of Death, Please?

by C. Michael Eliasz-Solomon

The RootsTech Conference is living up to its name. Everywhere there was a sea of: iPhones/Androids, iPads (in huge numbers), and laptops. Even the very elderly were geared up. Google, Dell, and Microsoft were at RootsTech. — why not Apple, especially since their customers were present in LARGE numbers??? [note to Tim Cook have Apple sponsor and show up as a vendor.]

According to Ryan Heaton (FamilySearch), “GEDCOM is stale.” He went on to speak about GEDCOMX as the next standard as if GEDCOM were old and/or dead. They were not even going to make GEDCOMX backwards compatible! In a future session I had with Heaton I asked the Million dollar question, “How do I get my GEDCOM into GEDCOMX”? After a moments pause he said they’d write some sort of tool to import or convert the existing GEDCOM files. Well that was reassuring??? So they want GEDCOMX to be a standard but FamilySearch are the only ones working on it and they have not had the ability to reach out to the software vendors yet (I know I asked).

My suggestion was to publish the language (like HTML, SQL, or GEDCOM). I asked for “railroad tracks“, what we used to call finite state automata, and what Oracle uses to demonstrate SQL syntax, statements that are valid with options denoted and even APIs for embedding SQL into other programming languages. Easy to write a parser or something akin to a validator (like W3C has for HTML).

Dallan Quass  took a better tack on GEDCOM. His approach was more evolutionary, rather than revolutionary. He collected some 7,000+ gedcoms

GEDCOM Tags

and wrote an open source parser for the current GEDCOM standard (v5.5). He analyzed the flaws in the current standard and saw unused tags, tags like ALIA
that were always used wrong, custom tags and errors in applying the standard. He also pointed out that the concept of a NAME is not fully defined in the standard and so is left to developers (i.e. vendors) to implement as they want. These were the issues making gedcoms incompatible between vendors. He said his open source parser could achieve 94% round trip from one vendor to another vendor.

Now that made the GEDCOMX guys take notice — here was their possible import/conversion tool.

The users just want true portability of their own gedcoms and the ability to not have to re-enter pics, audio, movies over and over again. RootsTech’s vision of APIs that would allow the use of “authorities” to conform names, places, and sources would also help move genealogy to the utopian future Jay Verkler spoke of at the keynote. APIs would also provide bridges into the GEDCOM for chart/output tools, utilities(merge trees), Web 2.0 sharing across websites / search engines / databases (more utopian vision).

GEDCOM is the obvious path forward. Why not improve what is mostly working and focus on the end users and their needs?

FamilySearch get vendors involved and for God’s sake get Dallan Quass involved. Publish a new GEDCOM spec with RailRoad tracks (aka Graphic Syntax Diagrams) and then educate vendors and Users on the new gedcom/gedcomx.    Create a new gedcom validator and let users run their current gedcoms against it to produce new gedcoms (which should be backward compatible with old gedcom to get at least 94% compliance that Quass can already do)!

Ask users for new “segments” in the railroad tracks to get new features that real users and possibly vendors want in future gedcoms. Let there be an annual RootsTech keynote where all attendees can vote via the RootsTech app on the proposed new gedcom enhancements.

How about that FamilySearch? Is that doable? What do you my readers think? Email me (or comment below).


P.S.
       Do Not use UML models to communicate the standard. It is simply not accessible to genealogists. Trust me I am a Data Architect.

Tags: ,
January 30, 2012

Genealogy This Week … #Genealogy, #Technology, #Polish, #GroundHog

by C. Michael Eliasz-Solomon

To Stanczyk, it appears that 2012 has gotten off to a sluggish start (genealogically speaking). How about for you genealogists (email or comment)? Well that is all about to change !   Lisa Kudrow‘s Who Do You Think You Are?, returns this Friday with Martin Sheen as the subject.

RootsTech 2012 kicks off this week too. Did you notice, they have an app (its free) for that? Even better they will STREAM some of the conference for the benefit of all genealogists !   Kudos to Roots Tech — All Conferences (genealogical or not should do these two things: app and stream conference proceedings). This should definitely jump start genealogy.

Read these blogs. Yes, I am telling you its ok to read other blogs than this one. These people are “official Roots Tech bloggers”.

I discovered that I missed one of my holiday blogs (in my backlog) about the happy married couples in Pacanów parish from 1881. So I will post the names of 40 Happy couples and what record # (Akt #) they are in the Pacanów parish church book.  This is two years after my great-grandparents got married, but there is still a Jozef & Mary who are getting married (Jozef Elijasz). I once had to sort out the two Jozef Elijasz from 1879 and the one from 1881 who all married women named Mary in the village of Pacanów! Genealogy is hard.

Oh and Punxsutawney Phil will make an appearance this week and offer his weather prognostication skills (I really think his predecessor Pete was much better and more alliterative too). I am pretty sure Phil & Pete are German, so you will need a German genealogy site for their lineage. Quaint tradition (Pennsylvania), dragging a Ground Hog from its home to ask him about weather. I think Bill Murray’s movie captured it well. So be careful what you do this week, or you may be repeating it a few times.

Follow

Get every new post delivered to your Inbox.

Join 412 other followers

%d bloggers like this: