Posts tagged ‘Genealogy’

May 3, 2012

Greenland, China – Where Are You? — #Genealogy, #Blog, #Map

by C. Michael Eliasz-Solomon

The map you see is this blog’s reach since some time in February.

My hope with this blog is to reach Greenland and China (中国).

My blog is connected to Greenland in this one way…

My first cousin Stephen E. Eliasz (whom we always called “Butchy”) was stationed in Greenland. I remember my father’s comments about the pictures of his from Greenland — which from my dad’s comments I pictured as icy. True, enough the Thule Air Base is the US base closest to the North Pole. I hope there is another Polish genealogist in Greenland who searches/finds this blog. I am trying to fill in the above map with as many genealogists from all over the globe.

My only connection to China – who are avid genealogists is my fascination with GEDCOM and family trees. I used to think that if you were related to Genghis Khan (born circa 1162) then you would have the largest family tree, because he had a vast empire and many wives. However, time works its wonder in many ways. The people with the largest family trees are those related to Confucius (551-479 BC, the religous/philosopher founder).

The Confucius Genealogy, originally recorded by hand, was first printed in 1080 AD [80+ years before Genghis Khan's birth]. Now the latest compilation of which there is a Confucius Genealogy Compilation Committee that is responsible for collecting, collating and publishing the 2,500 years worth of genealogical data. According to a web post by Tamura Jones (2/17/2008),

Confucius family tree, last updated in 1930. Back then, the tree already had 560,000 members Today, it has more than 2 million. The longest lines in the tree span 83 generations.

Tamura’s article was written just before the last time the Confucius Genealogy published in September 24, 2009  (as a pre-announcement). That 2009 publishing was the first time, the Confucius Genealogy included female descendants. So I guess the extra 1,700 years of Confucius ( 孔子) trumps the extra wives that Genghis Khan had. That is my only connection to Chinese Genealogy (家谱).

Does anyone have more than 83 generations (with citations documenting your lineage)?

May 3, 2012

Genealogy Indexer – Logan Kleinwak — #Genealogy, #Historical, #Directories, #Military, #Yizkor

by C. Michael Eliasz-Solomon

     Stanczyk’s prior article on Genealogy Indexer – the Logan Kleinwaks’ website that indexes historical city directories or other historical lists (i.e. Yizkor Books, Military Muster Lists, etc.) covered this amazing genealogical resource who deserves a much higher rating than #116 on the current Top 125 Genealogical Websites.

Since my first blog article about GenealogyIndexer.orgLogan Kleinwaks has added virtual keyboard (a software icon) for generating diacritical letters (think ogroneks and umlauts) as well as non Latin characters (think Hebrew or Cyrillic) to make searching easier. This jester even uses that excellent piece of coding to generate the text for articles or data entry into genealogy software. You may remember, I wrote about that in “Dying for Diacriticals” or any of the other dozen articles (some of which cover GenealogyIndexer).

Well in the last month Logan has really outdone himself in adding material to the website! I give up trying to keep up with the huge amounts of data he is publishing. You really need to follow Logan on twitter (@gindexer). Thank You Logan for your amazing efforts.

April 30, 2012

Genealogy Top 125 Websites (2012 2nd Qtr) Released ! — #Genealogy, #Website, #Rankings, #Metrics

by C. Michael Eliasz-Solomon

The Latest Top 125 GENEALOGY Websites are out !

Not surprisingly, all things or owned by them are in the top 20.

The 1940 US Census that came out on April 2nd, had a profound impact on the rankings. Obviously any web site related to the 1940 US Census had a boost in their ranking (except Ancestry which was already number 1). Here are the Top 125 Genealogy Websites (or click the image) !, the One-Step Website that is a king of Swiss knife of genealogy actually dropped about 100K in the ranking and rising nine places on the list to become the 19th highest rated website ! This impressive improvement is related to the 1940 US Census, even though this is not one of the four websites with actual census pages.

Dr. Morse’s web page which helps you find the best Enumeration District (ED) to browse (until indexes are created) by utilizing an address or the 1930 ED to point you at the valid 1940 ED(s) that you should begin your search with.

Mocavo is the new genealogy search engine. You can think of this as a Google for genealogy web pages and databases. This is a fairly new launched service and was a big splash at this year’s RootsTech (2012). Mocavo too, was up nine places on the list and is now the 17th highest rated website.

Looking 4 Kin

This relatively unknown website jumped an astounding 38 spots (now #47) on the top 125 and this jester thought that kind of improvement had to be mentioned.

New Additions

Louis Kessler‘s two websites: (#87) and (#74) were new additions. I also added to the list because it was one of the four websites hosting the 1940 US Census images. So cracked the list at #6. Well done! You may also recognize this website as the newest acquisition by

Stanczyk has had to give his own website a honorary spot, as my blog has dropped out of the top 125??? I am bit surprised, as last year when my popularity increased 4-fold I gained 5M in the ranking and had a nice #120 spot. In 2012, thanks to you my faithful readers, my popularity increased between 2.5-3-fold again. Surprisingly, I dropped 5M in my rankings and I had to remove my website from the top 125. are you sure?

This jester is sorely puzzled as my website stats are off the charts this year and I have already matched last year’s unique reader count and it is only the end of April! Another indicator that my readership is up 3-fold. However, I yield to the methodology and look forward to making the list next quarter.

April 28, 2012

Slavic Roots Seminar — #Polish, #Genealogy, #PA, #PGSCTNE

by C. Michael Eliasz-Solomon

Stanczyk, had a good day today. On a lark, I went down to see, with a new friend (Bob S.), the Slavic Roots Seminar at Nazareth Academy in Philadelphia, PA.

I thoroughly enjoyed the short one-day seminar put on by Lisa Alzo (The Accidental Genealogist),  Matthew Bielawa (,  and Jonathan Shea (this jester owns no less than four of his books — from which I learned to read Polish/Russian from his two volume series In Their Words: A Genealogist’s Translation Guide).

I know both Matthew and Jonathan are PGSCT&NE members (as is this jester) and I thought the comfort level/comraderie between the three of these presenters meant that Lisa too was also a member/officer of PGSCT&NE.

This free seminar was extremely well attended (dozens of people). When you reach that level of critical mass you can find another genealogist who researches in an area near to your own family. Sure enough, someone near Bob & I was speaking about their Lithuanian heritage (Bob’s family line).

The three presenters were very knowledgeable and also very personable — often the audience were amused at one of the presenter’s jests.

This jester was also happy to finally hook up with Donna Pointkouski, the very talented blogger, of  the rather literate genealogy blog, “What’s Past is Prologue“. While she was not a presenter today (pity), she too was a part of the seminar set-up / organization.

My thanks to all four of these expert genealogists, writers, and presenters — you made today a better day !

March 20, 2012

Finding Your Roots, With Henry Louis Gates, Jr. Premiers ! — #Genealogy, #Popular, #Media

by C. Michael Eliasz-Solomon

We have already have had a few episodes of “Who Do You Thin You Are?” on NBC — a couple of really good ones too. This Sunday (March 25, 2012) on PBS will air the start of their 10 part series, “Finding Your Roots, With Henry Louis Gates, Jr.“. So we can watch Helen Hunt on Friday and then on Sunday night we can learn about the roots of Cory Booker & John Lewis.

This series has already made a startling announcement related to comedian Wanda Sykes. In the NYT, article “Family Tree’s Starting Roots” we are told that, “The bottom line is that Wanda Sykes has the longest continuously documented family tree of any African-American we have ever researched, ” said Mr. Gates. The Wanda Sykes episode will air in May — mark that one down now!

I read the New York times article (see prior link, twice) and was startled by Wanda Syke’s family tree — a new twist again on African American genealogy (there is a fascinating reason why her roots are so well documented).

Scan Pages App

I was so motivated that I used the NYT newspaper article to demonstrate a new App on the iPhone (yes its free), called Scan Pages. I had previously download another Ricoh App, ImageToText (which OCRs an iPhone image and emails you the OCR’ed text) –which I had occaision to use during the RootsTech 2012 conference. The Scan Pages App is not an OCR application. It does create images (B/W are much better than color) that can be mailed in JPG of PDF format. It cleans up the images (in terms of straightening or de-skewing) and emails them to you. I did the newspaper’s article of Wanda Sykes in PDF. I also did a Scan Pages black-white image of a 1905 directory of Catholic Churches in Detroit . If you compare the two PDF documents, it is clear the cleaned-up B-W (1905 Churches) beats the color (Wanda Sykes).

Later on this week, I am going to try using the App, PDF Splicer, to put together in one PDF document the two emails of the two page article on Wanda Sykes. I also have two pages of that 1905 Church Directory so I will also try combining those two PDFs as well in PDF Splicer.

Mobility Genealogy is a fast-paced niche. Keep tuned in here for ideas to use in your research.


March 3, 2012

Google’s Chrome Browser For Genealogy — #Genealogy, #Technology

by C. Michael Eliasz-Solomon

     Stanczyk was a big Mozilla/Firefox browser user. On Mac or Windows it did not matter. So it was a shock that I switched to Chrome (Google’s browser).

I did so mostly on Google’s promise that “microdata” would be another widget that would greatly enhance the search experience for genealogy data. I waiting on that feature — still am waiting.

On Tuesday I mentioned Virtual Keyboard 1.45, for entering your diacriticals through your browser into say Today, I was reading Kathy Judge Nemaric’s blog – “Dead Reckoning” [nice name for a genealogy blog] and she mentioned an extension to the Chrome Browser. It is called Ancestry Family Search Extension 2.4 .

     Open up a new Tab (Ctrl-T works) and click on Chrome Web Store. In the “Search Store” field, type in “Ancestry Family Search” and press the Enter key to bring up the extension (see on the left).

Click on the Add to Chrome button and then click on the Install button in the dialog box that pops up to confirm your wish. Once you have installed the extensions into your Chrome browser, it will show like the following screen:

Now you are ready to reap the rewards of that hard work. Go to and perhaps open up your family tree on an individual you are working on. Now your browser’s address bar has a new  “widget”. Next to the STAR widget you have been using to Bookmark pages is a new widget shaped like a TREE.

See the red circle (and arrow)? Just click on that and it will bring up a new window on top the current TAB in your browser with (in my case) Tomasz Leszczynski result set from the Family Search databases. If you click on one result, then a new TAB will open to the exact record in Family Search.

This is a very nice synergy between the two websites. So I am thinking, that if Google produces their microdata widget, that 2012 will be the year of the widget in Genealogy and perhaps the year of the CHROME browser too.

There is one microdata Schema Explorer browser extension already in the Chrome Web Store. But you will want to wait for Google’s which will use the website: . I am guessing Google will use this website to develop schemas to guide its browser.

2012 is shaping up to be a very good year for genealogy and to switch to CHROME!

February 28, 2012

Dying For Diacriticals … Beyond ASCII — #HowTo, #Genealogy, #Polish

by C. Michael Eliasz-Solomon

Stanczyk mused recently upon a few of the NAMEs in my genealogy:

Bębel, Elijasz, Guła, Leszczyński, Kędzierski, Wątroba, Wleciał, Biechów, Pacanów, Żabiec

If you want to write Elijasz (or any of its variants) you are golden. But each of the other names require a diacritic (aka diacritical mark). Early on, I had to drop the diacritics, because I did not have computer software to generate these characters (aka glyphs). So my genealogy research and my family tree were recorded in ASCII characters. For the most part that is not a concern unless you are like John Rys and trying to find all of the possibly ways your Slavic name can be spelled/misspelled/transliterated and eventually recorded in some document and/or database that you will need to search for. Then the import becomes very clear. Also letters with an accent character (aka diacritic) sort differently than  letters without the diacritic mark. For years, I thought Żabiec was not in a particular Gazetteer I use, until I realized there was a dot above the Z and the dotted-Z named villages came after all of the plain Z (no dot) villages and there was Żabiec many pages later! The dot was not recorded in the Ship Manifest, nor in a Declaration of Intent document. So I might not have found the parish so easily that Żabiec belongs to. I hope you are beginning to see the import of recording diacritics in your family tree.


The rest of my article today teaches you how to do this. Mostly we are in a browser, surfing the ‘net, in all its www glory. After my “liberal indoctrination” (aka #RootsTech 2012), I have switched browsers to Google’s Chrome (from Mozilla Firefox) browser. Now I did this to await the promised “microdata” technology that will improve my genealogical search experience.  I am still waiting,  Mr Google !!!   But while I am waiting, I did find a new browser extension that I am rather fond of that solves my diacritical problem: Virtual Keyboard Interface 1.45. I just double-click in a text field and a keyboard pops-up:

Just double-click on a text field, say at . Notice the virtual keyboard has a drop down (see “Polski“), so I could have picked Русский (for Russian) if I was entering Cyrillic characters into my family tree.

But I want to keep using my browser …            OK!  Now I used to prepare an MS Word document or maybe a Wordpad document with just the diacriticals I need (say Polish, Russian, and Hebrew) then I can cut & paste them from that editor into my browser or computer application as needed — a bit tedious and how did I create those diacritical characters anyway?

I use  Character Map in Windows and Character Palette -or- Keyboard Viewer  on the MAC:

Now if I use one of these Apps, then I can forgo the Wordpad document  ( of special chars. ) altogether and just copy / paste from these to generate my diacritical characters.

What I would like to see from web 2.0 pages and websites is what Logan Kleinwaks did on his WONDERFUL website. Give us a keyboard widget like Logan’s, please ! What does a near perfect solution look like …

Logan has thoughtfully provided ENglish, HEbrew, POlish, HUngarian, ROmanian, DEutsche (German),  Slavic, and RUssian characters. Why is it only nearly perfect? Logan, may I please have a SHIFT (CAPITAL) key on the BKSP / ENTER line for uppercase characters? That’s it [I know it is probably a tedious bit of work to this].

Beyond ASCII ?

The title said  beyond Ascii. So is everything we have spoken about. Ascii is a standard that is essentially a typewriter keyboard,  plus the extra keys (ex. Backspace, Enter, Ctrl-F, etc.) that do special things on a computer. So what is beyond Ascii? Hebrew characters (), Chinese/Japanese  glyphs (串), Cyrillic (Я), Polish slashed-L (Ł), or Dingbats (❦ – Floral Heart). You can now enter of these beyond ascii characters (UNICODE)  in any program with the above suggestions.

Programmer Jargon – others  proceed with caution …

The above are all UNICODE character sets.  UTF-8 can encode all of the UNICODE characters (1.1 Million so far) in nice and easy 8bit bytes (called octets — this is why UTF-8 is not concerned with big/little endianess). In fact, UTF-8‘s first 128 characters is an exact 1:1 mapping of ASCII making ascii a valid UNICODE characters set. In fact, more than half of all web pages out on the WWW (‘Net) are encoded with UTF-8. Makes sense that our gedcom files are too! In fact UTF-8 can have that byte-order-mark (BOM) at the front of our gedcom or not and it is still UTF-8. In fact the UTF-8 standard prefers there be no byte order mark [see Chapter 2 of UNICODE] at the beginning of a file. So please FamilySearch remove the BOM from the GEDCOM standard.

If FamilySearch properly defines the newline character in the gedcom grammar [see Chapter 5, specifically 5.8 of UNICODE] then there is nothing in the HEAD tag that would be unreadable to a program written in say Java (which is UTF-16 capable to represent any character U+0000 to U+FFFF) unless there is an invalid character which then makes the gedcom invalid. Every character in the HEAD tag is actually defined within 8bit ascii which can be read by UTF-8 and since UTF-8 can read all UNICODE encodings you could use any computer language that is at least UTF-8  compliant to read/parse the HEAD tag (which has the CHAR tag and its value that defines the character set). Everything in the HEAD tag, with the exception of the BOM is within the 8bit  ascii character set. Using UTF-8 as a default encoding to read the HEAD will work even if there is a BOM.

February 27, 2012

#PA #Genealogy – Access To Vital Statistics — Public Access, Privacy Law, PA Act 110, SB 361

by C. Michael Eliasz-Solomon

PA Act 110 – Public Records (formerly known as Senate Bill 361)

This bill amends the Act of June 29, 1953 (P.L. 304, No. 66), known as the Vital Statistics Law of 1953, to provide for public access to certain birth and death certificates after a fixed amount of time has passed. This legislation provides that such documents become public records 105 years after the date of birth or 50 years after the date of death.

This is a mixed bag, but at least its consistent. I wish it was 72 years  (like the census) instead of 105. Also the 50 years after death is way too long. Dead is dead. Maybe you could make a case for 5-10 years. By doing greater than 30-35 years you are forcing genealogy research to skip generations since the current generation would die before gaining access. Genealogists will have to will research plans to children in PA.

The indexes (I hate the word indices) are here: Birth Index (1906 — so far that’s it) | Death Index (1906-1961).   By the way, you will need the American Soundex of the last name as this is how the records are sorted:  American Soundex of Surname, followed by alphabetical on FirstName. Use Steve Morse’s Soundex One-Step page.

February 19, 2012

Meme: #RootsTech — #Genealogy, #Technology

by C. Michael Eliasz-Solomon

A while ago, Stanczyk bemoaned iOS5. Therefore, I owe it an update …

  • Portable Genealogy is sound – Ancestry App better than ever
  • The Camera App in iOS5 does have a zoom. In fact if you use the familiar “pinch-gesture” you can zoom in/out and the old zoom slider appears too. Also you can use the Volume Up button (on the side of the phone to take a picture — helpful when the camera is rotated.
  • Just having the iPhone was very useful during the #RootsTech conference as my note taking device. Until iPad2(3) arrived(s) and it has both WiFi/G3 (LTE) I would have been without blogging capabilities in the Salt Palace convention center when its WiFi would go down. I utilized the #RootsTech App (for iPhone & there was one for Android too).
  • In the library it was my digital  camera.
  • In fact the ImageToText App came in handy to OCR an image of text for me
  • I used the Ancestry App to enter the transcribed text from the microfilm images right into the evidence (note area) of the app of an indivividual and attached the iPhone picture too.
  • In one case, I was able to get an immediate shaky leaf as a result of my data entry — much to my disbelief (and it was correct). So I could do an immediate on-site analysis and do further microfilm searching as a result.
  • I used the Bump App to swap contact info with one genealogist. I cannot wait until all genealogists become mobile-enabled and lose my business cards altogether. Hint to RootsTech Vendors you should use Bumps too to collect user info. Why do I have to drop a business card into a fishbowl??? Do a BUMP,  get a chotsky (swag). Leave the fishbowl for  the Luddites.
  • Are you a Slavic (Czech, Pole, Russian, etc.) genealogist? Then you must be dying for diacriticals. You could add an international keyboard. But why? In iOS5, just press and hold down the ‘ l ‘ key and up will come a list including the slashed-l. Just slide your finger over onto the slashed-l to enter that. Likewise, for entering ‘S, E, A, Z, C, N, etc.’ too — works upper/lower case. Of course if you have German ancestors, you can get your umlauts too in the same fashion. That trick is a Latin Alphabet data entry trick (sorry Cyrillic or Hebrew readers — try the International Keyboard trick).
February 16, 2012

GEDCOM “RailRoad Tracks” (aka Graphic Syntax Diagram) – #Genealogy, #Technology

by C. Michael Eliasz-Solomon

The above diagram is what Stanczyk had been jabbering about since the #RootsTech conference. Isn’t that much easier on the eyes and the grey matter than a complex UML diagram? Who even knows what a UML diagram is or if it is correct or not?

What does it say is in a GEDCOM file (ex.  Eliasz.ged)?

A HEAD tag  optionally followed by a SUBmissioN Record followed by 1 or more GEDCOM lines followed by a TRLR tag.

ex. gedcom lines  that can be “traced” along the railroad tracks at the top.

 1 SOUR Stanczyk_Software
 1 SUBM @1@
 2 VERS   5.5.1
 0 @1@ SUBM

OK Stanczyk_Software does not exist, but was made up as a fictitious valid SOURce System Identifier name. The GEDCOM file (*.ged) is a text file and you can view/edit the file with any text editor (vi | NotePad | WordPad | etc.). I do not recommend editing your gedcom outside of your family tree software, but there is certainly nothing stopping you from doing that ( DO NOT TRY THIS AT HOME). If you knew gedcom, you could correct those erroneous/buggy gedcom statements that are generated by so many programs — that cause poor Dallan Quass to ONLY acheive 94% compatibility with his GEDCOM parser.

Have you ever downloaded your gedcom from ANCESTRY and then uploaded it to RootsWeb? Then you might see all those crazy _APID  tags.   It is a custom tag (since it begins with an underscore  — GEDCOM rules dear boy/girl).   It really messed up my RootsWeb pages with gobbledygook. I finally decided to edit one gedcom and remove all of the _APID tags before I uploaded the file to RootsWeb. Aaah that is SO much better on the eyes. Oh I probably do not want to re-upload the edited gedcom into ANCESTRY, but at least my RootsWeb pages are so much better!   The _APID is just a custom tag for ANCESTRY (who knows what they do with it) so to appeal to my sense of aesthetics, I just removed them — no impact on the RootsWeb pages, other than improved readability. [If you try this, make a backup copy of the gedcom and edit the backup copy!]

Now obviously the above graphic syntax diagram is not complete. It needs to be resolved to a very low level of detail such that all valid GEDCOM lines can be traced. It also requires me/you to add in some definitional things (like exactly what is a level# — you know those numbers at the beginning of each line).

I have a somewhat mid-level  graphic syntax diagram that I generated using an Open Source (i.e. free) graphic syntax diagrammer, as I said in one my comments, I will send it to whoever asks (already sent it to Ryan Heaton & Tamura Jones). You can get a copy of Ryan Heaton’s presentation from RootsTech 2012 and compare it to his UML diagram (an object model). I think you will quickly realize that you cannot see how GEDCOM relates to the UML diagram — therefore it is difficult to ask questions or make suggestions. A skilled data architect/data modeler or a high-level object-oriented programmer could make the comparison and intuit what FamilySearch is proposing, but a genealogist without those technical skills could NOT.

I am truly asking the question, “Can a genealogist without a computer science degree or job read the above diagram?” and trace with his finger a valid path of correct GEDCOM syntax [ assuming a whole set of diagrams were published]. The idea is to see how the GEDCOM LINES (in v5.5.1 parlance FAMILY_RECORD, INDIVIDUAL_RECORD, SOURCE_RECORD, etc.) are defined and whether or not what FamilySearch is proposing something complete/usable and that advances the capabilities of the current generation of software without causing incompatibilities (ruining poor Dallan Quass’s 94% achievement). Will it finally allow us to move the images/audio/video multimedia types along with the textual portion of our family trees and keep those digital  objects connected to the correct people when moving between software programs?


GEDCOM files are like pictures of our beloved ancestors. They live on many years beyond those that created them. Let’s not lose any of them OK?

February 12, 2012

GEDCOM Standards – Where Genealogy Meets Technology — #Genealogy, #Technology, #Standards

by C. Michael Eliasz-Solomon

Stanczyk, has been churning since about November of last year (2011).  I have a number of ideas rummaging around my brain for genealogy apps. For over a quarter century, I have been a computer professional and used and/or developed a lot of  programs using a myriad of technologies. At my core, I am a data expert: design it, store it, query it, manage it, analyze it and protect it. It being the data.

Before going to #RootsTech 2012, I knew GEDCOM was the core of our hobby/business/research. GEDCOM is our defacto standard. It is how data in exchanged between us and our various programs. I say defacto because as a standard goes it is not a very open standard (one organization “owns”   it, and  the rest of us go along with it). It also has not changed in about decade and a half; So Ryan Heaton was correct in calling it “stale”. It does still work .. mostly. Although if a standard does not progress then you get a lot of proprietary “enhancements” that prevent the interchange of data completely — since one vendor does not know how to deal with another vendor’s file in totality.

At present, GEDCOM maxes out at version 5.5, although there are various other variations you might  see. But 5.5 was the last standard version. I counted 128 total tags and a provision for creating non-standard tags (they start with an underscore).

[Mike thanks to Tamura Jones! Even though GEDCOM v5.5.1 was never finalized, it IS the defacto max version of GEDCOM. GEDCOM v5.5.1 added 9 tags, removed the BLOB tag, so we now have a total of 136 tags.   -- I will need to update even my high level graphic syntax diagram]

Tags are like:

INDI,   FAMC,   FAMS,   SOUR,   REPO,   HEAD,   TRLR    etc.   -or-      ALIA,   ANCE

The first bunch is familiar and are probably in your family tree (if you ever exported the GEDCOM file). The ALIA tag is one that Dallan Quass said was universally used wrong by all programs. After seeing its definition, I can see how it  is confusing.  As for the ANCE, tag I do not recall seeing any program letting me do any functionality that might utilize this tag. This tag is probably one of those tags that Dallan said is not used at all.

I looked at the “MULTIMEDIA” section of the standard. It looks like it is woefully out of date and probably not used at all (at least not in any standard way), which is probably why our pics, audio, and video (or any other media file like PDF, MS Word) do not move with the GEDCOM. Has any program ever used the ENCODING/DECODING of a multimedia file? The standard seems to imply a buffer of only 32K (for a line) and even if you used a large number of  CONC tags strung one after another you need 100 lines to store a 3.2MB file in-line in the GEDCOM. I do not think I have seen that in a GEDCOM. They probably stored these binary large objects (BLOBs) outside the gedcom and refer to their path on the computer/network.  I did some noodling. I have 890 MB (or approximately  890,000 KB) in pictures and scanned source documents for about 1,000 people in my family tree. So I use nearly a gigabyte (1GB) for my family tree and all other multimedia — and I do not have any audio or video!  So I use almost 1MB/person.

If we did have this magical new GEDCOM standard that could carry all of our multimedia from one GEDCOM program to another GEDCOM program, the copying would take a long time. If I uploaded/download it to/from the Internet, I might incur an overage on my ISP’s usage charges, if this were technically feasible!   Imagine if I did this multiple times a month (as I got updates). I am beginning to understand why no vendor has tackled the problem. I would also like to store PDFs and other documents besides GIF/JPG/PNG which can be displayed on the Internet web pages natively in a browser. Those are not a part of the existing GEDCOM standard. Let me sling some jargon — I’d want to store any file type that there is a MIME type definition for,  that I can currently embed in emails,  or utilize in Java programs or that the HTML5 standard will allow for multimedia.

The GEDCOM 5.5 was in its infancy on dealing with character sets. It was predominantly ASCII with some funky ANSEL coding of characters to handle latin alphabet diacriticals, although it is not clear how I would do the data entry for those and it looks incomplete. It did mention UNICODE, but only cursory and just to remind us that the lengths in the GEDCOM standard were in  ‘characters’ not bytes –which was correct. Although those multibyte characters (say in Hebrew, Russian or Japanese or Chinese) would quickly use up the 32K byte line buffer  limit, which would effectively become about 8K characters per line. In fact, GEDCOM 5.5 says it will only deal with LATIN alphabets and leave Cyrillic, Hebrew and Kanji for some far flung future. Stanczyk  is Slavic, I need UNICODE to represent my ancestor’s names and places. Fortunately, I do not feel the need for Cyrillic (Russian, Ukrainian, Belorussian, Macedonian, etc.) or I’d be out of luck. I’ll just use the Polish version of those names in their ‘Latinized’ forms.

Oh that is another area the standard needs to be enhanced. NAMES. Dallan mentioned that Personal Names do not get a thorough treatment in the standard (I am refusing to read the data model and I am a Data Architect). Location Names get almost no treatment — they do give you a place to store your locations  (PLAC tag). What language should I use, after all my ancestors are from POLAND for God’s sake. Besides the obvious Polish, I have German, Russian and Latin to deal with and being American I prefer English. Slavic names often do not translate well. For example Wladyslaw is Ladislaus in Latin, but in English there is no equivalent — maybe that is why my ancestors use ‘Walter’ instead. But the point is, how should I store the name? Can I store all of the equivalents and search on any of them? Nope.

Damn, Russian is Cyrillic.  GEDCOM doesn’t deal with non Latin alphabets;  And even though I can read the Russian genealogy records, I ‘d rather not nor would I want to try and do data entry that way either. Besides, the communists reformed the language in 1918 (making War & Peace considerably shorter in Russian); That reform eliminated several characters. Most modern software is not aware of the eliminated characters  much less able to generate them. This whole Language/Unicode/Name thing is complicated and I have not even mentioned the changing borders or the renaming of cities in different languages or over time or their changing jurisdictions. I cannot fault GEDCOM for all of these woes. I have them in my own research and I have not yet found any satisfying way to  handle them. I find it helps to have a very good memory and keep these things in my head — but there is no backup for that.

How are we ever going to arrive at the vision Jay Verkler put forth at #RootsTech?  GEDCOM needs to become an open standard. Once it is standardized again, then it needs to become modern again and deal with the current technology, so we can get around to the tough problems of conforming: names, places, sources/repositories, calendars/dates  and doing complex analyses like Social Network Analysis as a way to gather wayward ancestors into a family for which we lack documentation to prove (Genealogically). I hope the future includes Bieder-Morse phonetic matching and can deal with folding diacritical characters into a base character (ex.  change ę into e) for searches.

FamilySearch, if you are going to register GEDCOM tools, then please do a few more things for the NEW standard. First, make each vendor add to an APPENDIX the name and complete definition of their NON-STANDARD tags, in case anyone else wishes to implement or deal with them. Put a section in the header (HEAD tag) that lists all NON-STANDARD tags (just once each) along with its vendor so that someone else can go look at the standard and see what these tags mean and possibly implement the good ones. Forget that two byte thing before the HEAD tag. Just make the HEAD tag ‘s  CHAR sub-tag indicate the character set (ANSI | ANSEL | UNICODE ).  Please administer a #RootsTech keynote to vote on annual changes to the GEDCOM standard. Provide a GEDCOM validator and also a GEDCOM converter webpage to allow users/vendors to validate/convert their gedcom file(s).

Make multimedia be meta-data and allow users to define “LOCATIONS” where multimedia files can be found using either a PATH or a URL (or a relative path / URL). Make it a part of the standard that the meta-data must move, but the multimedia files can optionally stay put. Multimedia should be able to be placed on a LOCAL/NETWORK, or on the INTERNET or on a multimedia  removable volume(s) [thumb drives, CDs, DVDs, etc.]. Make the multimedia “LOCATIONS” editable so a user can switch between LOCAL/NETWORK, INTERNET, or REMOVABLE including using some of each type of LOCATION. Allows these files to exist or not (show “UNAVAILABLE” or some equivalent visual clue, if accessed and they do not exist).  The mapping between an Individual (INDI) or a family (FAM) or some other future GROUP and its multimedia file(s) must move as a part of the meta-data (even if the multimedia file(s) do not). That way the end-user need only edit his LOCATIONS meta-data (and ensure the files are in that/those location(s)) when he runs the software.

Define an API for GEDCOM plug-ins so that new software can access the GEDCOM without parsing the gedcom file. The API should give the external plug-in a wrapped interface to the underlying data model without having to know the data model, just the individual, family, or location, or a name list of individuals, families, or locations. This will allow new software to provide additional functionality to a family tree or to provide inter-operability between trees/websites. Obviously security/privacy rules would limit this kind of  plug-in access.

That’s Stanczyk’s vision of the GEDCOM future!

January 24, 2012

Genealogy 2012 – State of the Union

by C. Michael Eliasz-Solomon

If you follow Stanczyk‘s posts, then you know the first 2012 Genealogical Website Ratings were published yesterday. I wanted to follow-up on that article’s meme with yet a further muse.

The ratings show that there was quite a bit of a shuffling around. Overall though, genealogy websites are nascent. That is my meme for today:  The State of Genealogy is Very Good and Is Improving. In a little over a week, RootsTech 2012 conference will happen. The convention shows many of the top web sites are attending:,,,, LegacyFamilyTree, MyHeritage, RootsMagic,, AgesOnline, etc. In the middle of this conference, the “Who Do You Think You Are“, show will debut (3-Feb-2012). Late March brings us PBS’s “FINDING YOUR ROOTS…” So the first quarter looks promising. Do you doubt this jester?

Perhaps the Baron’s Online article, ” ‘Tis the Season For” will convince you. Bob O’Brien (the author) analyzes  the stock performance of Ancestry in light this convergence. He does not reference RootsTech nor PBS — but this jester does. Also adding to the synergy for 2012 Genealogy is the release of the 1940 US Census on April 2. So 2012 has all the makings for genealogy’s best year ever. Baron’s does mention the 1940 Census too.

Now a successful business climate for genealogy – software, hardware, and services can only mean many good things will be coming for us genealogists. Let me urge you to greater heights in your research by lending your efforts in your research and also in collaborating on the Internet. We can all push our own research (and of course those distantly related to us) forward and ride the rising tides of the 2012 Genealogy Surge.

For good measure the biennial United Polish Genealogical Societies Conference in late April is also happening this year. So Polish Genealogy should be able to ride the tide of popularity too.

RootsTech looks like it will have its emphasis on the Internet with its evolving collaborative tools (social networks, HTML5, new databases, blogs, developer tools/frameworks/standards to enhance the collaborative/connection making nature of genealogy and provide richer search/match tools/techniques, etc.). Catch this break-out year!

That’s the Meme – The State of Genealogy in 2012 is very promising.

January 23, 2012

2012 1st Quarter – Genealogy Website Rankings — #Genealogy, #Rankings, #Website

by C. Michael Eliasz-Solomon

Welcome to Stanczyk’s  2012 First Quarter Genealogy Website Rankings. I know I am a week early — c’est la vie! Since my last rankings an array of rank postings [uh, pun partly intended] have appeared. Stanczyk has also received exactly one request for inclusion in his rankings, from .. Tamura Jones about his website: [#58 on the new Rankings]. He also has a worthy Twitter page too. Keep sending in recommendations — I will keep thinking about them or including them if they are worthy. I liked Tamura’s stuff so MUCH, that I added his genealogy page to my blogroll [Modern Software Experience at the right].

I really liked the survey from the Canadian website: Genealogy In Time. I added their magazine/website (#13)  as well to my rankings.  I found them because they produced an excellent Genealogy Website Ranking (mid January 2012), that included a very thorough discussion of their methodology. They neglected a few Polish Websites that SHOULD have made their list. Also they list in all of its many global incarnations and this eats up an unnecessary number of the top 125 poll slots.   But aside from those minor criticisms, their rankings is very GLOBAL and very good. Who knew there was a Chinese (make sense, considering their billion plus citizens and their excellent genealogical records) genealogy website or a Finnish website too in the top 125???

OK, Stanczyk will keep his Rankings  list, because of the emphasis on Polish / Slavic genealogical websites. Stanczyk also has many in the range 100-125 that are very useful though not popular enough to be the Genealogy in Time Rankings. However, the Genealogy-In-Time-Poll, makes a very useful tool in another way. They have graciously included the website links (URLs) of each site, making it rather easy to build a genealogical Favorites/Bookmarks list that is broadly useful. Stanczyk admits to his list being somewhat selective in the lower 1/3 in order to be more valuable to Polish Researchers (in particular to English speaking, though not exclusively so). On a personal note, this blog you are reading is in the top 5.8Million (of all websites world-wide) and is #120 on my Website rankings — come on readers give me a boost, please!

Needless to say, all website rankings I read, agree on the top 20-40 websites (putting aside the multiple listing of

Here is a snippet of the Rankings and the rest are on the Rankings Page:

January 19, 2012

Genealogy and Its Popularity – #Genealogy, #Media

by C. Michael Eliasz-Solomon

Stanczyk has maintained for a few years that genealogy as a hobby ranks second only to gardening as a hobby engaged in by Americans. Perhaps this is the year we begin the assault on the number one spot.

If you love genealogy (and I assume you do because you read this blog) and/or history and biographies, then 2012 is your year! Of course we all look forward to Lisa Kudrow’s annual send-up, “Who Do You Think You Are?“.

Who Do You Think You Are, the American genealogy documentary series on NBC returns on February 3, 2012.  The third season will have shows on:  Marisa Tomei,  Rob Lowe,  Paula Deen,   Rashida Jones,   Jerome Bettis,   Reba McEntire,   Helen Hunt,   Edie Falco,   Rita Wilson,   Jason Sudeikis,   Martin Sheen  and   Blair Underwood.


On PBS, if we can divert you from Downton Abbey,  we have “FINDING YOUR ROOTS WITH HENRY LOUIS GATES, JR.which premiers Sunday, March 25th.  This 10-part series will delve into the genealogy and genetics of famous Americans. Dr. Gates will cover the family trees of:  Kevin Bacon, Robert Downey, Jr., Branford Marsalis, John Legend, Martha Stewart, Barbara Walters and Rick Warren, among  others in this 10-part series. Make sure you watch Martha’s show — she’s   a  Kostyra [i.e. Polish] !


Do not forget that  BYU  TV channel  of course has an ongoing genealogy show (The Generations Project). This channel is not available everywhere (yet)  — check it out.


So this jester maintains that genealogy is even more popular than ever!   What do you think?

January 7, 2012

OH – Cleveland/Cuyahoga County Eliasz/Elijasz #Polish, #Genealogy

by C. Michael Eliasz-Solomon

Yesterday in the blog, Stanczyk emailed in an Ancestry database of note. They had an index of Marriages from Cuyahoga County, OH (the Cleveland area) 1810-1973. Most of these are marriage returns from the officiant and list little more than the bride, groom and marriage date and the officiant. Some do in fact list ages of the bridal party or their residences and even two of mine had the parent names.

Now this plays into an earlier blog article of mine about the Cleveland Eliasz/Elijasz, asking for any ancestors to write this jester and discuss family trees. [None so far.]

I was hoping for and found the marriage record of Stanislas Hajek and Agnes Eliasz ! Of all the Cleveland Eliasz/Elijasz this marriage was most convincing to me that they are relatives,as both Stanislas and Agnes (Agnieszka) were from Pacanow, which is my grandfather’s birth village. From a Polish Genealogical Society website ( email I received from a Baran, whose grandmother was an Eliasz, and from Ship Manifests, I was able to place this Agnes Eliasz in my family tree as a daughter of Jozef Eliasz & Theresa Siwiec (whose direct line ancestor a while ago sent me my grandparent’s marriages records – civil and church).

Truly the Internet makes this world a smaller place. So today, I am transcribing the married couples from the Cuyahoga County, OH marriages returns of 1913 on the same page with Stanislas Hajek & Agnes Eliasz (from page 193):

Michael Blatnik & Mary Hocevar August 25th, 1913 [#21537]
John Spisak & Veronika Busoge August 25th, 1913 [#21538]
Joseph Wisniewski & Frances Kotecka August 25th, 1913 [# 21539]

Stanislas Hajek & Agnes Eliasz August 25th, 1913 [# 21540]

George Csepey & Helen Weiszer August 26th, 1913 [# 21541]

Boleslas Zaremba & Alexandra Alicka August 26th, 1913 [# 21542]

Louis Rutkowski & Anna Solecka August 26th, 1913 [# 21543]

Aloys Salak & Anna Pisek August 26th, 1913 [# 21544]

Almost all of them look Slavic and most of those names are Polish. Cleveland, a large GreatLakeCity, an American enclave of Poliana in the early 20th century.

Related Ancestry DBs:
US, Ohio, Cuyahoga County, Jewish Marriage Record Extracts, 1837-1934
Ohio Marriage Index, 1970, 1972-2007
Ohio Marriages, 1803-1900
Ohio Divorce Index, 1962-1963, 1967-1971, 1973-2007


January 1, 2012

Stanczyk 2011 In Review – #Genealogy, #Polish

by C. Michael Eliasz-Solomon

The stats helper monkeys prepared a 2011 annual report for this blog.

Here’s an excerpt:

The concert hall at the Syndey Opera House holds 2,700 people. This blog was viewed about 8,300 times in 2011. If it were a concert at Sydney Opera House, it would take about 3 sold-out performances for that many people to see it.

Click here to see the complete report.

December 17, 2011

A Little Bit of Blog Bigos … #Genealogy, #Website #Rankings, #SSDI

by C. Michael Eliasz-Solomon

Stanczyk has a lot of catch-up to do. I blame it on the season and the Blood Red Lunar Eclipse — certainly that must be cause of the madness this December.


So many blogs have written about the Social Security Death Master File and the many related issues. First millions of records were dropped by the SSA. Next the SSA, and this has probably been going on for months, started redacting the names of the parents on the SS5 Applications, thus eliminating the usefulness of that research tool. Now Congress has bullied the paid genealogy databases (and even Rootsweb) to drop the SS# from their databases on deaths in the last ten years. Rootsweb just dropped their Social Security Database altogether!

Now let me remind the lame (not lame duck) Congress that the Social Security Death Master File is used to inform banks/financials/loan companies/credit card companies etc. that these SS#’s are of the DECEASED and that they should not grant any NEW credit applications with the Social Security Numbers in the Social Security Death Master File! Ergo, having the SS# of a dead person should not avail any criminal and should in fact result in their arrest for fraud, as the afore mentioned companies are supposed to check the Social Security Death Master File against credit apps. Therefore, there is really is no need to  eliminate the SS#’s from or any other database. By eliminating these numbers you cannot order the SS5 Applications — which is just as well since the SSA has made them much less useful. The result is: genealogists have less data available and the US Government has less MONEY($) available since the genealogists now have two reasons not to order the SS5 Applications any longer. The result is the US Government will now lose another source of income??? Boy, is this CONGRESS the biggest bunch of idiots or what?

Eastmans / Website Rankings

Dick Eastman’s Online Newsletter recently wrote about new website rankings and gave the URL/Link to a Anglo/Celtic website. Needless to say this is the website that caused this jester to produce a BETTER set of website rankings (please see my page above or at Genealogy Website Rankings). I ask you to please utilize my Genealogy Rankings as they are based upon resources in more common use in the USA (and Canada), such as or or or any Polish-related website or blog. So I am compelled — not because I am as popular as (#12),  vs Stanczyk (#120). But clearly leaving off the Steve Morse, or Ellis Island or the US NARA or Fold3 is not accurate in the USA and certainly NOT in the GLOBAL Genealogy market as a whole. Now this is foremost a blog about Slavic Genealogy (Russian-Poland overtly emphasized) and so I have made an effort to seek out and reflect Polish websites of Polish Genealogy websites/blogs (when their popularity reflects the need). I have intentionally not included because its Global Ranking is too low. It is a very well known website to Polish Genealogists and I am sure in Poland itself it would be in the top 125 (just not Globally). So while this blog has a certain voice, my website rankings deserve as much attention as those that Dick Eastman writes about. Perhaps one day will notice this blog and its Genealogy Website Rankings List — you my faithful readers can help me by emailing Dick Eastman and informing him about my set of Genealogy Website Rankings which is very thorough and includes the Top 125 Genealogy Websites — including Polish & American & Jewish (re NonAnglo-Celtic) websites too. EOGN should not be allowed to perpetuate its blind-spot to other genealogies. Now let me hasten to add the other Rankings does in fact mostly agree with my own Rankings on the top 10 or 20 Genealogy Websites — his Rankings lack Polish/American/Jewish sites and my own Rankings miss a few Anglo websites and all of’s other country sites (UK, CA, DE, AU, etc.) — which should probably be aggregated into but due to their many domains their totals are segregated by Alexa (ratings agency) and this jester chose not to include so many properties in the Rankings (which would exclude so many other worthy websites).

As before, let me remind new genealogists that this Genealogy Website Ranking could be utilized to create or augment your genealogy Bookmarks/Favorites. Obviously, they are valuable since a LOT of genealogists visit them.


I forgot to mention about (I put it into the newest Genealogy Website Rankings). I have briefly mentioned before (when I found them in my blog analytics). They are a new search engine, akin to Google. However, they are a Genealogy Search Engine and as such is enhanced to understand GEDCOM, genealogy, dates, places, etc. and their search results are more intensely accurate then say what you would get from Google. They also have the ability search databases and include those in results, as well as GEDCOMs. You have the ability to submit your family tree (GEDCOM) to Mocavo and they can provide you with notices of potential new matches — much like does for their subscribers. So instead of Googling you Family Tree, try MOCAVOing your Family Tree.

December 2, 2011

Family Search Website – Free Central / Eastern European Records – #Genealogy, #Slavic

by C. Michael Eliasz-Solomon

Stanczyk was checking out the family search  European Holdings for Slavic record counts / images to see what progress was made up through 2011.

It is good if your heritage includes the Germanic peoples or locales which were previously under their dominion. Do not get me wrong. I am thrilled that there  now over a million Polish records/images online or indexed at

Records Percentage
Austria 196,940 0.37
Czech Republic 85,469 0.16
Germany 50,998,675 96.98
Hungary Browsable Images Only 0.00
Poland 1,002,155 1.91
Russia 303,146 0.58
Slovakia Browsable Images Only 0.00
Ukraine 14,143 0.03

We have the ability to better. Please consider volunteering as an indexer. You can start and stop and start again, your  volunteering at any time. Find out more at Every little bit helps. Stanczyk managed to do over 150 records this year. Genealogy is collaborative. Helping each other, we also help ourselves. Please pitch in — make this part of your Random Act of Genealogical Kindness efforts.


Get every new post delivered to your Inbox.

Join 412 other followers

%d bloggers like this: