Posts tagged ‘bugs’

March 1, 2013

Thinking About @Ancestrydotcom ‘s GEDCOM — #Genealogy, #GEDCOM

by C. Michael Eliasz-Solomon

GorillaFamilyTreeAncestry.com (Twitter: @Ancestrydotcom ) is the proverbial 800 lb (362.87 kg) gorilla in the genealogical archive. You cannot miss him — mostly he’s lovable. So today after you read this blog post, Stanczyk wants you to tweet at him (see Twitter link above). I am hoping the big ape will make some improvements to their software. Hint .. Hint !

A couple of days ago (25-Feb-2013), I ran my PERL program against the GEDCOM file I exported from my family tree on Ancestry.com ‘s  website. That tree, the RootsWeb tree, and this blog are Stanczyk’s main tools for collaboration with near and distant cousin-genealogists (2nd cousins, 3rd, 4th, 5th cousins — all are welcome).

Quick Facts —

  1. No invalid tags  - Good
  2. Five custom tags – Also Good
  3. CHAR tag misused – ANSI [not good]
  4. My Ancestry Family Tree uses diacriticals: ą ć ę ł ń ó ś ź ż   in proper nouns [not good]
  5. Phantom Notes ??? [really not good]

So, Mr. Ancestry (sir) can you please fix #’s 3, 4, and 5, please?

CHAR -  I think Ancestry should use what is in the standards: ANSEL | UTF-8 | UNICODE | ASCII . I think this is easily do-able (even if all you do is just substitute ASCII).

This is not a picayune, nit-picky, persnickety, or snarky complaint. In fact, it leads right into the next problem (#4 above). Not only does Ancestry export the GEDCOM file as “ANSI”, it strips out my diacriticals too (as a result?). So now I have potentially lost valuable information from my research. For Slavic researchers, these diacriticals can be vital to finding an ancestor as they guide how original name was pronounced and how it might have been misspelled or mistranscribed in the many databases. Without the diacriticals that vital link is lost.

The last criticism is an insidious problem. Every time I exported the GEDCOM, I would get a note on one person in the tree. I would carefully craft the note on Ancestry, but what I received in the GEDCOM file downloaded would be different ???

I reported the problem to no avail and no response. This is not very good for an 800 lb gorilla.

Digging Deeper

I have since gone on to do some experiments and the results may astound you (or not). I copied the NOTE I was getting in my GEDCOM and saved it off to a text file, perplexed as to where it came from, since it was not the NOTE I was editing on Ancestry??? Now I did something bold. I deleted the note from that person on Ancestry and then downloaded the GEDCOM file again. Do you what I got? Wrong! I did not get my carefully crafted NOTE, I got yet another NOTE. I copied that note’s text and repeated my process of deleting the note and downloading the GEDCOM file a 3rd time. This time when I edited my GEDCOM file, I found MY note!!! But where/how did the other two notes come about? Why were there three notes? Why could I see and edit the 3rd note, but only get the first note when I downloaded the GEDCOM file? How did notes 2 & 3 get there? Why did I not get all three notes when I downloaded the GEDCOM? All good questions that I have no answer to. My suspicion is that Ancestry should not allow more than one EDITOR on a tree, other contributors should only be allowed to comment or maybe provide an ability to leave sticky-notes on a person [that does not go into a GEDCOM file]. I do not think the notes were created by their mobile app since I always saw my NOTE (and not the other two notes). I am chalking this up to an Ancestry.com bug and urging others who see strange things in their notes to take deliberate steps to unravel their notes. I hope Ancestry will fix this and let people know. I hope they fix all of items #’s: 3, 4, and 5.

So, my dear readers, I am asking you to tweet to Ancestry (as I will too) and  ask them for bug fixes. Perhaps if enough people tweet at @Ancestrydotcom, they will respond and not give us the cold  gorilla shoulder.

March 2, 2012

Diacritical Redux – Ancestry GEDCOM — #Genealogy, #Technology

by C. Michael Eliasz-Solomon

As Stanczyk, was writing about the GEDCOM standard since #RootsTech 2012, I began to pick apart my own GEDCOM file (*.ged). I did this as I was engaged with Tamura Jones (a favorite foil to debate Genealog Technology with). During our tête-á-tête, I noticed that my GEDCOM lacked diacriticals???

What happened? At first I thought it was the software that Tamura had recommended I use, but it was not the problem of that software (PAF). So I looked at the gedcom file that I had imported and the diacriticals were missing from there meaning, my export software was the culprit.

I looked at the GEDCOM’s  HEAD tag and the CHAR sub-tag, and it said “ANSI” [no quotes] was the value. That is not even a valid possible value! According to the GEDCOM 5.5.1 standard [on page 44 of the FamilySearch PDF document]:

CHARACTER_SET:= {Size=1:8}
[ ANSEL |UTF-8 | UNICODE | ASCII ]

Who is this dastardly purveyor of substandard GEDCOM that strips out your diacriticals (that I assumed you have been working so hard to add since my aritcle on Tuesday,  “Dying For Diacriticals“)? I’ll give you a HINT, it is the #1 Genealogy Website  – Yes,  it is ANCESTRY.COM !

Now what makes this error even more dastardly is that the website shows you the diacriticals in the User Interface (UI), but when you go to export/download the diacriticals are not there in the gedcom and unless you study things closely, you may be oblivious (as Stanczyk was for a long time) that these errors have crept into your research. I also found a spurious NOTE that I cannot find anywhere on anyone in my tree — which gets attributed to my home person (uh, me). This is very alarming to me too !!!

Tim Sullivan (CEO of Ancestry.com), I expected better of you and your website. I entrusted my family tree to you and that is what you did with my gedcom? Now I did some more investigating and I found that Ancestry does not strip ALL diacriticals. My gedcom had diacriticals in the PLAC tags and in NOTE tags. But NOT (I repeat NOT) in the NAME tags.

So Tim [pretend there is a shaky leaf here] , if you or a reputation defender or some other minion skims the Internet (for your name) here is what  I hope You/Ancestry.com will do:

  1. Do NOT strip diacriticals from the NAME tag !!!
  2.  Fix the Export GEDCOM to create a gedcom file with diacriticals in NAME tags
  3. Fix the Export GEDCOM to create a valid CHAR tag value: UNICODE, UTF-8, ASCII, ANSEL. I put them in my prioritized/preferred order [from left-to-right]. I hope you will not use ASCII or ANSEL.
  4. Run a GEDCOM validator against the gedcom file your Export GEDCOM software creates to download and fix the other “little things” too  (Mystery NOTEs ???).
Follow

Get every new post delivered to your Inbox.

Join 371 other followers

%d bloggers like this: