Skip to main content

Gedcom (STLM)

The word Gedcom designates a genealogical data exchange format.

It was originally developed by the Mormon Church for religious reasons, then retrieved by genealogists to exchange genealogical data between people runnning their genealogy applications on incompatible computer systems.

So Gedcom is a kind of genealogical language.

The word Gedcom, which is an acronym for Genealogical data communication, is spelled like an acronym: GEDCOM. By metonymic derivation, the word also designates a genealogy file in Gedcom format. The xxxxx.ged file you are working on within Ancestris is a Gedcom.

Since the mid-1990s, with the advent of the Internet and the proliferation of digital exchanges, the Gedcom specification has gradually become an essential standard for most genealogy software and sites.

However, while most of them can export in Gedcom format, some of them do not stricktly respect the Gedcom format and make some adaptations by adding proprietary structures or using existing ones for different meanings. In some cases, proprietary data structures cannot be properly converted to the gedcom format and some data might simply not exported.

Ancestris is fully Gedcom compatible (version 5.5 and 5.5.1). As a user, you may secure reliable and complete genealogy files, without risk of data loss, and share or transmit them with anymone.

Characteristics of a Gedcom file

A Gedcom file is a text file, i.e. a file containing human readable lines of text, that can be opened and edited using any text editor, such as Notepad, Kate, Kwrite, Gedit, etc. Its extension name is "*.ged".

As a result, such a file can be used *as is* by any genealogy software, installed under any operating system, without any conversion need.

Each line of text starts with a number and a label. The label is called a "tag". This tag is made up of three or four capital letters. It defines the type of information that follows on the line.

  • For example, the tag PLAC (= place) always indicates that the text that follows this tag is a place (such as place of birth, place of death, place of a ceremony, etc.)

Records of a Gedcom file

A Gedcom file contains a set of records. A record is a group of text lines, the first one of which starts with a zero "0". A record defines something in particular, which depends on the type of record.

The first and the last record of a Gedcom file are of a particular type:

  • The first record is called the header (HEAD tag) and defines some general information about the file.
  • The last record is called the end of file trailer marker (TRLR tag). It defines the end of the file.

Each of the other records defines a genealogical entity, with its own set of tags.

A Gedcom file uses 7 entity categories. The records that can be found in a Gedcom file are therefore as follows:

The choice to consider these 7 categories of data as being records is arbitrary of course, but it is always the case when creating a standard.

One could easily imagine other types of records, such as places for example. The fact that a place is not a separate entity does not prevent Ancestris from managing them and respecting the Gedcom format at the same time.

Tree structure of a record

Each record is presented in a tree structure: each tag can include any number of sub-tags.

Sub-tags are hierarchically depending on the next higher level tag, and may in turn include one or more sub-tags, etc.

Hierarchical levels

Hierarchical levels are numbered.

As each line must imperatively remain in its place from the point of view of the hierarchy, each of them is assigned a number corresponding to the level occupied in the tree structure of the record.

This is how the main level line of each record (that is, level zero) is numbered 0. A line located at the level immediately below bears the number 1. A line located at the level immediately below the previous level bears the number 2. And so on.

Identifier and entity records

As mentionned above, apart from HEAD and TRLR records, all the other records are entity records.

Each entity record starts with a level 0 line followed by the following:

  • The ID number of the entity surrounded by two at-signs (@),
  • The tag associated with the category to which the entity belongs.
    • For example, the line "0 @I5@ INDI" is the first record line of an INDIvidual entity which ID is 'I5'.

Indentation

For greater clarity, lines of a record can be indented so as to more clearly show the relationship between the lines of the record. The information lines underneath a tag qualify the tag.

  • Non indented record:
0 @I5@ INDI 					=> this defines indidual number 'I5'
1 NAME John Doe 				=> The indivudual's name is John Doe
1 SEX M							=> This individual is a male
1 BIRT							=> What follows defines his birth event
2 DATE April 16, 1951			=> John Doe was therefore born on April 16, 1951
1 FAMC @F1328@					=> Family F1328 is the record that defines John Doe's family (FAM) where he is a child (C)
  • Indented record:
0 @I5@ INDI 					=> this defines indidual number 'I5'
	1 NAME John Doe 			=> The indivudual's name is John Doe
	1 SEX M						=> This individual is a male
	1 BIRT						=> What follows defines his birth event
		2 DATE April 16, 1951	=> John Doe was therefore born on April 16, 1951
	1 FAMC @F1328@				=> Family F1328 is the record that defines John Doe's family (FAM) where he is a child (C)

The Ancestris Gedcom editor is the editor in Ancestris which shows you the exact information located in the Gedcom file enhancing the display of this information. This editor uses an indented display and does not show level numbers. It also adds handles to show or hide sub-tag levels making it easy to expand or collapse any branch.

  • This is how the same individual would show in the Gedcom editor:

Capture-d’écran-de-2020-08-28-21-56-02.png

As you can see, the Gedcom editor enhances the display by adding icons and by fetching relevant hints.

In particular, the "@F1328@" piece of data is replaced, only in the display, not in the real Gedcom file, with the relevant information about the family. Here, we therefore immediately know that John's parents are named Martin and Kelly.

Also the name is divide into its lastname and firstname parts.

Composition d'une ligne dans un enregistrement

Ligne standard

Chaque ligne d'un enregistrement contient essentiellement les éléments suivants :

  • Le numéro de niveau (de 0 à n),
  • Le tag indiquant la nature des informations contenues sur la ligne,
  • Les informations associées au tag en question.

Exemple :

  • La ligne 2 DATE 16 avril 1951 peut se lire ainsi : ligne de niveau 2, information de type DATE, et de contenu 16 avril 1951

Référence à une autre entité

Certaines lignes contiennent en outre une référence à une autre entité, laquelle consiste en un numéro encadré par deux arobases (@). Cette référence constitue un marqueur spécial dont le rôle est différent selon la place qu'il occupe par rapport au tag de la ligne.

  • Une référence située à gauche du tag indique le numéro de l'enregistrement courant (numéro toujours unique dans la catégorie d'entité dont il relève) : ce cas de figure ne se produit que sur la ligne de niveau 0 de l'enregistrement. Exemple :
    • 0 @I3@ INDI : ligne principale de l'entité faisant l'objet de cet enregistrement, numéro ID de cet enregistrement : I3, catégorie d'entité : individu.
  • Une référence située à droite du tag, indique le numéro d'un autre enregistrement, et renvoie à ce dernier afin de le mettre en relation avec l'enregistrement courant. Exemple :
    • 1 FAMC @F5@ : ligne de niveau 1, tag FAMC (famille dont descend l'individu courant) et référence F5 (autrement dit : l'individu courant descend de la famille F5)

Norme Gedcom

La norme Gedcom désigne l'ensemble des règles qui régissent ce qu'il est possible de faire et ne pas faire pour que tout le monde range les informations généalogiques d'une certaine façon. C'est donc la grammaire du langage Gedcom.

Deux normes principales existent, 5.5 et 5.5.1, la seconde étant une légère évolution de la première. Des choses permises dans la première ne le sont plus dans la seconde, et vice-versa. Ces différences sont néanmoins limitées.

Ancestris sait gérer les normes 5.5 et 5.5.1.

Vous trouverez en bas de page plusieurs liens qui rassemblent l'ensemble de la documentation que l'on a trouvé sur les normes Gedcom.

Nous vous proposons ici une traduction des points essentiels de la norme et leur utilisation dans Ancestris.

Norme Gedcom 5.5

Vous trouverez ici le détail de toute la norme 5.5 sous forme de liens web.

Tables des matières

Lettre de William S. Harten
Tableau type des données - Page 1 - Page 2
Introduction


Chapter 1: Grammaire de la Représentation des données

Chapter 2: Grammaire Liée à la Parenté (En français sur ce wiki Grammaire Gedcom.)

Chapter 3: Utilisation des jeux de caractères dans GEDCOM

Chapter 4: Enregistrement de la Production GEDCOM
Appendice A : Définition du Tag Gedcom Lié à la Parenté (En français sur ce wiki : Définition des tags)
Appendice B : Références Croisées

Appendice C : Codes LDS Temple
Appendice D : Jeu de Caractères ANSEL

Appendie E : Encoder/Décoder Objets Multimedia

Norme Gedcom 5.5.1

Vous pouvez aussi consulter la norme Gedcom 5.5.1 ici sous forme de fichier pdf en anglais : Norme Gedcom 5.5.1. Curieusement, les deux normes ne sont pas disponible sous le même format.

Vous trouverez dans ce même document un comparatif entre les deux normes.

Liens utiles