I. Description of the Data Set

This dataset is taken from the Arkiv Digital AD AB image and index database. When a child was born he or she was registered in a church record book called Birth and Christening records by the priest. They registered the name of the child, when the child was born and baptized, where the child was living and information about the father and mother of the child. The index is based on manual annotation of images from several books between the year 1800 to 1840.

The dataset consists of 191,301 index rows and 15,000 images and has been divided into
train: 133,941 index rows and 10,500 images
eval: 28,303 index rows and 2,250 images
test: 29,057 index rows and 2,250 images

Swedish county (län)
--------------------
Gävleborgs län - 23 982 index rows
Gotlands län - 9 925 index rows
Norrbottens län - 12 198 index rows
Västerbottens län - 16 118 index rows
Västernorrlands län - 21 014 index rows
Västmanlands län - 21 141 index rows
Älvsborgs län - 52 988 index rows
Örebro län - 33 935 index tows

Description of the index columns
--------------------------------
id - Arkiv Digital AD AB ID in database
index_aid - Index AID (Arkiv Digital AD AB external ID)
county - County where the child was born or registered (usually not in the image)
parish - Parish where the child was born or registered (can be written at the top of the page or entirely missing from the image)
child_first_name - Given name of the child
birth_date - Date of birth, format YYYYMMDD (on the image it is usually written DD/MM with the year on top of page)
baptism_date - Date of baptism, format YYYYMMDD (on the image it usually written DD/MM with the year on top of page)
birth_place - Place of birth
father_title - Title or occupation of the father
father_first_name - Given name of the father
father_last_name - Surname of the father
father_age - Age of the father when the child was born <== (available only in the master dataset SHIBRm)
mother_title - Title or occupation of the mother
mother_first_name - Given name of the mother
mother_last_name - Surname of the mother
mother_age - Age of the mother when the child was born
image_aid - Image AID (Arkiv Digital AD AB external ID)
image_path - Relative path to the image (images/)




Figure 1. SHIBR and what it serves: per discipline (a) and per community (b).

II. Use of the Materials

The users of the SHIBR Data Set must agree that:
  1. The use of the data set is restricted to research purpose only
  2. No redistribution of the dataset is allowed
  3. In any resultant publications of research that uses the dataset, due credits will be provided to:
    Abbas Cheddad, Hüseyin Kusetogullari, Agrin Hilmkil, Lena Sundin, Amir Yavariabdi, Mustapha Aouache, Johan Hall; "SHIBR-The Swedish Historical Birth Records: A Semi-Annotated Dataset," Neural Computing & Applications, Springer, 2021.

Link to the paper, Click here (open access)

III. Download Links


#### SHIBR Data Set: This page is updated on 2021-06-08
  1. Training Set: This set contains 10,500 RGB images in JPG format.
  2. Validation Set: This set contains 2,250 RGB images in JPG format.
  3. Test Set: This set contains 2,250 RGB images in JPG format.

Download_link_Kaggle

IV. Feedback or Comments


We will be pleased to get your feedback/suggestions to improve the dataset.



Karlskrona, Sweden on: 2021-06-08

Blekinge Institute of Technology