Synchronising colonial heritage: a linked data approach
Authors: Chris Dijkshoorn, Jauco Noordzij, Sjors de Valk
Since the seventeenth century, the Netherlands had trading posts and colonies in Asia, Africa and North and South America. Throughout this period many cultural objects from these regions have been brought to our country. These objects can still be seen in museums today. However, it is often unclear which objects should be considered 'colonial heritage' and, if so, whether they were acquired lawfully or against the will of their original owners. A new digital platform supports users in gaining insight into this: the Colonial Collections datahub.
The datahub is being developed by a consortium of heritage institutions with expertise in colonial heritage. Researchers from both the countries of origin and the Netherlands can use the datahub to trace the origins of objects and determine their current location. The Rijksmuseum participates in the consortium.
The Rijksmuseum, like the other partnering institutions, wants to make its information about colonial heritage available to the consortium. This means that a so-called data synchronisation must be created between applications of the museum and the consortium, in order to get the information from one application to the other. But how should this synchronisation work? A team from the Rijksmuseum and the consortium worked together on this the past few months.
Our approach is founded on two principles. The first principle: information must be made accessible in a standardised manner. 'Standardised' means that no specific agreements or custom solutions are required. This makes it easy for both the museum and the consortium to exchange information. 'Standardised' also means that the information is published as linked data, a method to expose information in a structured way, connected to information from other sources. The Rijksmuseum uses linked data extensively, as previous blog posts show.
The second principle takes it one step further. Both the Rijksmuseum and the consortium want to offer up-to-date information to users. This may sound obvious – who wouldn't want that? But this is not that common in the heritage sector. Usually an institution provides a new version of its information periodically, for example once a week, month or year. As a result users have to use old information until a new version is published.
We investigated how up-to-date information can be made available via data synchronisation. 'Up-to-date' means that any change to the museum's data – such as the addition of a new collection item to the collection management system – is immediately passed on to the datahub. Our investigation consisted of two phases.
In the first phase we explored which solutions already exist. We reviewed the following:
- OAI-PMH is a tried and tested way to exchange information. It is widely used in the heritage sector and the Rijksmuseum has quite some experience with it. On the other hand, it is an outdated protocol, not specifically intended for synchronising linked data.
- ResourceSync is often seen as the modern-day successor to OAI-PMH. Yet ResourceSync is little used and – like OAI – not designed for linked data.
- Git is a popular version control system; you can use it to store files and their changes. The information about collection items can also be stored in files. This would allow the datahub to look in the Git system of the Rijksmuseum and retrieve the latest information. Although an elegant solution, Git is not a standard for synchronising linked data – we need specific agreements to be able to use it.
- Linked Data Notifications (LDN) is a standardised linked data solution. It allows applications to exchange information via messages. It is a well thought out but also a generic protocol – it is not tailored to data synchronisation. If we want to use LDN for that purpose, we need to extend it.
- Linked Data Event Streams (LDES) is a new protocol. It is maintained by SEMIC, the Semantic Interoperability Community of the European Commission. LDES is promising: it is based on linked data and ideal for data synchronisation. However, the protocol's specification has not yet been completed.
- IIIF Change Discovery is also a new protocol, rooted in linked data. It was created by the community that developed the International Image Interoperability Framework (IIIF), a standard that delivers images over the web. With Change Discovery, a data provider can indicate which records have been changed; a data consumer can then process those. A limitation is that the protocol is part of IIIF, which could suggest that it is only suitable for communicating changes about images, not about collection items in general. Furthermore it only signals if a record has changed but not what has changed, such as the title or the description.
All solutions have pros and cons – there is no best solution. To be able to make a choice, we focused on solutions developed for data synchronisation and linked data: LDES and Change Discovery. We then created proof of concepts for both. Based on these experiences, we opted for Change Discovery: the protocol is well-specified and straightforward to implement.
In the second phase we created a data synchronisation based on Change Discovery.
First, the Rijksmuseum team developed a change discovery service. When information about an item changes in the collection management system, the service makes clear which item it concerns. Any change can be communicated in this way, such as an addition (a new item is added), an edit (the information about an existing item is updated) or a deletion (the information about an item is removed). The service is part of the museum's integration layer, the system through which collection information will be made digitally accessible.
Second, the Colonial Collections team developed a change discovery client. This is an application that queries the museum's service at regular intervals – for example once an hour – and asks whether collection items have been changed. If so, the client retrieves the latest information about the items and stores it in the datahub.
The Rijksmuseum and the Colonial Collections consortium now have a data synchronisation in place for exchanging up-to-date information. But there still is work to do. The change discovery service currently publishes changes about all collection items of the museum. However, the consortium is only interested in items that are colonial heritage. How can these be recognised? This is not just a technological question – it is foremost a question about the definition of 'colonial heritage', the provenance of items and the information that the Rijksmuseum has about its items. Once this question has been answered, the museum's colonial heritage can be presented in the datahub and give insight into our colonial past.
This blog post has also been published on the blog of the department Research Services of the Rijksmuseum.