14 May 2012

Libfocus Journal Club - Improving the presentation of library data using FRBR and Linked data

The following post is the first of a new feature on libfocus: a virtual journal club. Please feel free to comment on the article linked below.

Improving the presentation of library data using FRBR and Linked data By Anne-Lena Westrum,
Asgeir Rekkavik and Kim TallerĂ¥s


When a library end-user searches the online catalogue for works by a particular author, he will typically get a long list that contains different translations and editions of all the books by that author, sorted by title or date of issue. As an attempt to make some order in this chaos, the Pode project has applied a method of automated FRBRizing based on the information contained in MARC records. The project has also experimented with RDF representation to demonstrate how an author’s complete production can be presented as a short and lucid list of unique works, which can easily be browsed by their different expressions and manifestations. Furthermore, by linking instances in the dataset to matching or corresponding instances in external sets, the presentation has been enriched with additional information about authors and works.
By Anne-Lena Westrum, Asgeir Rekkavik and Kim TallerĂ¥s

Full article available here

The authors used the building of a new public library as a means of exploring how their metadata can be used in a way that can help to improve user experience. They used the work of one author to check how easy it was to access his work through the catalogue. The central problem they found was that their existing catalogue did not distinguish between the author's works and different versions of one work. The central hypothesis of the authors is that patrons generally only care about finding a title they are interested in, not a particular edition of that work.

They found that existing MARC records were not fit for this purpose so a tool was developed for automated FRBRisation  of existing MARC records. Unfortunately, a huge clean up process of the existing MARC records had to take place before FRBRisation.When the new and enriched dataset was completed, the project developed a web application to allow an end user to browse this part of the collection by choosing from a much shorter list of works.

Their final conclusion is that while libraries may find it tempting to convert to modern principles of metadata quality-such as RDF- this may well be a tedious and long-winded approach. Their cleanup alone took 60 hours of labour and they were only working on a very limited data set. However, the authors did show that a positive result for end users was achieved by this process, however laborious. I think it does seem like a rather limited outcome for such an investment in hours. The key issue here is that this was not a full library catalogue, only a tiny proportion of one; so, the question is is such an expenditure in time and effort worthwhile to achieve modern metadata standards

1 comment:

  1. Nice choice Ronan, thanks :)

    I think the authors hit the nail on the head when they state that “one cannot create better services based on already existing metadata, than what the quality of the metadata will support”. The inconsistencies in cataloguing practices over time and existing MARC records make automated FRBRisation (or indeed any conversion) a time-consuming challenge. The appendix even highlights material human error issues such as the need to correct typos in the 245 field(!) and incorrect ISBD syntax. Cleaning up basic errors like this is obviously desirable even without the FRBRisation process, but poor quality or inaccurate metadata may not be as prevalent for all catalogues (*wishful thinking*). Therefore the costs may not necessarily be as substantial as the authors experienced in this instance.

    The efficiency and usability gains from being presented with 40 works rather than 585 results are very valuable in my view. If the access point for searching continues to shift away from the library catalogue and towards resource discovery layers which also offer FRBRised groupings, it makes sense to look at your catalogue in this context also. However, larger libraries may have the time and resources to clean up their metadata accordingly, but for smaller libraries where users often use the ‘check the shelf’ method, cleaning up records and metadata may yield little tangible benefit, particularly given the potential costs involved.