Last week I lucky enough to head to Amsterdam for iPres2019 — the International Conference on Digital Preservation (iPres is the shorthand for the organizing body of this conference, and the conference itself).
This year iPres was held this year at the EYE Film Museum, the national museum for film located on Amsterdam’s IJ harbour. The conference program included such fun activities like Hackathons and the “Great Digital Preservation Bakeoff” as well as the more traditional conference offerings like a poster and demo session, panels, and paper presentations.
This year at iPres I myself did a fair amount of presenting — I had a panel describing a qualitative study on workplace dissatisfaction amongst digital preservation practitioners (here are the collaborative notes), a poster for the IASGE project (view the poster online), and a paper on a project to archive data journalism (read it on the LIS Scholarship Archive). Even though I was quite busy with all the preparations, given the great spread of the program I was able to attend some wonderful sessions.
So without further ado, here are the 5 things I learned at iPres 2019!
- Memento Tracer: a framework for scalable high-quality web archiving. Martin Klein led a workshop on Monday afternoon, and you can read the collaborative notes here. Memento has 3 essential parts: a browser extension that records Traces (a set of instructions for capturing the essence of web publications of a certain class, like capturing slides on SlideShare or GitHub repositories), a repository where anyone can upload/download/reuse Traces (this is great, because that means Traces can be versioned! and no one has to reinvent the wheel!), and a headless browser extension that uses Traces as guidance in the process that navigates and captures web publications (so if we have one working Trace for SlideShare, we can use that for all SlideShare slides that we want!). Martin explained that the Memento can be used in conjunction with ORCID to track researchers across all the platforms they use for scholarship and preserve their work. You can see examples of how this works for 16 test (but real!) researchers at: https://myresearch.institute.
- SARA – Software Archiving of Research Artifacts: this was the poster next to our IASGE poster, and had a very similar goal of preserving academic code! The goal of SARA is to “enable [researchers] to capture the intermediate statuses of their research work already during the process […] The collected research data and the different versions of the associated software tools are therefore traceable for later research.” Right now the requirements for capturing Git repositories is that it must exist in GitLab (which I love!), but I’m going to keep my eye on this project for next steps.
- The Universal Virtual Interactor (UVI): a part of the Emulation as a Service Infrastructure ecosystem, the UVI is a program that allows users to click on a file (like a CAD file from 1990) and have it open in the original program and computation environment (like AutoCAD 1990 in the appropriate old version of Windows!) using an emulator, in their browser! It also lets users click around and interact (hence the name) with the files and operating systems/old software. It is designed to work for any file, though for files that can be read by any program (like .txt files), this can be tricky — which one should UVI choose? Check out the gif below demonstrating clicking a link to automatically open a Microsoft Works file running in Windows 98 within a web browser (from the UVI DPC blog post):
- The file formats most present in the Library of Congress’s collections! By far, file extensions for images (.jpg, .tif, .jp2) are the most prevalent in LOC’s digital collections, following by extensions typically associated with documents (.txt, .pdf, and .xml). Even though those are the most common by count, GZip files (.gz) dominate the collection by size. There are roughly 3,937.79 TB worth of .gz files!
- Setting Up Open Access Repositories: Challenges and Lessons from Palestine: I wasn’t able to attend this presentation sadly (overlapped with one of mine!) but I was really intrigued by the content of the paper, which describes a “holistic approach for deploying open access repositories and building research data management services.” They took four Palestine Universities as case studies here. Given that some of our data and information literacy classes in the library examine the digital occupation of the Gaza Strip, I am also interested to see how this scholarship might be included.