5 Things I Learned at … IASSIST 2019

At the end of May, I headed to Sydney, Australia for this year’s IASSIST conference! IASSIST is a great organization that puts on an awesome conference, and I try to attend as often as my schedule allows. This year, I also was able to present on some work with ReproZip and ReproServer (slides on the OSF!), as well as participate on a panel about promoting reproducibility as a partner in campus-wide efforts.

Great ReproZip presentation by @VickySteeves !!! #iassist19 pic.twitter.com/skTbO1ezTT
— Eimmy Solis (@eimmysolis) May 31, 2019

While I was there, I learned about some great initiatives, projects, and tools! So without further ado, here are the 5 things I learned at IASSIST 2019:

IASSIST Qualitative Social Science and Humanities Data Interest Group — having formed in 2016, this interest group is meant to generate conversations around the needs of researchers who work with qualitative data and methods, and what types of services librarians and other information professionals can develop to support these researchers. I had never been to a special interest group meeting at IASSIST before, but this was a great first one! Very well-organized and I loved hearing about the agenda of the group.
University of Washington Libraries Data Services RDM MOOC — the Data Services team at UW pulled from existing curricula in RDM and expertise in the library to edit the materials down into a Massive Open Online Course scoped it as a non-credit, 4 day class that has about a 1hr/day time commitment with a 1:8 ratio of tutor:student. The course was offered once in Winter, Spring, and Summer with attendance ranging from 40-110. Learners in the MOOC overwhelmingly reported both that the class was very clear as to the contents (expectation management!) and that the class exceeded expectations. A nice model for a global campus to learn from, too!
IPYSheet – a neat implementation for Jupyter notebooks and JupyterLAb that lets folks edit spreadsheets right in the notebook! The add-on makes a widget that can be embedded in a code cell. Try out a live version of it on Binder: https://mybinder.org/v2/gh/QuantStack/ipysheet/master?filepath=docs%2Fsource%2Findex.ipynb
Cornell Institute for Social and Economic Research (CISER) houses R-squared, a research verification service at Cornell University. The staff at R-squared aims to reproduce the claims from a paper or report by taking the data and code and rerunning it to check whether the reported output is valid. The staff then creates an archive package with all information in one place and deposits it into their data archive. On their site, they say the average review takes 4-8 hours. They do not check methods, replicate studies, question conclusions/theories, or make any direct changes to patrons’ work.
QAMyData, an open source data quality assurance tool for SPSS, STATA and SAS files, written in the Rust programming languages (one of my faves!). The speakers mentioned CSVs are being somewhat supported as well by this tool. QAMyData aims to “automatically detect some of the most common problems in survey and other numeric data and creates a ‘data health check’, assisting with the clean up of data and providing an assurance that data is of a high quality.” It’s in a very alpha state, so I look forward to seeing this grow and add support for open formats!