Citing & Being Cited:

Code & Data Edition


Vicky Steeves & Nick Wolf | DATE


Agenda!


  1. Data Citations: Yours & Others
  2. Code Citations: Yours & Others
  3. Managing Citations (AKA keeping your sanity)

The Why of
Data & Code Citation


Data and code should be cited within our work for the same reasons journal articles are cited:

to give credit where credit is due (original author/producer) and to help other researchers find the material.

Citing Data

A data citation includes the typical components of other citations:

  1. Author or creator: the entity/entities responsible for creating the data
  2. Date of publication: the date the data was published or otherwise released to the public
  3. Title: the title of the dataset or a brief description of it if missing a title
  4. Publisher: entity responsible for hosting the data
  5. URL or preferably, a DOI

Citing Data

Some optional (but recommended) values include:

  1. Edition or version
  2. Date accessed online

Styling Citations

Chicago

Authors. Title. Place written: Organization, Date of publication. Distributor. Link.

MLA

Authors. Title, version. Place Written: Organization, Date of publication. Access method, date accessed. Link.

APA

Authors (first & middle name abbreviated). (Year). Title (Version number) [Description of form]. Location: Name of producer. Link.

But but....I want to make MY data citable!

Step 1: Clean & Prep Your Data!

Step 1a: Put your data into an open format!

  • Be software agnotic!

Step 1b: Package Your Materials

  • Data files
  • Documentation & description
  • Analysis tools if possible

Step 2: Publish/Make Available

When you publish, you should make the underlying data available in a repository that issues DOIs! You then link that DOI out!

This means that anyone who wants to use your data must go to this repository, download it, and cite their use if they publish using it!



Step 2: Publish/Make Available

You can also publish your data in a data journal! These are domain-specific or journal-specific peer reviewed journals just for data!
These also give you a DOI that you can share.

Getting Data Cited

Advantages to Tracking Citations:

  • Demonstrate to funders/promotion committees you & your data make big impacts in your field!
    • they judge merit based on intellectual merit and wider impact
    • tangible evidence to weigh against the cost of research

  • Monitor usage of datasets!
    • You can know what forms of data prep and data publication are most effective for sharing/open science!
    • Uncover opportunities for collaboration amongst peers

Examples


Exploring Dryad

Exploring Open Science Framework

Citing Code


Just like data, sharing and archiving of software is best done in repositories and journals!

But what software should you cite?

To start, you would only cite code that's not universal. Don't cite Microsoft Office, but DO cite scikit-learn!

Code Anomalies:
Authors vs. Contributors


A piece of software might be created by dozens, if not thousands of contributors. Do all of them get cited? No!

It's the difference between the maintainer(s) of a project (who is/are currently responsible for it), and the contributors (those who have committed code to the project, or made other contributions). You would cite the maintainers, and possibly also previous maintainers.

Code Anomalies:
Location?? Publisher??


Publisher name/location is similarly difficult. This could be optional, but not when software is produced solely by a specific university or software company.

The geographic location is probably irrelevant, unless it's necessary for distinguishing between multiple entities with the same name.

Styling Citations

Chicago

Software Name. Location: Publisher/Author, Date. Link if available.

MLA

Authors, Software Name, Place Written: Organization, Date Written. Link if available.

APA

Author Name (first & middle name abbreviated). Title of Software/Code. [Computer software]. Location: Publisher. Link if available.

But but....I want to make MY code citable!

Step 1: Clean & Prep Your Code!

Step 1a: Comment & style your code—

Step 2: Publish/Make Available

The two most popular repositories are NanoHub (about 2,000 DOIs for software) and Zenodo (close to 5000 DOIs for software).

NanoHub uses the open source HubZero software that integrates a subversion code repo. Zenodo was integrated with Github to give a DOI to a new release of a git repo.



GitHub + Zenodo

  1. Log into Zenodo using your GitHub account.
  2. Zenodo will redirect you back to GitHub because permissions. Grant them.
  3. Pick the public repository you want to publish.
  4. Check to make sure there is now a Zenodo webhook in the repo you chose.
  5. Release a version!
  6. Add a brief description in Zenodo, and that version gets a DOI and a badge!

Step 2: Publish/Make Available

You can actually submit your software to journals for peer review! So cool and futuristic~

Examples


Exploring Zenodo

Exploring the Journal of Open Source Software

But how do I keep track of all these citations??



Citation Management Software!

Citation Management Tools enable you to:

  • Import citations from databases, websites, catalogs
  • Organize citations using folders and tags
  • Attach PDFs, images, etc. to your citations
  • Annotate your citations and/or PDFs
  • Output auto-formatted bibliographies and in-text citations (APA, MLA, & hundreds more styles)

Using Zotero for Citation Management

Zotero

  1. Zotero is what's called a bibliographic management tool.

  2. Items in a Zotero library can be annotated with tags and text notes for YOUR convenience!

  3. Zotero will also store links to files, either as a standalone library entry, or as a child element linked to a bibliographic record.

Note: NYU Libraries provides support for learning Zotero. See the Zotero LibGuide for more information.

Why Zotero over others?

  1. Free and open source! See the GitHub repo~
  2. Save citations !IN BROWSER! using their FireFox and Chrome extensions.
  3. The Zotero toolbar is easy to use and works in both Microsoft Word and OpenOffice, letting you insert citations into your paper quickly.
  4. Lots of flexibility in the types of materials you can cite!

Manually Adding Citation to Zotero

Zotero captures citation information about a variety of materials -- notice that "computer program" is one of them! And "data" is coming soon.

Adding Citations Via Browser

Demo time!

Group Citation Libraries!

You can collaborate with as many people as you want! One person in your group should then create a group library and INVITE the other group members to join the library. Use their email addresses that they used to sign up to Zotero.

Save your references into the group library. When you sync to the Zotero server, the reference you added will be uploaded to the Zotero server. Then when the other group members sync, that reference will show up in the group library folder in their Zotero library.

Export a Bibliography

Upcoming Classes


Data Cleaning & Management with Python | October 11th,
5-6:30pm | Bobst Library

Intro to Git and GitHub | October 18th, 4-6pm | Bobst Library

Check the calendar for more!

Thank you! Questions?


Email us: vicky.steeves@nyu.edu & nicholas.wolf@nyu.edu

Learn more about RDM: guides.nyu.edu/data_management

Get this presentation: guides.nyu.edu/data_management/resources

Make an appointment: guides.nyu.edu/appointment