geospatial data – Data Dispatch https://data-services.hosting.nyu.edu NYU Data Services News and Updates Fri, 20 Oct 2017 16:51:06 +0000 en hourly 1 https://wordpress.org/?v=5.5.15 https://data-services.hosting.nyu.edu/wp-content/uploads/2017/07/DS-icon.png geospatial data – Data Dispatch https://data-services.hosting.nyu.edu 32 32 Data Services Adds Aerial Laser and Photogrammetry Data for Dublin, Ireland https://data-services.hosting.nyu.edu/data-services-adds-aerial-laser-and-photogrammetry-data-for-dublin-city-ireland/ Wed, 12 Jul 2017 18:30:44 +0000 http://data-services.hosting.nyu.edu/?p=1065 NYU Data Services is excited to publish a collection of 2015 Aerial Laser and Photogrammetry Survey Data for Dublin City, Ireland in the Spatial Data Repository. This high density dataset was collected in March 2015 by Debra F. Laefer and a team of researchers at NYU’s Center for Urban Science and Progress (CUSP). The dataset includes aerial laser scanning (ALS) from 41 flight paths in the form of a 3D point-cloud (LAZ) and 3D full waveform ALS (LAS and Pulsewave), and other imagery data. For more information on this collection, please read the official press release. TechCrunch refers to this set as the “largest LiDAR dataset ever to help urban development.”

Also, refer to the video below for a 3D preview of what the data looks like when visualized.

About the Collection and Data Release

The 2015 LiDAR dataset is a landmark acquisition for geospatial data collections at NYU Libraries. It is the first time since the launch of our new Spatial Data Repository in 2016 that the GIS team has worked with researchers at NYU to bring a complex, multi-format original dataset into our collection. Many thanks to Stephen Balogh, Brittney ONeill, Ahn-Vu Vo, and others who put in incredible amounts of work on organizing the data for release and developing capacity for it.

Because of the size and complexity of the data, we had to take several new steps in order to present the data with enough spatial context to be useful to a range of geospatial researchers. One of the most frequent questions we anticipate about this data is, “what is it, and what can you do with it?” To help, the team has provided a 3D rendering of what the point cloud data looks like when visualized (see below).

This is just one section of point cloud data, which anyone can download and visualize with a library like Potree, though even this visualization is presenting a compressed and down-sampled version of the full waveform LiDAR, which is made available in LAS and Pulsewaves formats. Professor Laefer’s team has provided very robust documentation about the use of this data in research, and its application for urban informatics scholarship. To date, this type of data has been used to explore the detection of road curbs and obstacles, tree growth, and more.

The size and complexity of the data associated with the 2015 aerial laser scan has also required us to revise some of the ways that we have been presenting spatial data. In total, the data associated with just a two square kilometer area in Dublin is well over one terabyte and comes in at least four different formats, including point cloud, full waveform, and infrared GeoTIFF. We needed efficient ways for users to explore smaller subsets of the data and download files efficiently, so we expanded the interface of GeoBlacklight to afford for discovery according to individual flight paths or area of coverage.

A screenshot of the navigation interface for the collection. Users can click on individual tiles or lines (which represent discrete flight paths) in order to download the datasets associated with that area or flight.

Through our spatial discovery application, GeoBlacklight, users can find sections or subsets of the data that are important to them and download accordingly. We hope that this release of LiDAR data benefits the larger geospatial community, and we encourage you to explore the complete collection within NYU’s Spatial Data Repository.

]]>
Five Things We Learned At . . . Geo4LibCamp 2017 https://data-services.hosting.nyu.edu/ftwla-geo4libcamp-2017/ Tue, 07 Feb 2017 15:15:39 +0000 http://data-services.hosting.nyu.edu/?p=892 Continue reading "Five Things We Learned At . . . Geo4LibCamp 2017"]]>
Geo4Lib2017 attendees gather in front of the Branner Earth Sciences Library & Map Collections at Stanford University

Last week, Stephen Balogh and I attended the second annual Geo4LibCamp, hosted by Stanford University. The event marked a great year of progress in the GeoBlacklight community. It was a time to reflect on why our current political situation should influence how libraries collaborate to preserve geospatial data. Here’s five things I learned.

  1. Given our exigent situation, we may need to re-think the scale and process of metadata creation. In his excellent plenary talk, Stace Maples modeled ways in which librarians might want to leverage Google’s Cloud Vision API, for instance, to extract workable metadata for scanned maps. The API has the potential to generate searchable terms to help with discovery.
  2. Index map layers are an interesting organizing principle to help contextualize the discovery of physical maps. Stanford is already implementing systems for presenting the holdings of Japanese military topographic maps. This includes reference layers in EarthWorks, Stanford’s discovery portal, but also a series of maps hosted on ESRI online. These tools allow users to discover specific maps more quickly and to see where there are gaps in holdings.
  3. Jack Reed at Stanford has released a gem called GovScooper that harvests all of the metadata in Data.gov and makes a rudimentary transformation into the GeoBlacklight schema. All of the metadata is now available in OpenGeoMetadata, so anyone can bring records into GeoBlacklight and begin to sort through them. This project, in my opinion, has major potential for rescuing geospatial data and enhancing the discoverability of it.
  4. The David Rumsey Map Center, which opened this past year, is amazing! We got to see some incredibly high resolution maps and hear about the process of digitizing and stitching together images.
  5. The GeoBlacklight community is very much concerned with user experience. In the un-conference planning session, the intersection of GeoBlacklight an user experience was the most popular proposed session, and when we met, we had some great discussions about the intersection of metadata and application design.

Thanks to Darren Hardy and everyone at Stanford University for hosting such a great and informative conference. I’m already looking forward to next year.

]]>
Saving Data: Preservation during Political Turmoil https://data-services.hosting.nyu.edu/saving-data-preservation-during-political-turmoil/ Thu, 26 Jan 2017 20:10:57 +0000 http://data-services.hosting.nyu.edu/?p=871 The first week of the Trump administration has been a disastrous assault on many fundamental human and academic rights. So far, a media blackout has been ordered for employees of the EPA, and moving forward, the administration says that “political staffers” will be required to review all published work and data produced by EPA scientists before release to the public or in academic venues.

Attempts to control access to data are spreading beyond the sciences as well. Last week, two Senators (Mike Lee from Utah and Paul Gosar from Arizona) introduced a bill that would undermine the Fair Housing Act, which prevents access to housing based on racial discrimination. In the text of the bill:

No Federal funds may be used to build, maintain, utilize, or provide access to a Federal database of geospatial information on community racial disparities or disparities in access to affordable housing.

Of course, the larger fear is that the massive amounts of data available from governmental sources, such as the EPA, U.S. Census, Bureau of Labor Statistics, and so much more will be taken down. And as the text of the aforementioned bill suggests, the motivation for hiding data is clearly ideological; our leaders intend to enable the discrimination of people and erase human rights.

To guard against this, there have been many coordinated efforts to rescue and preserve our data. Here in New York, NYU’s ITP and Tisch School of the Arts are hosting a guerrilla data rescue event. Developers, coders, librarians, archivists, and activists will gather to work on scraping data, archiving web sources, and coming up with ways to preserve important data.

Other events have been springing up elsewhere. The New York Academy of Medicine organized a drive to save data related to climate change. In NYU’s sphere, members of the OpenGeoPortal geospatial metadata consortium  have launched efforts to complete a data crawl. Thus far, they have archived 20 terabytes of data from these sources:

  • EPA Data Download Site
  • EPA Data Commons FTP Site
  • EPA eGrid
  • EPA FTP Portal
  • EPA Toxic Relief Inventory (TRI)
  • EIA Open Data Portal
  • EIA Layer Information for Interactive State Maps
  • EIA Natural Gas Annual Respondent Query System
  • USGS National Land Cover Database (2011, 2006)
  • USGS National Hydrography Data set
  • NREL GIS Data Portal
  • US Fish & Wildlife National Wetlands Inventory
  • US Census Bureau Entire 1980,1990, 2000, 2010 Population and Housing Census; ACS 2002-2013; EEO Disability 2002-2008; Econ 1997-2015.
  • HUD Data Portal
  • BTS National Transportation Atlas Database 2011-2015 including all tabular statistical data
  • BJS Bureau of Justice Statistics Raw Data Sources
  • HRSA Data Warehouse
  • NOAA Northern Hemisphere Snow & Ice Archive 1997- Monthly
  • NOAA GSOM Global Temperature (Stations)
  • NOAA Nighttime Lights Time Series 1992 – 2013
  • NOAA Global Self-consistent, Hierarchical, High-resolution Shoreline Database (GSHHG)
  • NOAA   Continually Updated Shoreline Product
  • NOAA   Historical Shoreline Survey
  • NOAA   USGS National Assessment of Shoreline Change Vector Shorelines
  • NASA GISTEMP Global Temperature (Global Mean)
  • National Atlas – Entire Atlas

Also, kudos to Stanford University’s Jack Reed, who developed GovScooper, a tool for scraping data from the Data.gov portal so it can be preserved.

The preservation of this data is only one part of the equation. Creating metadata for it so it can be discovered in new context is a next important step. At NYU, we’ve already been engaged in the process of preserving federal and local data. Our Spatial Data Repository contains a range of U.S. Census data, files from NYC’s Bytes of the Big Apple, and more. Our goal is to join in with these coordinated efforts and continue to make data accessible to as many people as possible.

]]>