Spatial Data Repository – Data Dispatch

Data Services Adds Aerial Laser and Photogrammetry Data for Dublin, Ireland

NYU Libraries — Wed, 12 Jul 2017 18:30:44 +0000

NYU Data Services is excited to publish a collection of 2015 Aerial Laser and Photogrammetry Survey Data for Dublin City, Ireland in the Spatial Data Repository. This high density dataset was collected in March 2015 by Debra F. Laefer and a team of researchers at NYU’s Center for Urban Science and Progress (CUSP). The dataset includes aerial laser scanning (ALS) from 41 flight paths in the form of a 3D point-cloud (LAZ) and 3D full waveform ALS (LAS and Pulsewave), and other imagery data. For more information on this collection, please read the official press release. TechCrunch refers to this set as the “largest LiDAR dataset ever to help urban development.”

Also, refer to the video below for a 3D preview of what the data looks like when visualized.

About the Collection and Data Release

The 2015 LiDAR dataset is a landmark acquisition for geospatial data collections at NYU Libraries. It is the first time since the launch of our new Spatial Data Repository in 2016 that the GIS team has worked with researchers at NYU to bring a complex, multi-format original dataset into our collection. Many thanks to Stephen Balogh, Brittney ONeill, Ahn-Vu Vo, and others who put in incredible amounts of work on organizing the data for release and developing capacity for it.

Because of the size and complexity of the data, we had to take several new steps in order to present the data with enough spatial context to be useful to a range of geospatial researchers. One of the most frequent questions we anticipate about this data is, “what is it, and what can you do with it?” To help, the team has provided a 3D rendering of what the point cloud data looks like when visualized (see below).

This is just one section of point cloud data, which anyone can download and visualize with a library like Potree, though even this visualization is presenting a compressed and down-sampled version of the full waveform LiDAR, which is made available in LAS and Pulsewaves formats. Professor Laefer’s team has provided very robust documentation about the use of this data in research, and its application for urban informatics scholarship. To date, this type of data has been used to explore the detection of road curbs and obstacles, tree growth, and more.

The size and complexity of the data associated with the 2015 aerial laser scan has also required us to revise some of the ways that we have been presenting spatial data. In total, the data associated with just a two square kilometer area in Dublin is well over one terabyte and comes in at least four different formats, including point cloud, full waveform, and infrared GeoTIFF. We needed efficient ways for users to explore smaller subsets of the data and download files efficiently, so we expanded the interface of GeoBlacklight to afford for discovery according to individual flight paths or area of coverage.

A screenshot of the navigation interface for the collection. Users can click on individual tiles or lines (which represent discrete flight paths) in order to download the datasets associated with that area or flight.

Through our spatial discovery application, GeoBlacklight, users can find sections or subsets of the data that are important to them and download accordingly. We hope that this release of LiDAR data benefits the larger geospatial community, and we encourage you to explore the complete collection within NYU’s Spatial Data Repository.

Data Services Adds Georeferenced Soviet Maps

NYU Libraries — Tue, 24 Jan 2017 20:22:03 +0000

The results of a crowd-sourced georectification project within Data Services have come to fruition as the NYU Spatial Data Repository has now released its rectified collection of topographic maps of Saudia Arabia and nearby regions. Produced by the Soviet military in 1978 at a 1:100,000 scale, the maps were compiled through a combination of aerial intelligence and on-the-ground observation. These maps are part of an endeavor, unknown in its extent at the time, that has been described as one of the “most comprehensive global topographic mapping project ever undertaken.”¹ Among their striking features are the close detail available on each sheet.

The cities of Dhahran and Al Khobar in Saudia Arabia

Data Services team members and friends gathered for two sessions over the last few months to georectify the maps in the open-source GIS software QGIS. Using bounding coordinate information listed on the maps, 441 map tiles were rectified and reprojected so that they could be displayed and used in GIS software alongside other raster and vector layers in standard WGS 84 (i.e. the World Geodetic System 1984, the standard spatial reference system).

Although human settlement features can be found in detail throughout the collection, they also describe a range of environmental features with hundreds of distinguished land and land-use types from categories of agriculture to forest, grass, soil types, and even five types of sand. Coastal areas can be compared to current coastlines to measure erosion and sea-level changes. Not conversant in Russian? An in-depth technical manual prepared by the U.S. Army in English, and available with the collection, will be your guide. To view the collection with spatial preview, click here.

¹ Alexander J. Kent and John M. Davies, “Hot Geospatial Intelligence from a Cold War: The Soviet Military Mapping of Towns and Cities.” Cartography and Geographic Information Science 40:3 (2013): 248.

Bytes of the Big Apple Data Added to NYU SDR

NYU Libraries — Thu, 28 Apr 2016 15:02:45 +0000

In our latest collection update, we have added most of the currently available files from NYC Planning’s Bytes of the Big Apple. Frequent users of the Bytes website appreciate it for its wealth of information, even while they might be frustrated with the somewhat fragmentary and arbitrary structure of data on the site.

By adding this data into our collection, we’ve not only preserved it (and attached relevant documentation), but also made it exceedingly easy to add administrative boundaries and public data to NYC-related mapping projects. For example, look at this quick visualization of the Bronx. I’ve added the MapPLUTO file of the Bronx to my account in CartoDB and created a choropleth map that shows the year (before 1975 and after 1975) that each building was constructed.

By clicking on each parcel, you can see the year of its construction. This is just one element of the data available to be displayed; check out many others in the PLUTO codebook. In all, there are about 50 files added, and this is the first of many forthcoming additions of NYC spatial data. To browse all of the items in our collection, visit the Spatial Data Repository and search “Bytes of the Big Apple.”

New Data Acquisitions

NYU Libraries — Tue, 22 Mar 2016 17:05:38 +0000

NYU Libraries is always trying to build our collection of interesting data that is useful for research. Here’s a highlighted list of selected data collections that we’ve acquired over the past few months. As always, if you have any questions about accessing this data, or if you would like to see NYU acquire other data, don’t hesitate to get in touch with us.

RealtyTrac Housing Foreclosure Data

NYU has renewed its license agreement with RealtyTrac to provide access to data on locations and characteristics of properties that have been foreclosed upon. Our collection contains delimited text records of every property that has been foreclosed upon in the United States between 2005 and January, 2016. The files are massive and include latitude/longitude coordinates, date of construction, size, assessed value, and many other variables related to each property. The data require special agreement to terms of use before you can gain access. To find out more information about how to get access, visit our guide.

Gallup WorldPoll Reference Tool

NYU has already held access to several key Gallup data products, including Gallup Analytics, Gallup Poll, and Gallup Brain. In these tools, you can explore specific indicators and extract data for them at the country level. While this capability is good, it’s somewhat limited in that you cannot search multiple indicators across survey and countries at the same time. Now, NYU has access to the data behind the scenes! If you get in touch with us at Data Services, we can give you access to Gallup’s World Poll reference tool, which allows you to sift through respondent-level data and search specific indicators across time and space. Just let us know if you are interested.

MobileCollins World Cellphone Coverage Maps

NYU Libraries has acquired several vector maps that highlight area of 2G, 3G, and 4G cellphone coverage across the world. These files are available in our new Spatial Data Repository. To locate the data, enter in relevant search terms, or search for CollinsBartholomew, Ltd. files. If you have any questions about this data, don’t hesitate to get in touch.

Protecting geo.nyu.edu Traffic with HTTPS

Stephen Balogh — Fri, 19 Feb 2016 19:01:43 +0000

Using the HTTPS protocol for websites, instead of its older parent protocol, HTTP, offers significantly more protection and privacy for all parties involved in a web transaction. HTTPS is a version of HTTP –– the protocol that allows us to connect to websites in an internet browser –– that encrypts all traffic uploaded or downloaded between a server and its client (you).

What exactly does this mean in practice? If you are surfing on a site via the HTTP protocol, then all of the data that is transmitted to or from that site is, by default, sent in an unencrypted form. This doesn’t necessarily mean that your data is exposed for others to see, but it does mean that anyone who is in a position to intercept the packets sent between you and the server might be able to see exactly what data was exchanged.

Our Spatial Data Repository at Data Services has supported HTTPS connections, but up until very recently, only the connection between the server that runs the repository and the user who accesses it was encrypted. Some of the external sources of data, loaded when a user accesses a SDR record, still used the HTTP protocol.

To understand this better, consider the fact that most websites are in reality a combination of content from disparate sources; some of the sources are hosted locally, on the same site you navigated to in your browser, and some are hosted externally. (If you’re interested in investigating the origin of content on any site, try playing around with the “Network” section of the Chrome browser’s Developer Tools). This combination of sources certainly characterizes content found on the Spatial Data Repository; accessing any page on the SDR requires a user to load content from a variety of other servers. And this is where our problem was located –– even though geo.nyu.edu was protected by HTTPS, the feature on the site that allows users to preview map extents had to make use of an unencrypted HTTP connection to our map servers. And as a result, the security of the site was a little less than ideal –– maybe even particularly so, since the website suggests that the SDR is entirely secure just by virtue of the fact that its domain begins with an “https://”!

Chrome’s green padlock icon for a secure HTTPS site

A good illustration of why this can be risky is the following.

With HTTPS content, the encryption works in such a way that all data sent between a client and a server appears completely unintelligible when viewed by a third party. This is why virtually all e-commerce sites use HTTPS –– even if a connection is compromised, your passwords and bank details are safe. I wanted to see what an attacker might see should they be able to intercept traffic between a user and the Spatial Data Repository. Using software that analyzes network packets (I used Packet Peeper, but there are a ton of these) I took a look at some traffic collected as I browsed around records.

Whenever I browsed a page directly hosted by the SDR, the traffic was, as expected, unintelligible:

An example of a network packet sent via an encrypted HTTPS connection

But when I kept looking, I eventually found some traffic that was sent out to an external service via HTTP. Looking carefully, you can see the name of the map I requested, and even which subset of it I was particularly interested in:

Not encrypted! A look inside a regular packet from HTTP (visualized in Packet Peeper)

Even though my activity on the main page was encrypted, the simple fact that one of the associated connections made from the page was not encrypted meant that anyone in a position to intercept network traffic could deduce a lot about how I was using the site.

Would this be the end of the world? No. But do I think it’s an important security consideration that might be overlooked in other contexts? Yes! And luckily, it is now the case that all (knock on wood) external connections made from pages on geo.nyu.edu will be served up via HTTPS.

Stay green, ol’ padlock!