Also, refer to the video below for a 3D preview of what the data looks like when visualized.
About the Collection and Data Release
The 2015 LiDAR dataset is a landmark acquisition for geospatial data collections at NYU Libraries. It is the first time since the launch of our new Spatial Data Repository in 2016 that the GIS team has worked with researchers at NYU to bring a complex, multi-format original dataset into our collection. Many thanks to Stephen Balogh, Brittney ONeill, Ahn-Vu Vo, and others who put in incredible amounts of work on organizing the data for release and developing capacity for it.
Because of the size and complexity of the data, we had to take several new steps in order to present the data with enough spatial context to be useful to a range of geospatial researchers. One of the most frequent questions we anticipate about this data is, “what is it, and what can you do with it?” To help, the team has provided a 3D rendering of what the point cloud data looks like when visualized (see below).
This is just one section of point cloud data, which anyone can download and visualize with a library like Potree, though even this visualization is presenting a compressed and down-sampled version of the full waveform LiDAR, which is made available in LAS and Pulsewaves formats. Professor Laefer’s team has provided very robust documentation about the use of this data in research, and its application for urban informatics scholarship. To date, this type of data has been used to explore the detection of road curbs and obstacles, tree growth, and more.
The size and complexity of the data associated with the 2015 aerial laser scan has also required us to revise some of the ways that we have been presenting spatial data. In total, the data associated with just a two square kilometer area in Dublin is well over one terabyte and comes in at least four different formats, including point cloud, full waveform, and infrared GeoTIFF. We needed efficient ways for users to explore smaller subsets of the data and download files efficiently, so we expanded the interface of GeoBlacklight to afford for discovery according to individual flight paths or area of coverage.
Through our spatial discovery application, GeoBlacklight, users can find sections or subsets of the data that are important to them and download accordingly. We hope that this release of LiDAR data benefits the larger geospatial community, and we encourage you to explore the complete collection within NYU’s Spatial Data Repository.
]]>Although human settlement features can be found in detail throughout the collection, they also describe a range of environmental features with hundreds of distinguished land and land-use types from categories of agriculture to forest, grass, soil types, and even five types of sand. Coastal areas can be compared to current coastlines to measure erosion and sea-level changes. Not conversant in Russian? An in-depth technical manual prepared by the U.S. Army in English, and available with the collection, will be your guide. To view the collection with spatial preview, click here.
1 Alexander J. Kent and John M. Davies, “Hot Geospatial Intelligence from a Cold War: The Soviet Military Mapping of Towns and Cities.” Cartography and Geographic Information Science 40:3 (2013): 248.
By adding this data into our collection, we’ve not only preserved it (and attached relevant documentation), but also made it exceedingly easy to add administrative boundaries and public data to NYC-related mapping projects. For example, look at this quick visualization of the Bronx. I’ve added the MapPLUTO file of the Bronx to my account in CartoDB and created a choropleth map that shows the year (before 1975 and after 1975) that each building was constructed.
By clicking on each parcel, you can see the year of its construction. This is just one element of the data available to be displayed; check out many others in the PLUTO codebook. In all, there are about 50 files added, and this is the first of many forthcoming additions of NYC spatial data. To browse all of the items in our collection, visit the Spatial Data Repository and search “Bytes of the Big Apple.”
]]>RealtyTrac Housing Foreclosure Data
NYU has renewed its license agreement with RealtyTrac to provide access to data on locations and characteristics of properties that have been foreclosed upon. Our collection contains delimited text records of every property that has been foreclosed upon in the United States between 2005 and January, 2016. The files are massive and include latitude/longitude coordinates, date of construction, size, assessed value, and many other variables related to each property. The data require special agreement to terms of use before you can gain access. To find out more information about how to get access, visit our guide.
Gallup WorldPoll Reference Tool
NYU has already held access to several key Gallup data products, including Gallup Analytics, Gallup Poll, and Gallup Brain. In these tools, you can explore specific indicators and extract data for them at the country level. While this capability is good, it’s somewhat limited in that you cannot search multiple indicators across survey and countries at the same time. Now, NYU has access to the data behind the scenes! If you get in touch with us at Data Services, we can give you access to Gallup’s World Poll reference tool, which allows you to sift through respondent-level data and search specific indicators across time and space. Just let us know if you are interested.
MobileCollins World Cellphone Coverage Maps
NYU Libraries has acquired several vector maps that highlight area of 2G, 3G, and 4G cellphone coverage across the world. These files are available in our new Spatial Data Repository. To locate the data, enter in relevant search terms, or search for CollinsBartholomew, Ltd. files. If you have any questions about this data, don’t hesitate to get in touch.
What exactly does this mean in practice? If you are surfing on a site via the HTTP protocol, then all of the data that is transmitted to or from that site is, by default, sent in an unencrypted form. This doesn’t necessarily mean that your data is exposed for others to see, but it does mean that anyone who is in a position to intercept the packets sent between you and the server might be able to see exactly what data was exchanged.
Our Spatial Data Repository at Data Services has supported HTTPS connections, but up until very recently, only the connection between the server that runs the repository and the user who accesses it was encrypted. Some of the external sources of data, loaded when a user accesses a SDR record, still used the HTTP protocol.
To understand this better, consider the fact that most websites are in reality a combination of content from disparate sources; some of the sources are hosted locally, on the same site you navigated to in your browser, and some are hosted externally. (If you’re interested in investigating the origin of content on any site, try playing around with the “Network” section of the Chrome browser’s Developer Tools). This combination of sources certainly characterizes content found on the Spatial Data Repository; accessing any page on the SDR requires a user to load content from a variety of other servers. And this is where our problem was located –– even though geo.nyu.edu was protected by HTTPS, the feature on the site that allows users to preview map extents had to make use of an unencrypted HTTP connection to our map servers. And as a result, the security of the site was a little less than ideal –– maybe even particularly so, since the website suggests that the SDR is entirely secure just by virtue of the fact that its domain begins with an “https://”!
A good illustration of why this can be risky is the following.
With HTTPS content, the encryption works in such a way that all data sent between a client and a server appears completely unintelligible when viewed by a third party. This is why virtually all e-commerce sites use HTTPS –– even if a connection is compromised, your passwords and bank details are safe. I wanted to see what an attacker might see should they be able to intercept traffic between a user and the Spatial Data Repository. Using software that analyzes network packets (I used Packet Peeper, but there are a ton of these) I took a look at some traffic collected as I browsed around records.
Whenever I browsed a page directly hosted by the SDR, the traffic was, as expected, unintelligible:
But when I kept looking, I eventually found some traffic that was sent out to an external service via HTTP. Looking carefully, you can see the name of the map I requested, and even which subset of it I was particularly interested in:
Even though my activity on the main page was encrypted, the simple fact that one of the associated connections made from the page was not encrypted meant that anyone in a position to intercept network traffic could deduce a lot about how I was using the site.
Would this be the end of the world? No. But do I think it’s an important security consideration that might be overlooked in other contexts? Yes! And luckily, it is now the case that all (knock on wood) external connections made from pages on geo.nyu.edu will be served up via HTTPS.
Stay green, ol’ padlock!
]]>