This post is a brief interlude in our series on deploying GeoBlacklight and a Spatial Data Infrastructure at NYU. During the two week stretch from July 25 to August 5, Jack Reed, Darren Hardy, Stephen Balogh, Eliot Jordan, and several others put in some incredible work (over 40 merged pull requests and 108 commits) to release version 1.0 of GeoBlacklight. This post is a summary of some of that work, with particular attention paid to the implications of the revised GeoBlacklight metadata schema.
With the release of version 1.0 of GeoBlacklight comes an array of improvements on the design and user interface of GeoBlacklight. According to Jack Reed, these improvements include compatibility with Blacklight 6, autocomplete and spelling suggestions on text searches in the default application, customizable Leaflet map and plugins, and a simplified metadata schema that removes some redundant and unused fields and allows for a more sophisticated way to present data documentation to users.
The release of 1.0 also coincides with the release of a large batch of GeoBlacklight metadata by the CIC Geospatial Data Discovery Project. Congrats to Karen Majewicz and everyone else on the team of Big 10 institutions for releasing hundreds of records. Their records, as well as the existing set of records, are available at OpenGeoMetadata.
Overview of Version 1.0 of the GeoBlacklight Schema: Noteworthy Changes
Perhaps the most important news to come out of the sprint is the revision of the original version of the GeoBlacklight schema, which was profiled in Darren Hardy’s and Kim Durante’s 2014 article. The overhaul and simplification of the GeoBlacklight schema (click here for the new version) is the result of both the community sprint and previous discussions about the need to show more complex relationships between spatial datasets within a single collection. Similarly, other fields have been deemed to be redundant and have been depreciated.
In order to help others adapt to the 1.0 schema, Darren Hardy has drawn up documentation that explains the changes. I encourage everyone to read this document, which is currently the closest thing we have to an authoritative guide to producing GeoBlacklight metadata. But, for the benefit of the community, here’s a bit more information about the rationale for and implications of the revisions.
What’s Stayed the Same
Many of the elements from the initial version of the GeoBlacklight schema have carried over without a change and should be applied as before. These are:
What’s Changed (Slightly)
Several of the elements have undergone a slight change. Specifically:
- identifier_s: A globally unique identifier that remains unique across all institutions
- references_s: While the core key-value schema is the same, there have been several additions. Now, you can include a link to a codebook, point to an image on a IIIF server, or point to layers on various ESRI ArcGIS layers. See the advanced schema definitions for full details.
- layer_slug_s: The team has added a couple points of clarification on best practices for creating layer slugs. Specifically, it should be a human readable identification of a layer and remain unique within a single deployment of GeoBlacklight. And, it should follow the convention of institution-keyword1-keyword2.
What’s Been Depreciated
A few layers have been taken away from the original schema. They are:
- relation_sm: The RDF link that is associated with each place name was not being used in any way and had the potential to cause problems, so we have gotten rid of it.
- georss_box_s: The bounding box coordinates are being expressed in the solr_geom field, which is all GeoBlacklight needs.
- georss_point_s: See above.
- solr_year_i: This only served the GeoBlacklight date range plug-in, so getting rid of this now allows for multi-value years to describe the temporal coverage of data.
- uuid: This was the same as the identifier and has gone away.
Now for the best part: the new functionality made possible by additions to the schema. The new fields are as follows:
- source_sm: This new field allows you to display parent-child relations between individual layers or records within a catalog. For example, we have collected this ESRI Geodatabase from Baruch CUNY college. While we have created an item record for the entire geodatabase, we’ve also created records for each of the layers within the geodatabase and indicated those via the source field (which manifests as data relations in the application).
- geoblacklight_version: This field will identify which version of the schema being used, which could eventually help with quality control and compatibility with pre-existing versions of GeoBlacklight.
How to Adapt
Adapting to the new metadata schema is fairly straightforward. Darren Hardy has constructed a series of scripts that omit the obsolete fields, and he’s already submitted a pull request to correct those elements of the metadata in our repository (as well as Stanford’s). Other tweaks will still need to be made.
In terms of authoring new metadata from scratch, Stephen Balogh will likely update the GeoBlacklight plug-in for Omeka to reflect these changes. Stay tuned for more news on that as well. That’s about it for now. Congratulations again to everyone on the team, and be sure to check out all of the project documentation on the wiki and repository pages.