JSON-LD: Building Meaningful Data APIs
Everybody loves JSON!
However, JSON by itself is pretty meaningless. Well. It has meaning, but only to the original creator of that format. They can share that meaning via documentation, conversation, or usage, but those often exist in places far removed from the JSON document you’re looking at.
What if the meaning itself was encoded directly into the document? What if every key in a JSON document had its own unique identity which you could look up (because web!) and read about its meaning? Wouldn’t that be slick?
Well. It’s quite possible, actually.
Here’s what I mean. Below is a pretty typical bit of “person” information written in JSON with English key names (to match this article’s content):
{ "first_name": "Benjamin", "last_name": "Young", "alias": "BigBlueHat", "email": "byoung@bigbluehat.com" }
You likely know what all that means (more or less). However, your software doesn’t, unless you coded it exactly to those string names. While there’s some possibility we picked the same keys (yay us!), there’s a higher likelihood we didn’t. We probably fix our :coffee:
differently too. Who knows.
Currently, we’d need to write some code that took my JSON and turned the first_name
key into your firstName
key (or into first
which is inside your name
object or into given
or… whatever). We’d then need to do that again the next time we found someone else with information about a person that used different key names. Rinse. Repeat.
What if we could match first_name
, name.first
, given
, and even an obfuscated sadfeqpoif
key into a meaningful Thing which made each of these “mean” the same thing? It can be done!
Meet JSON-LD! JSON-LD stands for “JSON for Linked Data.” It’s a specification for encoding contextualized meaning into otherwise meaningless JSON documents.
Here’s an example:
{ "@context": { "@vocab": "http://schema.org/", "first_name": "givenName", "last_name": "familyName", "alias": "alternateName", "email": "email" }, "first_name": "Benjamin", "last_name": "Young", "alias": "BigBlueHat", "email": "byoung@bigbluehat.com" }
In this upgraded version of the earlier JSON document, you’ll see a new @context
object. This object handles the mapping from my ad hoc terminology to the Schema.org vocabulary which is referenced via value of @vocab
.
The key/value pairs below the @vocab
key map directly to meaningful URLs at http://schema.org/
. Now, my keys have meaning. Each key now has:
- an identity in the form of a URL:
last_name
now maps to http://schema.org/familyName for instance. - a definition found by visiting the URL for
familyName
: “Family name. In the U.S., the last name of a Person. This can be used along with givenName instead of the name property.”
Pretty handy, no? Now. If you found that JSON — or got it back from my API — you could know what the keys meant by looking up their URLs.
Mapping to Meaning
Knowing what keys mean within a JSON document is already an upgrade over the status quo, but let’s not stop there!
Here’s an example of the same JSON using the Schema.org terms (key names) directly:
{ "@context": "http://schema.org/", "givenName": "Benjamin", "familyName": "Young", "alternateName": "BigBlueHat", "email": "byoung@bigbluehat.com" }
This Schema.org JSON example has the same meaning as the earlier version. “BigBlueHat” in both cases means an alternateName for Benjamin Young.
In the world of JSON-LD, my JSON with first_name
and this JSON with givenName
are equivalent, identical, the exact same, they match, and son. Crazy, right?
Coding for the @context
So wait. We’ve got two different JSON documents which mean the same thing. Fabulous. However, most current JSON code for processing these likely looks a bit like this:
var doc = {...}; // one of the documents above var first = doc.first_name || doc.givenName; console.log('First Name', first);
How does the @context
object help get around that? We’re still just looking for the strings not the Things.
We have two options:
- expansion
- compaction
Expansion
Expansion in JSON-LD parlance means taking those meaningful, URL-based names and making them (at least within the processing code) actually be the URLs. Here’s the output from the expansion process done on both the above contextualized examples (remember! They’re identicial now!):
[ { "http://schema.org/alternateName": [ { "@value": "BigBlueHat" } ], "http://schema.org/email": [ { "@value": "byoung@bigbluehat.com" } ], "http://schema.org/givenName": [ { "@value": "Benjamin" } ], "http://schema.org/familyName": [ { "@value": "Young" } ] } ]
Yeah. Sorry. This is the sausage making bit.
The above is the internal, identical representation in a JSON-LD processing system of both the above documents. It’s structured that way because there’s much more data that can (and likely will, in a real-life example) be added to those arrays, objects, etc.
However, you’re likely not currently inspired to switch all your JSON processing code to use this format. No worries! That’s where compaction comes in!
Compaction
JSON-LD processors will take the above two example docs, turn them into that expanded format, and then (if asked) turn them into a much more human-friendly compacted variation that looks like this:
{ "@context": { "@vocab": "http://schema.org/", "first_name": "givenName", "last_name": "familyName", "alias": "alternateName", "email": "email" }, "alias": "BigBlueHat", "email": "byoung@bigbluehat.com", "last_name": "Young", "first_name": "Benjamin" }
“Wait, what?” I hear you say. “You just copy/pasted the first example again! You, just now.”
Actually, I didn’t. Don’t trust me? Check it out in the JSON-LD Playground. Here’s a screenshot for the doubters who don’t want to click:
Yep. That’s the “straight” Schema.org example (on the left), re-contextualized via my idosyncratic @context
(seen on the right) with the fully realized and re-contextualized output document at the bottom.
So, yeah. Round tripped someone else’s JSON into my own key names. Nice, yeah? Let’s code for that.
The codes
Here’s the magic using json-ld.js:
var recontextualized = {}; jsonld.compact(schema_org_doc, my_context, function(err, compacted) { recontextualized = compacted; });
Running that code, I now have this output you just saw above. Not much code for so much magic, eh?
Here’s a live environment with the content from this example, so you can test it out: JS Bin on jsbin.com
Meaningful People APIs
Now that we can transform simple JSON documents into magically meaningful JSON-LD documents, let’s look at what this gets us in practice.
Let’s look at three JSON API endpoints from three different people-information-providers (commonly called Social Networks):
- Meetup
First, here’s the full idiosyncratic @context
to cover the content I want “normalized” from these three social networks:
{ "@vocab": "http://schema.org/", "first_name": "givenName", "last_name": "familyName", "alias": "alternateName", "job_title": "jobTitle", "city": "addressLocality", "country": "addressCountry" }
Below are the URLs, plain-old-JSON, and the site-specific custom @context
needed to map their key names to the Schema.org standard URLs. There’s also a JSBIN workspace you can fork for each of them!
Meetup
URL: https://api.meetup.com/2/member/19524571?&sign=true&photo-host=public&page=20&only=country,city,link,bio,name
JSON:
{ "link": "http://www.meetup.com/members/19524571", "name": "Benjamin Young", "country": "us", "bio": "aka BigBlueHat -=- Developer, Web, & Open Source Advocate, Invited Expert in the Annotation and Digital Publishing Working Groups at the W3C. Previously an inventor and evangelist for IBM's Cloudant, Couchbase, and also CTO at InnoVenture.", "city": "Greenville" }
Meetup Mapping @context
:
{ "city": "http://schema.org/addressLocality", "country": "http://schema.org/addressCountry", "bio": "http://schema.org/description", "name": "http://schema.org/name" }
You can test it out: JS Bin on jsbin.com
URL: https://api.twitter.com/1.1/users/show.json?screen_name=bigbluehat&user_id=15841047
JSON (only partial results below ’cause it’s huge!)
{ "id": 15841047, "id_str": "15841047", "name": "bigbluehat", "screen_name": "bigbluehat", "location": "Greenville, SC", "profile_location": null, "description": "inventor & evangelist - I :heart: hypothes.is, @couchdb, open source, open communities. I organize @RESTFest & @OpenUpstate." ... }
Twitter Mapping @context
:
{ "name": "http://schema.org/alternateName", "screen_name": "http://schema.org/alternateName", "description": "http://schema.org/description", "location": "http://schema.org/location" }
You can test it out: JS Bin on jsbin.com
URL: https://api.linkedin.com/v1/people/~?format=json
JSON:
{ "firstName": "Benjamin", "headline": "Invited Expert, W3C", "id": "Ol-pwbI97V", "lastName": "Young", "siteStandardProfileRequest": { "url": "https://www.linkedin.com/profile/view?id=AAoAAAAAx4oBxNlPOjsrspSms5FMi7Tx0c-EBSk&authType=name&authToken=XFwT&trk=api*a3227641*s3301901*" } }
LinkedIn Mapping @context
:
{ "firstName": "http://schema.org/givenName", "lastName": "http://schema.org/familyName", "headline": "http://schema.org/jobTitle" }
You can test it out: JS Bin on jsbin.com
Conclusion
I now have three different but meaningfully normalized variations of… myself. I can merge these in a number of ways depending on which data store I’m using. The keys all match now. They all have meaningful identifiers that, when followed, result in documentation for the key names. That’s far more than I got from the raw GET
requests to these various APIs.
Next time you sit down to code up some JSON document — even one you think only you will use — consider mapping your key names to meaningful vocabulary terms via JSON-LD. Your future self will thank you.
Reference: | JSON-LD: Building Meaningful Data APIs from our WCG partner Florian Motlik at the Codeship Blog blog. |