Structured Data Can Get Complicated

Over the past couple of months, I’ve written several posts about structured data markup and the increasing importance of building machine-readable context and descriptions into content that’s created for the web. Structured data is already used by the major search engines to provide enhanced search result listings, and it’s also used by mobile search providers to produce more accurate local search results.

The trouble is that the world of structured data isn’t all nice, neat and orderly. There’s no single standard “how to” on using structured data markup. The W3C — the organization that develops and manages most of the web’s open standards — recommends two different structured data specifications. The Resource Description Framework in Attributes (RDFa) was developed over a period of years by a W3C working group. The Microdata specification was promoted primarily by the major search players (Google, Yahoo and Bing), and then taken on by another W3C working group in a sort of shotgun wedding.

When there are two standards, is anything really “standard”? More or less. With a little ingenuity, Microdata can be used as a subset of RDFa. The markup vocabularies aren’t interchangeable, but Microdata coding can be used inside of a document that’s written using RDFa specifications. It’s confusing enough, though, that you need to have a firm idea about what’s going on before you start working the code into your website.

If you’re interested in delivering semantic information about your content primarily to the major search engines, Microdata is certainly the easier of the two markup formats to use. It’s vocabulary is managed through Schema.org and is pretty tightly controlled. RDFa, on the other hand, is a very open standard. The syntax, or “format” used in RDFa is managed through a broad-based W3C working group — but almost any person or organization can create an RDFa descriptive vocabulary, known as a schema (not to be confused with Schema.org).

Some schemas, such as Dublin Core or Friend of a Friend, are commonly used and referenced in millions of online documents. Schema.org — the name is a hint here — also promotes a widely used schema. But RDFa makes the creation of roll-your-own schemas relatively easy, which has resulted in the appearance of numerous, subject-specific vocabularies.

How useful are some of these rather obscure, very specific schemas? Only time will tell, I suppose. In the meantime, it seems the best practice is to keep your structured data markup as simple as possible and use only widely accepted schemas.