HTML 5 Species Taxon Microdata Using Darwin Core

I just added HTML 5 species taxon microdata to globalspecies.org. This provides machine readable data for certain properties of a species. ie. Taxonomy, common names, synonyms. The data is embedded in the HTML and is tagged as property name/value pairs.

An example of the extracted data can be seen using Google’s Rich Snippets Testing Tool.

Overview

HTML 5 introduces the concept of tagged content microdata that it is machine readable using a specific vocabulary. The content is grouped into items that have properties. The type and properties of the items are governed by the vocabulary. The item and properties can be attached to existing HTML tags in the content or new tags can be created. The <div> tag is the most common for an item. The <span> tag is the most common for a property. If the property value should not be displayed to the browser user, then the <meta> tag can be used with a ”content” attribute.

Rather than invent my own vocabulary, I went searching for an existing vocabulary. The vocabularies that major search engines support can be found at schema.org. They do not currently have one that supports species taxonomy.

Further searching yielded the Darwin Core at Biodiversity Information Standards – TDWG. This is a vocabulary used for the exchange of species information. The Taxon class is perfect for representing species taxonomy information.

I added two additional properties to the Taxon class: superFamily and synonym. SuperFamily is used by the ITIS Catalog of Life in their taxonomy. This is where globalspecies.org gets its taxonomy from. Synonyms are represented in the Darwin Core as another Taxon item instead of as a property of the official species item.

Vocabulary

Item Type: http://rs.tdwg.org/dwc/terms/Taxon
Properties

  • kingdom: Kingdom
  • phylum: Phylum
  • class: Class
  • order: Order
  • http://globalspecies.org/terms/superFamily: Super family
  • family: Family
  • genus: Genus
  • specificEpithet: Species part of scientific name
  • infraspecificEpithet: Infraspecies part of scientific name
  • scientificName: Full species name or higher taxon name ex. 1) Puma concolor 2) Chordata
  • taxonRank: Enumeration of – kingdom, phylum, class, order, superfamily, family, genus, species, infraspecies
  • vernacularName: Common name. Can have multiple.
  • http://globalspecies.org/terms/synonym: Equivalent scientific name text or a http://rs.tdwg.org/dwc/terms/Taxon. Can have multiple.

Darwin Core has additional properties. The above properties are the ones used for globalspecies.org.

Example 1 (Species Puma concolor)

<div itemscope="1" itemtype="http://rs.tdwg.org/dwc/terms/Taxon">
<meta itemprop='kingdom' content='Animalia' />
<meta itemprop='phylum' content='Chordata' />
<meta itemprop='class' content='Mammalia' />
<meta itemprop='order' content='Carnivora' />
<meta itemprop='family' content='Felidae' />
<meta itemprop='genus' content='Puma' />
<meta itemprop='specificEpithet' content='concolor' />
<meta itemprop='taxonRank' content='species' />
<h1 itemprop='scientificName'>Puma concolor</h1>
<meta itemprop='vernacularName' content='Cougar' />
<meta itemprop='vernacularName' content='Puma' />
<meta itemprop='vernacularName' content='Mountain lion' />
<h2>Synonyms</h2>
<ul>
<li itemprop='http://globalspecies.org/terms/synonym'>Felis concolor</li>
</ul>
</div>

itemscope=”1″ defines the start of an item.%%%
itemtype=”http://rs.tdwg.org/dwc/terms/Taxon” defines the item type.%%%
itemprop=’scientificName’ defines a property.

If you look at the Puma concolor page on globalspecies.org you may wonder why tags are used for the kingdom, vernacularName, etc. when the data is actually being displayed on the page and could be added to the <a> tag or wrapped in a tag.

The reason that the kingdom property was not added to the <a> tag is that the property value of an <a> tag is the ”href” attribute, not the contents of the <a> tag.

Ex. <a href=”/ntaxa/109518″>Animalia</a>

The microdata value for this property is ‘/ntaxa/109518’ instead of ‘Animalia’. Other tags that behave this way include <img alt=”” /> and <object width=”300″ height=”150″>. The vernacularName properties are not <span> tags because the php code that outputs them is shared by other code that does not need the microdata markup.

Example 2 (Phylum Chordata)

<div itemscope="1" itemtype="http://rs.tdwg.org/dwc/terms/Taxon">
<meta itemprop='kingdom' content='Animalia' />
<meta itemprop='phylum' content='Chordata' />
<meta itemprop='taxonRank' content='phylum' />
<h1 itemprop='scientificName'>Chordata</h1>
Common Name: <span itemprop='vernacularName'>Vertebrates and tunicates</span>
</div>

29 thoughts on “HTML 5 Species Taxon Microdata Using Darwin Core”

  1. With havin so much content and articles do you ever run into any problems of plagorism or copyright infringement? My website has a lot of unique content I’ve either created myself or outsourced but it seems a lot of it is popping it up all over the web without my authorization. Do you know any solutions to help reduce content from being ripped off? I’d really appreciate it.

  2. What¦s Going down i am new to this, I stumbled upon this I’ve discovered It positively helpful and it has aided me out loads. I am hoping to give a contribution & aid other customers like its aided me. Great job.

  3. The next time I read a blog, I hope that it doesnt disappoint me as much as this one. I mean, I know it was my choice to read, but I actually thought youd have something interesting to say. All I hear is a bunch of whining about something that you could fix if you werent too busy looking for attention.

  4. Wow that was odd. I just wrote an very long comment but after I clicked submit my comment didn’t show up. Grrrr… well I’m not writing all that over again. Anyhow, just wanted to say fantastic blog!

  5. I am now not sure the place you’re getting your information, however great topic. I needs to spend a while finding out more or figuring out more. Thanks for excellent information I used to be searching for this information for my mission.

  6. What i do not realize is if truth be told how you’re not really a lot more neatly-preferred than you may be right now. You are very intelligent. You realize thus significantly on the subject of this subject, made me for my part consider it from so many varied angles. Its like men and women are not fascinated except it is one thing to accomplish with Woman gaga! Your own stuffs great. At all times deal with it up!

  7. Thanx for the effort, keep up the good work Great work, I am going to start a small Blog Engine course work using your site I hope you enjoy blogging with the popular BlogEngine.net.Thethoughts you express are really awesome. Hope you will right some more posts.

  8. The root of your writing whilst appearing agreeable initially, did not really settle perfectly with me after some time. Somewhere throughout the paragraphs you actually managed to make me a believer but only for a while. I still have got a problem with your leaps in assumptions and you might do well to fill in those gaps. When you can accomplish that, I would certainly end up being amazed.

  9. I’ve been surfing online greater than 3 hours lately, yet I never found any fascinating article like yours. It is beautiful worth enough for me. In my opinion, if all website owners and bloggers made just right content material as you probably did, the net will likely be a lot more helpful than ever before.

  10. I was curious if you ever thought of changing the structure of your website? Its very well written; I love what youve got to say. But maybe you could a little more in the way of content so people could connect with it better. Youve got an awful lot of text for only having one or 2 pictures. Maybe you could space it out better?

  11. Attractive part of content. I just stumbled upon your site and in accession capital to say that I get in fact loved account your blog posts. Anyway I’ll be subscribing to your augment and even I achievement you get entry to persistently quickly.

Leave a Reply

Your email address will not be published. Required fields are marked *