CHARMe as bibliographic database, part 2: SKOS

In one of our last posts I introduced the idea of using CHARMe as a bibliographic database. In this post I want to dive deeper into this topic, focusing on how to capture various specific  types of documents.
At first let’s have a look, how our local bibliographic database is currently organized and what queries we want to run against it. It turns out that our bibliographic database is a simple excel-sheet with some informal defined constraints restricting  the allowed values of some columns. The most interesting constraints for this blog, are defined on the columns DocType  and Type. The column ‘Type’ specifies the kind of DocType.document in more detail.

bibliograhic_db

A common query for the bibliographic database looks like :
‘Please give me all items related to project documentation of a specific project/dataset.’or ‘Please give me all Algorithm Theoratical Basis Documents of a specific project/dataset’.
In addition to these common queries a new user requirement is, to add a new level of categorization to the ‘Types‘, in order to enable queries like: ‘Please give me all technical  items related to project documentation of a specific project/dataset.’
That leads us to the questions what are technical items and what other kind of items are to be captured? As we think about it, we come to the conclusion that we need a hierarchal order of those items. Additionally  it would be a great deal to make  these classifications and definitions publicly available, at least to our department as a kind of department vocabulary  and not to hide them inside an application.

cmsaf_vocab

After we have captured these fundamental requirements let’s have a look how CHARMe may help us to get on the way.
In the CHARMe project we have defined to use the FRBR-aligned Bibliographic Ontology, fabio, a publicly available and widespread used ontology. Therefore we should watch out for some fabio-classes, which we can map to our DocType and Type entries. And indeed except for DocType.document we find the following mappings:

docType:article to :

fabio_article

docType:presentation to :

fabio_presentation

docType:poster to :

fabio_conferenceposter

The nearest match to docType:document might be:

fabio_report

And as we see the fabio:Report already has some sub-classes, but when we take a closer look into fabio:TechnicalReport we have to notice that it is leaf node in the fabio-hierachy. So we need to find a way to exend the fabio-ontology In order to cover our technical items, like the ATBD.
While finding an existing, public widespread ontology which defines the term ATBD for us would be the preferred option, in this blog we start to create our own definition/vocabulary and trying to extend fabio to our needs, by using SKOS.

SKOS, which stands for Simple Knowledge Organization System, is a W3C standard, based on other Semantic Web standards (RDF and OWL), that provides a way to represent controlled vocabularies, taxonomies and thesauri.

“The fundamental element of the SKOS vocabulary is the concept. Concepts are the units of thought —ideas, meanings, or (categories of) objects and events—which underlie many knowledge organization systems. As such, concepts exist in the mind as abstract entities which are independent of the terms used to label them.SKOS Primer

So we start by expressing (using Turtle) our ATBD as a SKOs Concept and adding some labeling information and the definition to it:

cmsaf_vocab:ATBD     rdf:type     skos:Concept;
cmsaf_vocab:ATBD     skos:prefLabel    “ATBD”;
cmsaf_vocab:ATBD     skos:altLabel     “Algorithm Theoretical Basis Document”;
cmsaf_vocab:ATBD     skos:definition     “The Algorithm Theoretical Basis Documents (ATBD) are intended to describe the physical and mathematical description of the algorithms to be used in the generation of data products. The ATBD include a description of variance and uncertainty estimates and considerations of calibration and validation, exception control, and diagnostics. In some cases, internal and external data product flows are required.”;

And to build the bridge to the fabio-Ontology we just need to add a semantic relation information between our ATBD-concept and a fabio-Class:

cmsaf_vocab:ATBD     skos:narrower     fabio:TechnicalReport .

That means by using SKOS we can build up our own classification scheme in a common data model for knowledge organization systems, with the benefits, to easily incorporate or extend an existing KOS.

fabio_extended

But by migrating from excel to CHARMe, we also switch from relational data to graph data and from SQL to SPARQL. That means, the question ‘Please give me all technical items related to project documention of a specific project.’ becomes in pseudo SPARQL :
Select *
where {
?technical_item oa:hasTarget <uri of our project> .
?technical_item rdf:typeOf ?technical_item_type .
?technical_item_type skos:narrower+ fabio:TechnicalReport .
}.

So there are some more steps adhead of us before we can migrate our EXCEL bibliographic database to  CHARMe, but by using fabio and SKOS one of the big chunks can be tackled.

Advertisements

Tags: , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: