Among the major challenges of tackling a project such as LD4PE is deciding where to begin- and where to end.  Linked Data is such a huge topic; one which requires, both conceptually and in practice, the understanding of how a lot of moving pieces work.  Linked Data involves learning the principles behind the RDF data model, OWL (Web Ontology Language), and the SPARQL query language.   It involves becoming familiar with a number of different serializations used when publishing Linked Data (RDF/XML, Turtle, JSON).  It involves learning how to obtain data from non-RDF sources, then “clean” and convert it for use as Linked Data.  It involves an understanding of how controlled vocabularies and taxonomies are built and maintained, and how inferencing schemes are used.  In other words, Linked Data is a series of inter-related concepts, technologies, and best practices, and one must master all of it in order to put it to practical use.

The process of searching the Web for resources which fit the broad definition of “teaching about Linked Data”, was very “hit or miss” at first; we simply did not know where to begin focusing on such a wide topic.  The only guidance we had was an early draft of the Competency Index, to which the learning objects we were discovering would ultimately be tied.  As the project advanced, individual conversations with various LD4PE project members, each of whom had their own particular areas of interest and expertise, were conducted to get their unique perspectives on what direction the Competency Index and resource discovery should take.  Some felt that we should include certain background competencies, such as familiarity with taxonomy building or a basic understanding of how relational databases (and the query languages used to access them) differ from triple stores.  Every suggestion made perfect sense, from the standpoint that all were somehow related to the (very) wide world of creating, publishing, or consuming Linked Data.

As our monthly Skype meetings continued, all involved in the project eventually reached the consensus that we were at risk of widening the scope of LD4PE to the extent that it would be unmanageable; we would not be able to deliver a finished project of the quality we all aspired to within the two-year period of our Institute of Museum and Library Services (IMLS) grant.  Finally, it was agreed upon that, at least for the next few months, we would focus in on one area of the Competency Index and resource discovery- SPARQL query language.

Moving ahead with a focus on SPARQL makes sense for two reasons:  1) Resource discovery efforts to this point had revealed that there existed a lot of learning objects for some Linked Data concepts, and a relative few for others.  A few months into the project, a heat map was made which demonstrated the relative abundance or scarcity of resources for each area of the draft Competency Index- and SPARQL stood out as the “hottest”; 2) SPARQL, along with the RDF data model and OWL, is commonly identified as one of the three “background technologies” which underpin Linked Data.  SPARQL is used to provide access to the triple stores that hold RDF data, as well as to assess the quality of those datasets and to maintain them.  It is the language which enable one to pull information from a dataset for use in Linked Data applications.  And, used in conjunction with RDFS and OWL, SPARQL can be used to perform reasoning- a quality which sets it (and Linked Data itself), above traditional, relational data.

The following “slices” from our draft Competency index show just how important SPARQL is, as it appears in several areas:

Topic Cluster: Searching and querying

  • Topic:Discovery of RDF vocabularies and data sets
    • Competency:Monitors registries and referatories of RDF vocabularies, OWL ontologies and RDF  data stores
  • Topic:Assessment of RDF vocabularies and data sets
  • Topic:Anatomy of a simple SPARQL query
  • Topic:Querying RDF data using SPARQL
    • Competency:Understands the SPARQL 1.1 query language, protocol, functions and operators
    • Competency:Uses query forms including ASK, SELECT, DESCRIBE, CONSTRUCT
    • Competency:  Uses query patterns including BGP, UNION, OPTIONAL, FILTER
    • Competency:Uses sequence modifiers including DISTINCT, REDUCED, ORDER BY, LIMIT, OFFSET
  • Topic:Updating RDF with SPARQL 1.1
    • Competency: Performs data management using INSERT, DELETE, DELETE/INSERT
    • Competency: Performs graph management using LOAD, CLEAR, CREATE, DROP, COPY/MOVE/ADD
  • Topic:Reasoning over RDF
    • Competency:Understands how reasoning and data integration can be achieved by utilizing domain knowledge embodied in RDFS and OWL
    • Competency:Utilizes the entailment regimes of RDFS and SPARQL 1.1 and understands their limitations
    • Competency: Understands OWL properties, property axioms, axioms and class constructions in reasoning

 Topic Cluster: Creating, publishing and manipulating RDF

  • Topic: Creating and using SPARQL endpoints
    • Competency: Creates SPARQL endpoints for RDBMS
    • Competency: Uses SPARQL endpoints for RDBMS
    • Competency: Demonstrates knowledge of factors influencing whether to publish RDF or provide a SPARQL endpoint