Tuesday, October 1, 2013

How to Manage Date Range in SPARQL queries?


The following examples demonstrate how to manage date range in a SPARQL query:

Example 1

SELECT ?s ?date 
FROM <http://dbpedia.org> 
WHERE 
  { 
    ?s ?p ?date . FILTER ( ?date >= "19450101"^^xsd:date && ?date <= "19451231"^^xsd:date )  
  } 
LIMIT 100
View the results of the query execution on the dbpedia instance.

Example 2

Suppose there is the following query using bif:contains for date:
SELECT DISTINCT ?s ?date 
FROM <http://dbpedia.org>
WHERE
  {
    ?s ?p ?date . FILTER( bif:contains(?date, '"1945*"' ) && (str(?p) != str(rdfs:label)) )
  }
LIMIT 30
If ?date is of type xsd:date or xsd:dateTime and of valid syntax then bif:contains(?date, '"1945*"' ) will not found it, because it will be parsed at load/create and stored as SQL DATE value.
So if data are all accurate and typed properly then the filter is:
(?date >= xsd:date("1945-01-01") && ?date < xsd:date("1946-01-01"))
i.e. the query should be:
SELECT DISTINCT ?s ?date
FROM <http://dbpedia.org>
WHERE
  {
    ?s ?p ?date . FILTER( ?date >= xsd:date("1945-01-01") && ?date < xsd:date("1946-01-01") && (str(?p) != str(rdfs:label)) )
  }
LIMIT 10
View the results of the query execution on the dbpedia instance.
If data falls, then the free-text will be OK for tiny examples but not for "big" cases because bif:contains(?date, '"1945*"') would require that less than 200 words in the table begins with 1945. Still, some data can be of accurate type and syntax so range comparison should be used for them and results aggregated via UNION.

If dates mention timezones then the application can chose the beginning and the end of the year in some timezones other than the default.

0 comments:

Post a Comment