Provenance Vocabulary Core Ontology Specification

14 March 2012

This version:
http://purl.org/net/provenance/ns-20120314
Latest version:
http://purl.org/net/provenance/ns
Previous version:
http://purl.org/net/provenance/ns-20110125
Revision:
0.6
Authors:
Olaf Hartig (Database and Information System Research Group, Department of Computer Science, Humboldt-Universität zu Berlin)
Jun Zhao (Image Bioinformatics Research Group, Department of Zoology, University of Oxford)

Creative Commons License This work is licensed under a Creative Commons License. This copyright applies to the Provenance Vocabulary Core Ontology Specification and accompanying documentation.

Valid XHTML + RDFa Regarding underlying technology, the Provenance Vocabulary relies heavily on W3C's RDF technology, an open Web standard that can be freely used by anyone.

This visual layout and structure of the specification was adapted from the SIOC Core Ontology Specification edited by Uldis Bojars and John G. Breslin.


Abstract

The Provenance Vocabulary provides classes and properties for describing provenance of Web data. The vocabulary focuses on two main use cases: 1.) It enables consumers of Web data to describe provenance of data retrieved from the Web and of data derived from such Web data. 2.) It enables providers of Web data to publish provenance-related metadata about their data. Notice, the vocabulary is not intended for describing provenance of other kinds of Web content. The Provenance Vocabulary is designed as a Web data specific specialization of the W3C PROV Ontology (PROV-O); classes and properties provided by the vocabulary are domain specific extensions of the more general concepts introduced in PROV-O. As a consequence, any Provenance Vocabulary based description of provenance can be easily interpreted and exchanged according to the W3C PROV family of standards.

This documents specifies the Provenance Vocabulary Core Ontology which defines the main classes and properties provided by the Provenance Vocabulary.


Status of this document

NOTE: This section describes the status of this document at the time of its publication. Other documents may supersede this document.

This specification is an evolving document. This document may be updated or added to based on implementation experience, but no commitment is made by the authors regarding future updates.

The authors welcome suggestions on the Provenance Vocabulary and this document. Please send comments to the Provenance Vocabulary users' mailing list (prov-vocab-users), public archives are available. Issues with the vocabulary can be reported using the issue tracker.

Table of contents


1. Introduction

The Provenance Vocabulary which is defined as an OWL-DL ontology is partitioned in a core ontology and supplementary modules. To avoid making the core ontology too complex the modules provide less frequently used terms and a broad range of extensions of the core terms. At present the Provenance Vocabulary has three modules: Types, Files, and Integrity Verification.

The vocabulary is designed very closely to the model for Web data provenance as presented in [Har09]. This model comprises two dimensions of Web data provenance: data creation and data access. Accordingly, the Provenance Vocabulary basically consists of three main parts: general terms, terms for data creation, and terms for data access.

Detailed information about using the Provenance Vocabulary and many examples can be found in [HZ09].

The Provenance Vocabulary is a domain specific specialization of the W3C PROV Ontology (PROV-O) developed by the W3C Provenance Working Group; classes and properties provided by the vocabulary are Web data specific extensions of the more general concepts introduced in PROV-O. As a consequence, any Provenance Vocabulary based description of provenance can be easily interpreted and exchanged according to the W3C PROV family of standards.

The XML Namespace URIs that must be used by implementations of this specification are:

Recommended prefixes for abbreviating these four namespaces are prv:, prvTypes:, prvFiles:, and prvIV:, respectively.

Further prefixes used in this document are:

PrefixNamespace
prov:http://www.w3.org/ns/prov#
dcterms:http://purl.org/dc/terms/
foaf:http://xmlns.com/foaf/0.1/
ir:http://www.ontologydesignpatterns.org/cp/owl/informationrealization.owl#
irw:http://www.ontologydesignpatterns.org/ont/web/irw.owl#
web:http://sw.nokia.com/WebArch-1/

2. The Provenance Vocabulary Core Ontology at a glance

An alphabetical index of Provenance Vocabulary Core Ontology terms, by class and by property, are given below. All the terms are hyperlinked to their detailed description for quick reference.

Classes: | CreationGuideline | DataAccess | DataCreation | DataItem | DataProvidingService | DataPublisher | File | HumanAgent | Immutable | NonHumanAgent |

Properties: | accessedResource | accessedService | completedAt | containedBy | createdBy | deployedSoftware | operatedBy | performedBy | precededBy | retrievedBy | serializedBy | usedBy | usedData | usedGuideline |

The following figure provides an overview of the classes and properties in the Provenance Vocabulary Core Ontology.

An overview of the classes and properties in the Provenance Vocabulary core ontology.

3. Cross-reference for the Provenance Vocabulary Core Ontology classes and properties

The Provenance Vocabulary Core Ontology introduces the following classes and properties.

3.1. General classes

Class: DataItem

DataItem is a general concept that represents data items of any kind.

identifier:http://purl.org/net/provenance/ns#DataItem
sub-class of: prov:Entity
ir:InformationRealization
super-class of: prv:CreationGuideline
disjoint with: prv:File
in domain of:prv:containedBy prv:createdBy prv:precededBy prv:serializedBy
in range of:prv:containedBy prv:precededBy prv:usedData

[back to overview]


Class: File

File is a general class that represents computer files/documents of any kind.

identifier:http://purl.org/net/provenance/ns#File
sub-class of: prov:Entity
disjoint with: prv:DataItem
in domain of:prv:createdBy
in range of:prv:serializedBy

[back to overview]


Class: Immutable

Immutable is a concept that represents entities which are immutable.

identifier:http://purl.org/net/provenance/ns#Immutable
sub-class of: prov:Entity
in domain of:prv:retrievedBy

[back to overview]


Class: HumanAgent

HumanAgent is a general class that represents agents who are social beings such as persons, organizations, companies.

identifier:http://purl.org/net/provenance/ns#HumanAgent
sub-class of: prov:Agent
super-class of: prv:DataPublisher foaf:Person foaf:Organization foaf:Group
disjoint with: prv:NonHumanAgent
in range of:prv:operatedBy

[back to overview]


Class: NonHumanAgent

NonHumanAgent is a general class that represents agents who are not social beings.

identifier:http://purl.org/net/provenance/ns#NonHumanAgent
sub-class of: prov:Agent
super-class of: prv:DataProvidingService
disjoint with: prv:HumanAgent
in domain of:prv:deployedSoftware prv:operatedBy

[back to overview]


(Deprecated) Class: Actor

This class is deprecated and will be removed from the Provenance Vocabulary in the next revision. prv:Actor was deprecated in the process of making the Provenance Vocabulary a specialization of W3C's PROV-O. Hence, use one of the more specific classes prv:HumanAgent and prv:NonHumanAgent instead; or, if such a specialization is unsuitable for the use case at hand, directly use the general class prov:Agent as defined by PROV-O.

identifier:http://purl.org/net/provenance/ns#Actor

[back to overview]


(Deprecated) Class: HumanActor

This class is deprecated and will be removed from the Provenance Vocabulary in the next revision. prv:HumanActor was renamed to prv:HumanAgent in the process of making the Provenance Vocabulary a specialization of W3C's PROV-O. Hence, use prv:HumanAgent instead.

identifier:http://purl.org/net/provenance/ns#HumanActor

[back to overview]


(Deprecated) Class: NonHumanActor

This class is deprecated and will be removed from the Provenance Vocabulary in the next revision. prv:NonHumanActor was renamed to prv:NonHumanAgent in the process of making the Provenance Vocabulary a specialization of W3C's PROV-O. Hence, use prv:NonHumanAgent instead.

identifier:http://purl.org/net/provenance/ns#NonHumanActor

[back to overview]


(Deprecated) Class: Execution

This class is deprecated and will be removed from the Provenance Vocabulary in the next revision. prv:Execution was deprecated in the process of making the Provenance Vocabulary a specialization of W3C's PROV-O. Hence, use prov:Activity instead.

identifier:http://purl.org/net/provenance/ns#Execution

[back to overview]


(Deprecated) Class: Artifact

This class is deprecated and will be removed from the Provenance Vocabulary in the next revision. prv:Artifact was deprecated in the process of making the Provenance Vocabulary a specialization of W3C's PROV-O. Hence, use prov:Entity (or the more specific prv:Immutable) instead.

identifier:http://purl.org/net/provenance/ns#Artifact

[back to overview]


3.2. General properties

Property: containedBy

This property refers to a data item that contained a data item. Hence, this property refers to another data item of a larger granularity (e.g. an RDF triple is usually contained in an RDF graph).

Identifier:http://purl.org/net/provenance/ns#containedBy
OWL Type:ObjectProperty
Domain: prv:DataItem
Range: prv:DataItem

[back to overview]


Property: deployedSoftware

This property refers to the software that was run by a non-human actor (usually a service).

Identifier:http://purl.org/net/provenance/ns#deployedSoftware
OWL Type:ObjectProperty
Domain: prv:NonHumanAgent

[back to overview]


Property: serializedBy

This property refers to a file that serialized a data item.

Identifier:http://purl.org/net/provenance/ns#serializedBy
OWL Type:ObjectProperty
Domain: prv:DataItem
Range: prv:File

[back to overview]


Property: performedBy

This property refers to an agent that/who performed an activity.

Identifier:http://purl.org/net/provenance/ns#performedBy
OWL Type:ObjectProperty
Sub-property of: prov:wasAssociatedWith
Domain:prov:Activity (specified for prov:wasAssociatedWith)
Range:prov:Agent (specified for prov:wasAssociatedWith)

[back to overview]


Property: operatedBy

This property refers to a human agent who was operating a non-human agent. For instance, a service provider operates a data providing service (see concept prv:DataProvidingService). Another example is a human agent who operates a non-human data creating agent.

Identifier:http://purl.org/net/provenance/ns#operatedBy
OWL Type:ObjectProperty
Sub-property of: prov:actedOnBehalfOf
Domain: prv:NonHumanAgent
Range: prv:HumanAgent

[back to overview]


Property: completedAt

This property refers to the time an activity has been completed.

Identifier:http://purl.org/net/provenance/ns#completedAt
OWL Types:DatatypeProperty
FunctionalProperty
Equivalent to: prov:endedAtTime
Domain:prov:Activity (specified for prov:endedAtTime)
Range:xsd:dateTime (specified for prov:endedAtTime)

[back to overview]


(Deprecated) Property: yieldedBy

This property is deprecated and will be removed from the Provenance Vocabulary in the next revision. prv:yieldedBy was deprecated in the process of making the Provenance Vocabulary a specialization of W3C's PROV-O. Hence, use prov:wasGeneratedBy instead.

Identifier:http://purl.org/net/provenance/ns#yieldedBy
OWL Types:ObjectProperty
DeprecatedProperty

[back to overview]


(Deprecated) Property: involvedActor

This property is deprecated and will be removed from the Provenance Vocabulary in the next revision. prv:involvedActor was deprecated in the process of making the Provenance Vocabulary a specialization of W3C's PROV-O. Hence, use prov:wasAssociatedWith instead.

Identifier:http://purl.org/net/provenance/ns#involvedActor
OWL Types:ObjectProperty
DeprecatedProperty

[back to overview]


(Deprecated) Property: employedArtifact

This property is deprecated and will be removed from the Provenance Vocabulary in the next revision. prv:employedArtifact was deprecated in the process of making the Provenance Vocabulary a specialization of W3C's PROV-O. Hence, use prov:used instead.

Identifier:http://purl.org/net/provenance/ns#employedArtifact
OWL Types:ObjectProperty
DeprecatedProperty

[back to overview]


(Deprecated) Property: performedAt

This property has been renamed to prv:completedAt. Hence, prv:performedAt is deprecated and will be removed from the Provenance Vocabulary in the next revision.

Identifier:http://purl.org/net/provenance/ns#performedAt
OWL Types:DatatypeProperty
DeprecatedProperty

[back to overview]


3.3. Data creation classes

Class: DataCreation

DataCreation is a concept that represents the execution of an activity by which data items have been created.

identifier:http://purl.org/net/provenance/ns#DataCreation
sub-class of: prov:Activity
disjoint with: prv:DataAccess
in domain of:prv:usedData prv:usedGuideline
in range of:prv:createdBy

[back to overview]


Class: CreationGuideline

CreationGuideline is a concept that represents a guideline used to guide the execution of a data creation. Examples for creation guidelines are transformation rules, mapping definitions, entailment rules, and database queries.

identifier:http://purl.org/net/provenance/ns#CreationGuideline
sub-class of: prov:Plan
prv:DataItem
in range of:prv:usedGuideline

[back to overview]


3.4. Data creation properties

Property: createdBy

This property refers to the creation of a data item (or a file that serializes data items).

Identifier:http://purl.org/net/provenance/ns#createdBy
OWL Type:ObjectProperty
Sub-property of: prov:wasGeneratedBy
Domain: Union of prv:DataItem and prv:File
Range: prv:DataCreation
Property chain: prv:serializedBy, prv:createdBy

[back to overview]


Property: usedData

This property refers to a source data item that has been used during the creation of a data item. Examples for source data are the content of a document used for machine learning, the statements in a knowledge base used to entail a new statement, and the entries in a database used to answer a query. Notice, all source data has provenance; we strongly encourage to describe this provenance as well, at least as far as available information permits.

Identifier:http://purl.org/net/provenance/ns#usedData
OWL Type:ObjectProperty
Sub-property of: prov:used
Domain: prv:DataCreation
Range: prv:DataItem
Property chain: prvFiles:usedDataFile, inverse of prv:serializedBy

[back to overview]


Property: usedGuideline

This property refers to a creation guideline which guided the execution of a data creation. Examples for creation guidelines are transformation rules, mapping definitions, entailment rules, and database queries. Notice, all creation guidelines have provenance; we strongly encourage to describe this provenance as well, at least as far as available information permits.

Identifier:http://purl.org/net/provenance/ns#usedGuideline
OWL Type:ObjectProperty
Sub-property of: prov:used
Domain: prv:DataCreation
Range: prv:CreationGuideline
Property chain: prvFiles:usedGuidelineFile, inverse of prv:serializedBy

[back to overview]


Property: precededBy

This property may be used to make the relationship between different version of a data item explicit. More precisely, this property refers to an immediately preceding version of a data item; hence, the new version (i.e. the subject) has been created using the old version (i.e. the object). We strongly encourage to also describe this creation of the new version explicitly.

Identifier:http://purl.org/net/provenance/ns#precededBy
OWL Type:ObjectProperty
Sub-property of:
the inverse of prov:wasRevisionOf
dcterms:replaces
Domain: prv:DataItem
Range: prv:DataItem

[back to overview]


3.5. Data access classes

Class: DataAccess

DataAccess is a concept that represents the completed execution of an activity by which an immutable data item has been retrieved from the Web.

identifier:http://purl.org/net/provenance/ns#DataAccess
sub-class of: prov:Activity
disjoint with: prv:DataCreation
in domain of:prv:accessedResource prv:accessedService
in range of:prv:retrievedBy
restrictions: only one prv:accessedResource, only one prv:accessedService

[back to overview]


Class: DataProvidingService

DataProvidingService is a concept that represents a non-human agent - usually a Web service or a server - that processes data access requests and actually sends the requested Web representations over the Web.

identifier:http://purl.org/net/provenance/ns#DataProvidingService
sub-class of: prv:NonHumanAgent
super-class of: irw:Server web:Server web:Service
in domain of:prv:usedBy
in range of:prv:accessedService

[back to overview]


Class: DataPublisher

DataPublisher is a concept that represents entities such as persons, groups, or organizations who use a data providing service (see concept prv:DataProvidingService) to publish data on the Web.

identifier:http://purl.org/net/provenance/ns#DataPublisher
sub-class of: prv:HumanAgent
in range of:prv:usedBy

[back to overview]


3.6. Data access properties

Property: retrievedBy

This property refers to the data access by which an immutable entity has been retrieved from the Web. Each entity that has this property is a Web representation.

Identifier:http://purl.org/net/provenance/ns#retrievedBy
OWL Type:ObjectProperty
FunctionalProperty
Sub-property of: prov:wasGeneratedBy
Domain: prv:Immutable
Union of prv:DataItem and prv:File
Range: prv:DataAccess

[back to overview]


Property: accessedResource

This property refers to the Web resource that has been accessed during the execution of a data access. More precisely, the request of the referenced Web resource resulted in retrieving the representation that has been retrieved by the corresponding prv:DataAccess execution.

Identifier:http://purl.org/net/provenance/ns#accessedResource
OWL Type:ObjectProperty
Domain: prv:DataAccess
Range: irw:WebResource

[back to overview]


Property: accessedService

This property refers to the service that provided the Web representation during the execution of a data access.

Identifier:http://purl.org/net/provenance/ns#accessedService
OWL Type:ObjectProperty
Sub-property of: prov:wasAssociatedWith
Domain: prv:DataAccess
Range: prv:DataProvidingService

[back to overview]


Property: usedBy

This property refers to a data publisher who used a data providing services at the time the provenance description refers to.

Identifier:http://purl.org/net/provenance/ns#usedBy
OWL Type:ObjectProperty
Sub-property of: prov:actedOnBehalfOf
Domain: prv:DataProvidingService
Range: prv:DataPublisher

[back to overview]


4. Design Decisions

Vocabulary design decisions:

  1. use owl:disjointWith to explicitly identify disjoint classes
  2. don't define inverse properties because they are bad for interoperability (see http://dowhatimean.net/2006/06/an-rdf-design-pattern-inverse-property-labels)
  3. rule to decide about the direction of a property (i.e. a link) when both are possible: it should be possible to describe provenance of a data item starting from the data item itself (i.e. the data item becomes subject in triples) and 'working' towards more distant provenance elements. Thus, the links should not be directed towards the data item.

5. References

6. Change log

Changes from rev.0.5.1 to rev.0.6

Changes from rev.0.5 to rev.0.5.1

Changes from rev.0.4 to rev.0.5

Changes from rev.0.3 to rev.0.4

Changes from rev.0.2 to rev.0.3

Changes from rev.0.1 to rev.0.2