Bathing Water Quality: Structure of the Published Linked Data
This document describes the structure of the UK Bathing Water Quality linked data that we produced for the Environment Agency.
As part of our work with data.gov.uk and the UK Location Programme we have been working to pilot the publication of both current and historic bathing water quality information as linked data. This data is available through SPARQL endpoints at data.gov.uk and TSO's OpenUPLabs and through data dumps (samples, sites, compliance_history).
The data covers the period up to the end of the 2010 season.
The Domain
The UK has a number of areas, typically beaches, that are designated as bathing waters where people routinely enter the water. The Environment Agency monitors and reports on the quality of the water at these bathing waters.
For each bathing water there is a sampling point near which the water is sampled roughly once a week during the bathing season. These samples are analysed and the water given a compliance classification of excellent, good or poor.
Data Structure
The data can be thought of as structured in 3 groups:
- There is basic reference data describing the bathing waters and sampling points
- There is a data set giving the rating for each bathing water for each year it has been monitored
- There is a data set giving the detailed weekly sampling results for each bathing water
This data is represented in RDF data using the following namespaces and prefixes:
- bw: http://environment.data.gov.uk/def/bathing-water/
- bwq: http://environment.data.gov.uk/def/bathing-water-quality/
- loc-sp: http://location.data.gov.uk/def/ef/SamplingPoint/
- ref: http://reference.data.gov.uk/def/reference/
- qb: http://purl.org/linked-data/cube#
- ossr: "http://data.ordnancesurvey.co.uk/ontology/spatialrelations/
- geo: http://www.w3.org/2003/01/geo/wgs84_pos#
- skos: http://www.w3.org/2004/02/skos/core#
- dcterms: http://purl.org/dc/terms/
- void: http://rdfs.org/ns/void#
- xsd: "http://www.w3.org/2001/XMLSchema#
- rdfs: http://www.w3.org/2000/01/rdf-schema#
- rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
General Reference Data
Bathing Waters
Each bathing water is identified by two URIs fo the form:
- http://environment.data.gov.uk/id/bathing-water/{EU bathing water id}
- http://environment.data.gov.uk/id/bathing-water/{name}
The first of these has a last segment based on an EU bathing water identifier. The second has a last segment based on the name of the bathing water.
Each bathing water has the following properties:
| rdf:type |
|
| skos:prefLabel |
|
| rdfs:label |
|
| skos:notation |
|
| bw:eubwidNotation |
|
| loc-sp:samplingPoint |
|
| ref:uriSet |
|
| owl:sameAs |
|
Sampling Points
Each sampling point is identified by a URI of the form:
and has the following properties:
| skos:prefLabel |
|
| rdfs:label |
|
| skos:notation |
|
| loc-sp:samplePointNotation |
|
| bw:bathingWater |
|
| geo:lat, geo:long |
|
| ossr:easting, ossr:northing |
|
| ref:uriSet |
|
Compliance Classification
Compliance classifications are modelled as RDF resources and described using the SKOS vocabulary. They have URIs fo the form:
They have the following properties:
| rdf:type |
|
| rdfs:label |
|
| rdfs:isDefinedBy |
|
| skos:prefLabel |
|
| skos:inScheme |
|
| skos:definition |
|
| skos:topConceptOf |
|
| skos:notation |
|
| bwq:complianceCodeNotation |
|
| dcterms:source |
|
The following compliance codes are defined:
| bwq:G |
|
| bwq:I |
|
| bwq:F |
|
| bwq:C |
|
| bwq:N |
|
Data Sets
There are two datasets, the annual compliance dataset and the samples dataset. Each of these is modelled as an n-dimensional matrix using the data cube vocabulary.
Annual Compliance Assessment Dataset
The annual compliance assessment dataset has two dimensions, the year and the sampling point. The data set is identifyied by the URI:
The dataset resource has the following properties:
| rdf:type |
|
| rdfs:label |
|
| dcterms:description |
|
| dcterms:modified |
|
| dcterms:license |
|
| dcterms:source |
|
| void:vocabulary |
|
| void:uriRegexPattern |
|
| void:dataDump |
|
| qb:structure |
|
| qb:slice |
|
Slices
The annual compliance assessment dataset has two kinds of slices or subsets of the dataset:
| all the observations for a specific sampling point | http://environment.data.gov.uk/data/bathing-water-quality/compliance/slice/point/{id} |
| all the observations for a specific year | http://environment.data.gov.uk/data/bathing-water-quality/compliance/slice/year/{year} |
Each slice is a resource with the following properties:
| rdf:type |
|
| rdfs:label |
|
| qb:sliceStructure |
|
| qb:observation |
|
| bwq:samplingPoint or bwq:sampleYear |
|
Annual Compliance Assessment Observations
The value of each cell in the dataset matrix is the compliance code for given sampling point and year. Following the data cube vocabulary model, each cell in the matrix is a resource identified by a URL of the form:
- http://environment.data.gov.uk/data/bathing-water-quality/compliance/point/{sampling point id}/year/{year}
Each such observation has the following properties:
| rdf:type |
|
| rdfs:label |
|
| bwq:samplingPoint |
|
| dcterms:source |
|
| bwq:bathingWater |
|
| bwq:sampleYear |
|
| bwq:complianceClassification |
|
| qb:dataSet |
|
| bwq:inYearDetail |
|
In-Season Sample Assessment Dataset
The in-season sample assessment dataset has three dimensions:
- the year in which the sample was taken
- the week in which the sample was taken
- the sampling point at which the sample was taken.
Each observation has 6 measures:
- total coliform count
- faecal coliform count
- faecal strptococci count
- entrovirus count
- salmonella present
- sample classification
Each observation can also have the following attributes:
- the time the sample was taken
- whether there was an abnormal weather exception
- a total coliform count qualifier
- a faecal coliform count qualifier
- a faecal strotococci qualifier
- an entrovirus qualifier
The data set has the URI:
The dataset resource has the following properties:
| rdf:type |
|
| rdfs:label |
|
| dcterms:description |
|
| dcterms:modified |
|
| dcterms:license |
|
| dcterms:source |
|
| void:vocabulary |
|
| void:uriRegexPattern |
|
| void:dataDump |
|
| qb:structure |
|
| qb:slice |
|
Slices
The in-season sample assessment dataset has the following kinds of slices through the data:
| samples for a given sampling point | |
| samples for a given week | |
| samples for a given year | |
| samples for a given year and sampling point | |
| latest samples for each sampling point |
Each slice is a resource with the following properties:
| rdf:type |
|
| rdfs:label |
|
| qb:sliceStructure |
|
| qb:observation |
|
| bwq:samplingPoint or bwq:sampleYear or bwq: |
|
In-Season Sample Assessment Observation
In the in-season sample assessment data set, each cell in the dimensional matrix has the 6 measure named above and associated attributes. Following the data cube vocabulary model, each cell in the matrix is a resource identified by a URL of the form:
The record date is the date at which the information was published. If a sample is reanalysed and different results published, a new observation with a different record date will be created.
Each observation has the following properties:
| rdf:type |
|
| rdfs:label |
|
| bwq:samplingPoint |
|
| dcterms:source |
|
| bwq:bathingWater |
|
| bwq:sampleDateTime |
|
| bwq:sampleYear |
|
| bwq:sampleWeek |
|
| bwq:faecalColiformCount |
|
| bwq:faecalColiformQualifier |
|
| bwq:totalColiformCount |
|
| bwq:totalColiformQualifier |
|
| bwq:faecalStreptococciCount |
|
| bwq:faecalStreptococciQualifier |
|
| bwq:entrovirusCount |
|
| bwq:entrovirusQualifier |
|
| bwq:salmonellaPresent |
|
| bwq:complianceClassification |
|
| bwq:abnormalWeatherException |
|
| qb:dataSet |
|
| rdfs:comment |
|
| dcterms:created |
|
Count Qualifiers
The observation properties bwq:faecalColiformQualifier, bwq:totalColiformQualifier, bwq:faecalStreptococciQualifier and bwq:entrovirusQualifier specify how their corresponding count properties should be interpretted. The values of these properties are instances of bwq:CountQualifier and are defined as SKOS concepts. The following bwq:CountQualifiers are defined:
| bwq:moreThan |
|
| bwq:lessThan |
|
| bwq:actual |
|
Presence
The presence or absence of a substance may be indicated by an instance of bwq:Presence which has a richer set of values than just true or false. Instances of bwq:Presence are SKOS concepts. The following instance of bwq:Presence are defined:
| bwq:present |
|
| bwq:not-present |
|
| bwq:not-accessed |
|
