All articles

Introducing our Data Integration and Publishing Platform service

Decorative graphic - coral background with node and link graphic with Epimorphics swish icon in white as central node. White Text: introducing our data integration and publishing platform service

We offer a number of services to support the UK public sector through GCloud13.  This post highlights our cloud software Data Integration & Publishing Platform and associated Data Integration & Publishing support service.

Our data integration and publishing platform enables organisations to transform and integrate data from a wide variety of systems and then make that integrated data available for consumption through easy-to-use APIs. Customised data transformations can be used to ingest data from a wide variety of sources including near-real-time data streams like sensor networks.

Service Description 

The Epimorphics Data Integration and Publishing Platform is widely used for integrating and linking open and non-open data.  For example it is used to support the Food Standards Agency (data.food.gov.uk) and others.

Features

  • flexible, high performance data storage
  • fully standards-compliant linked data publication platform
  • widely used within the UK public-sector
  • replicated for fault-tolerance and scalability
  • examples include publishing near-real time data
  • ingest data from a wide variety of sources
  • data quality, validation and data quality integrated with data flows
  • data integration – high-performance with large data volumes
  • hosted and managed enterprise Data Platform Service
  • enable advanced front-end app interfaces beyond the standard UI

Benefits

  • robust, reliable publication of sustainable, trusted and usable 5-star data
  • adaptive/flexible platform can be grown to meet changing needs
  • deployment flexibility on the cloud or within your own infrastructure
  • provide customers with integrated, live updates twenty-four hours a day
  • build data integration into your transforming organisation
  • bring data from your legacy systems into one place
  • publish high-quality data that is actually used
  • enable teams to start integrating, managing, curating and using data
  • data sharing via flexible APIs and bulk download
  • create and deliver services with user needs at their heart

We offer the platform as a fully hosted and managed service for collecting, processing, integrating, merging, publishing and using data. By default we provide a hosted service on top of Amazon Web Services (AWS).

The platform includes:

  • a Linked Data API engine, providing access to the data in several developer-friendly formats (including JSON and CSV) and human-readable web pages
  • customisable text search
  • triple store for storing data as RDF
  • a fully SPARQL 1.1-compliant endpoint (made accessible as an optional extra)
  • a scale-out, fault tolerant runtime platform
  • a data management system, to enable clients to load their own data in source or RDF formats

Optionally we can provide additional upload mechanisms and automation which will integrate with clients’ existing workflows to support “business as usual” publication of linked data, including support for near real time data streams.

To ensure effective and reliable maintenance, we follow the ‘infrastructure as code’ approach, meaning that each part of the system can be easily rebuilt through automated processes and infrastructure definitions can be managed as version-controlled artefacts in a similar way to software. 

The platform is customisable and can also host applications running on top of the data.

Flow Diagram - From left to right. Box one  and two (data sources) flow to Data ingest and mapping, this flows to data store which flows to data API and access.  Underneath this is an indicative monitoring box.  The Data API and Access box has dotted line arrows to 1. Data sharing (external, data user (data scientist) and to data application and then data user (end user)

Our Data Integration and Publishing support service provides setup and support services for the data integration and publishing platform, including data modelling and preparation, platform configuration and custom data presentations, as well as migration, testing and ongoing support.  Additionally, our Epimorphics Data Architecture & Strategy support service provides support for data and information architecture strategy, design and delivery using cloud / hybrid cloud approaches.

Compliance with Open Standards

As a linked-data company we have a passion for open standards. We’ve been key in designing and developing many of the web standards and vocabularies around linked data.  For more information on our open standards work see our website at: www.epimorphics.com.  Specifically, linked data depends crucially on the correct implementation of the relevant open standards. Our platform fully complies with all the relevant standards, notably:

  • RDF syntaxes: RDF/XML, Turtle, N-Triples, JSON-LD
  • RDF 1.1 Turtle
  • SPARQL 1.1 Query
  • SPARQL 1.1 result set formats (XML, JSON, CSV, TSV)
  • SPARQL 1.1 Update
  • CSV

Example UK public sector GCloud users

Our Data Integration & Publishing Platform service is an extension of our Linked Data Publishing Platform, that has had and continues to be used by a number of organisations across the public sector.  The Food Standards Agency use Our Data Integration & Publishing Platform service for a number of their internal and external data services including:

Decorative image - screenshot displayed on a MacBook pro screen. Screenshot of the Food Standards Agency’s Unified View of food and feed establishment data.

More information

For more information contact us, see our GCloud Service Offerings or our Data Platform page and our API Libraries page