Locations PoC: Difference between revisions

From Open Rail Data Wiki
Location Proof Of Concept
 
 
(18 intermediate revisions by the same user not shown)
Line 1: Line 1:
The RDG Locations Proof of Concept (PoC) API brings together mutiple sources of locations data from various RDG and NR services into a single service. It allows consumers to search for codes associated to a location description or to request data asscoiated to an NLC, CRS, TIPLOC or STANOX code.
The RDG Locations Proof of Concept (PoC) brings together mutiple sources of locations data from various RDG and NR services into a single service. It allows consumers to search for codes associated to a location description or to request data asscoiated to a NLC, CRS, TIPLOC or STANOX code.


This is a '''''Proof of Concept''''' - it is intended to evaluate the potential usefulness of such an API and should not be incorporated into any services as it may be withdrawn, or unavailable, at any time
This is a '''''Proof of Concept''''' - it is intended to evaluate the potential usefulness and should not be incorporated into any services as it may be withdrawn, or be unavailable, at any time


= API =
= API =
The Locations PoC has two functions:
The Locations PoC API has two functions:
* A search by location name/description, to return codes associated to locations
:* A search by location name/description function, to return codes associated to locations
* A lookp by NLC, CRS, TIPLOC or STANOX to returned data associated to the specified code
:* A lookup by NLC, CRS, TIPLOC or STANOX function, to return data associated to the specified code
 
== Access ==
The API is provided as a RESTFul JSON API and is available from the following endpoints. Both endpoints '''only''' accept a POST request with a simple JSON body:
:* Description search
:** Endpoint: https://0hvzyzu9q6.execute-api.eu-west-1.amazonaws.com/beta/locs-desc
:** JSON Body: e.g. {"id" : "leeds"}
 
:* Code search
:** Endpoint: https://0hvzyzu9q6.execute-api.eu-west-1.amazonaws.com/beta/locs-js
:** JSON Body: e.g. {"id" : "LDS", "type" : "C"}
 
== Security ==
As this is a proof of concept a single access token has been created for use by the open community. This token needs to be included as a 'Auth-Token' header within the HTTPs request:
 
:* Auth-Token: ea8b2166-058c-4689-a72c-dd4b9b84cb82
 
== Functions ==
=== Search by Location ===
: https://0hvzyzu9q6.execute-api.eu-west-1.amazonaws.com/beta/locs-desc<BR/>
: e.g. {"id" : "leeds"}
 
The search by location function allows consumers to request a list of locations matching a specified string. The request is a simple JSON object which must be POSTed to the above endpoint. The object takes the form:
:{"id" : ''value''}
where the value must be at least 3 characters in length.
 
Once submitted, the back office service will match the requested string against the ''initial'' characters of each word within location names in its database and return all matching results. For example, if a user posts {"id" : "cross"} to the endpoint the following stations would match and be returned:
 
:* CrossHill
:* CrossKeys
:* Kirby Cross
:* London Kings Cross
:* etc.
 
The response will consist of a JSON array of objects, with each object containing the matched location name plus at least one of:
 
:* NLC
:* CRS
:* TIPLOC
:* STANOX
 
'''Note:''' The response is determined by location descriptions as set in multiple data sources. So the same location may appear more than once if it has been spelt differently in different feeds. For example, searching for {"id":"clapham junction"} will return:
 
:[
::  {"description": "CLAPHAM JUNCTION LONDON", "nlc": "5595", "crs": "CLJ", "tiploc": "CLPHMJN", "stanox": "87219"},
::  {"description": "CLAPHAM JUNCTION", "nlc": "5595", "crs": "CLJ", "tiploc": "CLPHMJN", "stanox": "87219"}
:]
 
whilst this is clearly the same location, as evidenced by the codes, it is shown twice because of the different descriptions.
 
=== Search by Code ===
: https://0hvzyzu9q6.execute-api.eu-west-1.amazonaws.com/beta/locs-js<BR/>
: e.g. {"id" : "LDS", "type" : "C"}
 
The search by code function allows consumers to specify either an NLC, a CRS, a TIPLOC or a STANOX code to retrieve all information associated to that code from multiple data sources. The request is a simple JSON object which must be POSTed to the above endpoint. The object takes the form:
:{"id" : ''"value"'', "type" : ''"value"''}
where the id value must be between 3 and 7 characters in length and the type be one of:
* C - CRS
* N - NLC
* T - TIPLOC
* S - STANOX
 
'''Note''': both CRS and NLC codes contain various dummy or psuedo codes used in different systems to represent logical locations. For example, the CRS dataset contains 3-character routeing guide group codes (e.g. G01 for London Group) and the NLC dataset contains priviate settlement locations (e.g. H123) and Cluster codes (e.g. Q001), both used within the fares service to represent pricing points.
 
The response to any of the four coding options contains some data common to all locations followed by data associated to that sepcific type (NLC, CRS, etc) and location. The fields common to all locations are:
:* type (same as specified in request
:* id (same as specified in request)
:* last_updated (date when record was last updated in database)
:* lu_action (action taken at last update, either 'I'nsert, 'A'mend or 'D'elete)
:* deleted (true if the code isn't currently in use)
:* description (location name and the source of the description)
:* source (master (primary) source of the location code)
:* assoc_locs (a list of location codes associated to this code, if any.)
 
'''Note:''' The response will '''only''' contain data directly associated to the specified code. i.e. if you request data for London Euston's NLC code ({"id" : "1444", "type" : "N"}) it will indicate in the "assoc_locs" that CRS code EUS is associated to NLC 1444 but it will ''not'' return data associated to EUS, only data directly associated to 1444.
 
==== NLC ====
In addition to the common information, if an NLC is requested then data from the following sources will be provided if it exists:
:* Corpus
:* Product Management Service (PMS) - Fares data
:* IPTIS Data Management Service (IDMS)
:* Retail Control Service (RCS)
:* Travelcard Remote Origin data
 
==== CRS ====
In addition to the common information, if a CRS code is requested then data from the following sources will be provided if it exists:
:* Routeing Guide (ENRG)
:* Knowledgebase (KB) - Stations data
 
==== TIPLOC ====
In addition to the common information, if a TIPLOC code is requested then data from the following sources will be provided if it exists:
:* Product Management Service (PMS) - Timetable and Station data
:* Darwin - Location reference data
 
==== STANOX ====
In addition to the common information, if a STANOX code is requested then data from the following sources will be provided if it exists:
:* Darwin - Berth Step Mapping data
 
== Specification ==
A Swagger 2.0 YAML file has been created to define the API and JSON files used within the PoC, found here: [https://wiki.openraildata.com/images/9/9f/MDM-Locs-beta-swagger.yaml.gz MDM-Locs-beta-swagger.yaml]
 
= Data Feed =
The underlying data for the API is also available on an S3 bucket as a series of CSV formatted files, each representing a table in the Proof of Concept database. The CSV files can be found in the following location:
 
:* nrdp-v16-logs/mdm/
 
And the AWS credentials required to access and download data from the S3 bucket are:
 
:* Access key ID: AKIAZCPIPGZPU4CWXD4I
:* Secret access key: please contact [mailto:online@raildeliverygroup.com online@raildeliverygroup.com]
:* Region: eu-west-1
 
UI tools exist (e.g. [https://www.cloudberrylab.com/explorer/amazon-s3.aspx Cloudberry]) which allow manual downloading of data from S3 (equivalent to tools such as [https://filezilla-project.org/ FileZilla] for FTP) and AWS also provide a Command Line Interface to allow programmatic downloading of data from S3. For example, using the AWS CLI the following would download all the CSV files into a local directory (assuming you have already entered the above credentials using the CLI command: aws configure):
 
:* aws s3 cp s3://nrdp-v16-logs/mdm/ '''/path/to/local-dir/''' --exclude "*" --include "*.csv" --exclude "*/*" --recursive
 
The CSV file names indicate the source of data (e.g. pms, kb, etc.) and there are 5 additional tables which form the MDM part of the PoC:
 
:* mdm_crss: unique list of CRS codes, from all sources
:* mdm_nlcs: unique list of NLCs, from all sources
:* mdm_tiplocs: unique list of TIPLOC codes, from all sources
:* mdm_stanoxs: unique list of STANOX codes, from all sources
:* mdm_loc_assocs: associations between different coding schemes (used in the API for the 'associated location' data)
 
For any consumers who use MySQL, this [https://wiki.openraildata.com/images/2/2c/Mdm.sql.gz SQL] file will create the relevant databases and tables ready to hold the data. For consumers who prefer another flavour of SQL, the SQL file should provide enough information to enable you to recreate tables, indexes etc. in the database of choice.


{{Navtable-NreDataFeeds}}
{{Navtable-NreDataFeeds}}


[[Category:National Rail Enquiries Data Feeds]]
[[Category:National Rail Enquiries Data Feeds]]

Latest revision as of 09:21, 29 July 2019

The RDG Locations Proof of Concept (PoC) brings together mutiple sources of locations data from various RDG and NR services into a single service. It allows consumers to search for codes associated to a location description or to request data asscoiated to a NLC, CRS, TIPLOC or STANOX code.

This is a Proof of Concept - it is intended to evaluate the potential usefulness and should not be incorporated into any services as it may be withdrawn, or be unavailable, at any time

API

The Locations PoC API has two functions:

  • A search by location name/description function, to return codes associated to locations
  • A lookup by NLC, CRS, TIPLOC or STANOX function, to return data associated to the specified code

Access

The API is provided as a RESTFul JSON API and is available from the following endpoints. Both endpoints only accept a POST request with a simple JSON body:

Security

As this is a proof of concept a single access token has been created for use by the open community. This token needs to be included as a 'Auth-Token' header within the HTTPs request:

  • Auth-Token: ea8b2166-058c-4689-a72c-dd4b9b84cb82

Functions

Search by Location

https://0hvzyzu9q6.execute-api.eu-west-1.amazonaws.com/beta/locs-desc
e.g. {"id" : "leeds"}

The search by location function allows consumers to request a list of locations matching a specified string. The request is a simple JSON object which must be POSTed to the above endpoint. The object takes the form:

{"id" : value}

where the value must be at least 3 characters in length.

Once submitted, the back office service will match the requested string against the initial characters of each word within location names in its database and return all matching results. For example, if a user posts {"id" : "cross"} to the endpoint the following stations would match and be returned:

  • CrossHill
  • CrossKeys
  • Kirby Cross
  • London Kings Cross
  • etc.

The response will consist of a JSON array of objects, with each object containing the matched location name plus at least one of:

  • NLC
  • CRS
  • TIPLOC
  • STANOX

Note: The response is determined by location descriptions as set in multiple data sources. So the same location may appear more than once if it has been spelt differently in different feeds. For example, searching for {"id":"clapham junction"} will return:

[
{"description": "CLAPHAM JUNCTION LONDON", "nlc": "5595", "crs": "CLJ", "tiploc": "CLPHMJN", "stanox": "87219"},
{"description": "CLAPHAM JUNCTION", "nlc": "5595", "crs": "CLJ", "tiploc": "CLPHMJN", "stanox": "87219"}
]

whilst this is clearly the same location, as evidenced by the codes, it is shown twice because of the different descriptions.

Search by Code

https://0hvzyzu9q6.execute-api.eu-west-1.amazonaws.com/beta/locs-js
e.g. {"id" : "LDS", "type" : "C"}

The search by code function allows consumers to specify either an NLC, a CRS, a TIPLOC or a STANOX code to retrieve all information associated to that code from multiple data sources. The request is a simple JSON object which must be POSTed to the above endpoint. The object takes the form:

{"id" : "value", "type" : "value"}

where the id value must be between 3 and 7 characters in length and the type be one of:

  • C - CRS
  • N - NLC
  • T - TIPLOC
  • S - STANOX

Note: both CRS and NLC codes contain various dummy or psuedo codes used in different systems to represent logical locations. For example, the CRS dataset contains 3-character routeing guide group codes (e.g. G01 for London Group) and the NLC dataset contains priviate settlement locations (e.g. H123) and Cluster codes (e.g. Q001), both used within the fares service to represent pricing points.

The response to any of the four coding options contains some data common to all locations followed by data associated to that sepcific type (NLC, CRS, etc) and location. The fields common to all locations are:

  • type (same as specified in request
  • id (same as specified in request)
  • last_updated (date when record was last updated in database)
  • lu_action (action taken at last update, either 'I'nsert, 'A'mend or 'D'elete)
  • deleted (true if the code isn't currently in use)
  • description (location name and the source of the description)
  • source (master (primary) source of the location code)
  • assoc_locs (a list of location codes associated to this code, if any.)

Note: The response will only contain data directly associated to the specified code. i.e. if you request data for London Euston's NLC code ({"id" : "1444", "type" : "N"}) it will indicate in the "assoc_locs" that CRS code EUS is associated to NLC 1444 but it will not return data associated to EUS, only data directly associated to 1444.

NLC

In addition to the common information, if an NLC is requested then data from the following sources will be provided if it exists:

  • Corpus
  • Product Management Service (PMS) - Fares data
  • IPTIS Data Management Service (IDMS)
  • Retail Control Service (RCS)
  • Travelcard Remote Origin data

CRS

In addition to the common information, if a CRS code is requested then data from the following sources will be provided if it exists:

  • Routeing Guide (ENRG)
  • Knowledgebase (KB) - Stations data

TIPLOC

In addition to the common information, if a TIPLOC code is requested then data from the following sources will be provided if it exists:

  • Product Management Service (PMS) - Timetable and Station data
  • Darwin - Location reference data

STANOX

In addition to the common information, if a STANOX code is requested then data from the following sources will be provided if it exists:

  • Darwin - Berth Step Mapping data

Specification

A Swagger 2.0 YAML file has been created to define the API and JSON files used within the PoC, found here: MDM-Locs-beta-swagger.yaml

Data Feed

The underlying data for the API is also available on an S3 bucket as a series of CSV formatted files, each representing a table in the Proof of Concept database. The CSV files can be found in the following location:

  • nrdp-v16-logs/mdm/

And the AWS credentials required to access and download data from the S3 bucket are:

UI tools exist (e.g. Cloudberry) which allow manual downloading of data from S3 (equivalent to tools such as FileZilla for FTP) and AWS also provide a Command Line Interface to allow programmatic downloading of data from S3. For example, using the AWS CLI the following would download all the CSV files into a local directory (assuming you have already entered the above credentials using the CLI command: aws configure):

  • aws s3 cp s3://nrdp-v16-logs/mdm/ /path/to/local-dir/ --exclude "*" --include "*.csv" --exclude "*/*" --recursive

The CSV file names indicate the source of data (e.g. pms, kb, etc.) and there are 5 additional tables which form the MDM part of the PoC:

  • mdm_crss: unique list of CRS codes, from all sources
  • mdm_nlcs: unique list of NLCs, from all sources
  • mdm_tiplocs: unique list of TIPLOC codes, from all sources
  • mdm_stanoxs: unique list of STANOX codes, from all sources
  • mdm_loc_assocs: associations between different coding schemes (used in the API for the 'associated location' data)

For any consumers who use MySQL, this SQL file will create the relevant databases and tables ready to hold the data. For consumers who prefer another flavour of SQL, the SQL file should provide enough information to enable you to recreate tables, indexes etc. in the database of choice.


National Rail Enquiries Data Feeds
Data Feeds About the Feeds Darwin Webservice (Public) Darwin Webservice (Staff) Historical Service Performance Push Port KnowledgeBaseDTDLocations (PoC)Real Time Journey Planner
LDB API About
LDB-SV API About
HSP About
DTD About Fares Timetable
Push Port About XML Schemas Schedules Associations Train Status Station Messages Alarms Train Order Train Alerts Formations Formation loading