SCHEDULE: Difference between revisions

From Open Rail Data Wiki
Additional technical information on schedules, stripped out example records to be more concise
Line 1: Line 1:
==Overview==
= Schedule data =


The Schedule feed is an extract of train schedules from the Integration Train Planning System.
== Overview ==


Schedule Data, cannot be obtained via Stomp, data is obtained by GZ file download from the Amazon S3 Data Buckets, each GZIP file consists of a collection of JSON strings.
The Schedule feed is an extract of train schedules from Network Rail's ITPS (Integrated Train Planning System), converted in to JSON format for easier parsing.  Network Rail are not planning to make raw CIF files available.


The data consists of a primary set of data (rather large, can be 1.5GB in size) and a set of daily corrections that should be applied to the base data.
Schedule files are available for all passenger TOCs and for each TOC.  Two types of file are available - a 'full' file which contains a snapshot of all schedules, and an 'update' file which can be applied to a a local database to bring it up-to-date with any changes.


The data only contains Passenger Train Information.
The [[http://www.atoc.org/about-atoc/rail-settlement-plan/data-feeds/types-of-data][CIF User Specification]] is available from ATOC's website, which details the format of the CIF file. This will be useful to developers wishing to gain deep understanding about the way train scheduling works, above and beyond the information contained here.
    Freight services are not included in the schedule; all messages containing FOC codes are filtered out.


== Obtaining the Data ==
== Downloading ==


Data is downloaded from Amazon S3. Each feed has a Bucket name and a File Name.
The schedule data, compressed using gzip, is downloaded from Amazon S3 via a private URL which is valid for a few minutes after generation.


Each bucket has one or more files available within it. Normally the FULL_DAILY buckets will contain a single file (toc-full), where as the UPDATE_DAILY buckets will contain 7 files, one for each day.
To request schedule data, send an HTTP request with your username and password to:


Data is obtained from the Amazon S3 URL
  https://datafeeds.networkrail.co.uk/ntrod/CifFileAuthenticate?type=bucket&day=file
    https://datafeeds.networkrail.co.uk/ntrod/CifFileAuthenticate?type=[bucket]&day=[file]


So for example
For example:
    https://datafeeds.networkrail.co.uk/ntrod/CifFileAuthenticate?type=CIF_ALL_FULL_DAILY&day=toc-full


Will give you the Full Schedule for All Regions for Today.
  https://datafeeds.networkrail.co.uk/ntrod/CifFileAuthenticate?type=CIF_ALL_FULL_DAILY&day=toc-full


You will need to be already logged into [https://datafeeds.networkrail.co.uk DataFeeds] in a Web Browser to obtain the data, or if using cURL, HTTP Basic Auth, following HTTP Redirects will login, (using your Email/Password, not your security key)
Replace '''bucket''' with the name of the bucket, and '''file''' with the name of the file.  On successful authentication, you will receive a 403 redirect to the location of the schedule files.


== Data ==
== Data ==


Like the realtime data feeds, the Schedule data is split down into Train Provider and then from there down into the Full Schedule for that day and the daily updates.
The schedule data contains a header row, a set of zero or more association records, a set of zero or more schedule records, and an end-of-file (EOF) record.


So if you are building a local schedule database from scratch or are wiping your copy to build a fresh version.
Each association and schedule record has an action - either 'create' or 'delete'.  In full files, there will be no 'delete' records.


* First download and process the Full Daily.
Update files must be applied sequentially to a full file.
* And then daily grab the Daily update for that day and process that.


Files are normally updated around about Midnight UTC
== Interpretation ==


A Daily Full file will only contain CREATE transactions, where as a Update can contain CREATE and DELETE transactions.
=== Validity ===


Each file contains,
Associations and schedule validities are between a start date and an end date, and on particular days of the week.  They each have a Short Term Planning (STP) indicator field as follows:


* a Data/information line,
* '''C''' - Planned cancellation: the schedule does not apply on this date, and the train will not run.  Typically seen on public holidays when an alternate schedule applies, or on Christmas Day.
* a set of Schedule/Train Associations Transactions,
* '''N''' - STP schedule: similar to a permanent schedule, but planned through the Short Term Planning process
* a set of Schedules Transactions,
* '''O''' - Overlay schedule: an alteration to a permanent schedule
* an EOF message,
* '''P''' - Permanent schedule: a schedule planned through the Long Term Planning process


For the Update files. DELETE transactions are listed before CREATE transactions normally.
Schedules may be overridden on a particular day as follows:


Files are New Line Delimited JSON Packets
* A permanent schedule ('P') may be overridden by an overlay ('O') or planned cancellation ('C')
* An STP schedule ('N') may be overridden by a planned cancellation ('C')


Tran UIDs (so whatever_train_uid or train_uid), consists of one of (C, G, L, P, W, Y) followed by a ID
If two schedules appear to be valid for a particular day, the schedule with the lowest alphabetical STP indicator wins.
  Train UID is the unique identity number used as a key field within TSDB and CIF. The format is annnnn, where „a‟ is one of C, G, L, P, W, Y.


=== Examples ===
=== Schedules ===


==== Header ====
A schedule comprises a header containing a schedule UID, data about the schedule (including whether it is a train, bus or ship) and validity dates, and an ordered list of locations and times at which a particular service should arrive, depart or pass.
    "JsonTimetableV1":
        "classification":"public",
        "timestamp":1343952450,
        "owner":"Network Rail",
        "Sender":
            "organisation":"Rockshore",
            "application":"NTROD",
            "component":"SCHEDULE",
        "Metadata":
            "type":"full",
            "sequence":0
Example from CIF_ALL_FULL_DAILY


Gives the Last Update time of the File as a UNIX TIMESTAMP, in this example, Friday 3rd August 2012 01:07:30 +0100. All data should be send from the Rockshore Organisation.
* Originating locations will always have a WTT departure time and optionally a public departure time
* Intermediate locations in a schedule will have a passing time if they are a mandatory timing point, or an arrival and departure time if the train carries out an activity at that location
* Terminating locations will always have a WTT arrival time and optionally a public arrival time, which may be some minutes later than the WTT time
* A location may have one or more activities associated with it - for example, '''U''' for locations where the train calls to pick up passengers (i.e. not available for alighting), '''D''' for locations where the train calls to set down passengers (i.e. not available for boarding).
* A location may have engineering, pathing or performance allowances


==== Association ====
=== Associations ===


===== Create =====
Associations are relationships between two schedules - a main train and an associated train.
    "JsonAssociationV1":
        transaction_type":"Create",
        "main_train_uid":"C05307",
        "assoc_train_uid":"C05351",
        "assoc_start_date":"2011-12-11T00:00:00Z",
        "assoc_end_date":"2012-09-09T00:00:00Z",
        "assoc_days":"0000001",
        "category":"NP",
        "date_indicator":"S",
        "location":"HTRWTM4",
        "base_location_suffix":null,
        "assoc_location_suffix":null,
        "diagram_type":"T",
        "CIF_stp_indicator":"P"
===== Delete =====
    "JsonAssociationV1":
        "transaction_type":"Delete",
        "main_train_uid":"W36743",
        "assoc_train_uid":"W37173",
        "assoc_start_date":"2012-08-03T00:00:00Z",
        "location":"STPANCI",
        "base_location_suffix":null,
        "diagram_type":"T",
        "cif_stp_indicator":null


* The transaction type indicates, if this is a new Entry to create or old Entry to delete.
There are three types of association:
* Location is a [[Identifying_Stations|TIPLOC]] Reference
* assoc_days represent if the Association is valid on the relevant day (MTWTFSS)
* cif_stp_indicator indicates if the entry is P(ermanent) or O(verlay)
  *  "the Permanent data is retained in addition to the Overlay, but the Overlay is assumed to supersede the Permanent position" (Page 28 CIF End User Spec)


=== Schedule ===
* '''NP''' - Next Train.  Not present for all schedules, but indicates the UID of the next service that the vehicle on this service will work
* '''JJ''' - Join.  Occurs at the end of the associated train's schedule.
* '''VV''' - Split.  Occurs at an intermediate location of the main train's schedule and indicates another train services that part of this train will form.


===== Create =====
Associations may be for the same day ('''S'''), or cross midnight either backward ('''P''') or forward ('''N''') depending on the date indicator field.
    "JsonScheduleV1":
        "CIF_bank_holiday_running":null,
        "CIF_stp_indicator":"P",
        "CIF_train_uid":"C24056",
        "applicable_timetable":"Y",
        "atoc_code":"GW",
        "new_schedule_segment":
            "traction_class":"",
            "uic_code":""
        "schedule_days_runs":"0000010",
        "schedule_end_date":"2012-12-08",
        "schedule_segment":
            "signalling_id":"1A35",
            "CIF_train_category":"XX",
            "CIF_headcode":"1234",
            "CIF_course_indicator":1,
            "CIF_train_service_code":"25397003",
            "CIF_business_sector":"??",
            "CIF_power_type":"HST",
            "CIF_timing_load":null,
            "CIF_speed":"125",
            "CIF_operating_characteristics":null,
            "CIF_train_class":"B",
            "CIF_sleepers":null,
            "CIF_reservations":"S",
            "CIF_connection_indicator":null,
            "CIF_catering_code":"C",
            "CIF_service_branding":"",
            "schedule_location": <snip>
        "schedule_start_date":"2011-12-17",
        "train_status":"P",
        "transaction_type":"Create"
 
* atoc_code can be looked up on [[TOC_Codes]] (this example is missing)
* Schedule Segment - signalling_id can be used to follow the train on the [[TD]] Feed
* Schedule Segment - CIF Power Type is what is a reference to what is pulling the Train
* Schedule Segment - CIF Speed is the top Speed of the Train
* Schedule Segment - CIF Sleepers is the service a Sleeper Service
 
===== Delete =====
    "JsonScheduleV1":
        "CIF_train_uid":"C06309",
        "schedule_start_date":"2011-12-12",
        "CIF_stp_indicator":"P",
        "transaction_type":"Delete"
 
When performing a deletion, the keys provided in this packet can match multiple Schedules. (Normally as different sets of schedule locations are run at different times on different days)
 
====== Schedule Location ======
 
Times are given in hhmm format, example 2005 for 5 minutes past 8 PM. (I'm not sure why some times are followed with a H)
 
Schedule Locations consists of none or more locations
 
The Path key can describe the route the train is expected to take into a station. For example Leeds Central when approaching from the South as Paths A-F
 
Platform Data is not always provided.
 
Schedule Locations come in three types
 
* LO - Train Origin
* LI - Stopping/Passing/Timing Point
* LT - Train Terminus
 
Stopping points contain both the Arrival and Departure time for the Train
 
As the data is an array, the Index of the Array can be used to help determine Stop Order.
 
======Train Origin======
    "location_type":"LO",
    "record_identity":"LO",
    "tiploc_code":"ABRDEEN",
    "tiploc_instance":null,
    "departure":"1350",
    "public_departure":"1350",
    "platform":"7",
    "line":null,
    "engineering_allowance":null,
    "pathing_allowance":null,
    "performance_allowance":null
 
As a departure only reference only the departure keys are present
 
======Stopping Point======
    "location_type":"LI",
    "record_identity":"LI",
    "tiploc_code":"FNPK",
    "tiploc_instance":null,
    "arrival":"0657",
    "departure":"0658",
    "pass":null,
    "public_arrival":"0657",
    "public_departure":"0658",
    "platform":"3",
    "line":null,
    "path":null,
    "engineering_allowance":null,
    "pathing_allowance":null,
    "performance_allowance":null
 
This train arrives and ~1 minute later, is expected to leave.
 
======Passing Point======
Some stopping points are just Passing Points, used for routing trains over points for specific paths or lines.
    "location_type":"LI",
    "record_identity":"LI",
    "tiploc_code":"KNGXBEL",
    "tiploc_instance":null,
    "arrival":null,
    "departure":null,
    "pass":"2052H",
    "public_arrival":null,
    "public_departure":null,
    "platform":null,
    "line":"FL2",
    "path":null,
    "engineering_allowance":null,
    "pathing_allowance":null,
    "performance_allowance":null
 
======Train Terminus======
    "location_type":"LT",
    "record_identity":"LT",
    "tiploc_code":"KNGX",
    "tiploc_instance":null,
    "arrival":"2054H",
    "public_arrival":"2100",
    "platform":"4",
    "path":null
 
As this is an arrival entry, only arrival keys are present.
 
=== EOF ===
    "EOF":true
 
Just a handy note to say you reached the end of the file, in case you obtained a broken download.
 
=== Further Information ===
 
The first ~4% of the Full Daily file contains schedule associations, this links multiple Train UID's with a Primary Train UID, which can be looked up in the Schedules.
 
Due to the Size of the Full Daily, (1.5GB when gunzip'ed) it can take some time (about 2 hours) to import the data from the file.
 
A given Schedule entry, contains information about the schedule, including its Start and End dates, and can then contain one or more schedule stops, which describe the calling points for a train on its schedule.
These calling points will have a official arrival/departure time and a Public arrival/departure time. When displaying data to the end user, its probably best to use the Public versions.

Revision as of 18:06, 27 October 2012

Schedule data

Overview

The Schedule feed is an extract of train schedules from Network Rail's ITPS (Integrated Train Planning System), converted in to JSON format for easier parsing. Network Rail are not planning to make raw CIF files available.

Schedule files are available for all passenger TOCs and for each TOC. Two types of file are available - a 'full' file which contains a snapshot of all schedules, and an 'update' file which can be applied to a a local database to bring it up-to-date with any changes.

The [[1][CIF User Specification]] is available from ATOC's website, which details the format of the CIF file. This will be useful to developers wishing to gain deep understanding about the way train scheduling works, above and beyond the information contained here.

Downloading

The schedule data, compressed using gzip, is downloaded from Amazon S3 via a private URL which is valid for a few minutes after generation.

To request schedule data, send an HTTP request with your username and password to:

 https://datafeeds.networkrail.co.uk/ntrod/CifFileAuthenticate?type=bucket&day=file

For example:

 https://datafeeds.networkrail.co.uk/ntrod/CifFileAuthenticate?type=CIF_ALL_FULL_DAILY&day=toc-full

Replace bucket with the name of the bucket, and file with the name of the file. On successful authentication, you will receive a 403 redirect to the location of the schedule files.

Data

The schedule data contains a header row, a set of zero or more association records, a set of zero or more schedule records, and an end-of-file (EOF) record.

Each association and schedule record has an action - either 'create' or 'delete'. In full files, there will be no 'delete' records.

Update files must be applied sequentially to a full file.

Interpretation

Validity

Associations and schedule validities are between a start date and an end date, and on particular days of the week. They each have a Short Term Planning (STP) indicator field as follows:

  • C - Planned cancellation: the schedule does not apply on this date, and the train will not run. Typically seen on public holidays when an alternate schedule applies, or on Christmas Day.
  • N - STP schedule: similar to a permanent schedule, but planned through the Short Term Planning process
  • O - Overlay schedule: an alteration to a permanent schedule
  • P - Permanent schedule: a schedule planned through the Long Term Planning process

Schedules may be overridden on a particular day as follows:

  • A permanent schedule ('P') may be overridden by an overlay ('O') or planned cancellation ('C')
  • An STP schedule ('N') may be overridden by a planned cancellation ('C')

If two schedules appear to be valid for a particular day, the schedule with the lowest alphabetical STP indicator wins.

Schedules

A schedule comprises a header containing a schedule UID, data about the schedule (including whether it is a train, bus or ship) and validity dates, and an ordered list of locations and times at which a particular service should arrive, depart or pass.

  • Originating locations will always have a WTT departure time and optionally a public departure time
  • Intermediate locations in a schedule will have a passing time if they are a mandatory timing point, or an arrival and departure time if the train carries out an activity at that location
  • Terminating locations will always have a WTT arrival time and optionally a public arrival time, which may be some minutes later than the WTT time
  • A location may have one or more activities associated with it - for example, U for locations where the train calls to pick up passengers (i.e. not available for alighting), D for locations where the train calls to set down passengers (i.e. not available for boarding).
  • A location may have engineering, pathing or performance allowances

Associations

Associations are relationships between two schedules - a main train and an associated train.

There are three types of association:

  • NP - Next Train. Not present for all schedules, but indicates the UID of the next service that the vehicle on this service will work
  • JJ - Join. Occurs at the end of the associated train's schedule.
  • VV - Split. Occurs at an intermediate location of the main train's schedule and indicates another train services that part of this train will form.

Associations may be for the same day (S), or cross midnight either backward (P) or forward (N) depending on the date indicator field.