Schedule Data, cannot be obtained via Stomp, data is obtained by GZ file download from the Amazon S3 Data Buckets, each GZIP file consists of a collection of JSON strings.
The data consists of a primary set of data (rather large, can be 1.5GB in size) and a set of daily corrections that should be applied to the base data.
Obtaining the Data
Data is downloaded from Amazon S3. Each feed has a Bucket name and a File Name.
Each bucket has one or more files available within it. Normally the FULL_DAILY buckets will contain a single file (toc-full), where as the UPDATE_DAILY buckets will contain 7 files, one for each day.
Data is obtained from the Amazon S3 URL
So for example
Will give you the Full Schedule for All Regions for Today.
You will need to be already logged into DataFeeds in a Web Browser to obtain the data, or if using cURL, HTTP Basic Auth, following HTTP Redirects will login, (using your Email/Password, not your security key)
Like the realtime data feeds, the Schedule data is split down into Train Provider and then from there down into the Full Schedule for that day and the daily updates.
So if you are building a local schedule database from scratch or are wiping your copy to build a fresh version.
- First download and process the Full Daily.
- And then daily grab the Daily update for that day and process that.
Files are normally updated around about Midnight UTC
A Daily Full file will only contain CREATE transactions, where as a Update can contain CREATE and DELETE transactions.
Each file contains,
- a Data/information line,
- a set of Schedule/Train Associations,
- a set of Schedules
- an EOF message
- a blank line
Files are New Line Delimited JSON Packets
Example from CIF_ALL_FULL_DAILY
Gives the Last Update time of the File as a UNIX TIMESTAMP, in this example, Friday 3rd August 2012 01:07:30 +0100. All data should be send from the Rockshore Organisation.
"JsonAssociationV1": transaction_type":"Create", "main_train_uid":"C05307", "assoc_train_uid":"C05351", "assoc_start_date":"2011-12-11T00:00:00Z", "assoc_end_date":"2012-09-09T00:00:00Z", "assoc_days":"0000001", "category":"NP", "date_indicator":"S", "location":"HTRWTM4", "base_location_suffix":null, "assoc_location_suffix":null, "diagram_type":"T", "CIF_stp_indicator":"P"
"JsonAssociationV1": "transaction_type":"Delete", "main_train_uid":"W36743", "assoc_train_uid":"W37173", "assoc_start_date":"2012-08-03T00:00:00Z", "location":"STPANCI", "base_location_suffix":null, "diagram_type":"T", "cif_stp_indicator":null
- The transaction type indicates, if this is a new Entry to create or old Entry to delete.
- Location is a TIPLOC Reference
- assoc_days represent if the Association is valid on the relevant day (MTWTFSS)
Just a handy note to say you reached the end of the file, in case you obtained a broken download.
The first ~4% of the Full Daily file contains schedule associations, this links multiple Train UID's with a Primary Train UID, which can be looked up in the Schedules.
Due to the Size of the Full Daily, (1.5GB when gunzip'ed) it can take some time (about 2 hours) to import the data from the file.
A given Schedule entry, contains information about the schedule, including its Start and End dates, and can then contain one or more schedule stops, which describe the calling points for a train on its schedule. These calling points will have a official arrival/departure time and a Public arrival/departure time. When displaying data to the end user, its probably best to use the Public versions.
A Schedule Stop comes in three types
- LO - Train Origin
- LI - Stoppin point
- LT - Train Terminus