Talk:BPLAN Geography Data: Difference between revisions

From Open Rail Data Wiki
 
(2 intermediate revisions by 2 users not shown)
Line 6: Line 6:
! Main File !! Duplicate File !! Notes
! Main File !! Duplicate File !! Notes
|-
|-
| [[File:Geography_20141214_to_20150516_from_20150302.txt.gz]] || [[File:20150302 ReferenceData.gz]] || Zipped contents identical
| [[File:Geography_20141214_to_20150516_from_20150302.txt.gz]] || File:20150302 ReferenceData.gz || Zipped contents identical
|-
|-
| [[File:Geography_20151213_to_20160514_from_20160126.txt.gz]] || [[File:20160126 ReferenceData.gz]] || Zipped contents identical
| [[File:Geography_20151213_to_20160514_from_20160126.txt.gz]] || File:20160126 ReferenceData.gz || Zipped contents identical
|-
|-
| [[File:Geography_20161211_to_20171209_from_20170124.txt.gz]] || [[File:20170127 ReferenceData.gz]] || Zipped contents identical
| [[File:Geography_20161211_to_20171209_from_20170124.txt.gz]] || File:20170127 ReferenceData.gz || Zipped contents identical
|-
|-
| [[File:Geography_20161211_to_20171209_from_20170823.txt.gz]] || [[File:20170830 ReferenceData.gz]] || Zipped contents '''differ''', main file does not have 1155 REF+SER records and 9 TLD records (main file also has PIT footer inconsistent with TLD record count). PIF headers for the two files are identical.
| [[File:Geography_20161211_to_20171209_from_20170823.txt.gz]] || [[File:20170830 ReferenceData.gz]] || Zipped contents '''differ''', main file does not have 1155 REF+SER records and 9 TLD records (main file also has PIT footer inconsistent with TLD record count). PIF headers for the two files are identical.
|}
|}
--[[User:ThomasWood|ThomasWood]] ([[User talk:ThomasWood|talk]]) 04:00, 22 February 2019 (UTC)
--[[User:ThomasWood|ThomasWood]] ([[User talk:ThomasWood|talk]]) 04:00, 22 February 2019 (UTC)
I'll delete the first three orphaned files and take a look at the last one.
--[[User:PeterHicks|PeterHicks]] ([[User talk:PeterHicks|talk]]) 14:57, 23 February 2019 (UTC)


= Files with PIT record consistency errors =
= Files with PIT record consistency errors =
Line 38: Line 42:
|}
|}
--[[User:ThomasWood|ThomasWood]] ([[User talk:ThomasWood|talk]]) 04:00, 22 February 2019 (UTC)
--[[User:ThomasWood|ThomasWood]] ([[User talk:ThomasWood|talk]]) 04:00, 22 February 2019 (UTC)
: [[File:Geography_20181209_to_20190518_from_20180618.txt.gz]] also has an inconsistent PIT record, it reports that the file should contain 650 TLD records, but only 641 are present in the file. --[[User:ThomasWood|ThomasWood]] ([[User talk:ThomasWood|talk]]) 15:47, 28 May 2019 (UTC)

Latest revision as of 15:47, 28 May 2019

Files Duplicated

There are a number of duplicate BPLAN files on the wiki, all of which are orphaned, should the duplicated ones be deleted?

Main File Duplicate File Notes
File:Geography 20141214 to 20150516 from 20150302.txt.gz File:20150302 ReferenceData.gz Zipped contents identical
File:Geography 20151213 to 20160514 from 20160126.txt.gz File:20160126 ReferenceData.gz Zipped contents identical
File:Geography 20161211 to 20171209 from 20170124.txt.gz File:20170127 ReferenceData.gz Zipped contents identical
File:Geography 20161211 to 20171209 from 20170823.txt.gz File:20170830 ReferenceData.gz Zipped contents differ, main file does not have 1155 REF+SER records and 9 TLD records (main file also has PIT footer inconsistent with TLD record count). PIF headers for the two files are identical.

--ThomasWood (talk) 04:00, 22 February 2019 (UTC)

I'll delete the first three orphaned files and take a look at the last one.

--PeterHicks (talk) 14:57, 23 February 2019 (UTC)

Files with PIT record consistency errors

A number of the (non-duplicate) BPLAN data files have PIT record values inconsistent with the contained number of records. Flagging this up here in case a data processing tool upstream has some undetected bugs that is causing these records to be unintentionally dropped. In particular, the 9 missing TLD entries from the File:Geography 20161211 to 20171209 from 20170823.txt.gz file appear to have a character that is only valid in the Windows-1252 encoding, so may cause some tools to reject these entries. (Differences between the 9 missing rows determined via the two different versions of this file published on the wiki, as noted in the previous section).

File Count Type REF TLD LOC PLT NWK TLK
File:20140116 ReferenceData.gz PIT 1405 628 10468 3487 37093 1047866
Actual 247 628 10468 3487 37093 1047866
File:Geography 20151213 to 20160514 from 20160126.txt.gz PIT 1422 610 10874 3663 39110 1071772
Actual 257 610 10873 3663 39108 1071769
File:Geography 20161211 to 20171209 from 20170823.txt.gz PIT 262 638 11062 3948 40119 1101022
Actual 262 629 11062 3948 40119 1101022

--ThomasWood (talk) 04:00, 22 February 2019 (UTC)

File:Geography 20181209 to 20190518 from 20180618.txt.gz also has an inconsistent PIT record, it reports that the file should contain 650 TLD records, but only 641 are present in the file. --ThomasWood (talk) 15:47, 28 May 2019 (UTC)