Talk:BPLAN Geography Data: Difference between revisions
PeterHicks (talk | contribs) No edit summary |
ThomasWood (talk | contribs) m remove links to deleted files |
||
Line 6: | Line 6: | ||
! Main File !! Duplicate File !! Notes | ! Main File !! Duplicate File !! Notes | ||
|- | |- | ||
| [[File:Geography_20141214_to_20150516_from_20150302.txt.gz]] || | | [[File:Geography_20141214_to_20150516_from_20150302.txt.gz]] || File:20150302 ReferenceData.gz || Zipped contents identical | ||
|- | |- | ||
| [[File:Geography_20151213_to_20160514_from_20160126.txt.gz]] || | | [[File:Geography_20151213_to_20160514_from_20160126.txt.gz]] || File:20160126 ReferenceData.gz || Zipped contents identical | ||
|- | |- | ||
| [[File:Geography_20161211_to_20171209_from_20170124.txt.gz]] || | | [[File:Geography_20161211_to_20171209_from_20170124.txt.gz]] || File:20170127 ReferenceData.gz || Zipped contents identical | ||
|- | |- | ||
| [[File:Geography_20161211_to_20171209_from_20170823.txt.gz]] || [[File:20170830 ReferenceData.gz]] || Zipped contents '''differ''', main file does not have 1155 REF+SER records and 9 TLD records (main file also has PIT footer inconsistent with TLD record count). PIF headers for the two files are identical. | | [[File:Geography_20161211_to_20171209_from_20170823.txt.gz]] || [[File:20170830 ReferenceData.gz]] || Zipped contents '''differ''', main file does not have 1155 REF+SER records and 9 TLD records (main file also has PIT footer inconsistent with TLD record count). PIF headers for the two files are identical. |
Revision as of 20:05, 27 May 2019
Files Duplicated
There are a number of duplicate BPLAN files on the wiki, all of which are orphaned, should the duplicated ones be deleted?
Main File | Duplicate File | Notes |
---|---|---|
File:Geography 20141214 to 20150516 from 20150302.txt.gz | File:20150302 ReferenceData.gz | Zipped contents identical |
File:Geography 20151213 to 20160514 from 20160126.txt.gz | File:20160126 ReferenceData.gz | Zipped contents identical |
File:Geography 20161211 to 20171209 from 20170124.txt.gz | File:20170127 ReferenceData.gz | Zipped contents identical |
File:Geography 20161211 to 20171209 from 20170823.txt.gz | File:20170830 ReferenceData.gz | Zipped contents differ, main file does not have 1155 REF+SER records and 9 TLD records (main file also has PIT footer inconsistent with TLD record count). PIF headers for the two files are identical. |
--ThomasWood (talk) 04:00, 22 February 2019 (UTC)
I'll delete the first three orphaned files and take a look at the last one.
--PeterHicks (talk) 14:57, 23 February 2019 (UTC)
Files with PIT record consistency errors
A number of the (non-duplicate) BPLAN data files have PIT record values inconsistent with the contained number of records. Flagging this up here in case a data processing tool upstream has some undetected bugs that is causing these records to be unintentionally dropped. In particular, the 9 missing TLD entries from the File:Geography 20161211 to 20171209 from 20170823.txt.gz file appear to have a character that is only valid in the Windows-1252 encoding, so may cause some tools to reject these entries. (Differences between the 9 missing rows determined via the two different versions of this file published on the wiki, as noted in the previous section).
File | Count Type | REF | TLD | LOC | PLT | NWK | TLK |
---|---|---|---|---|---|---|---|
File:20140116 ReferenceData.gz | PIT | 1405 | 628 | 10468 | 3487 | 37093 | 1047866 |
Actual | 247 | 628 | 10468 | 3487 | 37093 | 1047866 | |
File:Geography 20151213 to 20160514 from 20160126.txt.gz | PIT | 1422 | 610 | 10874 | 3663 | 39110 | 1071772 |
Actual | 257 | 610 | 10873 | 3663 | 39108 | 1071769 | |
File:Geography 20161211 to 20171209 from 20170823.txt.gz | PIT | 262 | 638 | 11062 | 3948 | 40119 | 1101022 |
Actual | 262 | 629 | 11062 | 3948 | 40119 | 1101022 |
--ThomasWood (talk) 04:00, 22 February 2019 (UTC)