Commit: 59dd6dc1b935356c717e566eb71aefd2763dbd9e
Parent: 3f16a0cd1baaa19bf62b9ca9aaf8c9aee07667fa
Author: Mike Ryan <mikeryan@lacklustre.net>
Committer: Marcel Holtmann <marcel@holtmann.org>
Date: 2015-12-27 23:55:41
Tree: 67eea93ab0787e45ae6a955e958d1100e1e824a8

tools: fix update_compids to parse newly formatted page from SIG This patch adds tools/parse_companies.pl, a twisted Perl script that parses the SIG's HTML page in poor taste using regex. Improvements also include support for non-ASCII entities such as &eacute; as well as full unicode support for Chinese names.

Diffstat

A tools/parse_companies.pl | 59 ++++++++++++++++++++++++++++++++++++++++
M tools/update_compids.sh | 35 ++++++++++- - - - - - - - - - - - - - - - - - - - - - - - -

2 files changed, 69 insertions(+), 25 deletions(-)

View Full Diff | Patch