Web Crawlers for Constant Enumerations

Base Generator

class pcapkit.vendor.default.Vendor[source]

Bases: object

Default vendor generator.

Inherit this class with FLAG & LINK attributes, etc. to implement a new vendor generator.

__init__()[source]

Generate new constant files.

static __new__(cls)[source]

Subclassing checkpoint.

Raises

VendorNotImplemented – If cls is not a subclass of Vendor.

_request()[source]

Fetch CSV data from LINK.

This is the low-level call of request().

If LINK is None, it will directly call the upper method request() with NO arguments.

The method will first try to GET the content of LINK. Should any exception raised, it will first try with proxy settings from get_proxies().

Note

Since some LINK links are from Wikipedia, etc., they might not be available in certain areas, e.g. the amazing PRC :)

Would proxies failed again, it will prompt for user intervention, i.e. it will use webbrowser.open() to open the page in browser for you, and you can manually load that page and save the HTML source at the location it provides.

Returns

CSV data.

Return type

List[str]

Warns

VendorRequestWarning – If connection failed with and/or without proxies.

See also

request()

context(data)[source]

Generate constant context.

Parameters

data (List[str]) – CSV data.

Returns

Constant context.

Return type

str

count(data)[source]

Count field records.

Parameters

data (List[str]) – CSV data.

Returns

Field recordings.

Return type

Counter

process(data)[source]

Process CSV data.

Parameters

data (List[str]) – CSV data.

Returns

Enumeration fields. List[str]: Missing fields.

Return type

List[str]

rename(name, code, *, original=None)[source]

Rename duplicated fields.

Parameters
  • name (str) – Field name.

  • code (int) – Field code.

Keyword Arguments

original (str) – Original field name (extracted from CSV records).

Returns

Revised field name.

Return type

str

Example

If name has multiple occurrences in the source registry, the field name will be sanitised as ${name}_${code}.

Otherwise, the plain name will be returned.

request(text=None)[source]

Fetch CSV file.

Parameters

text (str) – Context from LINK.

Returns

CSV data.

Return type

List[str]

safe_name(name)[source]

Convert enumeration name to enum.Enum friendly.

Parameters

name (str) – original enumeration name

Returns

Converted enumeration name.

Return type

str

static wrap_comment(text)[source]

Wraps long-length text to shorter lines of comments.

Parameters

text (str) – Source text.

Returns

Wrapped comments.

DOCS

Docstring of constant enumeration.

Type

str

FLAG = None

Value limit checker.

Type

str

Link to registry.

Type

str

NAME

Name of constant enumeration.

Type

str

pcapkit.vendor.default.LINE(NAME, DOCS, FLAG, ENUM, MISS)

Default constant template of enumerate registry from IANA CSV.

pcapkit.vendor.default.get_proxies()[source]

Get proxy for blocked sites.

The function will read PCAPKIT_HTTP_PROXY and PCAPKIT_HTTPS_PROXY, if any, for the proxy settings of requests.

Returns

Proxy settings for requests.

Return type

Dict[str, str]

Command Line Tool

usage: pcapkit-vendor [-h] [-V] ...

update constant enumerations

positional arguments:
  target         update targets, supply none to update all

optional arguments:
  -h, --help     show this help message and exit
  -V, --version  show program's version number and exit
pcapkit.vendor.__main__.get_parser()[source]

CLI argument parser.

Returns

Argument parser.

Return type

argparse.ArgumentParser

pcapkit.vendor.__main__.main()[source]

Entrypoint.

Warns

InvalidVendorWarning – If vendor target not found in pcapkit.vendor module.

pcapkit.vendor.__main__.run(vendor)[source]

Script runner.

Parameters

vendor (Type[Vendor]) – Subclass of Vendor from pcapkit.vendor.

Warns

VendorRuntimeWarning – If failed to initiate the vendor class.