Welcome to PyPCAPKit’s documentation!¶
The PyPCAPKit
project is an open source Python program focus
on PCAP parsing and analysis, which works as a stream PCAP file extractor.
With support of DictDumper
, it shall support multiple
output report formats.
Important
The whole project supports Python 3.4 or later.
About¶
PyPCAPKit
is an independent open source library, using only
DictDumper
as its formatted output dumper.
Note
There is a project called jspcapy
works on pcapkit
, which is a
command line tool for PCAP extraction but now *DEPRECATED*.
Unlike popular PCAP file extractors, such as Scapy
,
dpkt
, PyShark
, and etc, pcapkit
uses
streaming strategy to read input files. That is to read frame by frame,
decrease occupation on memory, as well as enhance efficiency in some way.
Module Structure¶
In pcapkit
, all files can be described as following eight parts.
Interface (
pcapkit.interface
)User interface for the
pcapkit
library, which standardise and simplify the usage of this library.Foundation (
pcapkit.foundation
)Synthesise file I/O and protocol analysis, coordinate information exchange in all network layers.
Reassembly (
pcapkit.reassembly
)Based on algorithms described in RFC 815, implement datagram reassembly of IP and TCP packets.
Protocols (
pcapkit.protocols
)Collection of all protocol family, with detail implementation and methods, as well as constructors.
CoreKit (
pcapkit.corekit
)Core utilities for
pcapkit
implementation.TookKit (
pcapkit.toolkit
)Compatibility tools for
pcapkit
implementation.DumpKit (
pcapkit.dumpkit
Dump utilities for
pcapkit
implementation.Utilities (
pcapkit.utilities
)Collection of four utility functions and classes.
Engine Comparison¶
Besides, due to complexity of pcapkit
, its extraction procedure takes
around 0.0009 seconds per packet, which is not ideal enough. Thus
pcapkit
introduced alternative extractionengines to accelerate this
procedure. By now pcapkit
supports Scapy, DPKT, and PyShark.
Plus, pcapkit
supports two strategies of multiprocessing (server
&
pipeline
). For more information, please refer to the documentation.
Test Environment¶
Operating System |
macOS Mojave |
Processor Name |
Intel Core i7 |
Processor Speed |
2.6 GHz |
Total Number of Cores |
6 |
Memory |
16 GB |
Test Results¶
Engine |
Performance (seconds per packet) |
---|---|
|
0.00017389218012491862 |
|
0.00036091208457946774 |
|
0.0009537641207377116 |
|
0.0009694552421569824 |
|
0.018088217973709107 |
|
0.04200994372367859 |
Installation¶
Note
pcapkit
supports Python versions since 3.4.
Simply run the following to install the current version from PyPI:
pip install pypcapkit
Or install the latest version from the gi repository:
git clone https://github.com/JarryShaw/PyPCAPKit.git
cd pypcapkit
pip install -e .
# and to update at any time
git pull
And since pcapkit
supports various extraction engines, and extensive
plug-in functions, you may want to install the optional ones:
# for DPKT only
pip install pypcapkit[DPKT]
# for Scapy only
pip install pypcapkit[Scapy]
# for PyShark only
pip install pypcapkit[PyShark]
# and to install all the optional packages
pip install pypcapkit[all]
# or to do this explicitly
pip install pypcapkit dpkt scapy pyshark
Samples¶
Usage Samples¶
As described above, :mo:d`pcapkit` is quite easy to use, with simply three verbs as its main interface. Several scenarios are shown as below.
extract a PCAP file and dump the result to a specific file (with no reassembly)
import pcapkit # dump to a PLIST file with no frame storage (property frame disabled) plist = pcapkit.extract(fin='in.pcap', fout='out.plist', format='plist', store=False) # dump to a JSON file with no extension auto-complete json = pcapkit.extract(fin='in.cap', fout='out.json', format='json', extension=False) # dump to a folder with each tree-view text file per frame tree = pcapkit.extract(fin='in.pcap', fout='out', format='tree', files=True)
extract a PCAP file and fetch IP packet (both IPv4 and IPv6) from a frame (with no output file)
>>> import pcapkit >>> extraction = pcapkit.extract(fin='in.pcap', nofile=True) >>> frame0 = extraction.frame[0] # check if IP in this frame, otherwise ProtocolNotFound will be raised >>> flag = pcapkit.IP in frame0 >>> tcp = frame0[pcapkit.IP] if flag else None
extract a PCAP file and reassemble TCP payload (with no output file nor frame storage)
import pcapkit # set strict to make sure full reassembly extraction = pcapkit.extract(fin='in.pcap', store=False, nofile=True, tcp=True, strict=True) # print extracted packet if HTTP in reassembled payloads for packet in extraction.reassembly.tcp: for reassembly in packet.packets: if pcapkit.HTTP in reassembly.protochain: print(reassembly.info)
CLI Samples¶
The CLI (command line interface) of pcapkit
has two different access.
through console scripts
Use command name
pcapkit [...]
directly (as shown in samples).through Python module
python -m pypcapkit [...]
works exactly the same as above.
Here are some usage samples:
export to a macOS Property List (Xcode has special support for this format)
$ pcapkit in --format plist --verbose 🚨Loading file 'in.pcap' - Frame 1: Ethernet:IPv6:ICMPv6 - Frame 2: Ethernet:IPv6:ICMPv6 - Frame 3: Ethernet:IPv4:TCP - Frame 4: Ethernet:IPv4:TCP - Frame 5: Ethernet:IPv4:TCP - Frame 6: Ethernet:IPv4:UDP 🍺Report file stored in 'out.plist'
export to a JSON file (with no format specified)
$ pcapkit in --output out.json --verbose 🚨Loading file 'in.pcap' - Frame 1: Ethernet:IPv6:ICMPv6 - Frame 2: Ethernet:IPv6:ICMPv6 - Frame 3: Ethernet:IPv4:TCP - Frame 4: Ethernet:IPv4:TCP - Frame 5: Ethernet:IPv4:TCP - Frame 6: Ethernet:IPv4:UDP 🍺Report file stored in 'out.json'
export to a text tree view file (without extension autocorrect)
$ pcapkit in --output out --format tree --verbose 🚨Loading file 'in.pcap' - Frame 1: Ethernet:IPv6:ICMPv6 - Frame 2: Ethernet:IPv6:ICMPv6 - Frame 3: Ethernet:IPv4:TCP - Frame 4: Ethernet:IPv4:TCP - Frame 5: Ethernet:IPv4:TCP - Frame 6: Ethernet:IPv4:UDP 🍺Report file stored in 'out'