Ever feel like you’re drowning in network traffic, trying to pinpoint that one suspicious packet? Manually sifting through gigabytes of data with Wireshark is powerful, but it’s not always the most efficient way to hunt for threats, especially when you need to do it repeatedly or at scale. What if you could bring the power of Python to your network analysis?
That’s where PyShark comes in. If you’re looking to supercharge your threat hunting program, this handy open-source tool might just become your new best friend. Today, we’ll dive into how you can use PyShark to automate network-based threat hunting, making your investigations faster and more effective.
Simply put, PyShark is a Python wrapper for tshark. For those unfamiliar, tshark is the command-line interface (CLI) counterpart to the ever-popular Wireshark. PyShark cleverly uses tshark’s XML export features under the hood, allowing you to programmatically access packet data within your Python scripts. If you’ve used tshark’s ELK output mode, you’ll appreciate how PyShark taps into similar detailed information, but with the flexibility of Python.
One of the coolest things about PyShark is its versatility. You can use it for:
While there might have been minor bugs in the past with specific filters (like Berkeley Packet Filters, or BPF) on pre-captured data, for the most part, PyShark offers functionally equivalent capabilities whether you’re looking at live streams or static files.
Speaking of filters, PyShark supports both capture filters (BPF) and display filters. This is crucial. On a busy network, capture filters let you tell Wireshark (and thus tshark/PyShark) to ignore irrelevant packets before they’re even processed, saving precious system resources. Display filters, on the other hand, work on the data already captured, much like when you type a filter into the Wireshark GUI’s filter bar. That same powerful filtering logic is accessible right within your Python code via PyShark.
You can find the official PyShark repository and excellent documentation on GitHub to explore further.
Installation is a breeze, thanks to pip. Just open your terminal and type:
pip install pyshark
# Or if you're using Python 3 specifically
pip3 install pyshark
Important Prerequisite: Tshark must already be installed on your system for PyShark to work. PyShark will typically find tshark if it’s in your system’s environment path. However, you also have the option to point PyShark to a specific tshark executable if needed.
Ready to see some live action? Capturing live network data with PyShark is surprisingly straightforward. Here’s a basic example:
import pyshark
# Start a live capture on the 'eth0' interface
# You can also add BPF or display filters here
capture = pyshark.LiveCapture(interface='eth0')
# Sniff for a specific number of packets (e.g., 5 packets)
for packet in capture.sniff_continuously(packet_count=5):
print(f"Just captured a packet: {packet}")
# Add your packet processing logic here
In this snippet, LiveCapture
sets up the listening interface. The sniff_continuously()
method then iterates through packets as they arrive. You can specify packet_count
to limit how many packets are processed, or omit it to sniff indefinitely (until you stop the script). This is also where you’d pass your bpf_filter
or display_filter
arguments to LiveCapture
.
Got a PCAP file you need to dissect? PyShark handles that with similar ease. This is incredibly useful for offline analysis, incident response, or testing your hunting scripts.
import pyshark
# Open a PCAP file
capture = pyshark.FileCapture('my_capture.pcap')
# Access a specific packet (e.g., the first packet)
print(f"The first packet in the file is: {capture[0]}")
# Iterate through all packets in the PCAP
# Be mindful of memory with very large PCAP files!
for packet_num, packet in enumerate(capture):
print(f"Processing packet #{packet_num}")
# Your analysis logic here
if packet_num > 10000 and some_memory_check_condition: # Example to prevent overconsumption
print("Reached packet limit for this iteration, consider refining filters or iterative processing.")
break # Or use capture.apply_on_packets(your_function, timeout=100) for large files
capture.close() # Good practice to close the file handle
When working with FileCapture
, you can directly access packets by their index (e.g., capture[0]
). Looping through the capture object lets you process each packet. A word of caution: if you’re dealing with massive PCAP files, loading everything into memory at once can be an issue. Consider processing packets iteratively or using more specific display filters within FileCapture
to limit the initial data loaded.
So, you’ve got your packets loaded, either live or from a file. How do you get to the juicy bits of information inside?
Each packet object in PyShark has a layers
attribute. This is a list containing all the protocol layers that tshark identified in that packet (e.g., Ethernet, IP, TCP, UDP, HTTP).
# Assuming 'packet' is a packet object from PyShark
print(f"Layers in this packet: {packet.layers}")
# Accessing specific fields (example for an IP packet)
if 'IP' in packet: # Check if the IP layer exists
source_ip = packet.ip.src
destination_ip = packet.ip.dst
print(f"Source IP: {source_ip}, Destination IP: {destination_ip}")
if 'ETH' in packet: # Check for Ethernet layer
source_mac = packet.eth.src
destination_mac = packet.eth.dst
print(f"Source MAC: {source_mac}, Destination MAC: {destination_mac}")
You can access individual fields using dot notation, like packet.ip.src
or packet.eth.dst
. These layer names (e.g., ip
, eth
, tcp
, http
) generally follow Wireshark’s conventions.
Crucial Tip for Robust Scripting: Always include error handling! If a packet doesn’t contain a specific layer or field you’re trying to access (e.g., trying to get packet.http.request_uri
from a non-HTTP packet), your script will throw an error. Wrap your field access in try-except
blocks or check for the layer’s existence (e.g., if 'HTTP' in packet:
) before trying to access its attributes.
Let’s put this all together with a concrete threat hunting scenario. Imagine we have a large PCAP file – say, the 209MB SANS SICS Geek Lounge PCAP (available from NetReSec or GitHub), which contains about 2.27 million packets. Our goal is to find evidence of nmap scanning.
capture = pyshark.FileCapture(
'sics_geek_lounge.pcap',
display_filter='http'
)
from collections import Counter
uri_counts = Counter()
for packet in capture:
try:
if hasattr(packet, 'http') and hasattr(packet.http, 'request_uri'):
uri = packet.http.request_uri
uri_counts[uri] += 1
except AttributeError:
# This can happen if a field is unexpectedly not present
continue # Skip to the next packet
capture.close()
# Print the most common URIs
for uri, count in uri_counts.most_common(10):
print(f"{uri}: {count}")
nice-test.php
appearing frequently (e.g., 88 times in the example transcript). This specific URI is often an indicator of nmap’s service detection phase.Python
# Assume 'suspicious_uri' is "/nice-test.php"
# Re-open or refine the capture
scan_details = {} # To store {source_ip: [destination_ips]}
# For a more targeted approach, you might construct a more specific display filter
# Example: f'http.request.uri == "{suspicious_uri}"'
refined_capture = pyshark.FileCapture(
'sics_geek_lounge.pcap',
display_filter=f'http.request.uri == "{suspicious_uri}"'
)
for packet in refined_capture:
try:
if 'IP' in packet: # Ensure IP layer exists
source = packet.ip.src
destination = packet.ip.dst
if source not in scan_details:
scan_details[source] = set()
scan_details[source].add(destination)
except AttributeError:
continue
refined_capture.close()
for scanner_ip, targets in scan_details.items():
print(f"Scanner IP: {scanner_ip} targeted: {', '.join(list(targets))}")
This would reveal the source IP performing the scan (e.g., 192.168.2.137
in the transcript’s example) and the various internal IPs it probed.
Amazingly, a script like this can churn through millions of packets in under a minute. While Python might not be the absolute fastest language for raw execution speed compared to compiled languages, its development speed and powerful libraries make it an excellent choice for these tasks.
This nmap detection example is just scratching the surface. You can adapt these techniques for all sorts of threat hunting:
PyShark empowers you to automate network analysis and threat hunting in ways that manual inspection simply can’t match. By combining the packet-parsing prowess of tshark with the scripting flexibility of Python, you can:
It’s a fantastic tool that we find incredibly useful, and hopefully, this guide gives you a solid starting point to incorporate it into your own security operations.
Our products are designed to work with
you and keep your network protected.
Insane Cyber © All Rights Reserved 2025