A threat hunting playbook is a documented, hypothesis-driven procedure that guides SOC analysts and threat hunters through specific detection investigations using defined data sources, detection logic, and triage steps. According to the CrowdStrike 2025 Threat Hunting Report, 81% of hands-on-keyboard intrusions were malware-free, meaning signature-based detection alone will never catch them. Your SIEM generates alerts. A playbook generates answers.
Reactive security assumes the attacker will announce themselves. They won’t. Modern adversaries operate inside trusted tools, use legitimate credentials, and move quietly for days or weeks before anyone notices. The average dwell time for human-operated intrusions that go undetected by traditional tools still runs in the double digits of days. Proactive hunting, structured around testable hypotheses mapped to MITRE ATT&CK, cuts that window.
This guide gives you a working threat hunting playbook: the methodology, the tools, and three ready-to-run hunts for lateral movement, credential access, and persistence. Every section is practitioner-to-practitioner. No vendor pitches, no 10,000-foot theory.
What Hypothesis-Driven Threat Hunting Actually Means
Hypothesis-driven hunting starts with a specific, falsifiable statement about attacker behavior in your environment. Not “look for unusual activity” but something like: “A threat actor with initial access to a workstation is using WMI for lateral movement to avoid Sysmon process creation logs.” That hypothesis drives everything that follows: which data sources you query, which fields you inspect, and what a true positive looks like versus a false one.
There are three hypothesis sources every mature program draws from. The first is threat intelligence: you know a specific threat group (say, a North Korea-nexus actor) is active in your sector and uses a particular technique, so you test for it. The second is situational awareness: a new vulnerability gets published, or you deploy a new cloud service, and you hunt for exploitation attempts before they escalate. The third is crown-jewel analysis: you identify your highest-value assets and ask what an attacker would have to do to reach them, then hunt those paths proactively.
The PEAK framework (Prepare, Execute, Act with Knowledge), developed by Sqrrl and later popularized through the threat hunting community, gives a useful three-phase structure. Prepare means forming the hypothesis and identifying required data. Execute means running queries and collecting findings. Act means documenting results, tuning detections, and feeding new hypotheses from what you learned. Every hunt you run should complete that loop.
How to Build a Hunt Hypothesis Step by Step
Start with a MITRE ATT&CK technique. Each technique page tells you what data sources are relevant, what sub-techniques exist, and which threat groups have used it in real campaigns. That structure alone saves hours of scoping work.
A well-formed hunt hypothesis follows this structure: [Threat actor or attacker class] is using [specific technique, ATT&CK ID] via [specific tool or method] against [target asset class] to achieve [objective]. For example: “An attacker with domain user credentials is using Pass-the-Hash (T1550.002) via Mimikatz to authenticate to high-value servers without triggering Kerberos event 4768.”
Once your hypothesis is written, ask three questions before running a single query. First: do you have the data to test this? If endpoint telemetry doesn’t include process memory events, a memory-based hunt will return nothing useful. Second: what does a true positive look like in your specific environment? Generic detection logic produces alert fatigue. Third: what is the expected false-positive rate, and how will you triage it? Authentication anomalies in a financial services environment will generate more noise than the same query in a 50-person professional services firm.
MITRE ATT&CK Mapping for Proactive Hunts
The ATT&CK Enterprise matrix as of Q1 2026 contains 14 tactics and 196 techniques with hundreds of sub-techniques. You cannot hunt everything simultaneously, and attempting to do so produces low-signal noise. Instead, prioritize based on three factors: attacker prevalence, data availability, and business impact.
For most enterprise SOC teams, the highest-value hunt categories are Credential Access (TA0006), Lateral Movement (TA0008), and Persistence (TA0003). These map directly to the post-exploitation phase where attackers consolidate access, move toward target assets, and establish resilience. Detecting here, before exfiltration or ransomware deployment, is where hunting earns its keep.
Credential Access techniques to prioritize include OS Credential Dumping (T1003), Brute Force (T1110), and Steal or Forge Kerberos Tickets (T1558). For Lateral Movement, focus on Remote Services (T1021), Use of Alternate Authentication Material (T1550), and Exploitation of Remote Services (T1210). Persistence hunting most commonly targets Scheduled Task/Job (T1053), Boot or Logon Autostart Execution (T1547), and Create or Modify System Process (T1543).
The MITRE ATT&CK Navigator tool lets you visually layer your existing detection coverage against these techniques. Build a layer showing what your SIEM rules already detect, then hunt in the gaps. That is the fastest path to meaningful uplift without duplicating work.
Threat Hunting Tool Comparison: Elastic, Splunk, Velociraptor, and YARA
Tool selection depends on what data you have, what questions you are asking, and whether you are hunting at the network, log, or endpoint layer. These four tools cover the core hunting stack that most mid-to-large SOC teams use.
| Tool | Primary Use Case | Query Language | Best For | Limitation |
|---|---|---|---|---|
| Elastic Security | Log analysis, SIEM hunting | EQL, KQL, ES|QL | Sequence-based correlation, timeline analysis | Complex rule tuning; requires data pipeline quality |
| Splunk Enterprise Security | SIEM, log aggregation, analytics | SPL (Search Processing Language) | Large enterprise, custom dashboards, threat intelligence integration | Licensing cost at scale; SPL learning curve for new analysts |
| Velociraptor | Live endpoint forensics, DFIR | VQL (Velociraptor Query Language) | Real-time endpoint hunting, artifact collection, memory analysis | Not a SIEM; no long-term log retention built in |
| YARA | File and memory pattern matching | YARA rules (pattern syntax) | Malware family detection, IOC-based file hunting | Requires rule writing expertise; high false-positive risk with broad rules |
Elastic Security with Event Query Language (EQL) is particularly strong for sequence-based hunts, finding chains of events that together indicate compromise. A four-event sequence showing process injection followed by network connection followed by registry modification followed by a scheduled task creation maps cleanly to a full attack chain and can be expressed in a single EQL query. The open-source Elastic Security detection rules repository on GitHub gives you 700+ pre-built starting points.
Splunk dominates in organizations with heavy investment in existing Splunk infrastructure. The Splunk Security Essentials app provides a library of over 200 content packs mapped to ATT&CK, and SPL’s statistical functions make behavioral baselining relatively accessible. The Splunk Attack Range project lets you spin up an on-demand attack simulation environment to test hunt logic before running it against production data.
Velociraptor, maintained by Rapid7, runs as a lightweight agent on endpoints and accepts VQL queries that execute in real time across your fleet. It is the right tool when your hunt hypothesis requires live forensic data: running process lists, active network connections, registry keys, browser history, or memory artifacts. For credential-based hunts, VQL can enumerate LSASS process handles, query SAM database access attempts, and inspect scheduled task XML configurations at scale.
YARA serves a different purpose in the stack. It does not query logs; it pattern-matches against file content and process memory. A YARA rule library maintained alongside your other hunt tooling lets you scan endpoints, email attachments, or sandbox results for known malware signatures and behavior patterns. The Elastic Protection Artifacts GitHub repository and CAPE Sandbox regularly publish YARA rules derived from live malware analysis.
Hunt Playbook 1: Lateral Movement via SMB and WMI
Lateral movement is the phase where attackers become hardest to distinguish from legitimate administrative traffic. The hypothesis: an attacker with valid domain credentials is moving laterally using SMB and WMI remote execution to avoid triggering process-level alerts on destination hosts.
MITRE ATT&CK techniques covered: T1021.002 (Remote Services: SMB/Windows Admin Shares), T1047 (Windows Management Instrumentation)
Data sources required: Windows Security event logs (Event ID 4624 logon type 3, 4648, 4672), Sysmon network connection events (Event ID 3), WMI activity logs (Microsoft-Windows-WMI-Activity/Operational), and ideally endpoint telemetry from your EDR.
Detection approach: Look for accounts generating Type 3 (network) logons to multiple distinct hosts within a short time window, particularly outside business hours. A legitimate admin authenticating to five servers in a maintenance window looks different from an account hitting 15 workstations between 02:00 and 03:00. Pair that with WMI process creation events on destination hosts where the spawning process is WmiPrvSE.exe.
A representative Elastic EQL query hunting WMI-based lateral movement:
sequence by host.name with maxspan=1m [process where process.parent.name == "WmiPrvSE.exe" and process.name in ("cmd.exe", "powershell.exe", "wscript.exe")] [network where process.name in ("cmd.exe", "powershell.exe") and network.direction == "outbound"]
Triage steps: For any hits, identify the source account and check its logon history for the past 24 hours. Verify whether the destination host is listed in the account’s standard access profile. Check whether the commands executed on the destination match known administrative scripts. A true positive typically shows an account accessing new hosts with no prior history, executing reconnaissance commands like whoami, ipconfig, or net user.
Hunt Playbook 2: Credential Access via LSASS Memory
LSASS memory dumping remains one of the most common credential access techniques despite years of defensive countermeasures. The hypothesis: an attacker with local administrator privileges is attempting to extract credentials from LSASS memory using a Mimikatz variant or built-in Windows tooling.
MITRE ATT&CK techniques covered: T1003.001 (OS Credential Dumping: LSASS Memory)
Data sources required: Sysmon Event ID 10 (process access events targeting lsass.exe), Windows Security Event ID 4656 (object handle requests), EDR process telemetry, and if available, Windows Defender Credential Guard status per host.
Detection approach: Any process opening a handle to lsass.exe with read memory access rights (0x10) is suspicious unless it is a known security product. Mimikatz in its default form triggers clear Sysmon Event ID 10 logs. However, attackers frequently use process hollowing, DLL injection, or custom loaders to evade this. A secondary hunt looks for unusual parent-child process relationships where a legitimate process (notepad.exe, msiexec.exe) spawns a child that immediately attempts LSASS access.
In Splunk SPL, hunting LSASS handle access:
index=sysmon EventCode=10 TargetImage="*lsass.exe" GrantedAccess IN ("0x1010", "0x1410", "0x1fffff") | stats count by SourceImage, GrantedAccess, ComputerName | where count < 3 | sort - count
Triage steps: Confirm the SourceImage against known good security tools in your environment (AV engines, EDR agents, backup software all legitimately access LSASS). For unfamiliar source processes, check the binary hash against VirusTotal and look for signing certificate anomalies. Check whether the process has a network connection following the LSASS access, as credential dumping typically precedes authentication to remote systems within minutes.
Velociraptor can complement this hunt. A VQL query running across your endpoint fleet can enumerate active handles to lsass.exe in real time, identifying in-progress attacks faster than log-based detection alone.
Hunt Playbook 3: Persistence via Scheduled Tasks and Registry Run Keys
Persistence mechanisms are where attackers bet that defenders won’t look closely at things that have always been there. The hypothesis: a threat actor who compromised a user-context process has established persistence via a scheduled task or registry run key that mimics legitimate software behavior.
MITRE ATT&CK techniques covered: T1053.005 (Scheduled Task/Job: Scheduled Task), T1547.001 (Boot or Logon Autostart Execution: Registry Run Keys / Startup Folder)
Data sources required: Windows Security Event ID 4698 (scheduled task created), Event ID 4702 (scheduled task updated), Sysmon Event ID 13 (registry value set for run keys), and Windows task scheduler operational logs.
Detection approach: Focus on scheduled tasks created from user context (not SYSTEM) with command lines pointing to paths outside Windows directories: AppData, Temp, ProgramData. Attackers frequently name tasks to blend in: “MicrosoftEdgeUpdate,” “AdobeGCClient,” or “WindowsDefenderScan” are common impersonation patterns. For registry run keys, the detection pivot is on values created by non-standard processes or pointing to scripting engines with encoded payloads.
A Velociraptor VQL artifact for scheduled task enumeration:
SELECT Name, Command, Arguments, Enabled, NextRunTime FROM Windows.System.ScheduledTasks() WHERE Command MATCHES "(?i)(cmd.exe|powershell.exe|wscript.exe|mshta.exe|regsvr32.exe)" AND Arguments MATCHES "(?i)(base64|encode|hidden|bypass|http)"
Triage steps: For each suspicious task, extract the full command line and decode any Base64 arguments. Check the creation timestamp against known software deployment windows. Verify the task XML against the task’s stated purpose; a task claiming to be a browser update that executes a PowerShell download cradle is your true positive. Cross-reference the network destination in any download URL against threat intelligence.
Measuring Hunt Maturity and Building a Repeatable Program
Individual hunts are valuable. A program that systematically improves over time is what separates security-capable organizations from those that get breached repeatedly. The Hunting Maturity Model (HMM), developed by David Bianco, describes five levels from HMM-0 (no hunting capability, relying entirely on automated alerts) through HMM-4 (automated hunting with continuous hypothesis generation from your own telemetry). Most enterprise teams operate at HMM-1 or HMM-2 and can reach HMM-3 within 18 months with the right process investment.
Measure three things for every hunt you run: the number of true positives found, the number of new detection rules generated from hunt findings, and the number of hypotheses queued for future hunts. Those three metrics tell you whether your hunting program is generating value or generating busy work. A mature program where every hunt generates at least one new detection rule or one new hypothesis is compounding its own value over time.
Document your hunts in a standardized template that captures: hypothesis text, ATT&CK technique mapping, required data sources, detection logic (the actual queries), analysis steps, triage decisions, findings, and next hypotheses. The GitHub repository OTRF/ThreatHunter-Playbook provides a Jupyter notebook template that works well for this. When a hunt produces a confirmed true positive, that logic becomes a detection rule. When it produces nothing, the data quality gaps it revealed become infrastructure requirements.
Integrate your hunting program with your incident response planning process. Every confirmed hunt finding should flow into your IR workflow with pre-defined escalation criteria. Equally, every incident your IR team investigates should generate at least one new hunt hypothesis about how that technique could have been detected earlier.
For organizations running multi-cloud workloads, hunting extends beyond Windows endpoints. Your cloud security posture introduces new hunt surfaces: unusual IAM role assumptions in AWS, service principal activity anomalies in Azure, and CloudTrail gaps that indicate log tampering attempts. The CrowdStrike 2025 Threat Hunting Report noted a 136% surge in cloud intrusions year-over-year, driven largely by credential misuse against cloud management planes rather than direct exploitation of workloads.
As threat actors increasingly use AI-generated tooling and LLM-assisted reconnaissance, your hunt hypotheses should account for behavioral patterns that look different from traditional manual attacks. The AI security threat picture evolving through 2026 means faster attacker iteration cycles, more convincing lure documents, and more polymorphic malware that defeats YARA rules written against static patterns.
Frequently Asked Questions
What is the difference between threat hunting and threat detection?
Threat detection is reactive: it relies on automated rules and signatures to generate alerts when known malicious behavior is observed. Threat hunting is proactive: a human analyst forms a hypothesis about attacker behavior and actively searches for evidence of that behavior, regardless of whether any alert fired. Hunting finds what detection misses, and detection rules built from hunt findings improve the overall detection baseline.
How often should a SOC team run threat hunts?
A functioning threat hunting program runs continuous or scheduled hunts rather than one-off investigations. Most mature teams run at least two to three focused hypothesis-driven hunts per week, targeting different ATT&CK technique areas on a rotating basis. Ad-hoc hunts should also trigger whenever new threat intelligence arrives about an active adversary targeting your sector or whenever a significant new vulnerability is published.
What data sources are required to start threat hunting?
The minimum viable data stack for endpoint-focused hunting includes Windows Security event logs, Sysmon with a comprehensive configuration (SwiftOnSecurity or Olaf Hartong’s modular config are standard starting points), and EDR telemetry if available. Network-level hunting additionally requires DNS query logs, proxy logs, and firewall session records. Cloud hunting requires platform-native audit logs: CloudTrail for AWS, Azure Monitor for Azure, and Cloud Audit Logs for GCP.
Can a small SOC team run a threat hunting program?
Yes, but scope it realistically. A two-person SOC should run one focused hunt per week targeting the highest-risk technique for their environment, not attempt to cover the full ATT&CK matrix. Start with the three highest-value hunt categories for your threat profile, build repeatable playbooks for each, and automate the detection logic once it is validated. A small team running structured hunts is more effective than a large team running unstructured investigations.
What is a YARA rule and when should you use it in threat hunting?
A YARA rule is a pattern-matching definition that identifies files or process memory containing specific byte sequences, string patterns, or structural characteristics associated with malware families or attacker tools. Use YARA during threat hunts when you need to scan endpoints for files that match known implant signatures, when analyzing suspected malware samples, or when threat intelligence provides specific binary indicators. YARA complements behavioral detection by catching file-based indicators that behavioral logic might miss.
How do you validate that a hunt found nothing, rather than missing something?
A null result is only meaningful if you verify the underlying data quality first. Check that the required log sources are actually generating data during the hunt window. Introduce a known-benign test event that matches your detection logic, confirm it appears in the data, then re-run your hunt. If the test event shows up and nothing else does, the null result is real. This validation step separates a well-executed hunt from a hunt with invisible blind spots.
Start Hunting Before the Next Alert Fires
The playbooks above are starting points, not finished products. Run each one against your environment, document what you find and what you do not, update the detection logic based on what your data actually looks like, and generate the next hypothesis from there. Every hunt you complete makes the next one faster and more precise.
If your organization needs structured support building a proactive threat hunting capability, including data source assessment, playbook development, and analyst training, Shield Operations works with security teams across the UK to build programs that find threats before they find you.