How to Generate a Hypothesis for a Threat Hunt

Increasingly in recent years, security teams have taken to threat hunting as a way to proactively identify and close gaps in their defense. While a large number of attacks can be prevented with automated security, the attacks that can evade it are often extremely damaging. This is where threat hunters come in. Threat hunters search out attacks happening in the system that automated tools may not have identified yet, or may not even identify at all.

While this approach has significant value for defense and visibility, many find the process of threat hunting to be too demanding. Security teams struggle with understanding their environment, overwhelmed by incoming data. What techniques do you use to threat hunt? Where do you even begin?

This blog helps you understand how to generate a hypothesis for a threat hunt. While success and progress in a threat hunt can seem rather nuanced, if a threat hunter builds strong, intelligent hypotheses, threat hunts build value, add visibility, and compound on themselves.

If you’re interested in thoroughly understanding how to identify an attack, check out our white paper,

TABLE OF CONTENTS

Using Indicators of Compromise
Reading Threat Reports and Blogs
The Value of Social Media Engagement
Identifying Commands with Google Dorks
Threat Hunting with MITRE ATT&CK
Relying on Previous Incidents
Test it Out on Yourself

Generating a Hypothesis

The process of threat hunting can be broken down into three steps: creating an actionable, realistic hypothesis, executing it, and testing it to completion. Though threat hunting can be aided by the use of tools, generating an actual hypothesis comes down to a human analyst. It’s important to set up hypotheses that are built on observations, intelligence, and experience. Hypotheses should be actionable, as well as testable, and tuned constantly.

Threat hunters should be able to make considerations based on their industry and environment to identify and start scoping out possible threats. From there, they need to ensure their hypotheses have actionable and testable results based on the data they have available. It’s best to combine indicators of compromise, environmental factors, and industry experience to create the most effective and high efficacy threat hunts.

Doing this correctly can have a very positive impact on the security of an organization. According to SANS, 75% of IT professionals surveyed have reduced their attack surface thanks to aggressive threat hunting. So how do you get started building a strong, actionable hypothesis?

Static IOCs or Threat Feeds? Why Not Both

To start threat hunting, the easiest place to begin is with external intelligence. External intelligence includes any type of open-source intelligence (OSINT) that threat hunters can use to identify threats in their environment. This can be indicators of compromise (IOCs), threat reports and blogs, and various tools.

Using Indicators of Compromise

An IOC is a piece of forensic data that has the potential to identify malicious activity on a system. In other words, an IOC is a red flag that alerts threat hunters to a potential threat.

Common IOCs that are useful for external intelligence include static ones like registry keys and IP status domains. The best way to identify if these IOCs are present is checking against the firewall or antivirus. For example, asking the question, “Has this domain ever been accessed by my organization?” can give you valuable information into whether your system has been communicating with a malicious external server.

Here are ten top indicators of compromise to get started:

Outbound Network Traffic: When there are suspicious traffic patterns on the network, this could be a sign that something is amiss. For example, many types of malware often communicate with a C2 server in the course of their activities.
Privileged User Account Activity: If privileged users change their typical behavior, this could be a sign of an account takeover.
Geography: Geography is used to flag fraud across industries, most notably in banking. If a user is logging in from somewhere wildly outside their usual patterns, this could be a sign of an intruder accessing their account.
Login Attempts: Attempting to login to an account, but using the incorrect username or password repeatedly, or alternatively logging on after hours and accessing privileged files, may signal a malicious user.
Database Reads: If an attacker has entered the system, they will most likely try to exfiltrate data. This creates a large volume of database reads, which should be flagged assuming it is unusual for the operation.
HTML Response: A good way to identify a SQL injection trying to extract a large amount of data through a Web application is the size of the HTML response. For example, if the size of the response is many MB, this is a sign something is amiss - the normal response is only around 200 KB.
Requests Across Domain for One File: Attackers may attempt to locate one file across the domain. In this instance, they may change the URL on each request, but continue to look for the same file. Most individuals will not query a file in the hundreds of times at different URLs on the domain, so this is suspicious activity.
Port: If an application is using a port for an unusual request, or a port that is obscure and rarely used, this is an easy in for an attacker.
Registry Changes: Creating persistence is an important goal for a lot of malware. Any unusual changes to the registry are a big sign of trouble.
DNS Queries: DNS queries are often used to communicate back to a C2 server. This traffic often has a distinct pattern, which over time is easier to recognize.

The drawback of using IOCs is that not every attack presents in a way that flags a static IOC. Attack methods are constantly evolving. Advanced attacks are designed to circumvent this detection method, whether it be through fileless malware, stolen credentials, or LOLbins.

Furthermore, IOCs are based off of past attacks, and as such, IOCs for new attacks simply aren’t available yet. So if this is a new attack, chances are using a static IOC won’t catch it. It is not enough to rely on static indicators of compromise when defending. However, they offer a good basis for a SOC or a new analyst that is just beginning to understand their environment.

For example, you can develop a hypothesis around whether or not an application on your network is connecting to a recently discovered, malicious domain. This is an example of a simple IOC. This lets you ask the question, “Do any applications in my environment connect to this malicious domain?” If you are able to confirm this hypothesis, you will need to inform your security team and take the necessary steps to resolve the issue.

Reading Threat Reports and Blogs

Threat reports and blogs are less direct than IOCs, but just as useful, especially when it comes to furthering your personal knowledge of the cyber landscape. This information gives you an understanding of how other companies handle exploits, what types of malware the industry is seeing, and what new techniques are being used to defend and attack.

We recommend reading reports from well-known research teams, like Nocturnus, SecureList, Spider Labs, Cisco, and so on. Try to look for how this can impact your industry or organization. Make it a goal to take away at least one security recommendation you can apply to your organization or job from each threat report you read.

For example, you are reading the latest report from the Nocturnus research team on the new Ursnif variant. You see that the malware collects information from the target machine using mail stealing modules for Microsoft Outlook, Internet Explorer, and Mozilla Thunderbird. It is also able to evade some security products with an Anti-PhishWall and Anti-Rapport module. With this report in mind, what questions can you ask about your own environment? Does your system use Rapport or PhishWall? How can you identify if these or similar mail stealing modules are in place on your machine?

You may find it valuable to use a tool like YARA or the Cybereason Defense Platform to define a rule to find the malicious domain on their system. Try creating your own custom rule with YARA based on a malware sample you have identified as a threat to your organization.

In addition to building your own hypotheses, many well-written threat reports will list their own security recommendations.

The Value of Social Media Engagement

Social media channels are a good, and at times essential, way to remain informed. Whether this comes in the form of a threat report, a single tweet on a specific behavior or an IOC, or some other means, many social media channels have useful information. Find these channels, whether it’s spending time on #InfoSecTwitter, joining the r/cybersecurity subreddit, or joining a Cybersecurity LinkedIn Group.

Some informative Twitter accounts to follow run the gamut, from sources like DarkReading to researchers like Amit Serper. SwiftOnSecurity, Brian Krebs, Graham Cluley, ThreatPost, and InfoSec Mag are all great places to start learning. Other handles like @checkmydump monitor and actively post password dumps to Twitter.

What makes social media different from other mediums is the immediacy of knowledge. The second a new variant is discovered, you can know about it through the power of social media. It’s a way of keeping up-to-date with the latest in the community, which can be invaluable and timely.

For example, when NotPetya was first used in an attack on Ukraine, researcher Amit Serper tweeted a vaccination within hours. Without social media, it would be much more difficult to reach such a wide audience in such a short amount of time.

Identifying Commands with Google Dorks

Google knows everything, so of course it knows about threat hunting. Google dork queries are ways to search Google with advanced search operators to pinpoint the information you are looking to find when threat hunting. It is considered a passive attack method, but can be used to find usernames, passwords, email lists, sensitive information, and even some website vulnerabilities.

This is the power of Google dorks: this particular query finds PDFs located on publicintelligence.net with the string “sensitive but unclassified” in them. They are an easy way to sift through the massive supply of data Google has available, and are particularly useful for threat hunters.

For example, if you identified specific unknown commands in your environment, you could use Google dorks to search the entirety of GitHub. By searching GitHub for strings matching the commands, you could potentially find a malicious tool used to attack your system based on those commands. With this data, you can learn more about the threat.

Threat Hunting with MITRE ATT&CK

As a framework, MITRE ATT&CK provides analysts with a common language. However, for a threat hunter it can be so much more. It is a treasure trove of relevant techniques, tactics, and procedures, adversary emulation planning guidelines, and advanced persistent threats.

Develop an adversary emulation plan to identify each step an attacker will take. MITRE ATT&CK even provides a listing of TTPs, so you can explore the techniques, tactics, and procedures attackers commonly use. By mapping out each phase of an attack, defenders can more clearly understand the attack process and potential red flags when looking for threats.

MITRE ATT&CK also provides a listing of eighty APTs and the industries they target. By looking at which APTs commonly attack which industries, you can get a better sense for the kinds of threats that are most prevalent in yours and base your AEPs around them.

For example, a security team in the healthcare space may identify the group DeepPanda as a relevant threat to their organization. The security team can use this information to create an adversary emulation plan and look for threats in their existing system from this group. MITRE ATT&CK has identified many common techniques used by DeepPanda, including the use of PowerShell scripts to download and execute programs in memory without writing to disk. Threat hunters can use this knowledge to create a hypothesis that there may be PowerShell scripts running on their system that are malicious. From there, they can begin the hunt.

To learn more about how to close holes in your defense with MITRE ATT&CK, check out our white paper, Five Stages to Create a Closed-loop Security Process with MITRE ATT&CK.

Leveraging Internal Intelligence

In contrast to external intelligence, internal intelligence is typically more complex and based around previous incidents and an understanding of the environment. More mature SOCs are able to work better with internal intelligence. Instead of solely fixing the bug or closing the hole in the defense, these SOCs perform retrospectives to follow up on the incident. If an event occurred in the past, what did the attackers do and how can it be used to make the system stronger and improve the defense?

Relying on Previous Incidents

There is much to learn from a previous incident in your organization. An attack may leave traces that give clues as to where there are holes in an organization’s defenses. Traces of an attack can give threat hunters an idea of where similar threats may be, as well as how they function.

Some EDRs, like the Cybereason Defense Platform, can map out the entire attack story for you. This gives you an edge, as you’re able to analyze the attack step-by-step and create behavioral IOCs, TTPs, and custom rules.

Based on behaviors and actions observed in a previous attack, ask yourself how these behaviors could impact your current environment. Did you patch the issue properly? If any users are particularly careless with security, or caused an incident, have their habits changed?

Test it Out on Yourself

The next step is to take malware and run it, either to observe it or to test your defenses in a lab environment. This will give you an intuitive understanding of the malware.

If you are able to detect and get a sample of a malware, you can build a hypothesis around its activity. In a lab environment, run malware that has successfully infiltrated your system in the past, or a piece of malware you find interesting. Observe its behavior. Formulate questions based on this behavior.

For example, you run the latest variant of the Astaroth Trojan in a lab environment. While it’s running, you observe it and notice that it uses wmi to maintain persistence. In this moment, you remember that the security tools you use do not alert on wmi commands, so you pose the question: what processes result in a wmi command, and is there something I should have visibility into that I don’t currently?

For more information about how to set up your own malware analysis lab, check out Christophe Tafani-Dereeper’s great breakdown on his personal blog.

In addition, tools like mimikatz are commonly used by attackers. By running tools like mimikatz in a lab environment, you can identify behaviors that take place in your environment that signal a common attack. See how the system reacts to mimikatz, then use this information to identify its use in your actual environment. To learn more about detecting mimikatz in memory for a threat hunt, check out Roberto Rodriguez’s interesting and lively blog.

However, keep in mind that building and running malware in a lab environment is time-consuming. This kind of activity may be limited to the mature SOC, depending on how many analysts you have available, their skill level, and what their bandwidth is like. In order to perform these activities, analysts will need to have malware analysis skills. This can include dynamic analysis, static analysis, and dynamic reverse engineering. There’s a great SANS course by Lenny Zeltser that delves into the specifics of malware analysis, FOR610: Reverse-Engineering Malware: Malware Analysis Tools and Techniques.

Tying it All Together

So what makes a good threat hunting hypothesis? A good hypothesis is a question that helps you identify threats, gain information about your environment, or prove your hypothesis wrong or right. Not all of these goals need to be met, but your hypothesis should always have a conclusion, whether it is proven right or wrong.

Additional Things to Consider When Threat Hunting

Break Myths: Beware of making any assumptions about your environment. This will limit your scope when forming a hypothesis and will create blindspots in your defenses.
For example, don’t assume that all service processes will run by services.exe. Similarly, do not assume that all processes which are unsigned and unknown (hash) are inherently malicious.
Find the Right Scope: Define your hypothesis in a specific way that designates a clear-cut scope. Ask a specific question that is not only clearly defined, but limited to actionable success. Strike a balance. Too broad, and the hypothesis may become too strenuous to prove. Too narrow, and it may not be very useful.
For example, a hypothesis like, “show me all processes that connected to a malicious IP” is too broad. A hypothesis like this includes any and all processes connected to the malicious IP, which could include processes that are not malicious. Instead, use a more specific hypothesis such as, “show me all processes that connected to a malicious IP, excluding processes that do not indicate malicious activity (like browsers and DNS processes).” This much more specific hypothesis will be easier to verify and, more importantly, more directed.

Keep in mind that the next step after creating the hypothesis is executing it, followed by testing it to completion. Do not create a hypothesis you cannot execute or resolve.

Just Do It

Threat hunting can be broken down into three steps: creating an actionable hypothesis, executing the hypothesis, and testing the hypothesis to completion. For new threat hunters, the first step can be the most daunting.

Use the tips in this blog post to start small and build towards bigger hypotheses. Your first threat hunt doesn’t need to be a complex, high-efficacy hypothesis. Begin with what you know, even with just an indicator of compromise, and build up knowledge from there. For more experienced threat hunters, try combining IOCs with knowledge of the environment and industry experience to create a more targeted hypothesis. What’s most important is to create a hypothesis and reach a conclusion, whether the hypothesis is proven true or false. Complete the process, then begin again with more knowledge than before.

Keep in mind the most important principles of generating a hypothesis for a threat hunt: it should be actionable, and can always be tested and tuned continuously.

Find mentors in the community or get a consulting team to come in to talk about threat hunting. Having a conversation with an expert about what threat hunting is, how to do it, what information to collect, and what to do to defend is invaluable to an organization's security defense.

Proactively seek out malicious activity and gaps in your defense program.

About the Author

Mor Levi

Mor Levi is a security researcher at Cybereason.

How to Generate a Hypothesis for a Threat Hunt

Generating a Hypothesis