Incident Response: 6 Steps, Teams and Tools

Incident Response: 6 Steps and the Teams and Tools that Make Them Happen

Published
May 02, 2023

Author

Reading time
20 mins

Learn how an incident response plan is used to detect and respond to incidents before they cause major damage.

What is incident response?

Incident response is an approach to handling security breaches. The aim of incident response is to identify the scope of the events, contain the damage, and mitigate or eradicate the root cause of the incident. An incident represents a change in security posture potentially in breach of law, policy, or unacceptable act that concerns information assets, such as networks, computers, or smartphones – which may or may not be materially reportable.

As the frequency and types of data breaches increase, the lack of an incident response plan can lead to longer increased cost, and further damage to your information security effectiveness. This makes incident response a critical activity for any security organization.

This post is part of an extensive series of guides about hacking.

In this article:

Why is incident response important?

When your organization responds to an incident quickly, it can reduce losses, restore processes and services, reduce the scope or effects, and mitigate exploited vulnerabilities. An incident that is not effectively contained can lead to a data breach with potentially catastrophic consequences. Incident response provides this first line of defense against security incidents, and in the long term, helps establish a set of best practices to prevent breaches before they happen.

If you fail to address an incident in time, it can escalate into a more serious issue, causing significant damage such as data loss, system crashes, and expensive remediation – or even external financial penalties depending on the type of incident and the industry involved. Effective incident response stops an attack in its tracks and can help reduce the risk posed by future incidents.

A solid incident response plan helps prepare your organization for both known and unknown risks. Reliable incident response procedures will allow you to identify security incidents immediately when they occur and implement best practices to block further intrusion. Incident response is essential for maintaining business continuity and protecting your sensitive data.

Your response strategy should anticipate a broad range of incidents. Even simpler incidents can impact your organization’s business operations and reputation long-term. In addition to the technical burden and data recovery cost, another risk is the possibility of legal and financial penalties, which could cost your organization millions of dollars. 

Types of externally-sourced security incidents

Here are several types of significant security incidents caused by malicious external threat actors that require organized incident response.

Phishing and social engineering

Phishing and social engineering are among the most common sources of security incidents. These attacks often involve manipulating individuals into disclosing confidential information, such as passwords or credit card numbers.

Social engineering uses psychological manipulation to trick people into making security mistakes or giving away sensitive information. Examples can range from emails claiming to be from a trusted source, to phone calls, to even physical impersonations.

DDoS attacks

Distributed Denial of Service (DDoS) attacks are a type of incident that floods a network, system, or server with traffic to overwhelm it and make it inaccessible to users. These attacks can be crippling to an organization dependent on their website or system for daily activity, causing significant downtime, and potentially leading to loss of revenue and customer trust – or even physical harm (if in healthcare or manufacturing, depending on the system.).

DDoS attacks can be particularly hard to defend against because they are performed by botnets, which might have thousands or millions of compromised computers attacking the target simultaneously. This type of attack are often used as a smokescreen for other malicious activities, distracting security teams while other attacks or trojans are employed.

Supply chain attacks

Software supply chain attacks, also known as value-chain or third-party attacks, occur when someone infiltrates your system through an outside partner or supplier who has access via network connection to your systems and data. This type of cyberattack can be particularly damaging because it can bypass traditional security measures and give the attacker deep access to sensitive data and intellectual property.

These attacks are difficult to both anticipate or prevent, as they exploit the trust relationship between companies and their suppliers. It highlights the importance of vetting and monitoring third-party providers for potential security risks, and to make careful decisions in the risk register about what service accounts will have access to the home network and what they are authorized to do..

Ransomware

Ransomware is a type of malicious software that encrypts a victim’s files. The attacker then demands a ransom from the victim to restore access to the data upon payment. Ransomware attacks can lead to significant business disruption and financial loss if critical systems or data are impacted.

These attacks can be delivered through various vectors (especially social engineering), including email attachments, software vulnerabilities, or infected websites. A robust incident response plan is crucial in dealing with ransomware incidents and minimizing their impact. This includes discussions at the highest levels as to whether your organization is prepared to pay a ransom for the key – or indeed whether it is legal to do so. 

An example of this grey area includes the 2020 ruling in which the U.S. Department of Treasury’s Office of Foreign Assets Control and the Financial Crimes Enforcement Network declared it illegal to pay a ransom in some (most) cases. There have also been cases where paying the ransom did not result in the decryption key being delivered – or even malware like NotPetya which included both ransomware and data destruction.

Insider threats

Insider threats are security threats that originate from within the organization. These may come from both careless and disgruntled employees, contractors, or anyone else who has been granted insider access to the company’s network and data.

These threats can be particularly challenging to manage and detect because these individuals often already have legitimate access to the company’s systems. This type of incident can lead to significant damage, including intellectual property theft, financial fraud, and damage to the company’s reputation.

The six steps of incident response

1. Preparation

Here are steps your incident response team should take to prepare for cybersecurity incidents:

  • Form an internal incident response team, and develop policies to implement in the event of a cyber attack. Review the tools needed to defend against each of these incident types, and centralize their management.
  • Review security policies and conduct risk assessments modeled against external attacks, internal misuse/insider attacks, and situations where external reports of potential vulnerabilities and exploits. (NIST provides a good framework – and there is a 2.0 update proposed and under consideration now.) 
  • Prioritize known security issues or vulnerabilities that cannot be immediately remediated – know your most valuable assets to be able to concentrate on critical security incidents against critical infrastructure and data.
  • Develop a communication plan for internal, external, and (if necessary) public breach reporting. One important example in the USA – the Securities and Exchange Commission has recently imposed requirements for publicly-traded companies to disclose cyberattacks within four business days after determining they are material incidents.
  • Outline the roles, responsibilities, and procedures of the immediate incident response team, and the extended organizational awareness, operational, or training needs.
  • Recruit and train team members, and ensure they have access to relevant systems, technologies, and tools. 
  • Plan education for the extended organization members and affiliates for how to report potential security incidents or information. This includes for partners and secondary supply chain vendors – and should be part of their negotiation and master services agreements.

2. Identification

Decide what criteria calls the incident response team into action, and create a triage or information gathering document for the focal point. (If all security incidents are reported to the security operations team, for instance, the analysts on duty will need to know how to ask the right questions to perform basic triage. 

IT systems gather events from monitoring tools, log files, error messages, firewalls, and intrusion detection systems. This data should be analyzed by automated tools and security analysts to decide if anomalous events represent security incidents. For example, just seeing someone hammering against a web server isn’t a guarantee of compromise – security analysts should look for multiple factors, changes in behavior, outbound traffic, and new event types being generated.

When an incident is isolated it should be alerted to the incident response team. Team members coordinate the appropriate response to the incident:

  • Identify and assess the incident and gather evidence.
  • Decide on the severity and type of the incident and escalate, if necessary.
  • Document actions taken, addressing “who, what, where, why, and how.” This information will be used later as evidence if the incident reaches a court of law. 

Keep in mind that every step of awareness and investigation, from logs to emails, phone calls, and personnel involved, become part of the record of investigation. Make sure that your incident response teams and security analysts understand the importance of recording the name, dates, times, and communications to every person involved throughout this process!

3. Containment

Once your team isolates a security incident, the aim is to stop further damage. This includes:

  • Short-term containment — an instant response, so the threat doesn’t cause further damage. This can include taking down production servers that have been hacked or isolating a network segment that is under attack.
  • System backup — you should back up all affected systems before you wipe and reimage them to acquire a “current state” or forensic image. A forensic image is a bit-for-bit copy of a hard disk, or a specific disk partition. Disk images are created after an incident to maintain the state of a disk at a specific point in time and thus provide a static ‘snapshot,’ which you can use as evidence of the security incident, and to investigate how the system was compromised.
  • Long-term containment — While making temporary fixes to replace systems that have been taken down to image and restore, rebuild clean systems so you can bring them online in the recovery stage. Take measures to prevent the incident from recurring or escalating: install any security patches on affected and associated systems, remove accounts and backdoors created by attackers, alter firewall rules, and change the routes to null route the attacker address, etc.
  • Create scope documentation — It is important to know precisely which credentials, service accounts, endpoints, servers, etc. were involved in the incident. It’s important to establish a place for storing disk images, lists, and reports for a clean chain of investigation and evidence preservation.

4. Eradication

Contain the threat and restore initial systems to their initial state, or close to it. The team should isolate the root cause of the attack, remove threats and malware, and identify and mitigate vulnerabilities that were exploited to stop future attacks. These steps may change the configuration of the organization. The aim is to make changes while minimizing the effect on the operations of the organization. You can achieve this by stopping the bleeding and limiting the amount of data that is exposed.

This is done as follows:

  • Identify and fix all affected hosts, including hosts inside and outside your organization
  • Isolate the root of the attack to remove all instances of the software
  • Conduct malware analysis to determine the extent of the damage
  • See if the attacker has reacted to your actions – check for any new credentials created or permission escalations going back to the publication of any public exploits or POCs
  • Make sure no secondary infections have occurred, and if so, remove them
  • Allow time to make sure the network is secure and that there is no further activity from the attacker(s)

Ensure your team has removed malicious content and checked that the affected systems are clean. For example, if the attacker used a vulnerability, it should be patched, or if an attacker exploited a weak authentication mechanism, it should be replaced with strong authentication. This may require heavy cooperation with both architecture, operations, and engineering teams, so make sure they are included in your communications plan.

5. Recovery

The purpose of this phase is to bring affected systems back into the production environment carefully to ensure they will not lead to another incident. Always restore  systems from clean backups, replacing compromised files or containers with clean versions, rebuilding systems from scratch, installing patches, changing passwords, and reinforcing network perimeter security, (E.g., boundary router access control lists, firewall rulesets, etc.)

Decide how long you need to monitor the affected network and endpoint systems, and how to verify that the affected systems are functioning normally. Calculate the cost of the breach and associated damages in productivity lost, human hours to troubleshoot and take steps to restore, and recover fully.

6. Lessons Learned

After any incident, it’s important to hold a debriefing or lessons learned meeting to capture what happened, what went well, and evaluate the potential for improvement. The incident response team and stakeholders should communicate to improve future processes. Complete documentation that couldn’t be prepared during the response process. The team should identify how the incident was managed and eradicated.

See what actions were taken to recover the attacked system, the areas where the response team needs improvement, and the areas where they were effective. Reports on lessons learned provide a clear review of the entire incident and can be used in meetings, as benchmarks for comparison or as training information for new incident response team members.

Who handles incident response? The Computer Incident Response Team (CSIRT)

To prepare for and attend to incidents, you should form a centralized incident response team, responsible for identifying security breaches and taking responsive actions. In a large organization, this is a dedicated team known as a CSIRT. The CSIRT includes full-time security staff — including any specialized insider threat teams. These individuals analyze information about an incident and respond. 

In a smaller organization, the incident response team can consist of IT staff with some security training, augmented by in-house or outsourced security experts.

The incident response team also communicates with stakeholders within the organization, and is involved with offering preliminary written communication with external groups such as press, legal counsel, affected customers, and law enforcement. 

The team should include:

  • Incident response manager (team leader) — coordinates all team actions and ensures the team focuses on minimizing damages and recovering quickly. Prioritizes actions during the isolation, analysis, and containment of an incident. Oversees all actions and guides the team during high severity incidents.
  • Security analysts — the manager is assisted by a team of security analysts who work across departments to isolate and rectify flaws in the organization’s security systems, solutions, and applications. They recommend specific measures to improve the overall security posture.
  • Lead investigator — isolates root cause, analyzes all evidence, manages other security analysts, and conducts rapid system and service recovery.
  • Threat researchers — provide the context of an incident and threat intelligence. They use this information and records of previous incidents to create a database of internal intelligence. On many security teams, threat researchers are gradually replaced by automated threat intelligence tools.
  • Communications lead — communicates with all audiences inside and outside the company, including management, internal stakeholders, executives and/or board members, legal, press, and customers.
  • Documentation and timeline lead — documents team investigation, discovery, and recovery efforts. And, creates a timeline for each stage of the incident. Next-generation Security Information and Event Management (SIEM) systems are able to generate documentation and incident timelines automatically. For example, see the Exabeam Advanced Analytics module offered by the Exabeam Security Management Platform.
  • HR/legal representation — there is the possibility that an incident could eventually involve criminal charges. (This is also why maintaining a clean chain of evidence and record of investigation.) Thus, you should have HR and legal guidance, especially if you engage secondary or external incident response professional services.

Incident response tools and technologies

SIEM

Security Information and Event Management (SIEM) solutions collect and aggregate log data generated by applications, servers, and network devices in an organization’s IT environment. This data is then analyzed and correlated to identify patterns that could indicate a security incident.

SIEM tools not only help detect potential threats but also aid in incident response by providing actionable intelligence. They generate alerts based on predefined rules and severity levels, enabling security teams to prioritize and respond to incidents more effectively. Moreover, SIEM solutions support compliance reporting by providing evidence of security measures.

SOAR

Security Orchestration, Automation, and Response (SOAR) is a stack of security solutions that allow an organization to collect data about security threats from multiple sources, and respond to security events without human assistance.

SOAR tools can automate common response actions, reducing the number of manual tasks that security teams need to perform. They can also streamline incident response workflows, making it easier for teams to track, manage, and resolve incidents. Additionally, SOAR tools can integrate with a wide range of security tools, providing a centralized platform for managing incident response.

EDR and XDR

Endpoint Detection and Response (EDR) and Extended Detection and Response (XDR) are security technologies that provide comprehensive threat detection and response capabilities. 

EDR focuses on endpoints like laptops, desktops, and mobile devices, monitoring them for signs of security incidents. XDR expands this coverage to include network traffic, cloud environments, and other potential attack vectors. Both EDR and XDR can play a crucial role in incident response, providing the visibility and control needed to detect, investigate, and respond to advanced threats quickly and effectively.

UEBA

User and Entity Behavior Analytics (UEBA) analyzes the normal conduct of users, endpoints, and systems, and uses it to detect anomalous behavior that deviates from the ‘norm’. These tools use machine learning, algorithms, and statistical analyses to detect meaningful anomalies from the behaviors of users, machines, and networks within an IT environment.

In incident response, UEBA tools can help security teams identify unusual or suspicious behavior that may indicate a security incident. For instance, if a user suddenly starts accessing sensitive data they don’t normally interact with, this could be a sign of an insider threat or a compromised account. By detecting such anomalies, UEBA tools can help teams respond to incidents before they result in significant damage.

IPS and IDS

Intrusion Prevention Systems (IPS) and Intrusion Detection Systems (IDS) are tools designed to detect and prevent security incidents. IDS monitors network traffic for signs of potential incidents, such as malicious payloads or suspicious behavior. If an incident is detected, IDS will send an alert to the security team.

On the other hand, IPS goes a step further by actively preventing incidents. For example, if an IPS detects an attempted security breach, it can take action to block the attack, such as by closing network connections or changing firewall rules.

5 tips for successful incident response

1. Isolate exceptions

Technology alone cannot successfully detect security breaches. You should also rely on human insight. Following are a few conditions to watch for daily:

  • Traffic anomalies — sensitive connections and servers used internally will typically have a stable traffic volume. If you notice a sudden increase or decrease in monitored traffic — and either can be suspicious — take notice.
  • Accessing accounts without permission — privileged or administrator accounts have access to more information and systems than normal employees. However, employees tend to be the easiest entry point for cybercrime. Closely monitor privileged accounts and watch for privilege escalation on normal user accounts. Privilege escalation is a common malware trait, and should be identified quickly via rules or anomalies.
  • Excessive consumption and suspicious files — if you see an increase in the performance of the memory or hard drives of your company, it could be that someone is illegally accessing them or leaking data.

Modern security tools such as User and Entity Behavior Analytics (UEBA) automate these processes and can identify anomalies in user behavior or file access automatically. This provides much better coverage of possible security incidents and saves time for security teams. Exabeam Behavior Analytics offer UEBA for Exabeam Fusion, Exabeam Security Investigation, and Exabeam Security Analytics. The latter two products can be layered on top of an existing SIEM, where Exabeam Fusion offers an all-in-one combination of UEBA, SIEM, and SOAR.

2. Use a centralized approach

Gather information from security tools and IT systems, and keep it in a central location, such as a SIEM. Use this information to create an incident timeline, and conduct an investigation of the incident with all relevant data points in one place.

You can also use a centralized approach to allow for a quick automated response. Use data from security tools, apply advanced analytics, and orchestrate automated responses on systems like firewalls and email servers, using technology like Security Orchestration, Automation, and Response (SOAR).

3. Assert, don’t assume

Don’t conduct an investigation based on the assumption that an event or incident exists. Instead of making assumptions, make assertions, based on a question that you can evaluate and verify. For example “If I’ve noted alert X on system Y, I should also see event Z occur in close proximity.”

Create your assertions based on your experience administering systems, writing software, configuring networks, building systems, etc., imagining systems and processes from the attacker’s eyes.

4. Eliminate impossible events

You may not know exactly what you are looking for. On these occasions, eliminate occurrences that can be logically explained. You will then be left with the events that have no clear explanation. These are often represented at the start of incident triage calls, when people report symptoms without knowing what has caused them.

For example:

  • Unexplained inconsistencies or redundancies in your code
  • Issues with accessing management functions or administrative logins
  • Unexplained changes in volume of traffic (E.g., drastic drop)
  • Unexplained changes in the content, layout, or design of your site
  • Performance problems affecting the accessibility and availability of your website or resources

5. Take post-incident measures

Continue monitoring your systems for any unusual behavior to ensure the intruder has not returned. Watch for new incidents and conduct a post-incident review to isolate any problems experienced during the execution of the incident response plan.

Tags: breaches, CSIRT,

Similar Posts

Spooky Season Brings a Toe-Curling Vulnerability

Defending Against Ransomware: How Exabeam Strengthens Cybersecurity

Why Airlines are Prone to Cyberattacks




Recent Posts

What’s New in Exabeam Product Development – March 2024

Take TDIR to a Whole New Level: Achieving Security Operations Excellence

Generative AI is Reshaping Cybersecurity. Is Your Organization Prepared?

See a world-class SIEM solution in action

Most reported breaches involved lost or stolen credentials. How can you keep pace?

Exabeam delivers SOC teams industry-leading analytics, patented anomaly detection, and Smart Timelines to help teams pinpoint the actions that lead to exploits.

Whether you need a SIEM replacement, a legacy SIEM modernization with XDR, Exabeam offers advanced, modular, and cloud-delivered TDIR.

Get a demo today!