Ruslan Rakhmetov, Security Vision
In today's aggressive cyber landscape and multitude of cyber threats, one cannot rely solely on preventive, prescriptive and proactive mechanisms. The Assumed Breach approach is far from new, but it is relevant - no company can be 100% confident in its cyber security, so it makes sense to focus on detective, deterrent, remedial, restorative, and investigative defences. In the event that attackers do penetrate the infrastructure, it's not too late to stop them before critical damage is done to the company. In this article, we will discuss IS incident management techniques, how to investigate cyber incidents, and the use of appropriate specialised tools.
So, according to NIST SP 800-61 Computer Security Incident Handling Guide, IS incident management consists of the following steps:
- Preparation;
- detection;
- Analysis;
- Containment;
- Elimination;
- Recovery;
- Post-incident actions.
In order to build a cyber incident management process, it is necessary to establish an IS incident response policy that allows the IS management process to be correctly and seamlessly integrated into the overall structure of the company's IS management system (ISMS). It is important that the incident response preparation and post-incident response phases include actions that increase the company's cyber security. For example, as part of the preparation, information about the protected infrastructure (including network map and information flow diagram), business process dependencies and responsible persons is collected and systematised, which, accordingly, leads to work to update this information and to a concomitant increase in the transparency and controllability of the IT infrastructure. In addition, NIST SP 800-61 specifically states that it is easier to prevent an incident than to deal with its aftermath, so the knowledge and expertise of SOC analysts or response team members can and should be applied to prevent cyber incidents and improve a company's cyber resilience. The most effective ways to prevent cyber incidents include comprehensive cyber risk management, ensuring endpoint security (including through vulnerability management, updates, configurations, IS event logging and host monitoring), ensuring network security (including by implementing the Zero Trust principle), and conducting Awareness training for company employees. As part of post-incident actions, it is important to analyse the causes of the incident, formulate a plan for eliminating the identified deficiencies, and use the results of the analysis to reconfigure individual protection systems, optimise ISMS elements, reassess cyber risks, and conduct additional IS training.
As part of the preparation for an IS incident response, it is important to correctly configure the following PPE and security features that will be used to identify incidents and further response actions:
- IS event logging should be configured on endpoint devices (servers and PCs), network devices, business applications, virtualisation and containerisation environments. The list of events that should be logged and then transmitted to SIEM systems may differ for each system, so it is wise to refer to best practices and documentation where experts and developers provide recommended logging settings. You should also think ahead of time about ways and tools to remotely run various scripts and commands on devices as part of incident response (e.g., to get a list of current processes, create a memory dump, look for indicators of compromise, network isolation of the host). Such tools can be PsExec utility, PowerShell Remoting technology, remote SSH connection, running IoC scanners (such as SPARK, THOR, Loki, rastrea2r and others).
- Security Information and Event Management (SIEM, Security Information and Event Management) systems, which are used to collect IS events, parsing, storing and efficiently searching IS events, as well as analysing, enriching, correlating IS events and generating IS incidents.
- IS Incident Response Platforms (SOAR, Security Orchestration, Automation and Response) - These are used to automate the management of IS incidents by setting up response scenarios and corresponding actions that are performed either directly by SOAR or by security tools that are integrated with and can be managed by the SOAR platform.
- NTA (Network Traffic Analysis) systems that can identify threats in network traffic and record a copy of the network traffic for later examination, which can be useful in incident investigation.
- Systems for computer forensic investigations, which may be software products or hardware-software complexes for conducting forensic analysis of attacked devices (RAM and storage devices) and saving their images for further use in forensic computer forensic investigations.
- A cyber intelligence platform (TIP, Threat Intelligence Platform) can be used and prepared in advance to process cyber threat analytics data and enrich IS incidents with data from external analytics services to help investigate cyber attacks.
When an incident is detected, if logging is correctly configured and logs are forwarded to the SIEM, the incident will be generated according to the correlation rules and then passed to the SOAR system. The incident is then processed according to the configured response scenarios (playbooks), but typically the SOC centre's L1 analyst will first have to decide if the incident is a false positive, and the incident data will be automatically enriched from internal and external sources (e.g, device data will be pulled from the corporate ITAM system, user information will be imported from the HR system, and file hashes, external IP addresses and DNS names will be downloaded from TIP and external analytical services).
As part of the analysis phase, the SOC team (primarily L2 analysts) categorise and prioritise the incident depending on its properties and the characteristics of the attacked asset, directly investigate the incident, and then communicate with the responsible parties and escalate depending on the criticality of the incident. During the investigation, the analyst tries to determine the attack vector, the entry point of attackers ("patient zero"), predict the route of further movement of attackers in the network and possible options for action. During the investigation phase, it is necessary to obtain the necessary and sufficient minimum information in order to quickly localise and eliminate the threat before the company suffers damage (e.g. data deletion or data theft). An in-depth investigation with forensic analysis of compromised devices can take several days, but if a device is infected with an encryption virus as a result of phishing, it is important to disconnect it from the corporate network as soon as possible to prevent the virus from spreading throughout the infrastructure. Then it is necessary to check if there are no other recipients of the same phishing email in the company, if there are no signs of compromise of other hosts, if the attacker has not had time to move horizontally across the network. After that, the attacked PC can be examined in detail by first capturing an image of the hard drive and RAM. Specialised tools for deep investigation and forensic analysis include FTK Imager, Belkasoft RAM Capturer, Autopsy / The Sleuth Kit, Volatility Framework, Rekall Framework, NetworkMiner, utilities from the Eric Zimmerman suite, REMnux and SANS Investigative Forensic Toolkit distributions, as well as other tools, a non-exhaustive list of which can be found on the Awesome Forensics project's GitHub page.
During the containment and remediation phases, it is important to use the integration capabilities of SOAR solutions to localise the threat as quickly as possible and bring the infrastructure to a secure state. Active countermeasures can include network isolation of the attacked device, blocking the compromised account, terminating the suspicious process, reconfiguring the firewall, moving the device to a "quarantined" VLAN, etc. During the recovery phase, backup systems and the implementation of business continuity and business recovery procedures will help. At the post-incident stage, the incident response history maintained in the SOAR system can be used to help assess the correctness and timeliness of actions taken, identify bottlenecks and plan for optimisation of the response process, as well as suggest ways to improve the ISMS and fine-tune protective measures to prevent similar incidents in the future.