No - code development and ML assistants are the next generation of SOC analyst tools

19.06.2025

Eva Belyaeva, Security Vision

Introduction

Let's imagine what the workplace of a SOC analyst of the future might look like. Including, we will consider which ML assistants would be useful in response and investigation: some of those mentioned in the article we have already implemented in our products, and some are still in the plans or can serve as an idea for those who face similar tasks.

SOC analyst's workplace is most often arranged. In fact, the modern incident management process already actively uses available AI assistants that simplify or speed up the work.

For example, there is already a behavioral analysis of events to detect incidents outside the correlation rules. This is a whole mechanism designed to complement SIEM triggers where a simple rule is not enough. Thus, among the same events, it is possible to find incidents in different ways.

Also, in their daily work, SOC employees increasingly resort to AI help when they need to write a correlation or normalization rule. Such requests are usually needed in two cases: when it is clear what to do, but it is easier and faster to explain what should happen than to write the code itself, and also when the work is too routine.

And finally, LLM for recommendations. Chances are, every company has someone who uses a chatbot when they need to write a simple script or ask a question about an investigation.

No - code

Let's start with the environment in which analysts work: it can be either a console or a web interface of one or several information security systems. In our opinion, it is correct and clear to display all information and all tools in a single-window platform, in order to, on the one hand, consolidate all information in one place, and on the other, simplify the management of other systems for the analyst. Including, using no - code.

The use of No - code tools allows you to simplify your work in three directions at once:

Provides transparency of the settings applied, such as correlation rules, response scenarios. You can immediately understand how it will work and whether it is correct.
Promptness of making changes to correlation and normalization rules, response scenarios. Often at the post-incident stage, it is necessary to make edits to adjust the logic of incident detection and reduce the number of false - positives.
Lower entry threshold and easy setup – no need to learn syntax and programming languages. Thus, the analyst is engaged in investigation and response, not development.

Limitations of using AI assistants

Now that we've made it easier for a specialist to work on incidents from a visual point of view, it's time to talk about assistants that could help the analyst and speed up the investigation.

What are the limitations and conditions for adding assistants to work? We must take into account the specifics of each customer individually before implementation.

Firstly, the expertise itself must correspond to what a specific customer has: what processes are built, in what format the team works and with what restrictions. We consider additional training mandatory ML models on the customer's premises and on his data.

The vendor can pledge both its own expertise and the best practices from international standards. Further training is also necessary so that the response from the ML assistant is relevant and up-to-date, so that it can be immediately applied in practice.

In addition, the use of ML solutions in SOC imposes additional restrictions: when working on - prem, the most important quality of the model is its autonomous operation, since there will be no ML specialist nearby who could promptly adjust this model. All the same, it is necessary to tune the model, even if the model works stably over a long period of time, no one is immune from the appearance of irrelevant responses. The cause of such problems can be changes in the traffic structure, data structure, the infrastructure itself. Then the model, if not updated, can even harm the work of the SOC. The functionality of automatic model shutdown, its automatic additional training on the customer's side becomes very important.

On our part, we pay a lot of attention to auto-selection. hyperparameters: when the model itself uses certain algorithms for retraining in given ranges, selecting the optimum for the final result based on retrospective data, in order to continue doing its job more efficiently, adapting to changes.

Main problems

When we talk about the need for an assistant, we need to start this story by discussing the current problems and shortcomings of automation or functionality of the information security system.

Firstly, this is a huge flow of incidents. When testing a pool of correlation rules, the volume of new triggers may be such that the analyst simply will not have time to not only investigate them, but even review each one. In general, there may be many triggers from the information security system - if there is a zoo of solutions or the operation of a specific product has not yet been debugged. It is necessary to somehow make it clear to the analyst how to prioritize these incidents, and which ones do not need to be paid attention to at all.

Secondly, repeating cases. Regardless of whether you have an internal SOC or an MSSP, over time your systems will detect incidents that are somewhat similar to each other or even identical. The human factor cannot be ruled out here - they may forget to configure something, some task within the framework of hardening is not done - and now the SOC receives a bunch of incidents about the same thing until someone fixes the error. It is also important to note that the team of analysts in this process is a fickle value, and each time the updated or old staff will try to remember or find in history what was done with such incidents a year or two ago, what one employee did with the incident yesterday, so that the second could successfully repeat it today. It would be great if this information was aggregated and it would be possible to receive relevant tips each time during the response.

And finally, thirdly, there is the problem of accumulating expertise in the format of a single knowledge base. The task is to collect all the best, all the developments, good solutions, response actions for the entire period of time and understand how the SOC responded and why.

A separate point is the task of searching for and identifying new types of attacks. Sometimes this has to be done manually, sometimes entire expert centers review event streams, and automation and the involvement of AI in this case would help distribute resources more efficiently.

We have selected 11 scenarios of how ML assistants could help cope with the mentioned tasks, and all of them are here for a reason - we have a whole concept for this, we propose to integrate assistants at each stage of the investigation.

Перед вами скомбинированный процесс управления инцидентами на основе методологий NIST и SANS

Here is a combined incident management process based on NIST and SANS methodologies. Next, we will go through each of the stages in detail.

Stage: preparation. Normalization. events sources

Here we solve a problem that many SIEMs and not only vendors face during integrations: the need to quickly and correctly normalize the data flow from a non-standard source. This can be either custom software from the customer or an event source that writes something to the log in a human-readable format - this is clear to the analyst, but problems arise when trying to automate parsing.

It would be useful if we could train an AI assistant that would immediately parse the data into correlation rule attributes.

Stages: Preparation, Analysis/Enrichment. Classification and Clustering of Users and Devices

When solving a clustering problem, our goal is to group objects – such as hosts or accounts – to provide additional context for an incident. This can include highlighting incidents that are active for the group whose members are currently involved in the investigation.

Depending on the results obtained, we can manage such incident criteria as mass and criticality. Some of this information may already be in the catalog services, but it is not always relevant at the time of the request. In addition, it is not enough to cluster once; this process must be repeated to take into account the current state of the infrastructure. The stability of the clusters is important here, so that with each change and the appearance of new assets, those already distributed do not begin to move chaotically between clusters.

The solution to this problem can become an additional element for other models – for example, for a recommender system.

Stage: Preparation. Asset Scoring

In terms of getting additional context for assets, you can add an AI assistant to prioritize them. For example, an analyst is working on an incident involving a host or account; the analyst needs to quickly understand what he can do with these assets in terms of response, what actions are available to him. Maybe this is a very important host that cannot be done at all, especially not rebooted or turned off for a while.

The challenge, then, is to calculate the asset priority based on the current infrastructure and the incident context.

As a calculation algorithm, you can use the classic PageRank, but only use a multifactor model as a weight or get results from another model - for example, trees - that will take into account the criticality of the asset, its CCD, the installed system in a given way. As a result, we get a score that can be dynamic: during the investigation of an incident, it can be taken into account in order to predict the route of the intruder in the context of the incident or chain of incidents being investigated. Such scoring will allow you to determine what needs to be paid attention to first, be it an asset or a group of assets.

Stage: Detection. Detection of typical attacks and abnormal events

Many vendors are already using ML models to search for anomalies in traffic. But in reality, not everything that could be used has been implemented yet.

For typical attacks, classic trees and boosting work; the peculiarity is that we work not with incidents, but with primary events from infrastructure objects. Only on such a stream do we have the ability to identify attacks such as, for example, C 2 and botnet.

As for anomalies, a classic problem for ML, we actively use both support vectors and isolation forest , the difference again is that the work is carried out with raw events, those that come before correlation.

Stage: Analysis/Enrichment. Definition of FP and TP

A large block of problems for a SOC analyst is information noise. It is not clear to him what is important and what can be neglected.

To solve this problem, you will need an ML assistant that will determine whether the incident is most likely false. positive or true positive , based on the analyst's previous findings. This will help weed out incidents that came in a stream as a result of testing correlation rules, as well as those generated as a result of misconfigurations and other incorrect software settings that analysts know about, but this problem cannot be solved at the moment. Thus, only those incidents that are really important will remain.

In addition, such an incident marking system will help not to miss a trigger that an analyst would automatically mark as FP - for example, because of the usual host involved. In such a situation, the ML assistant, having determined that the incident is reliable, increases the probability level TP, increasing the priority.

This is one of the most popular models that is not difficult to implement - boosting algorithms are also suitable; the peculiarity is that it may take some time to accumulate a representative sample - only when you have enough FP and TP incidents in your SOC, manually labeled, then you can count on a relevant result of the model.

Only after the results have proven themselves to be reliable can any automation be added to the label from the AI assistant; during data preparation, the label will be purely informational.

Stages: Analysis/enrichment, response. Search for similar incidents

We have identified three vectors of possible incident connections: identifying the attacker's pattern, building an attack chain, and inheriting recommendations.

In the case of determining the attacker's behavior pattern, we rely on the fact that whether it is a hacker group or an individual, a certain "handwriting" will still be traced when working in your infrastructure. An ML model could, by analyzing the actions performed by the attacker, determine the style of this attack; thus, an attack that occurred, for example, a year ago can be related to the one that is happening right now.

We also solved the problem of macrocorrelations and incident alignment in Kill in different ways. Chain – links incidents along an attack chain that could be investigated as a whole, rather than trying to respond to several separate incidents.

And ultimately, when we understand that incidents are similar, we can respond to them in similar ways if it has been helpful in the past.

As an implementation, we use two polar stories: we can take an incident, extract the key context from it, create a vector from it and extract the cosine proximity from it, or, on the contrary, form all the information as a short summary and then solve this problem as a problem of identifying the similarity of the text between themselves.

Stage: Analysis/Enrichment. Incident Scoring

Just as we needed to rank assets to understand the limitations of working with them, we need to score incidents.

For example, yesterday you investigated an incident in which your organization's website became unavailable. Let's say you are the customer company, and the website hosts your store. In the context of yesterday, the incident is serious, and a long downtime cannot be allowed, therefore, we investigate such an incident first.

But today things are changing and this incident comes among those that have DDoS triggers, therefore the priority of the first incident should decrease as the focus shifts to the root of the problem.

In this case, the AI assistant must determine the severity of the incident based on the totality of its characteristics, attributes, associated objects and environment.

Here we can draw an analogy from another area: in general, incident scoring is essentially no different from human scoring, which is used in the process of issuing loans or adjusting losses in insurance cases; with a sufficient degree of training of trees and their detailing, as in these areas, it is possible to apply automation and perform a number of actions if the scoring model is sufficiently trained. At the same time, the trust indicator is sufficient for the model to be trusted.

Stage: response. Digital imprint, search in knowledge base

In addition to the fact that we can use the results of ML models, we also propose the idea of forming a full-fledged digital copy of a SOC analyst. We could take either a competent analyst who knows how to investigate incidents in the infrastructure entrusted to him, or an entire SOC team that successfully investigates incidents, and form an interactive assistant in which all these skills and knowledge would be collected. This way, it will be possible to retain the expertise of an outsourced SOC, if you applied for their services for a certain period, and insure yourself in case of losing a key specialist.

To create a virtual assistant, it would be necessary to collect a knowledge base on how incidents were investigated, how additional information was searched for, what actions were performed, etc. Such a virtual analyst could be contacted both for a response prompt and to automate the response based on the collected data.

Stages: response, post-incident. Expert recommendations

Continuing with the knowledge base theme, expert recommendations for an incident can be compiled in the same way.

If we have a database of investigated customer incidents, the response to which was successfully completed, we can select the most frequently repeated steps for similar incidents. Unfortunately, in the process of work it is not possible to quickly change the playbook or even revise the response algorithm; most often, such changes will have to be carried out through a chain of approvals and changes, wasting time. Moreover, we can either use the customer's historical data to give personalized recommendations or compile them based on the same best practices for a more universal out-of-the-box solution.

In addition to the fact that playbooks become outdated in principle, they also cannot quickly adapt to changes in infrastructure or personnel - when, for example, a new escalation link appears.

And the ML model can automatically track these changes in the background and adjust recommendations to the current situation.

In terms of implementation, trees and boosting are also used here. Training can be carried out both on internal and external expertise. Pre-trained on open source or external expertise of the model must be extrapolated to the customer, here various algorithms are used in order to stretch datasets with ports or IP addresses to the customer's infrastructure. This is relevant both for the initial use of the model and for its further training and retraining.

Stage: Post-incident. Reporting and briefing

Let's assume that the customer uses all the AI assistants we have developed in their work. What next?

The information noise has disappeared, analysts are busy with their work, and we are moving into the reporting phase. Even if the flow of irrelevant incidents has decreased, the volume of incidents requiring investigation may still be large.

No single person can quickly and without loss of quality review all SOC incidents and response actions in detail and immediately assess how effectively this SOC is working, how much benefit and incidents the information security system brings, and how much it is used in the containment phase.

The purpose of the model in this case will be to provide a brief summary of the results of an incident investigation - one or several at once, as well as to conduct a strategic review of everything that happens in the SOC.

Stage: Post-incident. Update playbooks and correlation rules

The very last stage in the incident management process is working on errors. In the process of this work, it may be necessary to change the rules of normalization, correlation and the playbooks themselves.

When it comes to playbooks, it is necessary to determine what actions analysts perform more often for groups of similar incidents, what new tools have appeared, and which ones have completely stopped being used over time.

Manual implementation of such a request takes a relatively long time. In this case, the AI assistant itself could understand how it needs to rebuild the response scenario and provide recommendations.

The same goes for correlation rules - let's say the model understands which rule parameters cause the most FPs - for example, in conjunction with a model that detects these false positives. Here, too, we could provide recommendations for tuning the rules.

Imagine you have Red working for you Team or you hired an external pentest. During their work, you record everything that the pentest team did: events, actions. At the same time, a list of incidents as a result of these attacks is formed in parallel. And the AI assistant could, having looked at these two sets of data, independently create correlation rules that would be relevant specifically for your organization.

There are two ways to solve this problem: on the one hand, the model can look at the actions that the analyst performs, which branches he or she follows or does not follow. In a certain context of the incident, the model can remove unused branches. Or, conversely, automate those branches that are definitely executed upon reaching a certain set of parameters. On the other hand, we can accept the entire flow of events from Red as input Team, telemetry from devices, equipment, traffic, and automatically include this in the correlation rules. We also look at the actions that can be performed by Blue Team manually, and based on this we assemble a composite playbook for response. The main thing is that the model can achieve a level of confidence that the actions are unconditional or lay this down as branches in the response scenario.

Instead of a conclusion. A possible roadmap for ML assistants

When we say that assistants should definitely appear and make our work easier, we also need to think about what we are ready for and what we should wait. Let's look at the expected roadmap for implementing assistants in SOC depending on its maturity level.

The first step would be to logically get rid of visual noise and set up the incident pool so that it is immediately clear which of the triggers need to be investigated first:

Search abnormal events;
Normalization events;
Definition of FP and TP;
Scoring and classification of assets, incidents and attacks;
Search for similar incidents and patterns in events.

When we have learned to work with old attacks, it is time to move on to new ones. Here we can already accumulate a full-fledged knowledge base on investigated incidents.

Detection typical attacks;
Expert recommendations ;
Search By base knowledge ;
Correction playbooks ;
Reporting and briefing summary .

At the very end, when the previous steps have been completed, you can already think about recreating the work of analysts in the form of a full-fledged assistant model.

Correction rules correlations;
Digital cast .

We hope that our developments in the area of ML assistants will really be able to simplify and speed up the work of SOC analysts, becoming a reliable aid in investigation.

For our part, we have always been and remain focused on finding a way or several ways to solve current problems, be it a high entry threshold for analysts or a lack of time to process a large data flow. And, most likely, if we close all the problems mentioned in this article, we will soon face new challenges.