SOT

SOT

SOAR
Security Orchestration, Automation and Response

Automation of response to information security incidents using dynamic playbooks and information security tools, building an attack chain and with an object-oriented approach

NG SOAR
Next Generation SOAR

Automation of response to information security incidents with built-in basic correlation (SIEM), vulnerability Scanner (VS), collection of raw events directly from information security tools, dynamic playbooks, building an attack chain and an object-oriented approach. AM and VM are included

AM
Asset Management

Description of the IT landscape, detection of new objects on the network, categorization of assets, inventory, life cycle management of equipment and software on automated workstations and servers of organizations

VS
Vulnerability Scanner

Scanning information assets with enrichment from any external services (additional scanners, The Data Security Threats Database and other analytical databases) to analyze the security of the infrastructure.

VM
Vulnerability Management

Building a process for detecting and eliminating technical vulnerabilities, collecting information from existing security scanners, update management platforms, expert external services and other solutions

FinCERT
Financial Computer Emergency Response Team

Bilateral interaction with the Central Bank, namely the transfer of information about incidents and receipt of prompt notifications/bulletins from the regulator

GovCERT
Government Computer Emergency Response Team

Bilateral interaction with the state coordination center for computer incidents, namely the transfer of information about incidents and receipt of prompt notifications/bulletins from the regulator

Mail us to sales@securityvision.ru or get demo presentation

AI Cybersecurity. Part 1: Neural Networks and Machine Learning

AI Cybersecurity. Part 1: Neural Networks and Machine Learning
29.01.2026

Ruslan Rakhmetov, Security Vision

 

At the end of August this year, the world was shaken by the news that the information security company ESET discovered PromptLock, the first ransomware of its kind based on generative AI, generated malicious code on the fly by sending prompts for its creation through an API to LLM gpt-oss-20b from OpenAI. Actually, it's a virus turned out to be experimental training project New York University's Tandon School of Engineering – for testing, the researchers uploaded their prototype to VirusTotal, where it was discovered by ESET security analysts. Despite the experimental nature of this sample, the alarm can only be considered partially false – the researchers demonstrated that AI-powered viruses can perform typical actions of real ransomware attacks, including data detection and analysis, data theft and encryption, and demanding a ransom depending on the value of the files. Attackers previously spent significant resources creating such advanced viruses, but now AI significantly simplifies their development: this prototype consumes approximately 23,000 tokens to conduct a test attack, which is less than $1 when using commercial services to access flagship LLMs, but it is also possible to use completely free Open Source services. Source AI models.


The widespread use of AI by attackers has become a common and widespread phenomenon – an example is report about the malicious use of the Claude family of large-scale language models from Anthropic, a company founded by former OpenAI employees. Here are just a few examples of malicious AI use from the report:

 

1) The attackers created a cover story for Claude that it was a coordinated pentest, although in reality they were carrying out extortion against at least 17 different companies, in which the attackers used Claude Code (an AI agent to assist with software development) and Code execution tool (a sandbox for executing commands on the server side Claude) along with the hacker distribution Kali Linux. Claude was used to perform an initial scan of the victims' internet infrastructure, brute-force possible access credentials, search for and exploit vulnerabilities, navigate the infrastructure, and customize hacker Open Source tools for gaining a foothold in infrastructure, document exfiltration, and analysis. For example, during an attack on a financial institution using Claude, the attackers analyzed the stolen financial data, assessed the company's financial health, and based on this, determined the ransom amount. This approach was dubbed " vibe." hacking ", similar to the " vibe" approach coding ”, which implies simplifying software development using AI.

 

2) Another attacker, using Claude, developed ransomware that evades antivirus and security analyzers, persists on the attacked system, detects and encrypts certain file types, and provides C & C communications. The same attacker then offered this ransomware as a RaaS (ransomware-as-a-service) on shadow forums.

 

3) Attackers used Claude to create fake IT profiles, successfully pass remote technical interviews at Fortune 500 companies, secure remote positions, and then conduct espionage at these companies while successfully completing all required work tasks. Moreover, the spies often didn't speak English and lacked professional competencies – effectively, the AI conducted the interviews for them.

 

4) Fraudsters used Claude to create profiles on dating sites and communicate with victims to extort money, as well as to create and maintain a service for buying and selling stolen payment card data.


To understand the threats inherent to AI and how to effectively protect ML models, we need to understand the basic concepts and definitions. Let's start with neural networks and machine learning.


1. The development of modern ML models and AI systems has been significantly influenced by neural networks – mathematical models that simulate the transmission of signals between neurons in the human brain. Neural networks consist of the following layers:

 

· Input layer: a set of input data (analogous to the input signal entering the human nervous system). For example, this could be a set of properties of a specific object, image or speech characteristics, words, and images.

 

· Hidden (intermediate) layers: a set of linear functions of several variables (transformations) of the form


Y = W 1* X 1 + W 2* X 2 +... + Wn * Xn + B


where X 1- Xn are the input data, W 1- Wn are the weights for the input data (the "importance" of the input signal, which is amplified or weakened when passing through a synapse in the human nervous system), B is the bias (allows information to be passed to the next layer even if the input signals are weak), and Y is the weighted sum, i.e. the result of a linear transformation of the input data, which then passes through the subsequent layers.

 

· Output layer: a nonlinear transformation (activation function) applied to the weighted sum. For example, the following can be applied: activation functions such as sigmoid, hyperbolic tangent, linear rectifier (ReLU, Rectified Linear Unit), parametric (PReLU) linear rectifier, exponential (ELU) linear rectifier, etc.


The simplest practical example of a single hidden layer in a neural network would be a linear regression, which would track the linear relationship between the output and a set of input parameters. For example, we have a list of car ads (of a specific make, model, generation, and trim level) with prices. We need to determine the relationship between price and the year of manufacture, the vehicle's mileage, and the number of previous owners – that is, we need to effectively select the coefficients (weights W1, W2, W3) in a linear function.

 

Y = W 1* X 1 + W 2* X 2 + W 3* X 3 + B

 

Where Y is the vehicle's price, X1 is the year of manufacture, X2 is the mileage, X3 is the number of previous owners, and B is the correction factor (error). The result will be a function approximating the processed data from the sales ads, which can be plotted as a hyperplane with a dimension equal to the number of parameters being processed (in our case, there are only three). Ultimately, using the resulting function, integrated into the car sales website, the seller will be able to set the optimal price for their car, taking into account its year of manufacture, mileage, and number of previous owners.


2. The task of neural networks is to process incoming data, identify relationships between them and the obtained result, and then predict the results based on new incoming data. A key feature of neural networks is training on a large set of historical data to identify patterns and relationships between them, i.e., determining weights and biases. For effective training, the following algorithm is used: the neural network runs through the data using a forward pass and receives the result, then the deviation of the obtained result from the expected (“correct” and previously known) value is estimated using the loss function (loss function). To reduce the deviation, a backward pass mechanism is used Backpropagation (also known as backpropagation or error propagation): the error result is passed backward through all layers of the neural network, and at this stage, the neural network calculates which weights and biases contributed to the error. As a result, the neural network replaces the incorrect weights and biases with more accurate ones. Various optimization methods can be used for this, such as gradient descent, stochastic gradient descent, and adaptive gradient algorithms.


3. There are several main types of neural network architectures, for example:

· Feedforward neural networks (FNNs) Neural Networks) are the simplest of neural networks, used for the simplest operations such as recognizing numbers and classifying simple objects;

· Convolutional neural networks (CNNs, Convolutional Neural Networks) are used for image recognition, computer vision, figure and face recognition;

· Long Short-Term Memory Neural Networks (LSTMNs, Long Short-Term Memory Networks) are used for translating texts, transcribing speech into text;

· Generative adversarial networks (GANs, Generative Adversarial Networks consist of two complementary networks (a generator and a discriminator). The generator attempts to create realistic data (images, text), while the discriminator attempts to distinguish this artificially created data from real data. The result is a point at which the differences are indistinguishable. These types of networks are used to create realistic images and images, including deepfakes, image enhancement, and the creation of various media content.

· Autoencoders (AEs) are used to detect anomalies in data and remove noise and spikes in images and signals;

· Self-organizing maps (SOMs) are used for visualization of multidimensional data and data clustering;

· Radial basis function networks (RBFNs, Radial Basis Function Networks) use radial basis functions (RBFs) as activation functions, each with only one hidden layer. These networks are used to solve problems of approximation, prediction, and object classification.

· Recurrent neural networks (RNNs, Recurrent Neural Networks) are used for speech recognition, handwritten text, and processing of data sequences;

· Graph neural networks (GNNs, Graph Neural Networks) are a logical continuation of the development of convolutional and recurrent networks and are used to work with graph data structures, for example, in the analysis of social connections, connections between various entities (knowledge graphs), in bioinformatics and molecular drug design;

· Transformers have become in many ways an evolution of recurrent neural networks and are used for natural language processing (NLP, Natural Language Processing) and natural language text synthesis. The most prominent example is ChatGPT (where GPT stands for Generative Pre–trained Transformer, generative pretrained transformer);

· Mamba is a relatively new neural network architecture, developed in 2023 and based on state space models (SSMs, State Space Models). Mamba can be used as a language model and demonstrates higher computational efficiency compared to transformers, allowing it to accept a large number of input parameters while sacrificing previous context.


4. Large neural networks operate with billions of parameters, for which weights and biases are selected through training. For example, in the aforementioned gpt–oss -20 b model, With 21 billion parameters and 24 layers, the model is considered compact by modern standards. Conventionally, if a neural network contains 4 or more layers, its training is called deep learning (DL). Deep learning is considered a subset of the more general process of machine learning (ML) – the difference is that ML uses pre-developed human data processing algorithms, while DL automatically learns to process large volumes of data and independently adjusts its processing algorithms. Accordingly, ML is considered a less resource-intensive process that works well with small data sets and allows one to understand its internal decision-making logic, while DL is more resource-intensive (requires high-performance graphics (GPU) and tensor processing units (TPU)), and works well with Big Data and with text, speech, images, but for an outside observer its internal decision-making logic is a “black box”.


5. There are several methods of machine learning:

 

· Supervised learning Machine learning (ML) is a method of machine learning that uses labeled datasets (classified objects with identified characteristic features – datasets) for which a "teacher" (a human or a training sample) specifies the correct question-answer pairs. This is then used to construct an algorithm for answering subsequent similar questions. Let's give a simple example: a model is fed a set of photographs of various animals, and the teacher indicates which photographs contain cats. The trained model should then correctly identify cats in new photographs. A more complex example: if the task is to train a neural network to predict the consequences of car accidents, it is fed historical data (date and time of past accidents, number of participants in the accident, speed and direction of travel of the participants, braking distance, location, weather conditions) and the actual consequences of past accidents. Thus, the model receives input and output data from the teacher and must learn to predict the consequences of future accidents based on new input data. Several types of supervised learning algorithms can be distinguished: regression algorithms (including the linear regression discussed above, random forests, and gradient boosting), classification algorithms (including logistic regression, k- nearest neighbors, and support vector machines), Bayesian classifiers, and decision trees. Systems for risk assessment, fraud detection, predictive analytics, anomaly detection, and pattern recognition are built based on ML models with supervised learning.

 

· Unsupervised Learning is a method of machine learning that does not use labeled data sets, does not specify the correct question-answer pairs, and requires the model to find various relationships between objects based on known properties, identify dependencies and patterns, and perform predictive modeling based on known historical data. Modeling) of future results. Unsupervised learning can be achieved using cluster analysis algorithms, including clustering methods such as the k-means method, the k-median method, hierarchical clustering, and discriminant analysis. Various recommender systems and anomaly detection systems are built based on ML models with unsupervised learning. Examples of the use of unsupervised learning models include autoencoders (AEs) and self-organizing maps (SOMs).

 

· Self-training (Self-Supervised Machine learning (ML) – this method allows ML models to self-learn on unlabeled data, using the relationships between input data and the results of its processing as signals for further self-learning. Similar algorithms are used in computer vision and natural language processing – i.e., in areas where manually creating the necessary datasets for training is difficult.

 

· Semi-Supervised Learning is a machine learning method that combines a small number of labeled datasets with a large number of unlabeled datasets. This approach is justified by the fact that obtaining high-quality labeled datasets is a resource-intensive and time-consuming process. An example of the use of semi-supervised learning models are generative adversarial networks (GANs).

 

· Reinforcement Learning is a special case of supervised learning, in which the “teacher” is the operating environment, which provides feedback to the ML model depending on the decisions it makes (rewards for correct decisions and penalizes for incorrect ones).

Recommended

OWASP ZAP for beginners: how to conduct a web application security audit
OWASP ZAP for beginners: how to conduct a web application security audit
Security analysis
Security analysis
Organizing networking within teams to improve efficiency
Organizing networking within teams to improve efficiency
What is SQL Injection?
What is SQL Injection?
What goals do attackers set for VPOs
What goals do attackers set for VPOs
What is Internet fraud (scam), what to be wary of and how to protect yourself
What is Internet fraud (scam), what to be wary of and how to protect yourself
Cybersecurity incident response scenarios. Part 2: runbooks, playbooks, dynamic scripts
Cybersecurity incident response scenarios. Part 2: runbooks, playbooks, dynamic scripts
Vulnerability search methods and types of scanners
Vulnerability search methods and types of scanners
eBPF through the eyes of a hacker. Part 3
eBPF through the eyes of a hacker. Part 3
Application of symmetric and asymmetric encryption algorithms
Application of symmetric and asymmetric encryption algorithms
AI vs. AI (Cyber Threat Attack and Defense)
AI vs. AI (Cyber Threat Attack and Defense)
Mobile threats, detection and prevention: How to know if your phone has a virus and how to remove it
Mobile threats, detection and prevention: How to know if your phone has a virus and how to remove it

Recommended

OWASP ZAP for beginners: how to conduct a web application security audit
OWASP ZAP for beginners: how to conduct a web application security audit
Security analysis
Security analysis
Organizing networking within teams to improve efficiency
Organizing networking within teams to improve efficiency
What is SQL Injection?
What is SQL Injection?
What goals do attackers set for VPOs
What goals do attackers set for VPOs
What is Internet fraud (scam), what to be wary of and how to protect yourself
What is Internet fraud (scam), what to be wary of and how to protect yourself
Cybersecurity incident response scenarios. Part 2: runbooks, playbooks, dynamic scripts
Cybersecurity incident response scenarios. Part 2: runbooks, playbooks, dynamic scripts
Vulnerability search methods and types of scanners
Vulnerability search methods and types of scanners
eBPF through the eyes of a hacker. Part 3
eBPF through the eyes of a hacker. Part 3
Application of symmetric and asymmetric encryption algorithms
Application of symmetric and asymmetric encryption algorithms
AI vs. AI (Cyber Threat Attack and Defense)
AI vs. AI (Cyber Threat Attack and Defense)
Mobile threats, detection and prevention: How to know if your phone has a virus and how to remove it
Mobile threats, detection and prevention: How to know if your phone has a virus and how to remove it