Ruslan Rakhmetov, Security Vision
In our blog we often talk about how information security tools work, which usually focus on a specific perimeter or protection method, but we also talk about general IS approaches and methodologies (e.g. Zero Trust and access control). The current review will focus on an approach that focuses on protecting the data itself, not just the network perimeter and associated defences, collectively known as DCAP (Data-Centric Audit and Protection). These systems scan file shares, mail servers, and other data stores to discover sensitive information, classify it according to its importance, and provide comprehensive control of unstructured data that is stored on various systems.
In addition to data discovery (finding and categorising sensitive data according to various criteria, e.g. data type, degree of confidentiality, belonging to a business process, etc.), DCAP systems perform several other important tasks:
- Access control, namely, who has access to what data with monitoring of actions related to this information, including reading, writing, deleting, copying and moving;
- preventing data breaches by detecting suspicious activity, blocking unauthorised access and generating detailed reports on security breaches;
- Compliance with regulatory requirements, e.g., 152 FL, GDPR and HIPAA);
- risk analyses assessing potential threats to data;
- managing, creating and customising security policies, defining who can access what data and under what conditions;
- generating detailed reports on the state of data security, including information on breaches and incidents.
The level of security required to protect data can vary depending on industry and business specifics. In some cases, the system can automatically block suspicious activities, which partially overrides the data breach protection objectives. DCAP can analyse natural language in text data (not only to detect keywords and phrases), analyse images (in files like images or videos). The system performs hashing to ensure data integrity and change detection, and cryptographic encryption to protect data. Since a DCAP system must efficiently process large amounts of data in real time and scale to meet the growing needs of the organisation - additional requirements for vendors arise and the system architecture may differ from the classic one. Typically, DCAP is characterised by the following components in the architecture:
- software installed on the devices where the data is stored (agent), such as servers and workstations);
- a centralised server that collects and stores the data received from the agents;
- an analytics engine responsible for processing the collected data, classification, anomaly detection and threat detection;
- a database that stores information about data classification, security policies, events and users;
- and the management console itself, a graphical interface that allows administrators to configure the system, view reports and manage security policies.
DCAP systems are widely used in various areas where the protection of sensitive information is of paramount importance, such as the financial sector (customer data, account and transaction information), insurance companies (policies, terms and conditions, claims, and medical data), healthcare (patient data, doctor's records and results of tests, clinical trials and drug formulas), the public sector (protection of state secrets, personal data of citizens and information about government projects), and law enforcement
DCAP systems fulfil their tasks in stages. Of course, depending on the vendor chosen, the functions may vary, but typically the phases of work include:
1) scanning all data stores;
2) creating indexes to quickly find and analyse information;
3) classification, which can be performed manually (experts determine the sensitivity of the data and assign appropriate labels to it) and/or automatically (using machine learning algorithms to analyse the content of files and determine the types (text, graphics, audio, etc., as well as the presence of keywords, patterns and other features characteristic of sensitive information);
4) auditing and recording all user actions with the data;
5) building a profile of each user's behaviour (to detect anomalies and suspicious activity, as in UBA/UEBA solutions );
6) identifying possible links between different events to help detect complex attacks (like the correlation mechanism in SIEM solutions);
7) building a threat model to assess the probability and consequences with an estimated risk level for each type of data;
8) comparing against security policies and generating alerts when breaches are detected.
The functioning of DCAP solutions is similar to the work of a librarian, only when before starting a new working day you have to walk around the premises and look for books not only on the shelves, but also in other places. To begin with, we need to find all the information and categorise it by genre, as well as highlight such books that visitors will be able to study only in the reading room, without the possibility to take the documents home (due to the importance of the information or the rarity of the copy). Then our ‘librarian’ will need to let the readers in and log everything that people take to study, at what time and how long they use the books, so that in the future, based on unusual situations, an incident can be identified. It's also nowhere without regulations - users are noted in the logs and will be required to return books on time, while the DCAP librarian will check all logs every day.
Also, you will not be mistaken if you notice that the functionality of DCAP systems resembles that of DLP and is closely related to data analysis, so on the Russian market, vendors of data protection systems also offer data-centric data auditing. For example, Zecurion DCAP, InfoWatch Data Discovery and Solar DAG have appeared. On the other hand, there are specific and data-focused systems that have developed independently, such as Varonis or Makves DCAP solutions.
DCAP functionality also addresses access control tasks, which can be compared, for example, to parental control on a computer or smart TV or the work of a secretary. In the first example, parents set restrictions on their children's access to certain websites, programmes or TV shows, while in the second example, the secretary controls access to the manager's office, sorts incoming correspondence and makes sure that important documents do not fall into the wrong hands. A DCAP system performs similar functions, controlling access to data and protecting it from unauthorised use.
With integrations between solutions (native from DCAP vendors or performed as part of an orchestrator's work, like SOCR), the choice of solution can be made vendor-independently, based on your own requirements, such as the size of the organisation, type of data, budget and security requirements.
Smaller companies may be suited to simpler solutions, while larger corporations will require scalable and flexible systems or serious integration with other IS/IT solutions. If an organisation works with large volumes of unstructured data (documents, emails), it is necessary to choose a system with advanced content analysis capabilities.
The choice of DCAP can be compared to the process of choosing a bank safe: each bank customer has an individual safe where he keeps his valuables (the bank, in turn, provides reliable protection of these safes and monitoring, for example, video recording in the lobby). DCAP-system fulfils a similar function, protecting confidential data in a digital environment, the choice of ‘safe deposit box’ depends on the specifics of the data (e.g. only cash or jewellery) and its volume (larger safe deposit boxes cost a lot of money), etc.
With DCAP systems, you have full control over your data, even if it has not been structured and processed before. You will be able to determine exactly who, when and what was done with your files and prevent unauthorised actions, quickly find the information you need and optimise storage processes. A data-driven approach to protection is an important step in the development of companies, because with a large flow of files, it is almost impossible to solve the problem effectively without automation.