Eva Belyaeva, Security Vision
Introduction
Recently we have had several enthusiasts within the company who are interested in DevSecOps for very different reasons. Someone was asked an uncomfortable question by a customer, someone decided to take the path of self-development, and someone close to us is an AppSec and wants to be able to keep the conversation going.
In this article I would like to tell in simple ‘IS’ language about the core of the secure development process for those who encounter it for the first time or have heard something somewhere but can't link it into a coherent picture. Because it is actually as close as possible to what you or your colleagues already know from your daily work, even if you have never worked in development or touched the source code.
Now you can try to speak the same language with those customers who, besides wanting to build IS processes, are also full-fledged developers of their own software. Let's go!
What is SCA?
What is behind this abbreviation and how is it adapted in our country? In the original it is Software composition analysis, most often SCA in our country sounds like Code Composition Analysis. Rarely we say composite, which is not quite correct. And there is also an adaptation from the vendor of native SCA - component. The latter, as I think, reflects the essence most of all.
What is the essence? First of all, SCA is a process like, for example, asset management or incident investigation. It's not a one-time action, but rather an ongoing, repetitive check aimed at improving the security of a product or application.
During such a process, the product is decomposed to identify dependencies, licences and vulnerabilities, as well as possible licence compatibility (e.g. if you've pulled in a GPL dependency, now you supposedly need to open source the product) of the components used. To simplify, the result of regular checks will be to highlight weaknesses in both the source code and the OpenSource project pulled into the project, and as a result the owner has knowledge of the components' coverage by checks: artefacts needed to make further decisions on how to deal with the identified risks.
Who is it for and why?
There are three reasons why a company needs to implement an SCA process (assuming, of course, that the company is a software developer):
- Licensing - in some cases, it needs to be established whether an in-house developed product can be sold at all if it has a ‘not for commercial use’ component, for example.
- Certification - the regulator regulates the security requirements of the source code, without checking these requirements you cannot say that your software can be taken at ‘word for word’.
- Security - even if you don't sell your product and don't want to get certified, it will be, to put it mildly, unpleasant if your brainchild gets hacked or if your application causes your users to get hacked.
An application can be anything - be it mobile applications, serious security products, or even games. Anything developed by you, a customer, or even your pet project - by and large, it all needs to be tested and responded to in a timely manner.
SCA as a process
Source code and binary builds
Let's start with what people are working with when it comes to SCA. Here it's as simple as that - source code, binary. What is written by humans (or with Copilot) is subject to checking for vulnerable dependencies.
And in general, the result can be the dependency graph itself - to understand what percentage of closed and open code is in the product, how many vulnerable points there will be in the future build. Some specialists immediately start from the attack surface - the possibility of a researcher or a hacker to break into the product and cause damage. In order to decide on a strategy and first steps, you need to understand how to prioritise.
How do you choose what to test?
There are actually several opinions here. Both in the SDL community and at various public presentations, the picture is not quite unambiguous. Rather it turns out that there is a generally accepted pattern of actions and there are a number of factors due to which the strategy can be adjusted. But if you are implementing a process for the first time, it is better to start with the obvious and simple.
There may be OpenSource libraries and some proprietary ones as part of the source code in terms of dependencies. The latter are more difficult to check, instead of checking the code directly you have to trust the vendor or knowledge bases about vulnerabilities, in fact you get a ‘cat in a bag’ without the possibility to confirm that the found vulnerabilities are all that is waiting in the vendor's closed code.
This is why open source software is often prioritised. There is also a second reason - an attacker will know about it and if he wants to hack your product, he will start with known vulnerabilities of open libraries.
The second step of verification is most often called container builds.
If all these steps are already passed and the process is well established, you can afford to spend time on checking the software-IDE, in which the development works, check the OS on which the product will be installed - there is a possibility that your product may be very good and safe, but it will be possible to break your game or system by hacking the OS itself.
Dependencies. Important nuances
Once the scope of work is defined, you start collecting information about dependencies. All imports, all additional libraries are searched for in the source code in various ways.
What ways are there? If we list them from maximally unautomated to automated, they are:
- maximum manual work with an attempt to run the product on a bare system and search for dependency packages manually;
- manual searching and searching through knowledge bases;
- using OpenSource or vendor products.
What can you look for dependencies in?
- The source code itself, e.g. the ‘requirements.txt’ file for python;
- regulars by keywords import, scope, etc.
- project repositories - for the above;
- builds of finished products;
- containers and virtual machines with the finished product.
The main question is how to hook all the dependencies? And how not to destroy your work with your own hands. For example, a useful advice from experts is to be sure to capture the versions of libraries and their dependencies, because the result of SCA analysis will lose its relevance if you do not manage updates and do not understand what versions of libraries are used in the product at a particular point in time.
It is worth noting here that the process described above is as similar as possible to the process of finding and managing vulnerabilities that is already familiar to security professionals, with the only difference being that here it is dependencies and libraries and there it is software. Similarly, these processes will be similar in the future when we move on to mitigation and patching.
Just as a vulnerability scanner examines files and their contents to look for software names and versions on a server or workstation, an SCA analysis tool examines code and looks for library names and versions. Where does this information go next?
SBOM and other bombshells
Let's get to know each other: SBOM (Software Bill of Materials) is a machine-readable format that lists libraries and their versions. This document can be passed to researchers who will either search for known vulnerabilities for the software or manually test a particular version.
Dependency information is also found here, so that when a vulnerability is discovered in a component, you can go back a step and understand how that component got into the product.
Using such a document, you can build an assumed attack surface and understand where the weaknesses are in the final product.
There is also MLBOM for artificial intelligence models, but this is just for general development.
What else do we collect?
In addition to what we can find in open code, we can test both open and closed code for the same purpose by looking at other researchers' analysis results. Often the results of static and dynamic code analysis go into a common vulnerability database of a particular product for further processing.
Once dependencies are established and vulnerabilities are discovered, they are sent to systems that automate the SCA process. Such systems allow you to configure the necessary risk tracking policies - update and development freezes, direct component updates, and notifications to those responsible for the process and patching.
We've found the vulnerabilities. What's next?
The analogy here is the same as the vulnerability management process - it's just like your normal job - you need to familiarise yourself with the risks and decide what to do next.
Whose responsibility is this problem?
Most often such problems are solved by the developers themselves with the help of AppSec colleagues who will help you choose the right strategy.
Opinions differ on how to properly address the vulnerabilities found, prioritise them and conduct an initial triage, just as they do at the stage of selecting the area to be tested. What unites them is that the experts are based on the types of dependencies found in the product.
Types of dependencies
Dependencies are direct and indirect (also called directive and transitive). What does this mean?
For example, you have a python code that implements some API requests and you have decided to use the requests library. This is a directive dependency. It, in its turn, pulls the urllib library as one of its dependencies - this is a transitive dependency from the code's point of view.
Transitive dependencies have a higher chance of finding vulnerabilities in them. Perhaps it is due to the fact that large projects are counterribbed more often, checked more often and, as a result, more vulnerabilities are found and eliminated. Or actually, it is just statistical: for one directive dependency there are several transitive dependencies, so the probability of finding vulnerabilities is higher.
But also transitive dependencies are more likely to surface an attack, so some experts recommend starting there.
There is a contrary opinion that you should start with directive dependencies, as this gives the developer more opportunity to influence the remediation process.
Plan Intercept
So what options does the developer have? The first thing to do is to confirm the relevance of the vulnerability to the product - we say that yes, someone will be able to launch such an attack and everything will break. This is for directive dependencies.
For trazitive dependencies it's a bit more complicated - we have to build a trace - a chain of calls in our application. This is very similar to the attack killchain and intruder route from the incident investigation process. There are times when traces only confirm a fraction of indirect dependencies, and the amount of remediation work is significantly reduced. However, when we talk about transitive dependencies, it can be difficult to fix them in your code; for example, you started an issue, patched the found vulnerability, but your directive-dependent library has not been updated and is working with the old vulnerable version - it is impossible to patch the whole chain. In this case, experts suggest to go the other way and work with vulnerable method calls in the code itself, validate the call and set up a secure data protection wrapper there.
There are several popular strategies for dealing with found vulnerabilities:
- Updating - checking for the next version of an application or library and then updating it. It should be noted here that update policies can be quite different - some prefer to sit ‘a few versions’ behind the latest version, because often the latest version seems safer just because something critical hasn't been discovered there yet. Someone is motivated by the continuity of the application - a new version can mean both adding and removing functionality that was used in the product. The strategy of ‘just update everything’ is unfortunately not a panacea.
- Migration. It happens, for example, that you have added to your product a library of some component with many dependencies just for the sake of one function. Then you should weigh the risks from the fact that the application is vulnerable and from the fact that you have to spend time and resources on your own development or on replacing one library with another.
- Mitigation. Protecting code can be done in other ways. Many companies, either having not waited for a patch from the library author or having no possibility to switch to a new version, do patching themselves and migrate this legacy from product to product. There are also more convenient ways - for example, wrapping the product in an NWI that will deal with all unwanted calls and protect sensitive data.
- Freeze everything. Here we are talking about both updates and development. Most often this strategy is used when it comes to a system-critical component of the system. There have been instances where products have not been released to the market and have delayed the implementation of systems to customers until the vulnerability is addressed in one way or another.
The most important thing
The main thing that I would like to emphasise at the end for you to remember is that SCA is a regular process. It should evolve with the product, increasing its level of security. As soon as the next stage is passed, you can move on to the next one, don't be afraid of updates and a seemingly frighteningly large dependency graph - everything is fixable.
It is important to be aware of what is used by the application, what it consists of. Only through a properly implemented and structured secure development process, which is based on component analysis, you can make something truly secure.