Ruslan Rakhmetov, Security Vision
More and more often in the news we come across terms that raise questions: what is cryptography, cryptology, cryptanalysis? What does encryption mean in general, why is it used? What encryption algorithms are there, how do they differ, what is symmetric and asymmetric cryptography? What are hash and digital signature, what are they used for? In this article we will try to give brief answers to these questions. Let's get started!
So, let's start with definitions:
- Cryptology is the science that studies methods of encrypting information. Cryptology is divided into cryptanalysis and cryptography.
- Cryptanalysis is the science of methods for decrypting encrypted information, i.e. breaking encryption by selecting an encryption key or finding vulnerabilities in cryptographic algorithms.
- Cryptography is the science of protecting information by means of various mathematical transformations. The objectives of cryptography are to ensure the integrity, confidentiality, authenticity (authenticity), non-repudiation of information.
- Information integrity is the state of information in which it either remains unchanged or changes are made only by those who are authorised to do so.
- Confidentiality of information is the state of information in which only those with appropriate access rights have access to it.
- Authenticity of information is a state of information that guarantees the possibility to unambiguously identify the author (owner) and the source of information.
- Information non-repudiation is a state of information that guarantees the impossibility of the author (owner) of information to repudiate authorship and performed actions (e.g., changing, sending, receiving) with information.
How are confidentiality, integrity, authenticity and non-repudiation ensured? We will tell you further:
Confidentiality of information is achieved by encrypting it. Encryption is the transformation of information using cryptographic algorithms. The initial information is an open, readable text (plaintext). After the encryption operation, it turns into encrypted information, i.e. a meaningless set of characters (ciphertext). In order to read it, a decryption operation must be performed, i.e. the original text must be restored using the encryption key known to the user. If the encryption key is unknown to the user (attacker), the operation of restoring the original text is called decryption, and it is achieved by breaking the cipher using cryptanalysis techniques.
Cryptographic encryption algorithms are divided into two large groups: symmetric encryption algorithms (secret key cryptosystems) and asymmetric encryption algorithms (public and private key cryptosystems). Key information is the secret keys or pairs of public and private keys (i.e. what can encrypt and decrypt information). The most important characteristic of an encryption key is its length, which is measured in bits.
1. Symmetric encryption involves using the same key for both encryption and decryption. Such a key is called a secret key. The advantages of symmetric encryption are speed and shorter encryption key length (compared to asymmetric encryption algorithms), while the disadvantages are the complexity of secure key transmission in an untrusted environment (an attacker who intercepts the secret key will be able to read all messages encrypted by it) and the complexity of key management (each secret key has to be safely delivered to and stored with each sender and receiver). Symmetric encryption algorithms are divided into stream cipher and block cipher:
1.1 The symmetric stream encryption algorithm encrypts each character (bit/byte) of a message separately by performing the mathematical operation XOR ("exclusive OR") with a key (key stream, gamma). The advantages of stream encryption are its simplicity (which allows encryption to be performed on hardware with low processing power) and the ability to encrypt a stream of messages in real time, which is used to encrypt voice and video traffic. Examples of stream encryption include algorithms such as the older RC4 algorithm (which has vulnerabilities and cannot be used) and the newer XChaCha20 and XSalsa20 algorithms.
1.2 The symmetric block cipher algorithm encrypts fixed-length message blocks, i.e., it breaks the message into parts and encrypts each part separately. Block ciphers are the most popular due to their ability to work in different modes and high cryptographic strength. The most striking examples of symmetric block ciphers are foreign standards DES (Data Encryption Algorithm, key length 56 bits, is obsolete), 3DES (Triple Data Encryption Algorithm, key length 168 bits, is obsolete), AES (Rijndael algorithm, key length can be 128 or 256 bits, actively used), as well as Russian standards GOST 28147-89 and GOST 34.12-2018 (Magma and Grasshopper algorithms, key length 256 bits, actively used).
2 Asymmetric encryption involves the use of two keys, a private key and a public key. They are created at the same time, they are related to each other, but you cannot get the other key from one. The private key is kept secret by the owner, the public key can be given to anyone. The public key is used to encrypt the message: for example, a sender wants to encrypt an email message and uses the recipient's public key, obtained from a public directory, to encrypt it. The recipient of the message uses his private key to decrypt the email message. If he wants to reply, he encrypts the reply with the public key of the original sender, who decrypts the reply with his private key. The most prominent examples of asymmetric encryption algorithms are RSA and the ElGamal scheme, which require a key length of at least 2048 bits. In addition to encryption, asymmetric algorithms are used for digital signatures (discussed below) and for secure exchange of symmetric encryption keys (e.g., in the TLS protocol for secure access to websites).
The integrity of information can be verified by calculating the value of the hash function from the message on the sender's side and on the receiver's side with further reconciliation of the result. A hash function is a cryptographic transformation of a message of any length into a sequence of bits of a fixed size. Such transformation is called hashing, and the result is called a hash, and from a hash it is impossible to restore the original message, and even the most insignificant change of the original message leads to a complete change of the hash from it. If identical hashes are obtained from different messages, it means that the hash function is not reliable enough and allows collisions. In practice, the hash value, for example, from a software installer file, is published on the developer's site, and the user after downloading the file can calculate the hash and check its value with the one published on the site - if the values coincide, the user has downloaded the real file and it has not been corrupted or tampered with in the process of downloading. The most prominent examples of hashing algorithms are hash functions MD5 (hash length 128 bits, insecure, collisions possible), SHA-1 (hash length 160 bits, insecure, collisions possible), SHA-2 and SHA-3 (hash length depending on the implementation can be from 224 to 512 bits, actively used), as well as the Russian standard GOST 34.11-2018 (Stribog algorithm, hash length 256 or 512 bits, actively used).
The authenticity of information can be ensured by digitally signing the message being sent. A digital signature is a hash of the message encrypted using the private key of the message author (sender). Further, the message can be encrypted or unencrypted (if it is not confidential and only its authenticity needs to be ensured) and sent to the recipient along with the digital signature and the sender's public key. The public key can be sent in the form of a digital signature certificate - i.e. a set of data containing the public key, information about its owner (sender) and information about which certificate authority (or certification centre) issued the certificate. Then the receiver, using the sender's public key, decrypts the digital signature of the message and obtains the hash value, calculates the hash of the received message in parallel and then compares the obtained hashes - if they coincide, the signature is correct, the message is genuine and really written by the sender (private key owner).
Nonrepudiation is guaranteed by the fact that the private key with which the author signs the hash of the message must belong only to the author, he cannot pass it on to anyone else, and intruders cannot steal it. Additionally, time stamps may be used for digitally signed messages, and the digital signature certificate may contain the data of the user-owner (full name, company name and division), which will allow to unambiguously establish the authorship of the message. In Russia, the standard GOST 34.10-2018 "Processes of formation and verification of electronic digital signatures" is used, and Russian legislation uses the concept of "enhanced qualified electronic signature", which, in accordance with 63-FZ, is equivalent to a citizen's handwritten signature and can be used to certify any official electronic documents.