WHAT IS STEGANOGRAPHY, USED BY CYBERCRIMINALS? HOW DOES JIZÔ AI DETECT IT?

14 November 2024

Derived from the Greek words "steganós," meaning sealed, and "graphế," meaning writing, steganography refers to a discreet method of data concealment. Much like cryptography, its purpose is to transmit a message that only the intended recipient can understand. To achieve this, a secret message or confidential information is hidden within another seemingly innocuous medium.

However, while cryptography alters the message to make it unreadable without the appropriate key or decryption method, steganography hides the message within a host element in such a way that the very existence of a secret message goes undetected. The two methods can be combined to further enhance the secrecy of communication, with an encrypted message subsequently being hidden.

Two types of steganography can be distinguished: injection methods, which add content almost imperceptibly, and substitution methods, which modify properties of the host element to conceal information.

To evaluate the performance of a steganography method, three criteria are used. First, the quality of concealment, i.e., the extent to which the host element is unaffected in a way that prevents easily detecting either the hidden message or any modifications to the host element. Second, capacity, or the amount of information that can be concealed. For instance, if a method is undetectable but only allows transferring one bit per day, its effectiveness might be questionable. Third, robustness, or how well the message can be retrieved after slight modifications to the host element or distortions during transmission.

Steganography has been used for a long time. One of the earliest known uses dates back to the 6th century BCE when the tyrant Histiaeus organized the Ionian revolt. To transmit orders, he shaved the head of one of his slaves, tattooed the orders onto his scalp, and waited for the hair to regrow before sending the slave as a messenger. This primitive injection-based steganography method was already effective.

Another example of basic steganography used for a long time involves hiding a message within text, retrievable through a predefined method, such as “reading only the first letter of each word.”

Take the phrase, "Sacha and his friend walk together incredibly late." While it seems innocuous, taking the first letter of each word reveals "Sesame it."

Methods have evolved over centuries and technological advancements, notably with increasingly sophisticated invisible ink or new levels of precision such as microdots, used during World War II to transmit secret information.

Nowadays, steganography has entered the digital realm in various forms. It is used for legitimate purposes, such as protecting copyright in images, but also by malicious individuals to commit cybercrimes. Moreover, methods continuously evolve to stay ahead of detection equipment. Finally, the advent of artificial intelligence represents a new technological leap, benefiting both criminals and defenders.

I | MODERN STEGANOGRAPHY TECHNIQUES

The complexity of the digital world today means that vast amounts of data are constantly being exchanged, whether through documents (a PDF file often contains several million bytes) or via the internet and machine-to-machine communications. These are all opportunities for concealing information used in steganography.

A | STEGANOGRAPHY BASED ON DIGITAL MEDIA

Modern digital media, such as images, videos, audio files, or text documents, are more or less complex elements containing vast amounts of information. It is possible to hide secret information within them, either by directly adding it to the media in a way that is invisible to a user (injection) or by slightly modifying a part of the media, often an inconspicuous area or one associated with noise (substitution).

To understand how this works, one must look at how computers represent images. An image is a series of pixels, which represent the breakdown of the image into a grid of tiny squares. For colored images, each pixel contains information about its color. A color is considered a mix of red, green, and blue. The quantity of each color is represented on one byte, allowing for 2⁸ values, or 256. Thus, a color consists of three values ranging from 0 to 255, indicating a mix of primary colors.

In binary representation, within a byte, each bit does not hold the same weight. For example, 178 is written in binary as 10110010. Changing the first bit would result in 00110010, or 50. The first bit has the greatest influence on the number represented. Conversely, if the least significant bit is altered, it changes to 10110011, or 179. In the color scale, with values ranging from 0 to 255, a change from 178 to 50 in one of the colors would result in a significant, easily noticeable difference.

However, a change from 178 to 179 is imperceptible to the human eye. This is why one of the basic techniques in image steganography is called the least significant bit (LSB) method. It leverages the fact that modifying the least significant bit has minimal impact on the final color of a pixel, making the alteration invisible. Thus, by hiding a message in the least significant bit of the color values of pixels distributed throughout the image, it is possible to conceal a message in an image. Complex algorithms can optimize the selection of pixels for hiding the information to minimize the impact on the image while maximizing the amount of information hidden. However, this method is not very robust, as slight modifications to a few pixels in the image could make the message unrecoverable.

Depending on the image format, the task can become more complicated, especially when the image is compressed. Information loss must be anticipated to hide a message properly. This is the case with the JPEG format, which uses discrete cosine transform by grouping pixels into 8x8 zones. The PNG format is also different, as each image consists of a header followed by chunks of data containing fields such as type and data. It is necessary to edit some bits in these chunks of data to conceal a secret message.

A video is a sequence of images. It is therefore also possible to perform steganography within it. This practice has its own set of challenges due to the various video formats, particularly the coding standards like H.264, which aim to compress videos to make them smaller. Specific parameters must be manipulated to insert data, but it is entirely possible to do so, as with images, again using substitution techniques.

Similarly, it is possible to hide information in an audio file. An analog sound is also represented digitally in bits. By slightly modifying the amplitude or frequency of high-frequency signals (according to Fourier decomposition), only the "noise" of the signal is altered—an inaudible change to the human ear.

Text documents are also a suitable medium for steganography. This is primarily done via insertion rather than substitution, unlike the previously mentioned categories. Formats such as Word documents allow for adding invisible notes, which are text not displayed by default. This method is very easily implemented and effective, as few people display hidden notes in the options. A more discreet way is to use spaces, tabs, or line breaks at the end of the text. These characters are not visible to the reader but are interpretable and can encode secret messages, for instance, by considering a space as 0 and a tab as 1.

It is also feasible to use a rarer steganography method called generative. It involves creating an element, such as a text, that appears innocuous but is built around a hidden message, rather than embedding a message into an existing element. For instance, some programs can generate fake spam emails containing concealed messages retrievable by anyone aware of their presence.


B | STEGANOGRAPHY BASED ON NETWORK PROTOCOLS

Network protocols codify and structure digital exchanges. They follow a specific syntax with numerous fields. Individuals can exploit these attributes to conceal information and communicate secretly through steganography within network protocols.

A network packet is an encapsulation of layers, each layer having its possible protocols. The IP protocol is placed at Layer 3 of the OSI model, onto which a TCP or UDP transport layer is added. In the case of TCP/IP, the data is preceded by a TCP header and an IP header. Each header contains various pieces of information, such as source/destination IPs, ports, a sequence number (for TCP), or other fields like Time to Live, flags, and additional fields for the IP header. Some of these fields can be used for steganography. For instance, by manipulating fragmentation fields in a network where the maximum transmission unit (MTU) is known, it is possible to conceal information in the middle bit of the flag field, which relates to fragmentation but is not needed if the packet size does not exceed the limit. Similarly, the identification field of the IP header can be exploited for a similar operation. By employing various such techniques, information can be transmitted discreetly through common network protocols themselves, making it undetectable to basic network supervision.

It is also possible to perform steganography using another ubiquitous protocol, DNS (Domain Name System). In the past, this protocol has already been used for communication and remote control, leveraging the protocol’s TXT field, though this is not the most discreet method.

Attackers have also used this protocol by sending requests at specific time intervals, effectively creating an initial layer of steganography via this protocol. More advanced models exist, based on the structure of DNS packets. For example, the protocol manages queries and responses and differentiates them using the QR field. By sending a query specified as such (QR = 0) but also including fields typically related to responses (e.g., ANCount = 1), it is possible to insert confidential data between the IP and ResourceDataLength fields of the response section, correctly calculating the message length (response IP + hidden data). Such a request is considered valid and will be processed correctly by a DNS server. Furthermore, by specifying the appropriate details for the requested IP and domain name in the response fields, the query blends into the traffic and is not flagged as suspicious. By directing these queries to a compromised DNS server, the hidden data in the requests can be retrieved while the server continues to handle legitimate requests.

Finally, application-layer protocols like HTTP can also serve as vessels for concealing data. Several methods can be considered, such as embedding textual steganography within URI fields or leveraging specific HTTP fields like the "cookie" header for client-to-server communication and the "Set-Cookie" header for server-to-client communication. Data can be encoded and placed in these fields.

Network protocols provide an excellent means of implementing steganography, as digital exchanges are vast, occur continuously, and blend into large volumes of data.


II | THE USE OF STEGANOGRAPHY BY CYBERCRIMINALS

Cybercriminals continually refine their offensive techniques to outpace the protective capabilities implemented in their victims' systems. Their goal is to compromise systems to steal or destroy data. To achieve this, they not only require the necessary tools but must also carry out their operations discreetly to evade detection systems that might block communications or alert security teams. It is therefore unsurprising that cybercriminals have adopted steganography, adding it to their arsenal to minimize the noise their actions generate on networks.


A | DISTRIBUTION OF MALWARE

One of the earliest uses of steganography in an attack is for the propagation or deployment of malware. By embedding malware, often within an image or PDF document, cybercriminals make their operations discreet in two ways. First, it deceives most file scanners, which sometimes limit their checks to specific file extensions, excluding seemingly harmless files like images, or searching only for specific strings of characters indicating malicious code. Once past these security layers, the malware also fools the human user, who is unlikely to suspect malicious content in seemingly legitimate files. Even a user trained in cybersecurity risks may lower their guard when a file is not an executable, does not have a double extension, or is not a shortcut. But who would suspect an image?

There are numerous concrete examples of criminals employing this method. A common approach is distributing Word documents via phishing campaigns. These documents contain VBS macros, which download an image when opened. The image, while appearing innocuous, is processed by the macro to extract malicious code that executes in the victim's environment. This method has been employed by infamous cybercriminal groups such as IcedID, Formbook, LummaStealer, and OceanLotus. In a more subtle example, the Vawtrak malware embedded its payload in the least significant bits of website favicons.

A large-scale campaign occurred as early as 2016, orchestrated by a group named AdGholas, which extensively used steganography combined with JavaScript to compromise many victims.

The group injected JavaScript code into websites that analyzed parameters of each visitor, such as geographic location or time. If the conditions were met, the attacker replaced a banner on the site with another, identical in appearance, but containing concealed code used by the JavaScript. This method gave rise to a series of such malware attacks, including one that discreetly stole banking information from customers making purchases on compromised sites, such as a campaign targeting the Tupperware brand website, capturing customer data without their knowledge.

Similarly, organizations in Azerbaijan were targeted by phishing campaigns involving Word documents with macros hidden via steganography. Once reconstructed, these macros downloaded a .NET-based Remote Access Trojan (RAT) named Fairfax.


B | COMMAND AND CONTROL / DATA EXFILTRATION

Steganography is even more widely used by cybercriminals to transmit information between a control server and a victim. This is often because the amount of data to communicate is very small. In command and control (C&C) scenarios, a server may only need to provide a single number or an IP address redirecting to the next IP in the attacker's rotation. By camouflaging such elements through steganography, attackers aim to bypass detection systems. It is also used for discreetly exfiltrating data, even in highly monitored environments.

One notable example of inventive and complex techniques comes from the notorious Chinese APT group Platinum. For C&C communication, the group downloaded an HTML page from a site it controlled. The page appeared entirely innocuous, with minimal content and no hidden text. To transmit information, the group cleverly designed the HTML page by manipulating the order of tags. HTML relies on tags to function, but the order in which they are written does not affect the page’s rendering. By modifying the placement of tags like "align," "bgcolor," "colspan," and "rowspan," the attacker encoded data. Each line of tags encoded 4 bits of information. Using this method, the group could retrieve an AES decryption key, which was then used to decrypt content also hidden via steganography in another HTML tag, this time utilizing spaces and tabs as described earlier. This dual use of steganography made the communication nearly undetectable. The only visible element was the connection to the site, but analyzing the communication would not reveal it was a C&C exchange.

Another example takes place in the industrial sector—a highly sensitive environment with significant security stakes. Communications are closely monitored and usually involve minimal data, as devices are streamlined. In this attack, an NTP (Network Time Protocol) server was compromised. The attacker observed the traffic, especially NTP exchanges with PLCs (programmable logic controllers) connected as clients to the server. To send commands to a compromised PLC, the attacker broadcast NTP messages at highly precise time intervals. Much like Morse code, information and instructions were transmitted by timing variations—subtle enough to require precise and comprehensive network monitoring to detect.

Finally, to stealthily extract data from compromised environments, attackers employ various steganography techniques. One recent example involved concealing information in an image, not sent alone but embedded in an email. An employee exfiltrated several pieces of confidential company data by embedding them into the signature image of their emails, thus evading all corporate protections and security measures, including data loss prevention systems.

Numerous tools exist to easily create steganographic content. One example is the Python tool CloakifyFactory, which allows files to be easily hidden within text. It can generate text, emoji lists, or even lists of Star Trek characters.

Steganography is a technique that can be readily implemented by anyone without requiring advanced technical knowledge. It has long been employed by numerous cybercriminal groups to secretly communicate and transmit information while evading standard detection systems.


III | THE DETECTION OF STEGANOGRAPHY WITH JIZÔ

Steganography poses a significant challenge for detection teams. Its effectiveness relies on its ability to blend into the background to avoid detection. An antivirus system often struggles to detect the presence of malware hidden in an image. It is impractical to expect a single system to detect all methods of steganography, especially since these techniques evolve faster on the attackers' side than on the defenders' side. However, the Jizô observability platform can still detect a significant portion of attacks utilizing steganography, thanks to its innovative combination of detection engines that complement each other, and whose interoperability sets it apart.


A | DETECTION THROUGH SIGNATURES

It is very difficult to identify hidden information within a medium. However, if that medium has been used in a past attack, it becomes easy to detect it using Jizô’s modules based on its signature.

Additionally, while steganography hides the message, it still requires communication, which is detectable by Jizô. If an attacker uses steganography to exfiltrate data to a command and control (C&C) server, that connection can be detected. Using knowledge of various attackers' infrastructures, signature-based detection enables identifying the exfiltration, despite the attacker’s efforts to hide it.

Finally, most steganography tools are not natively installed on victims’ systems. If an attacker wishes to forge a steganographic image or element directly on the victim’s system for exfiltration purposes, they must install a public tool or retrieve a private script. In both cases, this activity can again be detected by Jizô using their signatures.

Thus, signature-based detection can easily identify the presence of steganography if the element has been observed previously and, if not, can detect related activities, such as the installation of a specialized tool or communication with known criminal servers.


B | DETECTION VIA ARTIFICIAL INTELLIGENCE AND BEHAVIORAL ANALYSIS

Fortunately, Jizô is equipped with additional detection engines that enhance signature-based detection to uncover a wide range of attacks using steganography in attempts to evade detection.

Jizô provides a comprehensive analysis of the network. It has full visibility over all network traffic and can identify suspicious behaviors, such as unusual spacing between messages or communications between devices that typically do not interact. Thanks to its deep learning-based artificial intelligence, coupled with deep packet inspection, it is possible to detect anomalies in the network.

For example, an attacker might successfully hide their activities within a medium, but the surrounding activity will still appear abnormal. An unusual communication—whether external or lateral—will be flagged. If an image is transmitted within a routine communication that typically involves only text, it will also be flagged. Variations in packet size or flow within a stream can also trigger alerts. Jizô also performs file analysis as they transit the network. Although it is challenging to detect advanced steganography through file analysis, statistical studies, such as examining the color distribution in an image, can at least detect simpler forms of steganography, forcing attackers to employ more sophisticated techniques. In the realm of protocol-based steganography, Jizô excels, detecting unusual changes in TCP headers, for example.

The key is identifying anomalies within the network rather than directly detecting steganography. While attackers may continually improve their methods of concealing messages, they cannot completely mask the act of sending those messages. The network does not lie. They might try to mimic normal network behavior to avoid suspicion, but doing so requires technical expertise, silent observation of the network, and access to a network device, which then severely limits their options by forcing them to conform to standard usage. Jizô’s AI module characterizes what is normal and abnormal on the network with exceptional precision. Combined with its other detection engines, it can raise alerts during various phases of an attack, even when steganography is employed. Users can also provide feedback to refine the AI’s determination of normal or abnormal activities. Lastly, Jizô can correlate multiple suspicious events to raise higher-confidence alerts.

For instance, if a series of images are sent in what appears to be a routine communication shortly after detecting unusual or rare connections originating from the same source device, an alert can be raised for behavior highly indicative of an attack using steganography.


CONCLUSION

Steganography is a formidable tool that has been used for centuries to discreetly conceal information. Technological advancements have created new possibilities in this field, particularly with the advent of the internet. Exchanges have multiplied, communication methods, and exchanged media have diversified, offering numerous opportunities to hide data. These methods have been adopted and further developed by malicious actors, who use them to carry out cybercriminal operations. Detecting these methods is a critical challenge for detection teams.

While it is very difficult to detect most steganography techniques directly, the communications carrying the steganographic elements remain visible. This is where Jizô delivers significant value. Thanks to its interconnected and complementary detection modules, led by its advanced artificial intelligence, Jizô can detect a wide range of attacks employing steganography. The platform’s full visibility of network traffic and knowledge of normal patterns within it enable precise detection of most attacker attempts.

Of course, some blind spots remain, but overcoming them would require attackers to possess highly advanced technological capabilities, deep knowledge of their victim's network, and would limit the quantity of data they could transfer.