Abstract

This paper proposes a new steganography method for hiding data into dynamic GIF (Graphics Interchange Format) images. When using the STC framework, we propose a new algorithm of cost assignment according to the characteristics of dynamic GIF images, including the image palette and the correlation of interframes. We also propose a payload allocation algorithm for different frames. First, we reorder the palette of GIF images to reduce the modifications on pixel values when modifying the index values. As the different modifications on index values would result in different impacts on pixel values, we assign the elements with less impact on pixel values with small embedding costs. Meanwhile, small embedding costs are also assigned for the elements in the regions that the interframe changes are large enough. Finally, we calculate an appropriate payload for each frame using the embedding probability obtained from the proposed distortion function. Experimental results show that the proposed method has a better security performance than state-of-the-art works.

1. Introduction

Stenography is technology that hides secret data into covers for covert transmission [1]. The most important aim of steganography is to combat the adversary’s detection using steganalysis tools [24]. As digital images are the most popular covers in steganography, many methods and tools of digital image steganography have been developed since 1990s, such as LSB (Least Significant Bitplane) [5]. To combat early steganalysis attacks, some algorithms have also been designed to keep the invariance of statistical properties [6]. However, most of the traditional works are not secure enough to modern steganalysis.

Currently, content adaptive steganography approaches are designed to improve security by minimizing the distortion between the cover and the stego. According to this idea, a popular framework for digital image steganography is defined in [7, 8], which includes two points, i.e., allocating the embedding cost and realizing the data encoding. In the first part, the embedding cost is allocated for each element of the cover to quantify the effect on the cover image when modifying the elements [9, 10]. In the second part, an encoding method is designed to achieve the theoretical embedding payload with the distortion function. Nowadays, STC (Syndrome-Trellis Codes) is the most popular tool for adaptive embedding [11, 12]. In this framework, defining a suitable distortion function is vital for steganography security. Starting from HUGO [13], many distortion functions have been proposed for spatial and JPEG images, e.g., MVGG [7], WOW [14], UNIWARD [15], HILL [16], and UED [17]. For spatial images, distortion functions allocate the embedding costs for gray values or RGB values. For JPEG images, distortion functions allocate the embedding costs for JPEG coefficients. Besides, some spatial image distortion functions can be used for JPEG images by allocating the embedding costs for the coefficients obtained from the inverse transformation of JPEG coefficients. As a countering technique, many effective steganalysis approaches have also been proposed. Steganalysts always train models to distinguish suspicious images from clean images [18]. Generally, a steganalysis model contains two parts. One is to extract features from a set of images, such as SRM [18], SPAM [19], DCTR [20], and GFR [21]. The other is to train a classifier using machine learning tools, e.g., the ensemble classifier [22]. Besides, both parts can also be realized by deep learning.

With the development of 4G/5G communication techniques and the social networks, GIF (Graphics Interchange Format) images are more and more popular because of the characteristics of the small size and the convenience to display animation effects. Different from the spatial images and JPEG images, each image frame used in GIF is composed of a color list, i.e., palette, and a set of index values, i.e., image data. When using GIF images as covers, there are three ways to hide secret data. The first way is structure steganography that modifies certain sections in the file header to accommodate secret data. The second way is to embed secret data by changing the correspondence between index values and actual pixel values in the palette, e.g., Gifshuffile [23]. The third way is to embed secret data by slightly modifying the index values, such as OPA (Optimal Parity Assignment) [24] and MBA (MultiBit Assignment) [25].

In 1999, Fridrich proposes a steganography method for palette images, which searches for closest colors and embeds data according to the parity of pixel values [26]. Some other methods for GIF steganography have also proposed in [2729]. In recent years, there have also been some works on GIF steganography. Generally, these methods use some mathematical operations to calculate whether a pixel should be modified to embed data, e.g., the adaptive method using MOD operation by Fathurohman et al. [30]. These works have good capabilities of data embedding. However, they were designed for static GIF images.

In 2015, a method for embedding data into dynamic GIF images is proposed, which uses EzStego and chaos system to hide data [31]. From 2016 to 2017, a series of steganography approaches for dynamic GIF images were proposed [3234]. In those methods, the steganography for static GIF is directly used in each frame of the dynamic GIF. Meanwhile, each frame is assigned with an equal payload. In 2018, Basak et al. propose to embed data into dynamic GIF images using the difference between adjacent pixels in the same frame [35]. Saleh and Merzah propose to combine LZW, EzStego, and LSB to embed secret text messages into dynamic GIF images [36]. Shi et al. propose to use the STC framework for emoji GIF images by designing a new distortion function [37]. These works have good performances on dynamic GIF steganography. However, the correlations between frames are not investigated efficiently. To design a more secure steganography tool for dynamic GIF images, we must consider the changes in the interframes. Meanwhile, payload assignment is also important [38]. While previous works embed even payloads into GIF frames, it would be more appropriate to distribute payloads unevenly to different frames to improve the performance of statistical undetectability, since each frame has different contents.

In this paper, we propose a new algorithm of cost assignment for dynamic GIF images by combining the characteristics of dynamic GIF image palette and the inter frames. Meanwhile, we propose to assign suitable embedding payloads for each frame using the entropy function. Finally, we construct a steganography framework for dynamic GIF images to improve the security. The rest of this paper is organized as follows. We introduce the preliminaries of GIF steganography in Section 2. The proposed framework will be described in Section 3. Section 4 shows the experimental results and analysis. Section 5 concludes the whole paper.

2. Preliminary

In this paper, the matrices, vectors, and sets are written in boldface, the variables are written in italics. For a dynamic GIF image with frames, we denote the index values of the dynamic GIF image as . For each frame, we denote the th frame as and denote the index value sequence of as . Therefore, can represented as , in which is the number of pixels in a frame and is the number of frames in the dynamic GIF image. For a color dynamic GIF image, the pixel of the th element of the th frame has an index value and the corresponding RGB vector (, , and ).

In order to embed secret data, a sender modifies a cover image to generate a stego image , in which is the number of pixels in a cover. The generated additive distortion function between and is the sum of the embedding cost of each element, s.t. in which is the embedding cost of changing to . According to the adjusted range , the common embedding operations on are divided into two types, i.e., the binary when and the ternary when . For example, the embedding operations of ±1 is the ternary embedding that is expressed by and the stego . We denote the embedding costs of three cases of as , , and , respectively, in which and .

In [39], for a given embedding cost, the modification probability can be obtained: where is obtained from the constraints of the modification probability and the -bit secret data in (3).

In [39], it has been theoretically proved that the minimum distortion is achievable when embedding -bit secret data into cover data.

Derivation of the minimum distortion corresponding to a fixed embedding payload is the rate-distortion bound [39]. When embedding -bit secret data, the sender should make the distortion between the cover image and the stego image as small as possible, which can be formulated as the following optimization problems:

In the actual embedding process, we first define a suitable distortion function and then combine the coding methods, e.g., STC, to embed data and limit the distortion as far as possible.

For secret data , we use STC to find the closest codewords to cover in coset of as stego.

The definition of coset is shown in (7). where is a parity-check matrix.

As the stego is found in the codeword , the secret data can be extracted by multiplying the stego with , as shown in (8).

3. Proposed Method

In Figure 1, we show the framework of the proposed method for steganography in dynamic GIF images. First, we decompose dynamic GIF images into frames, each of which is a static GIF image. We design new distortion function for each frame. Based on the existing distortion functions, we obtain the initial embedding costs for each frame. Meanwhile, according to the characteristics of the GIF image palette and the difference between the current frame and the adjacent frames in dynamic GIF image, we calculate adjustment factors and for initial distortion function to construct a new distortion function to get the improved embedding costs .

If the payload contains bits, we dynamically allocate the bits to each frame. After identifying the distortion function and the payload for each frame, we obtain the stego GIF image by data embedding.

3.1. Distortion Function Initialization

We first design a steganography method for each frame of dynamic GIF images. By changing the index values in the index matrix, the secret data can be embedded into the cover with slight modification in the image content.

For a 256-color dynamic GIF image, each index value corresponds to an RGB vector (, , and ). We make an analogy between the RGB values of GIF images and the gray values of gray images. Thus, the algorithms of WOW, UNIWARD, HILL, etc. can be used to define the distortion function for each GIF frame.

For each frame, we extract the matrix of RGB values from the GIF image palette separately. We use the spatial distortion function, i.e. HILL and UNWARD as initial distortion function to calculate the embedding costs of each pixel in RGB channels, denote the embedding cost of the th element in the th frame in RGB channels as , , and . Therefore, we obtain an initial distortion function

3.1.1. Palette Sort Algorithm

As shown in Figure 2, the GIF image palette is arranged out of order. Therefore, similar RGB vectors may be separated. For example, the difference between the RGB values corresponding to the index 2 and the index 100 is smaller than the difference between the RGB values corresponding to the index 2 and the index 3. To achieve better undetectability, the initial pixel value of the cover and the modified value of the stego should be as close as possible. Considering that Euclidean distance is the distance between two points in a multidimensional space, the RGB cube is a three-dimensional space. Hence, we rearrange the GIF image palette using the Euclidean distance as criterion.

We obtain the original palette and the original index matrix from the GIF image. The original palette contains 256 RGB vectors. We denote each element in the palette as , in which is index value, the range is 0 to 255.

As the example shown in Figure 3, we have . By rearranging the original palette, we can obtain a sorted palette and a new index matrix . We denote the new palette as . The steps of rearranging the palette are depicted as follows.

Step 1. We find most frequently used RGB vectors from and denote their index value as . For each vector corresponding to the index in , we find a vector from (excluding the indexes in) that has the smallest distance. We denote the index values of these vectors as .

Step 2. We generate a random number from 0 to 255. Let be the first row in the new palette . Meanwhile, we find the positions in that have the index values equal to . On the same positions in , we set the values as . As the example in Figure 3, since , we assign as . We find the positions in that have the index values 4 and set the values on these positions in , as 0.

Step 3. From the palette , we find a vector that has the smallest Euclidean distance to . The vector must have not been used in previous steps. We set this vector as in the new palette . We further find the positions in that have the index values equal to . On the same positions in , we set the values as . As the example in Figure 3, since has the smallest Euclidean distance with , we assign as . Meanwhile, we find the positions in that have the index values 56 and set the values on these positions in as 1.

Step 4. If the index value of belongs to the set or , say or , we must use or to fill the -th vector in . Otherwise, we use Step 3 to fill the -th vector in . As shown in Figure 3, as and , we use () to fill the -th vector in . Meanwhile, we set the values in corresponding positions in as 3.

3.2. Distortion Function Optimization Based on Magnitude

In dynamic GIF image steganography, we embed data by modifying the index values. It is different from the steganography method for spatial images that embeds data by modifying pixel values. However, we can use the STC framework by optimizing the distortion functions defined for spatial images. Before using the ±1 embedding algorithm, we initialize the distortion function by state-of-the-art designs, e.g., WOW, UNIWARD, or HILL.

For WOW, the distortion algorithm calculates a weighted difference embedding suitability to get the embedding costs of this pixel, using the difference between the residual of the cover image and the residual that only the th pixel is modified, as shown in (10), in which “” is the convolution operator, “” is an operator of rotating 180 degrees, and is the embedding suitability, is the absolute operator. Besides, is calculated by where is a filter, the th filter, and is the matrix that only the th pixel is modified, is the matrix with the same size as the cover image . When the modified amplitude is 1, the th element of is 1, while the others are all 0. The value of the weight difference of WOW depends on the modified amplitude and the filter. Since the calculation of the weight difference uses an absolute value, changing the pixel value in the forward direction (the embedding operation of +1) and changing the pixel value in reverse (the embedding operation of -1) have no distinction on the result.

However, when embedding data in GIF, we must modify the index values. The modification would be result in the change of the RGB values. More importantly, the modifications of +1 and -1 have different impacts on the RGB values. As shown in Figure 4, the index value of is 5 and the corresponding value is 0.42. When adding 1 to , the index value changes to 6, the corresponding value changes to 0.26, and the modification amplitude is 0.16. When subtracting 1 from , the index value changes to 4, the corresponding value changes to 0.41, and the modification amplitude is 0.01. For with index value equal to 6, the corresponding value is 0.26 and the modification amplitude is 0.16 when subtracting 1 from . Different operations on different index values would have different modification amplitudes on RGB values. Therefore, when hiding data into GIF, we should set different embedding costs for different operations on different index values.

According to this point, we calculate the modification amplitudes on RGB values when adjusting ±1 on index values and optimize the initial distortion function according to the modified amplitude. The pixels with less impact are assigned with small embedding costs, and the pixels with more impacts are assigned with large embedding costs.

Accordingly, we propose to include an adjustment parameter called amplitude weight to manipulate the optimized proportion of the modification amplitude in the new distortion function. The setting of will be discussed in Section 4.1. Since the initial distortion functions are define for RGB channels, we provide the adjustment factors for three channels, respectively. In (12), , , and are the optimization factors for the +1 embedding operations of the th pixel in the th frame in three color channels and , , and are the optimization factors for the -1 embedding operations of th pixel in the th frame in three color channels. is the adjustment parameter called amplitude weight. , , and are the absolute difference of the RGB values of and () and , , and are the absolute difference of the RGB values of and ().

Combining three optimization factors with the initial distortion functions, we adjust the distortion function as (13).

3.3. Distortion Function Improvement Using the Correlation of Interframes

In spatial image steganography, the texture and edge regions are more appropriate for data hiding than the other regions, as it is more secure against the steganalysis. Most STC embedding methods assign these regions with low embedding costs. This principle can also be used for steganography in dynamic GIF images. As there are many frames in a dynamic GIF image, we also use the changes in interframes. When displaying animation effects, most of the successive frames have strong correlations, which can be found in Figure 5.

We propose to integrate the correlations of interframes into distortion function optimization. The pixels that have larger changes in interframes would be assigned with smaller embedding costs. First, we calculate the motion differences of adjacent frames of each pixel. Then, we select the appropriate pixels according to these motion differences. After that, we set adjustment parameters to adjust the embedding costs of these pixels to achieve the adjustment factor .

For each pixel in dynamic GIF image , we calculate the difference between each frame and the next frame by (14). where is the difference between the th pixel in the th frame and the th pixel in the ()th frame. is the absolute difference of and , is the absolute difference of and , is the absolute difference of and . is the number of pixels in a frame and is the number of frames in dynamic GIF image.

As the adjacent pixels are related, we use the average value of the differences of the pixels in a block as the difference of the pixel in the center of the block. In order to better evaluate the degree of change in inter frames, we average the difference between the present and the previous frames and the difference between the present and the next frames. We use the average as the motion difference of adjacent frames, which is depicted in (15), in which is the motion difference of the th pixel in the th frame, and is the difference between the th pixel in the th frame and the th pixel in the ()th frame.

After calculating the interchanges of each pixel of the dynamic GIF image, we sort all pixels according to the motion difference and select the pixels with the top of large changes to decrease their embedding costs. Considering that changes exceeding 38 dB are imperceptible to the human eye, we combine the image content to select the pixels with changes below 38 dB to calculate the selection range adaptively. where is the mean of the motion differences of the pixels with changes below 38 dB, the motion differences of the pixels with changes equal 38 dB, and the motion difference parameter is empirically set as 0.2. We select the pixels with the top by (17)

Subsequently, we define the adjustment factor in (18) according to the correlation of interframes.

Finally, we obtain the final distortion function

3.4. Payload Allocation

In dynamic GIF image steganography, we calculate the embedding costs for each frame. There are more pixels with small embedding cost in some frames. Therefore, these frames can accommodate more secret data. In order to achieve a better security with the condition of a constant total payload, we assign each frame with different payload according to the characteristics of each frame itself.

According to the calculation of the initial embedding costs and optimization factors, we get the optimized embedding costs for each pixel in dynamic GIF image. During data hiding, the whole distortion should satisfy the rate distortion bound in (20), where is the number of pixels in a frame and is the number of frames in dynamic GIF image. is the embedding cost of the th pixel in the th frame. is the modification probability of the th pixel in the th frame. is a entropy function to calculate the payload based on the modification probability, which is depicted in (3). The modification probability can be obtained according to the embedding cost and a parameter . The parameter is obtained from the constraints of the modification probability and the payload of secret data in (20).

After obtaining the distortion function, we input the embedding costs of all frames and the payload of secret data into the constraints in (20) to calculate the parameter . Accordingly, we obtain the modification probability of each frame based on the embedding costs and the parameter . Subsequently, embedding payload of each frame can be calculated by (21)

4. Experimental Results

To verify the proposed method, we have conducted many experiments. Two dynamic GIF image datasets are constructed, i.e., the SPORTBase and the HAPPYBase. Each database contains 500 dynamic GIF images, and each image has 20 frames, i.e., 10,000 static GIF images in each dataset. They are all color dynamic images with 8-bit index values. The SPORTBase are football games downloaded from https://www.zhibo8.cc/. The HAPPYBase are emoticons from https://www.soogif.com/, https://www.sina.com.cn/ and other websites. Both are available in https://github.com/jzlin1997/GIF-Image-Steganography.

We use HILL, UNIWARD, and WOW as the initial distortion function in the proposed method. After optimizing the initial distortion function and allocating the embedding payloads, we generate a new steganography method for dynamic GIF images. We name the proposed steganography based on improved HILL, UNIWARD and WOW as Gp-HILL, Gp-UNIWARD and Gp-WOW, respectively.

We use the steganalysis tools of the state-of-the-art SPAM feature set [19] and SRM feature set [18] with the ensemble classifiers [22]. We obtain the RGB values of stego GIF images to calculate the gray values and extract SPAM features from the gray values. Half of the cover and stego feature sets are used for training, and the rest are for testing. We use the minimal total error with equal priors achieved on the testing sets as the criterion to evaluate the performances of the feature sets. In (22), is the false alarm rate and is the missed detection rate. The average of by 10 random tests is used to evaluate the performance.

In Section 4.1, we discuss the adjustment parameter, i.e., the amplitude weight. Based on appropriate amplitude weight, we find a suitable value of the motion difference parameter. Subsequently, we compare our proposed method with the basic method using initial distortion functions and Shi et al. [37] in Section 4.2 and Section 4.3. We give a subjective evaluation of the stego image in Section 4.2 and a specific analysis of undetectability in Section 4.3.

4.1. Parameter Determination

In Section 3.2, we have analyzed that the modification amplitude is an important factor for distortion function. Considering that the modification amplitudes on pixel value of different operations are different, we not only embed the data in texture regions but also embed the data in the pixels with small modification amplitude. We define an adjustment parameter called amplitude weight to adjust the optimized proportion of the modification amplitude in the new distortion function. If is too large, the embedding costs of the pixels with small modification but in smooth region would become much smaller. It will ignore the effects of embedding in texture regions.

We use Gp-HILL and Gp-UNIWARD to embed 0.15 bpp into dynamic GIF images in SPORTBase with different values of δ. The SPAM errors are shown in Figures 6(a) and 6(b). The results indicate that the largest testing errors can be achieved when . Therefore, we set .

Based on the identified amplitude weight , we look for the optimal parameter of motion difference in the SPORTBase. We use Gp-HILL and Gp-UNIWARD to embed 0.15 bpp into GIF images and vary the values of , respectively. The results shown in Figure 7 indicate that the largest testing errors can be achieved when .

4.2. Image Quality

In Figure 8, we compare quality of the stego frames by the proposed method and the other methods. We use HILL as the initial distortion function. The stegos are generated by the initial function, the function in [37] and ours. The embedding payload is 0.2 bpp. Figure 8(a) shows the original frame. Figures 8(b)8(d) shows the stegos generated by different distortion functions. Traditional distortion functions used for GIF embedding only consider the embedding in texture regions, which always ignore the differences of modification amplitude. Therefore, there are some drawbacks like the black points appearing in the blue coat in Figure 8(b). However, the distortion function in [37] focuses on embedding secret data into the pixels with a smaller modification amplitude, a large piece of white in the blue coat in Figure 8(c). The results show that the proposed method achieves a better quality.

Meanwhile, based on the HILL, we calculate the mean PSNR (peak-signal-to-noise ratio) of the stego generated by initial distortion function, the stego by Shi et al. and our Gp-HILL in Table 1. As shown in Table 1, all the PSNR of the stego generated by Gp-HILL has been improved. For example, the PSNR of the stego generated by Gp-HILL is 40.01 when the embedding payload is 0.15 bpp. Compared with the PSNR of the stego generated by the basic method using HILL, our method can improve 5.11. Moreover, from 0.05 bpp to 0.2 bpp, the PSNR of the stego generated by Gp-HILL has reached an undetectable level by the human eye.

4.3. Security against Steganalysis

We have also conducted many experiments to verify the capabilities of countering steganalysis. The proposed Gp-HILL, Gp-UNIWARD, and Gp-WOW are used to hide secret data into the dynamic GIF images in SPORTBase and HAPPYBase. Five different payloads from 0.05 bpp to 0.25 bpp are used.

We use SPAM and SRM to extract features from covers and stegos to evaluate the security of the proposed method. The proposed method is compared with the basic method, Shi et al.’s method in [37]. The basic method and Shi et al.’s method use HILL, UNIWARD, and WOW as initial distortion function.

In Table 2 and Figures 9(a)9(f), we show the steganalysis results of data hiding in SPORTBase. Table 2 indicated that the proposed method outperforms the basic methods and the Shi et al.’s method using three distortion functions with SPAM feature. For example, the testing error of Gp-HILL with SPAM is 0.3414 when the embedding payload is 0.25 bpp. It can improve 0.0465 comparing the testing error of the basic method with SPAM using HILL. With the increase of embedding payload, the improvement of testing error also increases. Further, Table 2 shows all the testing errors obtained by the proposed method with SRM are higher than the basic method from 0.05 bpp to 0.5 bpp and are higher than the Shi et al.’s method from 0.05 bpp to 0.15 bpp. When payload becomes larger, embedding in texture area has lost its effect, such as the testing error of Gp-HILL with SRM is 0.0018 when the embedding payload is 0.25 bpp, therefore, the testing errors obtained by the proposed method with SRM are lower than the Shi et al.’s method. But the image quality of Shi et al.’s method is worse. It is shown the proposed steganography method could obtain improvements in the performance of steganalysis.

To further evaluate the performance of the proposed method, we do experiments using HAPPYBase. Figures 10(a) and 10(b) and Table 3 shows the comparisons of testing errors between Gp-HILL, the basic method, and Shi et al.’s method using HILL in HAPPYBase with SPAM and SRM. As shown in Table 3, the testing errors of the proposed method are higher than the basic method and the Shi et al.’s method in all situation in HAPPYBase, which has richer texture than SPORTBase. The results indicate the proposed method for dynamic GIF images steganography could achieve better performance on resisting the modern steganalysis and verify that the parameters are also appropriate for other databases.

5. Conclusions

The paper presents a new steganography method for the dynamic GIF images based on STC framework. We rearrange the palette and calculate the modification amplitudes on pixel values according to the mapping relationships between index values and RGB values. According to the modification amplitudes on pixel values, we calculate the adjustment factors for each color channel. Considering the strong correlation of interframes, we use the motion difference of adjacent frames as an adjustment factor to adjust embedding costs. Combining adjustment factors, we design a new method of distortion function specification. Besides, with the embedding probabilities on different frames, we also assign different payloads for each frame to achieve higher security. The experimental results show that the proposed method has a better performance than previous works.

Data Availability

The link of database and experiment results used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the Natural Science Foundation of China (Grant 61572308, U1736213, and U1636206) and Shanghai Excellent Academic Leader Plan (16XD1401200).