1 Introduction

Digital steganography is the art of hiding messages in redundant parts of digital media such as images, video, and audio for the purpose of covert communication. So far, many steganography algorithms have been proposed for different types of covers [13]. In contrast, steganalysis is the technique of detecting stego objects, extracting or removing the message embedded by steganography. Existing steganalysis researches mainly focus on detecting stego objects [411], but investigators are more interested in extracting the hidden secret information.

In recent years, some steganalysis techniques that can locate and even extract the hidden message, also referred to as the payload, have been reported for the following four cases. (1) In the case of that the investigator owns multiple stego images with messages embedded into the same positions, Ker first proposed to locate the payload of LSB (least significant bit) replacement by averaging the weighted stego-image residuals in the same positions [12, 13]. Subsequently, a series of payload locating steganalysis algorithms were proposed for spatial LSB steganography [1417]. In 2014, Quach used the residuals to estimate the rough sequence of stego positions when owning enough stego images embedded messages of different sizes along the same path [18]. In 2019, Yang et al. proposed a locating methodology based on quantitative steganalysis for this case [19]. Recently, Wang et al. proposed a payload locating method based on co-frequency sub-image filtering for a category of pseudo-random JPEG image steganography, such as JSteg and F5 steganography [20]. (2) In the case of that the investigator owns a single stego image, in 2012, Quach proved that the modified pixels in a stego image can be located with a lower error rate if enough independent non-random discriminant functions can be used [21]. Then, Yang et al. fused spatial and wavelet filtering results to locate the modified pixels of LSB matching steganography [22]. (3) In the case of that the investigator knows the embedding position generator, Zhang et al. proposed three attack algorithms to recover the stego key of LSB steganography if the carrier is known or reused [23]. Fridrich et al. used χ2 testing to recover the stego key of LSB steganography if the carrier is unknown [24, 25]. Later, Zhang et al. and Liu et al. used the single-key collision attack algorithm to recover the stego key of LSB steganography [2628]. Yang et al. recovered the stego key based on the optimal stego subset property of MLSB (multiple least significant bits) steganography [29]. Xu and Liu et al. utilized the statistical differences between the quantized DCT (discrete cosine transform) coefficients and the distribution differences between the extracted message bits to recover the stego key of some typical steganography algorithms, such as OutGuess, JPEG domain random LSB matching, and F5 steganography without matrix encoding [30, 31]. Additionally, some quick stego key recovery algorithms have been proposed for specific carriers or embedding position generators [32, 33]. (4) In the case of that the secret message is embedded into an image sequentially using an encoding algorithm, Chen et al. proposed a differential attack for matrix embedding under the chosen stego condition [34]. Luo et al. proposed a message extraction algorithm for HUGO (highly undetectable stego) steganography with STC (syndrome-trellis codes) based on blind coding parameters recognition when the embedded message is plaintext [35]. Gan et al. proposed an algorithm based on partially known plaintext to extract the encrypted file embedded by HUGO steganography when the file format name and message length have been embedded without encryption [36].

In a word, the existing algorithms perform well for above four cases. However, when an encoding algorithm is used to embed the secret messages into pixels or coefficients pseudo-randomly selected from an image, there are still no effective algorithms which can extract the secret messages.

F5 steganography is a typical algorithm which uses matrix encoding to embed messages into the pseudo-randomly shuffled DCT coefficients [37]. This paper proposes a stego key recovery method to recover the check matrix and shuffling key of F5 steganography. Firstly, according to the characteristic of the matrix encoding used in F5 steganography, the check matrix is recovered based on the embedding ratio estimated by quantitative steganalysis. The shuffling key is then recovered based on the difference between the distributions of bit sequences extracted by true and false shuffling keys. Additionally, the cardinality of the shuffling key space is reduced by examining the extracted encoding parameter and message length. Experimental results show that the proposed method can accurately and effectively recover the stego key containing the check matrix and shuffling key.

2 Related work—F5 steganography

F5 steganography is a typical JPEG image steganography algorithm proposed by A. Westfeld. The modification pattern in F5 steganography is very simple, but its optimality has been proved under the condition that one cannot obtain the DCT coefficients before quantization [38]. The matrix encoding was firstly introduced into steganography by A. Westfeld to design F5 steganography and combined with shuffling to significantly improved the security. Additionally, F5 steganography embeds messages much faster than the new STC-based steganography. Because of above reasons, programmers have developed many steganography tools based on F5 steganography algorithm suitable for different operating systems such as Linux, Windows, and Mac. However, the existing many stego key recovery methods were just designed for the simplified F5 steganography without matrix encoding and can not distinguish true and false stego keys of F5 steganography with matrix encoding. Therefore, it should be valuable to recover the stego key of F5 steganography with matrix encoding.

F5 steganography first shuffles the quantized DCT coefficients, and then uses matrix encoding technology to embed the secret message into the shuffled coefficients. The matrix encoding can be represented as (1,W,w), which denotes that w bits of the message are embedded into W (W=2w−1) non-zero coefficients with at most one coefficient modified.

The embedding procedure of F5 steganography is as follows (see Fig.1).

  1. 1)

    Decode the given cover JPEG image to obtain the quantized DCT coefficients.

  2. 2)

    Generate a shuffling key from a given password and shuffle the coefficients obtained in step 1.

  3. 3)

    Count the available non-zero alternating current (AC) coefficients and compute the matrix encoding parameter w based on the number of available non-zero AC coefficients and the message length.

  4. 4)

    Embed 31 bits of metadata (matrix encoding parameter w and message length l) into the shuffled non-zero AC coefficients.

  5. 5)

    Embed the message into the rest of the available non-zero AC coefficients by matrix encoding (1,W,w).

  6. 6)

    Inversely shuffle the stego quantized DCT coefficients to the original order.

  7. 7)

    Store the stego JPEG image.

Fig. 1
figure 1

Embedding procedure of F5 steganography

The extraction procedure of F5 steganography is as follows (see Fig.2).

  1. 1)

    Decode the given stego JPEG image to obtain the quantized DCT coefficients.

    Fig. 2
    figure 2

    Extraction procedure of F5 steganography

  2. 2)

    Generate a shuffling key from a given password and shuffle the coefficients obtained in step 1.

  3. 3)

    Extract the metadata (matrix encoding parameter c and message length l).

  4. 4)

    Extract the embedded message by matrix decoding.

The following two characteristics of the quantized DCT coefficients will be maintained after embedding messages into an image by F5 steganography (see Fig.3).

  1. (i)

    The quantized DCT coefficients with larger absolute values appear with lower frequency, so the bins of the quantized DCT coefficients with larger absolute values are smaller.

  2. (ii)

    As the absolute value of the quantized DCT coefficients increases, the frequency of the DCT coefficient decreases at a smaller rate. For example, the difference between two adjacent bins close to coefficient value 0 is greater than that between two adjacent bins far from coefficient value 0.

Fig. 3
figure 3

Histogram of the quantized DCT coefficients in a JPEG image

The above characteristics mean that the number of 1s in the LSBs of non-zero AC coefficients is greater than the number of 0s.

3 Method—stego key recovery method for F5 steganography

From Figs. 1 and 2, it is clear that the stego key of F5 steganography is composed of the shuffling key and the check matrix of matrix encoding. If one can obtain these two components, then the embedded message can be extracted. The brute force attack method tries all possible shuffling keys and check matrices, so it is highly inefficient. If the check matrix or shuffling key can be obtained before searching for another, then the time complexity could be reduced significantly. Following this thought, this section will present a stego key recovery method for F5 steganography with matrix encoding. The presented method contains two main procedures: recovery of check matrix and recovery of shuffling key, which are described in the following parts.

In order to simplify the description, the following symbols are defined firstly.

  1. 1)

    Let X=x1,x2,…,xN denote the quantized DCT coefficients in the cover image, where N is the number of quantized DCT coefficients in the cover image.

  2. 2)

    Let Y=y1,y2,…,yN denote the quantized DCT coefficients in the corresponding stego image.

  3. 3)

    Let M=m1,m2,…,ml denote the sequence of embedded message bits, where l is the length of message.

  4. 4)

    Let Hw denote the check matrix of matrix encoding used in F5 steganography with parameter w.

  5. 5)

    Let K denote the space of shuffling keys, where |K| is the size of the shuffling key space, and let k0 be the correct shuffling key.

  6. 6)

    Let Mi=(mi,1,mi,2,…,mi,w) denote the ith group of embedded message bits.

  7. 7)

    Let Ci=(ci,1,ci,2,…,ci,W) denote the sequence of bits expressed by the selected ith group of W non-zero AC coefficients.

  8. 8)

    Let Si=(si,1,si,2,…,si,W) denote the sequence of bits expressed by the ith group of W non-zero stego AC coefficients.

3.1 Recovery of check matrix in F5 steganography

During embedding, if F5 steganography modifies a coefficient with an absolute value of 1, one more non-zero AC coefficient is read and added to the buffer to form a new group of W coefficients. F5 steganography does not embed the message into the DC (direct current) coefficients. Therefore, one can compute the estimated capacity with no matrix encoding as follows:

$$ L=h_{DCT}-\frac{h_{DCT}}{64}-h(0)-0.51h(1), $$
(1)

where hDCT denotes the number of quantized DCT coefficients in the cover image, h(0) denotes the number of AC coefficients equal to zero, h(1) denotes the number of AC coefficients with an absolute value of 1, \(\frac {h_{DCT}}{64}\) is the number of DC coefficients, and 0.51h(1) is the estimated loss due to shrinkage.

F5 steganography computes the modified position ai as follows:

$$ a_{i}=bin2dec\left(M^{T}_{i}\bigoplus\left(H_{w}C^{T}_{i}\right)\right), $$
(2)

where the function bin2dec((b1,b2,…,bw)T) converts the binary vector (b1,b2,…,bw)T to a decimal number \({\sum \nolimits }_{j=1}^{w}b_{j}2^{w-j}, M^{T}_{i}\) and \(C^{T}_{i}\) denote the transpositions of Mi and \(C_{i}, \bigoplus \) denotes the bitwise XOR (exclusive or) operation. If ai=0, the selected ith group of non-zero AC coefficients should remain unchanged. If ai≠0, the aith bit in Ci should be changed to obtain the stego bit sequence. That is,

$$ S_{i} = \left\{ \begin{array}{cc} C_{i}, & \rm{when}\quad a_{i} = 0; \\ \left(c_{i, 1}, c_{i, 2}, \ldots, \bar{c_{i, a_{i}}}, \ldots, c_{i, W}\right), & \rm{when}\quad a_{i} \neq 0. \end{array}\right. $$
(3)

On the receiving side, F5 steganography decodes the message bits as follows:

$$ M^{T}_{i}=H_{w}S^{T}_{i}. $$
(4)

From (4), it is apparent that the check matrix Hw is determined by the parameter w and the elements in it. Fortunately, in F5 steganography, the elements in all positions of the check matrix Hw are also determined by the parameter w as follows:

$$ H_{w}\left(i, j\right) = bit\left(j, w-i+1\right) $$
(5)

where bit(j,wi+1) denotes the (wi+1)th least-significant bit of the value j, 1≤iw and 1≤j≤2w−1. For example, when w=2, the check matrix is

$$ H_{2} = \left(\begin{array}{ccc} 0 & 1 & 1 \\ 1 & 0 & 1 \end{array}\right), $$
(6)

and when w=3, the check matrix is

$$ H_{3} = \left(\begin{array}{ccccccc} 0 & 0 & 0 & 1 & 1 & 1 & 1 \\ 0 & 1 & 1 & 0 & 0 & 1 & 1 \\ 1 & 0 & 1 & 0 & 1 & 0 & 1 \end{array}\right). $$
(7)

Therefore, the recovery of check matrix in F5 steganography can be viewed as the recognition of the encoding parameter w. Because F5 steganography encodes message bits with as many available coefficients as possible, the parameter w should satisfy the following inequality:

$$ \frac{w-1}{2^{w-1}-1}< r\leq \frac{w}{2^{w}-1}, $$
(8)

where \(r=\frac {l}{L}\). Therefore, we can adopt a quantitative steganalysis algorithm to estimate the embedding ratio in the stego image, then obtain the parameter w using (8). Currently, many quantitative steganalysis algorithms have been proposed for F5 steganography. For example, Fridrich et al. calibrated the given image to estimate the coefficient histogram of the cover image, and then used a least squares method to estimate the message length l [39]. Luo et al. improved the modification ratio estimation in Fridrich’s algorithm based on relative entropy [40].

It can be found that the probability to successfully recover the check matrix depend on whether the message length can be estimated with error \(e=\hat {l}-l\) in the range \(\left (\frac {w-1}{2^{w-1}-1}L-l, \frac {w}{2^{w}-1}L-l\right ]\). Thus, this further demonstrates that it is necessary to design more accurate quantitative steganalysis algorithm.

3.2 Recovery of shuffling key in F5 steganography

Recovering the shuffling key in F5 steganography involves distinguishing the true shuffling key k0 from the key space. This section describes the principle to recover the shuffling key based on the distribution of the extracted message bits.

Firstly, the following symbols are defined:

  1. 1)

    Let u0 denote the frequency of 0 in the bit sequence extracted from the non-shuffled DCT coefficients;

  2. 2)

    Let u1 denote the frequency of 1 in the bit sequence extracted from the non-shuffled DCT coefficients;

  3. 3)

    Let I(k) denote the bit sequence extracted from the DCT coefficients shuffled by key k;

  4. 4)

    Let p0(k,n) denote the frequency of 0 in the first n bits of I(k);

  5. 5)

    Let p1(k,n) denote the frequency of 1 in the first n bits of I(k).

The difference between the distributions of the message bits extracted by the true and false shuffling keys is analyzed as follows.

  1. a)

    Statistical characteristics of message bits extracted by the true shuffling key. In F5 steganography, when the embedded message has been encrypted, it should be a stream of pseudo-random bits in which 0 and 1 appear with equal probabilities. Therefore, there should be p0(k0,n)≈p1(k0,n)≈0.5n, where 0<nl.

  2. b)

    Statistical characteristics of message bits extracted by the false shuffling key. When w=1, the matrix encoding degrades to simple LSB embedding. From statistical characteristic (i) of the quantized DCT coefficients, one can infer that there should be more 1s than 0s in the LSBs of the non-zero AC coefficients of the stego image, i.e., u1>u0. When w≥2, one can infer from statistical characteristic (i) that, in all groups of 2w−1 non-zero AC coefficients selected from the non-shuffled DCT coefficients, the groups whose elements’ LSBs are all 1 will appear most frequently. Because the group \(\phantom {\dot {i}\!}(1, 1, \ldots, 1)_{2^{w}-1}\) will be decoded as \(\phantom {\dot {i}\!}H_{w}(1, 1, \ldots, 1)^{T}_{2^{w}-1}=(0, 0, \ldots, 0)^{T}_{w}\), there should be more 0s than 1s in the bit sequence extracted from the non-shuffled DCT coefficients, i.e., u1<u0. Because the secret message bits are pseudo-randomly spread across all non-zero AC coefficients, the distribution of the message bits extracted by the false shuffling key is similar to that of the message bits extracted from the non-shuffled image. Therefore, in the message bits extracted by the false shuffling key k, there should be p1(k,n)=u1,p0(k,n)=u0, and p1(k,n)≠p0(k,n).

For example, when there are 5 coefficients whose LSBs are 1s and 1 coefficient whose LSB are 0s in a JPEG image, all of the possible stego bit sequences are showed in the following Table 1. It can be seen that there are 4 extracted message bit sequences where the number of 0s is more than half of the length.

Table 1 Bit sequences extracted with parameter w=2 from 5 coefficients whose LSBs are 1s and 1 coefficient whose LSB is 0

In a word, the distribution of the message bits extracted by the true shuffling key is different from that extracted by the false shuffling key. Therefore, we can use a non-parametric hypothesis testing to examine whether the message bits extracted by the test shuffling key conform to the distribution of the message bits extracted by the true shuffling key. Namely, the recovery of the shuffling key can be viewed as the testing with following hypothesis:

$$ \mathbf{H}_{0}: F(x)=F_{0}(x), \mathbf{H}_{1}: F(x)\neq F_{0}(x), $$
(9)

where F0(x) is the distribution of the bit sequence I(k0) extracted by the true shuffling key k0.

Let f(x) denote the probability function of the distribution F0(x). Because the relative frequencies of 1 and 0 in I(k0) are both approximately equal to 0.5, it holds that

$$ f(x)=\left\{\begin{array}{cc} 0.5 & x=0, \\ 0.5 & x=1. \end{array}\right. $$
(10)

In the first n bits of the bit sequence I(k), the actual frequencies of 0 and 1 are np0(k,n) and np1(k,n), respectively. From (10), the true shuffling key k0 produces theoretical frequencies of 0.5n for both 0 and 1. Thus, when the test shuffling key k is true, Pearson’s theorem implies that the limit distribution of the following statistics is the χ2 distribution with a single degree of freedom:

$$ t(k, n)=\frac{\left(np_{0}(k, n)-0.5n\right)^{2}}{0.5n}+\frac{\left(np_{1}(k, n\right)-0.5n)^{2}}{0.5n}\sim \chi^{2}(1). $$
(11)

The probability distribution function of the statistics is

$$ p(k)=P\left(\chi^{2}(1)\leq t\right)=\frac{1}{2^{\frac{1}{2}}\Gamma\left(\frac{1}{2}\right)}\int_{0}^{t}e^{-\frac{x}{2}}x^{-\frac{1}{2}}dx. $$
(12)

The true shuffling key k0 will generate the small value of t(k,n) which would cause small value of p(k). In contrast, if k is a false shuffling key, the limit distribution of the statistics t(k,n) should not be the χ2 distribution with one degree of freedom and generate a larger value of t(k,n) which would cause large value of p(k). Therefore, we attempt to search for the shuffling key k that generates a small value of t(k,n). Note that the search speed of the shuffling key is related to the number of samples n. Larger values of n will produce more accurate results, but reduce the search speed of the shuffling key. Thus, we need to find an appropriate value of n.

For the given significance level α, the threshold value T(α) of t(k,n) used to test the true and false shuffling keys should satisfy

$$ P\left\{\chi^{2}(1)>T(\alpha)\right\}=\alpha, $$
(13)

and the rejection region is (T(α),). When the test key is the true shuffling key k0, the corresponding statistics t(k0,n) takes the smallest value over the whole key space and t(k0,n)<T(α). The statistics t(kj,n) of the false shuffling key kj(j≠0) will be larger than T(α). Let Δ=t(kj,n)−T(α)(j≠0) denote the difference between the statistics t(kj,n) of the false key and the threshold value. Then, we obtain

$$ T(\alpha)+\Delta=\frac{\left(nu_{0}-0.5n\right)^{2}}{0.5n}+\frac{\left(nu_{1}-0.5n\right)^{2}}{0.5n}, $$
(14)

where Δ>0. Thus, the number of bits used should be

$$ n=\frac{T(\alpha)+\Delta}{\frac{\left(u_{0}-0.5\right)^{2}}{0.5}+\frac{\left(u_{1}-0.5\right)^{2}}{0.5}}. $$
(15)

It can be seen that there are two extreme cases in (15), viz. the case of the embedding ratio r→1, and the case of the embedding ratio r→0. In these two extreme cases, it is possible failure to recover the shuffling key.

  1. 1)

    As the embedding ratio r→1, the characteristics u0→0.5,u1→0.5 and (15) would cause that n. Thus, the shuffling key would not be successfully recovered because of an insufficient number of samples.

  2. 2)

    As the embedding ratio r→0, if the message length l is less than n, the shuffling key would not be successfully recovered. In the χ2 testing, the number of samples belonging to each class should satisfy 0.5n≥50, so the number of samples used should satisfy

    $$ n\geq 100. $$
    (16)

    which also means that the message length l should be not smaller than 100 bits.

After shuffling the quantized DCT coefficients, F5 steganography embeds the metadata (matrix encoding parameter w and message length l) in the embedding procedure and extracts the metadata in the extraction procedure. The matrix encoding parameter w is determined by l and L using (8) and the message length l must not exceed L. Table 2 presents the performance results of matrix encoding with different encoding parameters. The value of w must be an integer in the range 1-9. Thus, if the parameter wk extracted by a test shuffling key k does not satisfy this criterion, this key must be false. We can use a quantitative steganalysis algorithm to estimate the embedding ratio l/L, and then obtain the parameter w. If wkw, the tested shuffling key k also must be false. Additionally, if the message length lk extracted by shuffling key k is greater than the number of non-zero AC coefficients, the tested shuffling key k also must be false.

Table 2 Performance of matrix encoding with different parameters

3.3 Description of stego key recovery method

In summary, the stego key of F5 steganography, composed of the shuffling key k0 and the check matrix of matrix encoding Hw, can be recovered as follows (seeing Fig.4).

  1. 1

    Decoding: Decode the given stego JPEG image to obtain the quantized DCT coefficients and count the number of non-zero AC coefficients NAC≠0.

  2. 2

    Estimate metadata and recover check matrix: Estimate the message length l using a quantitative steganalysis algorithm, and then determine the matrix encoding parameter w and the check matrix Hw.

  3. 3

    Count frequencies of 0s and 1s extracted from the non-shuffled DCT coefficients: Extract the bit sequence from the non-shuffled DCT coefficients and count the frequency of 0s in the extracted bit sequence, u0, and the frequency of 1s in the extracted bit sequence, u1.

  4. 4

    Scan the shuffling key space: Examine each possible shuffling key k in the shuffling key space K through the following steps.

    1. (a)

      Shuffle the coefficients obtained in step 1.

    2. (b)

      Extract the matrix encoding parameter wk and the message length lk.

    3. (c)

      If wk=w and lk<NAC≠0, add k to the set of candidate shuffling keys, B; otherwise, examine the next possible shuffling key.

  5. 5

    Examine the set of candidate shuffling keys: If there is only one key in B, i.e., |B|=1, the only key is regarded as the recovered shuffling key \(\hat {k_{0}}\).

  6. 6

    Scan the set of candidate shuffling keys: If |B|>1, compute the number of bit samples needed by the given significance level α and difference Δ using (15), n, and then examine each candidate shuffling key k in B through the following steps.

    1. (a)

      Count the number of 0s and the number of 1s in the first n bits of the bit sequence extracted by the possible shuffling key, i.e., p0(k,n) and p1(k,n).

    2. (b)

      Compute the statistics t(k,n) using (11).

    3. (c)

      If t(k,n)≥T(α), delete k from B.

  7. 7

    Re-examine the set of candidate shuffling keys: If |B|=1, the only key in B is regarded as the recovered shuffling key \(\hat {k_{0}}\). If |B|>1, the key in B with the minimum statistics t(k,n) is regarded as \(\hat {k_{0}}\). If |B|=0, return -1 to denote that the process has failed to recover the shuffling key.

  8. 8

    Return result: Return the recovered check matrix Hw and shuffling key \(\hat {k_{0}}\) as the recovered stego key.

Fig. 4
figure 4

Diagram of the proposed stego key recovery method

4 Results and discussion

4.1 Experimental setup

The F5 steganography software was implemented in Java. When the user inputs a password, the password is stored as a string variable. Because the maximum length of string variable is 216−1=65535 bytes in Java, when the password length is d (1≤d≤65535), the cardinality of the shuffling key space is |K|=28d. To reduce the time cost of experiments, the shuffling key was tested in a space of size 106. The “lena.jpg” image with 512 ×512 pixels and a quality factor of 75 was token as the cover image. Then, experiments were performed on a PC (personal computer) with a dual-core CPU (central processing unit), 3 GB (gigabyte) memory, and main frequency of 2.20 GHz (gigahertz), as follows:

  1. 1)

    Pseudo-randomly select a password from the space {“000000”, “000001”, …, “999999”};

  2. 2)

    Estimate the capacity of the cover “lena.jpg” image;

  3. 3)

    According to the embedding ratio r and the estimated capacity, generate a pseudo-random bit sequence;

  4. 4)

    Use the F5 steganography software to embed the generated pseudo-random bit sequence into the cover “lena.jpg” image with the selected password, and then generate the stego “lena.jpg” image;

  5. 5)

    Use the stego key recovery method to recover the stego key from the generated stego image.

4.2 Experimental results and analysis

In the proposed stego key recovery method, the value of Δ is related to the cardinality of the shuffling key space, |K|, and the embedding ratio r. When the value of r is larger, the number of samples required, n, will be higher and the search speed will be slower. Therefore, the value of Δ should be as small as possible. The experiments were performed with different values of Δ, starting from 0 and increasing in steps of 5, until the shuffling key had been identified or the number of samples exceeded the message length. Table 3 presents the experimental results for different embedding ratios, where “-” denotes that the number of samples exceeded the message length and “*” denotes that the value of Δ makes the number of samples equal to the message length.

Table 3 Experimental results of stego key recovery for F5 steganography

From the results in Table 3, the following conclusions can be drawn.

  1. 1)

    When the matrix encoding parameter w=1, if the embedding ratio r = 0.85, the stego key cannot be recovered successfully; if the embedding ratio r = 0.67, 0.7, 0.75, and 0.8, the stego key can be recovered successfully. This is because a high embedding ratio in F5 steganography makes the histogram of DCT coefficients shrink significantly. Although we can estimate the capacity from (1), as the embedding ratio r→1, the degree of shrinkage results in large errors of the estimated capacity. For example, in the “lena.jpg” image with a quality factor of 75, when the estimated embedding ratio r = 0.85, the actual embedding ratio is about 0.977. When the estimated embedding ratio r = 0.87, the stego image has been embedded fully. When the embedding ratio is close to 1, i.e., r→1,u0→0.5,u1→0.5, the number of samples required, as computed by (15), then exceeds the message length. A lack of available samples results in a failure to recover the stego key.

  2. 2)

    When the matrix encoding parameter w=2, the stego key can be recovered successfully for various embedding ratios.

  3. 3)

    When the matrix encoding parameter w≥3, the stego key cannot be recovered successfully. The matrix encoding attempts to embed information into more coefficients with fewer modifications, and the modification ratio is

$$ D(w)=\frac{1}{W+1}=\frac{1}{2^{w}}. $$
(17)

As the matrix encoding parameter w increases, the modification ratio becomes smaller. This smaller modification ratio implies a smaller difference between the stego image and the cover image, and a smaller difference between the distributions of message bits extracted by the true stego key and the false stego key. Thus, more samples are needed to recover the stego key successfully. If the number of samples required exceeds the message length, the lack of available samples results in a failure to recover the stego key. For example, when w=3, the modification ratio is 1/8 and the maximum embedding ratio is 3/7 ≈0.42. In this case, even when all of the extracted bit samples are utilized, this is still less than the number of samples required. Therefore, the stego key cannot be recovered successfully.

In the following, the proposed method is compared with the stego key recovery method proposed by Xu et al. [30] that is denoted as Xu’s method.

  1. 1)

    Comparison between the principles. Xu’s method fits the distributions of the quantized DCT coefficients in the embedding paths generated by the true stego key and false stego keys, whereas the proposed method utilizes the distribution difference between the message bits extracted by the true stego key and false stego keys. Xu’s method does not compute the number of samples required, whereas the proposed method contains a simple method to compute this number.

  2. 2)

    Comparison between the application scopes. Xu’s method is only effective for F5 steganography without matrix encoding, whereas the proposed method is effective for F5 steganography with matrix encoding. Neither Xu’s method nor the proposed method can successfully recover the stego key when the embedding ratio r→1 or r→0. In the following, we compare these two methods in the cases of large embedding ratios and small embedding ratios.

    In the case of large embedding ratios, as the embedding ratio r close to 1, there should be large error in the estimated capacity. Thus, the embedding ratio r cannot be too large. For example, in the “lena.jpg” image with a quality factor of 75, when the estimated embedding ratio r=0.87, the stego image has been fully embedded. Therefore, only experimental results for embedding ratios of 0.65, 0.7, 0.75, 0.8, and 0.85 are listed in Table 4. The results show that, when the embedding ratio are 0.75 and 0.8, the proposed method can recover the shuffling key successfully, but the Xu’s method fails. When the embedding ratio r = 0.85, both methods fail. This is because the embedding ratio is computed with a large error in the capacity estimation, and the actual embedding ratio is about 0.977. This is very close to 1, causing a very slight difference between the statistical characteristics along the embedding paths generated by the true shuffling key and the false shuffling key. Therefore, there are insufficient samples available to recover the shuffling key.

    Table 4 Experimental results in the cases of large embedding ratios

    In the case of small embedding ratios, the experimental results are presented in Table 5. The results show that, when the embedding ratios are 0.001, 0.002, and 0.003, both methods fail to recover the shuffling key because too few samples are available. Both methods use χ2 statistics, but the proposed method divides the samples into two categories whereas Xu’s method divides the samples into five categories [30]. In the χ2 testing, each category must contain sufficient samples. Thus, Xu’s method is more likely to fail because of less samples in each category. Therefore, when the embedding ratios are 0.004 and 0.005, Xu’s method fails while the proposed method can recover the shuffling key successfully. Hence, the proposed method outperforms Xu’s method [30].

    Table 5 Experimental results in the cases of small embedding ratios
  3. 3)

    Comparison between the time complexities. Both Xu’s method and the proposed method must search the given shuffle key space. Therefore, if the shuffling key space is fixed, the time complexities of the two methods are determined by the number of samples required. A larger number of samples would result in higher time complexity. When the numbers of samples required by the two methods are equivalent, Xu’s method should extract n bits for each possible key in the shuffling key space. Its time complexity is n|K|. But the proposed method in this paper need to extract only 31 bits of metadata to determine the set of candidate shuffling keys B, then extract n bits for each candidate shuffling key. Thus, the time complexity of the proposed method is 31|K|+n|B|. Because the cardinality of B is usually much smaller than that of the shuffling key space, the time complexity of the proposed method is usually lower than that of Xu’s method.

As an example, consider the stego “lena.jpg” image with a quality factor of 75 and embedding ratio of r = 0.66 (matrix encoding parameter w = 2). All possible shuffling keys were used to extract the encoding parameter w and the message length l. When the extracted encoding parameter w=1, 2, …, 9, the corresponding shuffling key was added to the set of candidate shuffling keys, B. Figure 5 shows the numbers of keys by which the encoding parameter w=1, 2, …, or 9 were extracted. The examination of encoding parameter reduced the cardinality of the shuffling key space from 106 to 137533. Because the encoding parameter is related to the embedding ratio, matching the encoding parameter to the embedding ratio further reduced the shuffling key space to 15091 keys when w = 2, which is only 1.51% of the whole shuffling key space. Because the message length cannot be greater than the number of non-zero AC coefficients, by excluding the keys that generate such message lengths, i.e., l>NAC≠0, the cardinality of the shuffling key space was reduced to 60, just 0.006% of the whole shuffling key space. Therefore, the time complexity of the proposed method is significantly lower than that of Xu’s method.

Fig. 5
figure 5

Numbers of shuffling keys by which the encoding parameter w=1, 2, …, or 9 were extracted

5 Conclusions

F5 steganography synthesizes matrix encoding and DCT coefficients shuffling, and takes the check matrix of the matrix encoding and a shuffling key as the stego key. However, previous stego recovery methods only work in the absence of matrix encoding. Therefore, this paper proposes a stego key recovery method to recover the check matrix and shuffling key. Firstly, the check matrix of the matrix encoding is recovered based on the embedding ratio estimated by quantitative steganalysis. The shuffling key is then recovered using a χ2 testing and by examining the extracted metadata. Experimental results demonstrate the effectiveness and superiority of the proposed method over the existing stego key recovery method for F5 steganography—Xu’s method.

However, in STC-based steganography, multiple check submatrices are placed next to each other and shifted down by on row to generate the check matrix. The check submatrices could be generated by a key, and the decoded message bits are controlled by not only the check submatrix and stego bit subsequence in the corresponding positions, but also the check submatrices and stego bit subsequences in previous positions. Therefore, the proposed method can not directly applied to the recovery of the stego key for the STC-based steganography. And we will try to find the new property of the bit sequence decoded from the randomly shuffling coefficients to recognize the correct stego key.

Additionally, in the future work, we will try to locate the stego positions by machine learning and searching for similar images [41, 42] and even consider the generation and operation history of the stego image [4345].