Abstract

The asymptotic performance of systematic rateless codes is first analysed using Gaussian approximation (GA) based on mutual information, which provides more accurate decoding thresholds than the method based on message mean. A modified linear program (LP) algorithm is proposed, where two key parameters are precomputed to efficiently search degree distributions with low overhead and appropriate average degree. The asymptotic analysis and simulation results show that the optimized code outperforms the codes obtained by conventional LP. Furthermore, the effects of outer code on overall code rate and decoding complexity are discussed.

1. Introduction

Rateless codes keep generating symbols until successful decoding; thus, they are more robust than traditional fixed-rate codes on noisy channels. The first practical rateless code is Luby Transform (LT) code [1], the extension of which is Raptor code constructed by an inner LT code serially concatenated with an outer high-rate low-density parity-check (LDPC) code [2]. They are originally proposed on a binary erasure channel (BEC) and then extended to noisy channels [3, 4], where the well-known sum-product algorithm (SPA) is employed to decode.

Many asymptotic analysis tools have been implemented to analyse and design LT code and Raptor code on binary input additive white Gaussian noise (BIAWGN) channels, such as Gaussian approximation based on message mean (GA-mean) [5, 6], extrinsic information transfer (EXIT) chart [79], and discretized density evolution (DDE) [10]. Recently, the systematic version of rateless codes have received extensive attention due to their several advantages over the nonsystematic codes. For example, the original information bits may be directly visible in the codewords of systematic codes; nonsystematic rateless codes usually need at least one symbol with degree-1 to start decoding [5] while systematic codes have a lot; systematic codes may have smaller generating/decoding matrix thanks to the systematic part and then lower encoding/decoding complexity may be achieved.

Systematic LT (SLT) and systematic Raptor (SR) codes on BEC are proposed in [2, 11], respectively. However, on noisy channels, SLT codes are generated in a straight-forward manner [12], i.e., the original symbols are first transmitted followed by LT-coded symbols. If an inner SLT code is serially concatenated with a high-rate outer code, the resulting code can also be referred as an SR code [13]. The SLT codes are analysed and designed based on the GA-mean method in [14, 15], but this method may introduce some nonignorable errors of message density resulting in poor performance of designed codes [5]. DDE provides an accurate asymptotic result, and SR codes have been analysed in [13], but this method has pretty high complexity because it is essentially an infinite-dimensional analysis.

The EXIT chart has the same calculation complexity as GA-mean since both are one-dimensional analysis, but the former may provide asymptotic results closer to DDE, which have been shown in the analysis of LDPC codes [16] and SR codes [17]. As the EXIT chart is also based on Gaussian approximation and the tracking variable in decoding iteration is mutual information, we refer it as Gaussian approximation based on mutual information (GA-MI) in the following. The contributions of this paper are summarized as follows. First, we provide the asymptotic curves of SLT codes and SR codes on BIAWGN channels using GA-MI, which is very close to DDE. Second, we provide a modified linear program (LP) algorithm to design SR codes, where two important optimization parameters are precomputed according to the requirement of outer code and some lower bound analysis. The optimized degree distribution has lower overhead and appropriate average degree than those obtained by conventional LP. Moreover, the balance of code-rate and decoding complexity is discussed.

2. Encoding and Decoding of Systematic Rateless Codes

Like conventional Raptor codes, the SR codes include two stages of encoding and decoding. Firstly, the source symbols are precoded using a high-rate LDPC code to produce the intermediate symbols . Note the outer code rate . Then, the intermediate symbols are encoded using an inner SLT code to generate output symbols as given by , where the generation matrix , is an identity matrix, and is the generation matrix of size like conventional LT code [15]. The rates of inner SLT code and SR code are and , respectively.

The output symbols are transmitted on BIAWGN channels with binary phase shift keying modulation modelled as , where and is a zero-mean Gaussian noise with variance . The channel log-likelihood ratio (LLR) is given by . Two-step sequential decoding is considered [13], where the decoding of inner SLT codes is followed by decoding of outer LDPC codes. The SPA is implemented in the inner decoder according to the bipartite graph related to , which is constructed using intermediate symbols and output symbols as variable nodes (VNs) and check nodes (CNs). The message update rule between VNs and CNs is given by where and denote the message passed from the th check node to the th variable node and the message passed in the opposite direction at the th decoding iteration, and represents the set of all neighbours of th node except for the th node. After the maximum iteration , the decision is provided as the initial LLR to the outer decoder.

3. Asymptotic Analysis Based on Mutual Information

3.1. GA-MI for Inner Code

Using polynomial notations, the CN degree distribution of the inner SLT code is written compactly as , where is the maximum degree. The corresponding edge degree distribution is calculated by , and the average degree of CNs is given by . The average degree of VNs is denoted by , and the variable node and edge degrees are both assumed to be Poisson distributed as and , respectively [7]. They can also be truncated to obtain polynomials as and , where is large enough to ensure . The overhead of the inner SLT code is defined by which is related to the code rates as .

Assuming the all-zero codeword is transmitted, the initial LLR has the symmetric Gaussian density as . The mutual information of is computed by [8] where is a monotonically increasing function and then has an inverse function . Both and do not have close-form expressions, but they can be closely approximated as suggested in [18].

From (2), we know that the output of VNs is the sum of message of channel LLR and CNs; thus, the mutual information passed from variable nodes at the th iteration is calculated by where the is the variance of initial LLR and is the variance of message output by check nodes. The mutual information passed by CNs at the th iteration is like conventional LT codes as given by where and . Combining (4) and (5), we can get the iteration expression of as the function of , , and :

At the end of iteration , the inner decoder outputs the extrinsic information provided to the outer decoder as and the corresponding error probability is calculated as where the function .

To compare the accuracy of different methods, we plot the asymptotic bit-error-rate (BER) of the SLT code using the above GA-MI method and GA-mean method in Figure 1, where the degree distribution takes the optimized result of in [15]. Also presented are the results by DDE and simulated BER with finite symbol length . It is shown that the curve of GA-MI is much closer to the curve of DDE than GA-mean, and the simulation BERs of finite-length codes converge to the results of GA-MI with the increase of . Note that the asymptotic BER curve by GA-mean is lower than the ones by DDE and GA-MI in the water-fall region of overhead. It is not to say that the GA-mean method has better performance than the other two methods but the approximated message density represented by its mean has a relatively larger deviation from the actual density given by DDE. Therefore, we say that the GA-MI provides a more accurate asymptotic performance than the GA-mean method. In addition, the line of critical BER in Figure 1 is explained in the next subsection.

3.2. GA-MI for Outer Code and Critical BER

The extrinsic information output by an inner decoder is taken as a priori information of outer decoder, and the asymptotic performance of outer codes can also be obtained by GA-MI. Assuming the message output by an inner decoder is Gaussian distributed [10], the variance of initial LLR of an outer decoder is calculated by

We denote the edge degree distributions of variable nodes and check nodes of outer LDPC codes as and . The exchange of mutual information is analogous to the SLT code, and then, the iteration expression is given by

The decoder outputs the error probability after the last iteration as where is the variable node degree distribution.

As shown in [10], the outer code would be successfully decoded if the BER of inner code lowers to a certain level, which is called as “critical BER.” Combining (7)–(11), we can calculate the critical BER which is denoted by , i.e., if . The critical BER for a 50/51-rate regular (4,204)-LDPC code is , which is very close to the results of obtained by DDE in [10]. This critical BER is also depicted in Figure 1, and the corresponding overheads obtained by GA-MI and DDE are between 1.3 and 1.4, which could be taken as the decoding thresholds of SR codes. Furthermore, we see that the employed code takes a wide range of overhead (from 0.9 to 1.4) to reduce to the critical BER; therefore, the performance may be further improved if we can narrow this region.

4. Efficient Design of Rateless Codes

The outer code is fixed as a high-rate LDPC code, and we concentrate the optimization on inner SLT codes. The classic linear program algorithm is combined with GA-MI to design SLT codes, which is described as follows. where the function is defined in (6) and is an integer indicating the maximum number of sampling points of . The objective of the above design model indicates the minimized overhead as . The second and third constraints ensure the distribution validity. The first constraint ensures that the tracking variable, i.e., mutual information from CNs, converges to a fixed value in decoding iterations. Note that is the maximum extrinsic information output by CNs, different from the minimum (see Equation (7)) which is the extrinsic information required by an outer decoder for successful decoding.

The selection of parameters and is very important because they affect the performance of optimization results in a great measure. If is too small, the results cannot achieve the critical BER quickly, whereas the average degree , which indicates decoding complexity to some extent, will be increased if is too large. If is too small, the optimization process may not succeed, while the decoding complexity will be increased if is too large considering . Furthermore, and influence each other (see Equation (8)) so that it is easy to fall into the circle of “chicken and egg, which came first” problem.

The conventional LP method chooses these two parameters separately by experience or many attempts [5, 6, 9], which may not find the best result or even not converge successfully if the selected parameters are inappropriate. A rough range of these two parameters is given in [7] but no exact result is provided. In [15], a new constraint condition about the lower bound of error probability is added, but the results are not suitable for SR codes (see the analysis for Figure 1 in former section). Hence, we propose a new approach to efficiently locate the values of these two key parameters in the following, using which the LP search is definitely convergent and the resulting codes have excellent performance.

Step 1. Assume is fixed and rewrite the error probability of Equation (8) to a binary function as given by where the coefficients [7]. Note that is monotonically decreasing relative to either of two inputs.

Step 2. Calculate the bounds of two parameters, namely, and . As in [8], the LLR output by VNs may be perfect () in the final iteration; from (1), we know the output of CNs is equal to the channel LLR, which is a zero-mean Gaussian variable with variance . Therefore, the upper bound of mutual information output by CNs is Assuming the critical BER as , the required average degree is where is the reverse function of (13) related to . In consider of the monotonicity of (13), is the lowest possible value of satisfying . However, is not achieved for small overheads, thus the selected must be larger than .

Step 3. Set the sampled average degree as , where , is the sampling interval, and controls the maximum . Then, get the maximum value of mutual information corresponding to as given by where is the reverse function of (13) related to .

Step 4. Implement the LP algorithm of Equation (12) with the above and corresponding to search degree distributions.

This method is applied with channel parameters , , and . From (14) and (15), we have and ; thus, we start with but the searched degree distribution has a large overhead as . For different , the corresponding and the resulting overheads are both plotted in Figure 2. It is seen that the overhead decreases very quickly to a minimum and then increases slowly with the increase of . The corresponding (not listed) is monotonically increasing as and so does the decoding complexity. Therefore, the best distribution could be chosen as the one with minimum overhead, and here is with . If changing to another channel parameter and , we still observe the similar phenomenon; thus, the last two steps of the above method could be combined into one step: the algorithm stops if the resulting overhead starts to increase and then the previous one is taken as the optimized results.

The proposed method can also be extended to the case of variable outer code rate. Setting the outer codes as regular LDPC codes with a rate from 0.90 to 0.99, we present the critical BERs and the feature parameters of resulting codes in Table 1. In general, with the increase of outer code rate , the critical BER decreases and then the overhead of inner codes increases. The overall rate increases along with the rising of , but the decoding complexity, indicated by , also increases. In practice, one can choose the codes balancing the rate and decoding complexity based on system requirements.

5. Simulation Results

We choose a regular (4,200)-LDPC code with a rate of as the outer code, and the distributions of the inner code optimized by the above method are shown in Table 2, the bottom line of which gives the corresponding overhead threshold. The code of is optimized under , which is also the channel condition for code optimization in [15]. We denote the distribution of the corresponding code in [15] as . For comparison, we also denote the classic distribution optimized with in [2] by . The asymptotic performance of SR codes with these three distributions is shown in Figure 3.

It is seen that the optimized distribution has the best performance in overhead threshold as , which is smaller than of . The corresponding SR code-rate thresholds of and are around 0.478 and 0.415, respectively, equivalent to about 15% increase in code rate. Furthermore, the average degree of is , which is smaller than of , equivalent to about 23% decrease in average degree. The code of has the smallest average degree as , but the overhead threshold of is the highest. The reason of the optimized code having advantages in both overhead and average degree is that the proposed method has found the optimal design parameters for LP, using which the searched result can achieve the critical BER as soon as possible with an appropriate average degree. We can see that the curves of descends sharply and “exactly” to the intersection point with the line of critical BER, while have a wide transitional region and the start point of error floor is far below critical BER (would increase average degree).

To further verify the advantage of optimized code, we depict the simulated BER performance of SR codes with those distributions in Figure 4, where information length is fixed to . Due to the degree distributions of which is originally designed for nonsystematic Raptor (NR) code, we also provide the simulation result of the NR code with . It is seen that the optimized SR code indeed outperforms the existing SR codes, which validates the asymptotic results and verifies the effectiveness of our design method. However, the performance of the NR code with and the optimized SR code is very close to each other, the reason of which is analysed as follows. Under the channel with , those two codes achieving a BER below have a rate close to 1/2, and from [10], we know that the performance of rate-1/2 NR code with is actually approaching the channel capacity, i.e., the space for performance improvement is limited. In addition, the obvious performance difference is observed between the NR code and the SR code with the same degree distribution ().

The code of in Table 2 is optimized under which has an overhead of , corresponding to the overall rate and the bit signal-to-noise ratio . In Figure 5, the SR code of is compared with the rate-1/2 SR code and NR code optimized in [10, 13], the distributions of which are denoted by and , respectively. It can be seen that our optimized SR code outperforms those two codes, though the improvement is not very big when compared with the NR code. The code of in Table 2 is optimized under , corresponding to the high symbol signal-to-noise ratio (SNR) . It is also compared with the SR code of and the NR code of in Figure 5. The optimized SR code achieves a BER of when the overhead , indicating that only a few LT-coded symbols of the inner code are needed for successful decoding. The overheads of the SR code with and the NR code with achieving a BER of are around 0.36 and 0.30, respectively. This result implies that the optimized SR code may have better performance than the NR code under high SNR (or high rate).

6. Conclusions

In this paper, we have investigated the asymptotic performance and distribution design of systematic rateless codes using the GA-MI method, which gives a more accurate result than the GA-mean method. Based on critical BER analysis, the average degree of VNs and the maximum mutual information in linear program are analysed and located efficiently, using which we have obtained the codes with lower overhead and appropriate decoding complexity than existing codes. The influence of outer code rate on the overall code rate and decoding complexity is also discussed. In addition, the optimized SR codes are compared with NR codes under different SNR.

Data Availability

The numerical and simulation data used to support the findings of this study are included within the supplementary information file.

Conflicts of Interest

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant No. 61501469) and supported by the China Postdoctoral Science Foundation (Grant No. 2015M572782).

Supplementary Materials

(Supplementary Materials)