1 Introduction

The Internet of Things (IoT) is developing rapidly in theory and application. With the development of narrow band Internet of Things (NB-IoT) and the accelerating popularity of 5G, the cost of connecting devices to IoT has been reduced, which greatly promotes its development. As a result, more and more devices are connected to the Internet of Things in various fields, from smart terminals to industrial facilities. Devices connected to the Internet of Things continue to generate and transmit large amounts of information. But the availability of wireless communication links to attackers means that data on the Internet of Things are vulnerable to various types of attacks, such as data eavesdropping and data tampering. Therefore, it is not safe to transmit information in plaintext form on the Internet of Things. Cryptography is the most widely used method to ensure the security of data transmission and sharing in the network. Encrypted data transmission on the Internet of Things has become the key to secure communication between devices. RSA public key cryptography has been well researched and widely used since the introduction of Rivest, Shamir and Adleman in 1975. The system relies on Integer Decomposition Problem (IFP), which typically uses a 1024-bit or more key. In 1976, Hellman proposed a public key exchange algorithm based on the Discrete Logarithm Problem (DLP) [1]. In 1986, Miller [2], Koblitz [3] independently proposed the Elliptic Curve Cryptography (ECC). ECC is a kind of encryption technology based on DLP. It uses elliptic curve in finite domain to generate finite Abel group to implement public key cryptographic primitives. Unlike the RSA algorithm, which relies on the sub-exponential time algorithm to solve the integer decomposition problem, the best algorithm for solving the basic mathematical problems of ECC involves the Elliptic Curve Discrete Logarithm Problem (ECDLP). This leads to the infeasibility of solving the ECDLP algorithm, which increases rapidly with the size of the problem and is much higher than integer factorization and discrete logarithm problem. Therefore, ECC requires only smaller keys than public key cryptography (such as RSA and ElGamal), while providing the same level of security. For example, an ECC that provides the same level of security as RSA with a 1024-bit key size requires only 160 keys. Due to the high security of per key length, ECC is widely used for mobile devices and IoT. However, ECC operations are still very complex and costly for devices with poor computing power, limited energy reserves and intensive data transmission in IoT. This poses a challenge for the long-term stable function and real-time data transmission of IoT devices. Therefore, we need to improve the operation of ECC to make it lighter, thereby increasing the efficiency of cryptography and reducing costs.

The operation of the ECC works on a multiplication group over a finite field. The scalar multiplication of an elliptic curve is an operation that adds a point P on the curve k times.

$$Q = kP = P + P + \cdots + P,\;k\;times$$

where \(P\) is a point on an elliptic curve and k is a large positive integer. In any primitive implementations of ECC, scalar multiplication is the main computing operation. The key factor to improve the efficiency of ECC is how to realize fast scalar multiplication. Therefore, many researchers have proposed various studies on accelerated scalar multiplication. Morain et al. [4] proposed the non-adjacent form (NAF), which is a signed form of representation. This form ensures that at least one of any two adjacent terms is zero. Solinas et al. proposed the Joint Sparse Form (JSF) based on NAF [5]. JSF is the best signed binary representation of a pair of integers, which can generate more double-zero bits than NAF. Koblitz [6] and Solinas [7], respectively, proposed an anomalous binary curve on which Frobenius mapping can be used, and an effective scalar representation on that curve—reduced \(\tau\)-adic non-adjacent form (RTNAF). The squaring on an anomalous binary curve is implemented by a displacement, which can be performed in a very short time. Under the RTNAF representation, the scalar multiplication \(\tau P\) is quickly obtained by squaring on the \(x\) and \(y\) coordinates of the point \(P\). Cohen proposed a more efficient hybrid addition operation by combining projective coordinates and affine coordinates into mixed coordinates [8]. The introduction of Jacobian coordinates eliminates expensive inversion operation in scalar multiplication under affine coordinates. In recent years, the method of expressing a large integer \(k\) by double base and multi-base has attracted widespread attention. Dimitrov [9] first proposed the Double-Base Number System (DBNS) and applied it to speed up scalar multiplication, which effectively reduces the number of point additions in scalar multiplication, by taking advantage of the sparseness and the ternary nature of DBNS. But there are some repetitive computations in DBNS. In order to solve this problem, Dimitrov [10] proposed Double-Base Chain (DBC) on the basis of DBNS. DBC performs computations in a nested form, so that the results of each part of the computation will be reused, reducing the occurrence of repeated computations. Mishra extended DBNS to Multi-Base Number Representation (MBNR), breaking the limitation that only two bases can be used to represent scalars, so as to bring higher redundancy in the representation of scalars [11]. But MBNS has some repetitive computations like DBNS.

In this paper, we propose an efficient formula for fast computation of the sevenfold of elliptic curve points over the binary fields, which can be used in DBNS and MBNR to compute the scalar multiplication of elliptic curves. This formula uses division polynomial and multiplexing of intermediate values in affine coordinates to increase the speed of computing the sevenfold point by more than 14%. We also proposed a scalar multiplication algorithm based on the Step Multi-Base Representation (SMBR). This algorithm uses the sevenfold point formula we proposed and replaces the traditional point doubling with the faster point halving. Experimental results show that our scalar multiplication algorithm is more efficient in affine coordinates over binary fields and contributes to reducing the cost of cryptography for devices in the IoT.

The organization of this paper is as follows. In Sect. 2, we briefly introduce the basics of elliptic curves, point halving, double-base chain and multi-base representations. In Sect. 3, we give an efficient formula for computing the sevenfold point of an elliptic curve over binary fields and provide proof of the formula and analysis of the computational cost. In Sect. 4, we propose a scalar multiplication algorithm based on SMBR and give the method of scalar \(k\) conversion to SMBR and the detailed steps of the scalar multiplication algorithm. In Sect. 5, we show experimental results, compare and analyze our algorithm with other research and demonstrate that our method is more efficient. Finally, in Sect. 6, we draw our conclusions.

2 Related work

In this section, we will review the concepts and research status of ECC, point halving, double-base chains and step multi-base representation.

2.1 Elliptic curve cryptography (ECC)

Definition 1

(Elliptic curve cryptography) An elliptic curve \(E\) over a finite field \(GF\) field \(K\) can be defined by the Weierstrass equation

$$\begin{array}{*{20}l} {E:y^{2} + a_{1} xy + a_{3} y = x^{3} + a_{2} x^{2} + a_{4} x + a_{6} } \hfill \\ \end{array}$$
(1)

where \(\begin{array}{*{20}l} {a_{1} , \, a_{2} , \, a_{3} , \, a_{4} , \, a_{6} \in K} \hfill \\ \end{array}\), and \(\Delta \ne 0\),where \(\Delta\) is the discriminant of \(E\).

In practice, adjusting the variables within the admissible range can greatly simplify the Weierstrass Eq. (1).

Over prime fields,\(K = F_{P}\), if the characteristic of \(K\) is not equal to 2 and 3, then Eq. (1) can be simplified to

$$\begin{array}{*{20}l} {y^{2} = x^{3} + ax + b} \hfill \\ \end{array}$$
(2)

where \(a,b \in F_{p}\), \(\begin{array}{*{20}l} {\Delta = 4a^{3} + 27b^{2} \ne 0} \hfill \\ \end{array}\).

Over binary fields, \(K = F_{{{2}^{m} }}\), the elliptic curve \(E\) is called the non-supersingular curve, and Eq. (1) can be rewritten as

$$y^{2} + xy = x^{3} + ax^{2} + b$$
(3)

where \(a,b \in F_{{2^{m} }}\),\(\Delta = b \ne 0\).

The set \(E(K)\) of rational points and the infinity point \(O\) defined on an elliptic curve \(E\) over a field \(K\) form an abelian group under the operation (usually denoted by addition) defined by the law of chord and tangent. If the points on an elliptic curve \(E\) are represented in affine coordinates, such as \(P = (x,y)\) and \(Q = (u,v)\), then both the point addition (\(P + Q\)) and point doubling (2P) require an expensive field inverse operation. We use \([i]\),\([s]\) and \([m]\) to represent the computational cost of one inversion \(I\), one squaring \(S\) and one multiplication \(M\), respectively. In order to facilitate the gauging of the computational cost of inversions, the \([i]/[m]\) ratio is defined according to the ratio of the cost between one inversion and one multiplication. It is generally assumed that \({3} \le [i]/[m] \le 10\) for the binary fields [12], and \([i]/[m] \ge 30\) for the prime fields [13]. In addition, squaring is the least expensive of the three main operations. Over binary fields, squaring is a linear operation with negligible computational cost, and it is generally assumed that \([s] \le 0.1[m]\) [12]. Over prime fields, \([s] = 0.8[m]\) is generally assumed, but in order to prevent side-channel attacks (SCA) from using side-channel atomicity [14], the same multiplier needs to be used to perform squarings and multiplications, then \([s] = [m]\).

2.2 Point halving

The point halving independently proposed by Knuden [15] and Schroeppel [16] is a reverse operation of point doubling. Assume that \(P = (x,y)\) and \(Q = (u,v)\) are two points defined on the elliptic curve \(E\) Over binary field and expressed in affine coordinates, satisfying \({\text{Q}} = 2P\). If we know the affine coordinates of the point \(P\), the coordinates of point \(Q\) can be obtained by point doubling using the following equation:

$$\lambda = x + y/x$$
(4)
$$u = \lambda^{2} + \lambda + a$$
(5)
$$v = x^{2} + u(\lambda + 1)$$
(6)

Point halving is the completely opposite operation. When \(Q = (u,v)\) is known, find \(P = (x,y)\) so that \({\text{Q}} = 2P\), denoted as \({\text{P = }}\frac{{1}}{{2}}Q\). First, we need to solve \(\lambda^{2} + \lambda = u + a\) according to Eq. (5) to get \(\lambda\), then solve \(x^{2} = v + u(\lambda + 1)\) from Eq. (6) to get \(x\) and finally calculate \(y = \lambda x + x^{2}\) based on Eq. (4) to get \(y\). The logic of the point halving is shown in Algorithm 1. The detailed analysis of the solving process and the calculation complexity about point halving are shown in [13].

figure a

Point halving is less expensive to compute than point doubling, and the advantage is more evident when the point \(P\) is unknown in advance and the \([i]/[m]\)-ratio is small [13]. If the doubling required in the usual double-and-add operations is all replaced by point halving, the calculation speed can be accelerated by 50% [16].

2.3 Double-base chain (DBC)

The Double-Base Number System (DBNS) was originally proposed by Dimitrov [9] as a scheme for the representation of positive integers, where each positive integer \(k\) can be expressed as the sum or difference of a number of 2-integers (the product of the powers of two relatively prime positive integers). For example, the positive integer \(k\) represented by the application of {2,3}-integers (the sub-formats are \(2^{b} 3^{t}\)) are

$$k = \sum\limits_{i = 1}^{m} {s_{i} 2^{{b_{i} }} 3^{{t_{i} }} } ,\;{\text{with}}\;s_{i} \in \{ - 1,1\} \;{\text{and}}\;b_{i} ,t_{i} \ge 0$$
(7)

In [9], Dimitrov described the attributes of DBNS and proved that DBNS have a high degree of redundancy. Obviously, there are no unique forms of DBNS representation of an integer. For example, there are 5 different DBNS representations of 10, 72 representations of 50 and 402 representations of 100, where only positive sums (\(s_{i} = 1\),\(i \in [1,m]\)) are considered. The number of representations grows rapidly as the integer k increase.

Subsequently, Dimitrov [10] developed the Double-Base Chain (DBC) and specified that a DBNS can only be transformed into a DBC if the exponent of the base in the DBNS is a decreasing sequence, that is \(k = \sum\nolimits_{i = 1}^{m} {s_{i} 2^{{b_{i} }} 3^{{t_{i} }} }\) and satisfies \(s_{i} \in \{ - 1,1\}\),\(b_{1} \ge b_{2} \ge \cdots \ge b_{m} \ge 0\) and \(t_{1} \ge t_{2} \ge \cdots \ge t_{m} \ge 0\).

Definition 2

(Double-base chain (DBC)) Given a positive integer \(k > 0\), a sequence \((C_{n} )_{n > 0}\) of positive integers, if it satisfies:

$$C_{1} = 1,C_{n + 1} = 2^{u} 3^{v} C_{n} + s,\;{\text{with}}\;s_{i} \in \{ - 1,1\}$$
(8)

for some \(u,v \ge 0\), such that \(\exists m > 0\) satisfies \(C_{m} = k\), the sequence \(C_{m}\) is called a double-base chain of \(k\), and \(m\) is the length of this double-base chain. The length of the double base chain is equal to the number of \(2^{b} 3^{t}\) subitems in DBNS.

Double base chain make all calculated values reusable by restricting the sequence \((b_{i} )\) and \((t_{i} )\) in DBNS to decrease monotonically (i.e.,\(b_{1} \ge b_{2} \ge \cdots \ge b_{m} \ge 0\), \(t_{1} \ge t_{2} \ge \cdots \ge t_{m} \ge 0\)) and by applying recursive calculations. The double-base chain representations are highly redundant and can dramatically reduce the Hamming weights in scalar expansion. Algorithm 2 gives the process for transforming a large integer \(k\) into DBNS that conform to exponential constraints.

figure b

In Algorithm 2, the maximum exponential limit on the base \(x\) is generally set to \(\log_{x} (k)\), i.e., \(b_{\max } < \log_{2} (k) \le n\), \(t_{\max } < \log_{3} (k) \le 0.65n\), where \(n\) is the binary bit size of the positive integer \(k\). With a positive integer of 160-bit size, for example, \(b_{\max } = 160\) and \(t_{\max } = 103\) can be specified.

The ternary/binary method for fast computation of ECC scalar multiplication proposed by Ciet [17] was subsequently applied to double-base chain to decrease the number of inversions and the execution time by efficient point doubling (\(2P\)), point tripling (\(3P\)) and point quadrupling (\({4}P\)).

2.4 Step multi-base representation (SMBR)

In [11], Mishra extends the Double-Base Number System to Multi-Base Number Representation. Given a large positive integer \(k\) and a set \(B = \{ b_{{1}} ,b_{2} , \ldots ,b_{l} \}\) of small positive integers, if there exists an expression for \(k\) which is the sum of the products of the powers of the elements in base \(B\), i.e.,\(k = \sum\nolimits_{j = 1}^{m} {(s_{j} \prod\nolimits_{i = 1}^{l} {b_{i}^{{e_{ij} }} )} }\),with \(s_{j} \in \{ - 1,1\}\) and \(e_{ij} \ge 0\), then the above equation is called the multi-base number representation of the integer \(k\) using base \(B\). The integer \(m\) is the length of the MBNR and \(\left| {\text{B}} \right|\) denotes the number of elements in the set of B. If \(\left| B \right| = 2\), the MBNR is simplified to a DBNS. DBNS already has a high degree of redundancy and a short expression. MBNR has a higher redundancy and shorter expressions than DBNS.

Definition 3

(Step multi-base representation) A step multi-base representation based on the set \(B = \{ b_{{1}} ,b_{2} , \ldots ,b_{l} \}\) is

$$k = \sum\limits_{i = 1}^{m} {s_{i} b_{1}^{{e_{1i} }} b_{2}^{{e_{2i} }} \ldots b_{l}^{{e_{li} }} }\quad \;{\text{with}}\;s_{i} \in \{ - 1,1\}$$
(9)

where \(\forall j \in \{ 1,2, \ldots ,l\}\), the sequence \((e_{j} )\) is monotonously decreasing, then it is called the step multi-base representation of the integer k.

SMBR can be seen as a generalized extension of the DBC. If \(\left| B \right| = 2\), the SMBR is simplified to a DBC. Same as the DBC, all intermediate calculated values can be reused in SMBR. Thus, SMBR has a higher redundancy and can further reduce the Hamming weights in scalar expansion. As the efficient formula for point quintupling was proposed in [11], the SMBR based on \(B = \{ 2,3,5\}\), i.e., \(k = \sum\nolimits_{i = 1}^{m} {s_{i} 2^{{b_{i} }} } 3^{{t_{i} }} 5^{{q_{i} }}\), is widely used and studied [18, 22]. In the scalar multiplication algorithm, the selection of the base of SMBR will determine the computational performance of the algorithm. Replacing the existing base with a more efficient multiple-point formulas can further increase the redundancy of the expression and reduce the computational cost of scalar multiplication algorithm. Therefore, in Chapter 3, we proposed an efficient septuple formulas, and in Sect. 4, we proposed a scalar multiplication algorithm based on the SMBR with \(B = \{ {{1} \mathord{\left/ {\vphantom {{1} {2}}} \right. \kern-\nulldelimiterspace} {2}},3,{7}\}\).

3 Septuple formula design

In this section, we give an efficient septuple formula for elliptic curve points over binary fields, a proof of the formula and an efficiency analysis.

3.1 Point septupling in elliptic curves over binary fields

In order to reduce inversions in multipoint operations, it is common practice in research work to convert points to other coordinate systems (e.g., the Jacobi coordinates). However, over binary fields, where the \([i]/[m]\)-ratio is small, generally \({3} \le [i]/[m] \le 10\), the elliptic curve group arithmetic in affine coordinates already has excellent performance. Therefore, we propose a new point septupling formula in affine coordinates. Let \(P = (x,y)\) be a known point on an elliptic curve shown in Eq. (3) over a binary field. Assume that the sevenfold point of \(P\) is expressed as \({7}P = (u,v)\). Then, we can obtain \(u\) and \(v\) by the following formula:

Let us define a set of polynomials as follows: Then,

$$\begin{gathered} A = x^{4} + x^{3} + a \\ B = x(A + x^{3} ) \\ C = A^{3} + x^{4} B \\ D = A(B^{2} + C) \\ E = A^{6} + x^{4} B(A^{3} + B^{2} ) \\ F = B(A^{2} D + C^{2} ) \\ \end{gathered}$$
$$\left\{ {\begin{array}{*{20}c} {u = x + xD\left( {\frac{xF}{{E^{2} }}} \right)} \\ {v = y + u + \left( {\frac{xF}{{E^{2} }}} \right)\left[ {\frac{CF}{E} + (x^{2} + y)D} \right]} \\ \end{array} } \right.$$
(10)

3.2 Proof of septuple formula over binary fields

For non-supersingular curves over the binary fields, \(K = F_{{{2}^{m} }}\), there are division polynomials as follow:

$$\psi_{1} = 1$$
$$\psi_{2} = x$$
$$\psi_{3} = x^{4} + x^{3} + a$$
$$\psi_{4} = x^{6} + ax^{2} = x^{2} (x^{4} + a)$$

The higher degree division polynomials can be deduced from the following recursive relations:

$$\psi_{2n + 1} = \psi_{n + 2} \psi_{n}^{3} - \psi_{n - 1} \psi_{n + 1}^{3}$$
(11)
$$\psi_{2} \psi_{2n} = \psi_{n + 2} \psi_{n} \psi_{n - 1}^{2} - \psi_{n - 2} \psi_{n} \psi_{n + 1}^{2}$$
(12)

Applying these recursive relations, by sequentially assigning \(n = 2\) in Eq. (11), \(n = 3\) in Eq. (12), \(n = 3\) in Eq. (11) and \(n = {4}\) in Eq. (12), we can obtain:

$$\psi_{{5}} =\psi_{{4}} \psi_{{2}}^{{3}} - \psi_{1} \psi_{3}^{3}$$
$$\psi_{6} = {{\left( {\psi_{5} \psi_{3} \psi_{2}^{2} - \psi_{1} \psi_{3} \psi_{4}^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {\psi_{5} \psi_{3} \psi_{2}^{2} - \psi_{1} \psi_{3} \psi_{4}^{2} } \right)} {\psi_{2} }}} \right. \kern-\nulldelimiterspace} {\psi_{2} }}$$
$$\psi_{7} = \psi_{5} \psi_{3}^{3} - \psi_{2} \psi_{4}^{3}$$
$$\psi_{8} = {{\left( {\psi_{6} \psi_{4} \psi_{3}^{2} - \psi_{2} \psi_{4} \psi_{5}^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {\psi_{6} \psi_{4} \psi_{3}^{2} - \psi_{2} \psi_{4} \psi_{5}^{2} } \right)} {\psi_{2} }}} \right. \kern-\nulldelimiterspace} {\psi_{2} }}$$

Depending on the polynomials \(\psi_{{1}} \psi_{2} \psi_{3} \psi_{4}\) and the recurrence relation (11), (12), for any point \(P = (x,y)\) on a non-supersingular curve, the n-fold of this point can be derived from the formula:

$$[n]P = \left( {x + \frac{{\psi_{n + 1} \psi_{n - 1} }}{{\psi_{n}^{2} }},y + \psi_{2} on + \frac{{\psi_{n + 1}^{2} \psi_{n - 2} }}{{\psi_{2} \psi_{n}^{3} }} + h_{4} \frac{{\psi_{n + 1} \psi_{n - 1} }}{{\psi_{2} \psi_{n}^{2} }}} \right)$$
(13)

where,

$$\psi_{2} on = x + \frac{{\psi_{n + 1} \psi_{n - 1} }}{{\psi_{n}^{2} }}$$

and

$$h_{4} = \left( {x^{2} + y} \right)$$

Assuming that the affine coordinates of \(7P\) is \((u,v)\), we can calculate directly from Eq. (14):

$$u = x + \frac{{\psi_{8} \psi_{6} }}{{\psi_{7}^{2} }}$$
$$v = y + \psi_{2} on + \frac{{\psi_{8}^{2} \psi_{5} }}{{\psi_{2} \psi_{7}^{3} }} + h_{4} \frac{{\psi_{8} \psi_{6} }}{{\psi_{2} \psi_{7}^{2} }} = y + u + \frac{{\psi_{8}^{2} \psi_{5} }}{{\psi_{2} \psi_{7}^{3} }} + \left( {x^{2} + y} \right)\left( {\frac{{\psi_{8} \psi_{6} }}{{\psi_{2} \psi_{7}^{2} }}} \right)$$

However, calculating \((u,v)\) directly using the Formulae above is not the most suitable method. In the process of calculating \(\psi_{{1}} \psi_{2} \ldots\psi_{8}\) and \((u,v)\), many intermediate values are generated. The calculation can be accelerated by transforming the formula forms and multiplexing the intermediate values. We define polynomials: \(A = x^{4} + x^{3} + a\), \(B = x(A + x^{3} )\),\(C = A^{3} + x^{4} B\), \(D = A(B^{2} + C)\), \(E = A^{6} + x^{4} B(A^{3} + B^{2} )\) and \(F = B(A^{2} D + C^{2} )\), then the transformed forms of formulae \(\psi_{{1}} \psi_{2} \ldots\psi_{8}\) and \((u,v)\) are as follows:

$$\psi_{1} = 1$$
$$\psi_{2} = x$$
$$\psi_{3} = x^{4} + x^{3} + a = A$$
$$\psi_{4} = x^{6} + ax^{2} = x^{2} (x^{4} + a) = x^{2} (A + x^{3} ) = x\left[ {x(A + x^{3} )} \right] = xB$$
$$\psi_{{5}} =\psi_{{4}} \psi_{{2}}^{{3}} - \psi_{1} \psi_{3}^{3} = x^{4} B + A^{3} = C$$
$$\psi_{6} = {{\left( {\psi_{5} \psi_{3} \psi_{2}^{2} - \psi_{1} \psi_{3} \psi_{4}^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {\psi_{5} \psi_{3} \psi_{2}^{2} - \psi_{1} \psi_{3} \psi_{4}^{2} } \right)} {\psi_{2} }}} \right. \kern-\nulldelimiterspace} {\psi_{2} }}={{\left( {x^{2} AC - x^{2} AB^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {x^{2} AC - x^{2} AB^{2} } \right)} x}} \right. \kern-\nulldelimiterspace} x} = x\left[ {A(B^{2} + C)} \right] = xD$$
$$\psi_{7} = \psi_{5} \psi_{3}^{3} - \psi_{2} \psi_{4}^{3} = A^{3} C - x^{4} B^{3} = A^{6} + x^{4} B(A^{3} + B^{2} ) = E$$
$$\psi_{8} = {{\left( {\psi_{6} \psi_{4} \psi_{3}^{2} - \psi_{2} \psi_{4} \psi_{5}^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {\psi_{6} \psi_{4} \psi_{3}^{2} - \psi_{2} \psi_{4} \psi_{5}^{2} } \right)} {\psi_{2} }}} \right. \kern-\nulldelimiterspace} {\psi_{2} }} = {{\left( {x^{2} A^{2} BD + x^{2} BC^{2} } \right)} \mathord{\left/ {\vphantom {{\left( {x^{2} A^{2} BD + x^{2} BC^{2} } \right)} x}} \right. \kern-\nulldelimiterspace} x} = x\left[ {B(A^{2} D + C^{2} )} \right] = xF$$

Substituting these values into formula (13) and then transforming, we can get septuple formula (10). The specific transformation process is as follows:

$$u = x + \frac{{\psi_{8} \psi_{6} }}{{\psi_{7}^{2} }} = x + \frac{{x^{2} DF}}{{E^{2} }} = x + xD\left( {\frac{xF}{{E^{2} }}} \right)$$
$$v = y + \psi_{2} on + \frac{{\psi_{8}^{2} \psi_{5} }}{{\psi_{2} \psi_{7}^{3} }} + h_{4} \frac{{\psi_{8} \psi_{6} }}{{\psi_{2} \psi_{7}^{2} }} = y + u + \frac{{xCF^{2} }}{{E^{3} }} + \left( {x^{2} + y} \right)\left( {\frac{xDF}{{E^{2} }}} \right) = y + u + \left( {\frac{xF}{{E^{2} }}} \right)\left[ {\frac{CF}{E} + (x^{2} + y)D} \right]$$

3.3 Cost of septuple formula over binary fields

Given a point \(P = (x,y)\) on a non-supersingular curve (Eq. 3), let us check the sub-expressions and costs required to compute \({7}P\) by applying Eq. (10) without any pre-computation. Table 1 lists the sub-expressions, intermediate values and costs for computing \({7}P\). Next, we analyze the efficiency of the septuple formula. Table 2 lists the costs of various field operations over binary fields. A number of methods for computing \({7}P\) have been proposed in previous research. In [19], the authors proposed several septuple formulae over prime fields, one with \(15[s] + 14[m]\) costs in Jacobian coordinates, and the other with \(9[s] + 13[m]\) costs in Jquartic coordinates. In [20], the author uses the expensive sevenfold point operation over binary fields, which requires three inversion, and the total computational cost is as high as \({3}[i] + {7}[s] + {18}[m]\). Later, in [21], the authors proposed two fast methods for point septupling over binary fields with computational costs of \({2}[i] + {7}[s] + {14}[m]\) (method 1) and \({1}[i] + {6}[s] + {20}[m]\) (method 2), and the break-even ratio between the two methods is \([i]/[m] = 6\).

Table 1 Cost of septuple formula for elliptic curves over binary fields
Table 2 Costs of various field operations for elliptic curves over binary fields in affine coordinates

The costs of septuple formula we proposed is \({1}[i] + {7}[s] + {16}[m]\). Compared with the method in [20], the formula proposed in this paper reduces two inversion and two multiplication, resulting in a speed up of 33% at the ratio of \([i]/[m] = 4\) and 43% at the ratio of \([i]/[m] = 8\), with more speed up at higher \([i]/[m]\)-ratio. Compared with the method 2 in [21], our formula reduces four multiplication and adds one squaring (the cost of squaring can be ignored over binary fields), and the speed is increased by 17% at the ratio of \([i]/[m] = 4\) and by 14% at the ratio of \([i]/[m]{ = 8}\), which is more significant when the \([i]/[m]\)-ratio is smaller. Compared with the method 1 in [21], our formula reduces one inversion and adds two multiplication. The break-even ratio between our formula and method 1 in [21] is \([i]/[m] = 2\). Our formula is faster when \([i]/[m] > {2}\), with up to 17% faster at the ratio of \([i]/[m] = 4\) and 20% faster at the ratio of \([i]/[m] = 8\). Over binary fields, the \([i]/[m]\)-ratio is generally not less than 3 and is commonly assumed to be \([i]/[m] = 8\). Therefore, the septuple formula we proposed is more efficient than the methods in [20, 21].

4 Methods

In order to improve the computation speed of elliptic curve scalar multiplication, we modify the MBNR based on \(B = \{ 2,3,5\}\) proposed in [11] and propose a SMBR based on \(B = \{ {{1} \mathord{\left/ {\vphantom {{1} {2}}} \right. \kern-\nulldelimiterspace} {2}},3,{7}\}\), denoted \(k = \sum\nolimits_{i = 1}^{m} {s_{i} (\frac{1}{2})^{{b_{i} }} 3^{{t_{i} }} 7^{{q_{i} }} }\). We retain the original point tripling, replace the point doubling and quadrupling with faster point halving, replace the point quintupling with the point septupling we proposed in Sect. 3 and restrict the exponential sequences \((b_{i} )\)\((t_{i} )\)\((q_{i} )\) of 1/2, 3 and 5 to decreasing monotonically, respectively. The modified SMBR is defined as follows.

Definition 4

(\(\{ {{1} \mathord{\left/ {\vphantom {{1} {2}}} \right. \kern-\nulldelimiterspace} {2}},3,{7}\}\)-step multi-base representation) A multi-base number representation based on the set of \(B = \{ {{1} \mathord{\left/ {\vphantom {{1} {2}}} \right. \kern-\nulldelimiterspace} {2}},3,{7}\}\).

$$k = \sum\limits_{i = 1}^{m} {s_{i} \left( \frac{1}{2} \right)^{{b_{i} }} 3^{{t_{i} }} 7^{{q_{i} }} } ,\;{\text{with}}\;s_{i} \in \{ - 1,1\}$$
(14)

where the sequence of \((b_{i} )\)\((t_{i} )\)\((q_{i} )\) is each monotonically decreasing. Then, the representation is called a \(\{ {{1} \mathord{\left/ {\vphantom {{1} {2}}} \right. \kern-\nulldelimiterspace} {2}},3,{7}\}\)-step multi-base representation of the integer \(k\).

Due to the inclusion of point halving, the \(\{ {{1} \mathord{\left/ {\vphantom {{1} {2}}} \right. \kern-\nulldelimiterspace} {2}},3,{7}\}\)-SMBR of the large integer \(k\) has to be derived indirectly. Assuming that the size of the binary field is \(p\), firstly we find a large exponent \({2}^{r}\) with base 2 that the value approximating \(p\). Then, we multiply the original scalar \(k\) by \({2}^{r}\) and then model \(p\) and denote the result as \(k^{{\prime}}\), as shown in Eq. (15).

$$k^{{\prime}} { = 2}^{r} k \,\bmod\, p$$
(15)

This allows us to transform finding a \(\{ {{1} \mathord{\left/ {\vphantom {{1} {2}}} \right. \kern-\nulldelimiterspace} {2}},3,{7}\}\)-SMBR of \(k\) into finding a \(\{ {2},3,{7}\}\)-MBNR with exponential restrictions of \(k^{{\prime}}\). Thus, the form of the representation of k can become

$$k = \frac{{k^{{\prime}} }}{{2^{r} }} = \frac{{\sum\limits_{i = 1}^{m} {s_{i} 2^{{b^{{\prime}}_{i} }} 3^{{t_{i} }} 7^{{q_{i} }} } }}{{2^{r} }} = \sum\limits_{i = 1}^{m} {s_{i} \left( \frac{1}{2} \right)^{{(r - b_{i} )}} 3^{{t_{i} }} 7^{{q_{i} }} } \,\bmod\, p,\;{\text{with}}\;k^{{\prime}} { = 2}^{r} k\,\bmod\, p$$
(16)

where \(s_{i} \in \{ - 1,1\}\), \(0 \le b^{{\prime}}_{1} \le b^{{\prime}}_{2} \le \cdots \le b^{{\prime}}_{m}\), \(t_{1} \ge t_{2} \ge \cdots \ge t_{m} \ge 0\) and \(q_{1} \ge q_{2} \ge \cdots \ge q_{m} \ge 0\).

Subsequently, we find MBNR of \(k^{{\prime}}\) with base \(\{ {2},3,{7}\}\) and restrict the exponential sequences of 3 and 7 monotonic decrease, but the exponential sequence of 2 monotonic increases. We obtain this MBNR in an iterative way. First, we find \(n\) such that \(k = 0\;\,\bmod\, \;n\), where the trial order of \(n\) is \(\left\{ {42,36,32,28,27,24,21,18,16,14,12,9,8,7,6,4,3,2} \right\}\). If \(k = 0\,\bmod\, {42}\), then return \({2} \cdot {3} \cdot {7}\left( \frac{k}{42} \right)\). If \(k = 0\,\bmod\, {36}\), then return \({2}^{2} \cdot {3}^{2} \left( \frac{k}{36} \right)\). If \(k = 0\,\bmod\, 32\), then return \({2}^{5} \left( \frac{k}{32} \right)\). And so on, if \(k = 0\;\,\bmod\, \;n\), then return \({2}^{{n_{1} }} \cdot {3}^{{n_{2} }} \cdot {7}^{{n_{3} }} \left( \frac{k}{n} \right)\), where \(2^{{n_{1} }} \cdot 3^{{n_{2} }} \cdot 7^{{n_{3} }} = n\). If no suitable match is found for all trials, we find a power of 2 that is closest to \(k\) denoted as \(k_{c}\) and return the absolute value \(\left| {k - k_{c} } \right|\) of the difference between \(k\) and \(k_{c}\). We chose the power of 2 as the approximation for k because point doubling will become point halving afterwards (and may also constitute half-and-add) with less cost than point tripling and point septupling. As the return value of \(\left| {k - k_{c} } \right|\) becomes smaller and smaller, it can always be approximated in the next round by a lower power of 2. So in this MBNR, the exponents of 2 are keep monotonically decreasing. Therefore, triple-and-add are rarely required in this scalar multiplication. The iterations do not stop until \(k\) is equal to 1 or the power of 2, 3 and 7, which means that for any non-negative integer \(b\), \(t\) and \(q\), \(2^{b} 3^{t} 7^{q}\) can represent a positive integer. The return terms of this iterative algorithm form a MBNR of \(k^{{\prime}}\) and are ordered from the highest exponent of 2 multiplied by the lowest power of 3 and 7 to the lowest exponent of 2 multiplied by the highest power of 3 and 7. Then, we reverse the order of the sub-terms of the MBNR, so that the exponents of 3 and 7 decrease, and the exponents of 2 increase. Finally, by dividing the MBNR by \({2}^{r}\), so that all exponents of 2 are negative, the exponents of 1/2 are monotonically decreasing. From this, we can obtain an \(\{ {{1} \mathord{\left/ {\vphantom {{1} {2}}} \right. \kern-\nulldelimiterspace} {2}},3,{7}\}\)-SMBR of \(k\) as shown in Eq. (16).

Based on this representation, we can propose a scalar multiplication for elliptic curves over binary fields using \(\{ {{1} \mathord{\left/ {\vphantom {{1} {2}}} \right. \kern-\nulldelimiterspace} {2}},3,{7}\}\)-SMBR, as described in Algorithm 3. The number of additions (including half-and-add and triple-and-add) is equal to the number of items in the presentation minus one. Half-and-add is used instead of addition as long as the exponent of 1/2 is not zero. If the exponent of 1/2 is zero, but the exponent of 3 is not zero, then triple-and-add is used instead of addition. Since there is no formula for quadruple-and-add, we can only use the typical addition operation if the exponents of 1/2 and 3 are zero at the same time. A total of \(b_{1}\) point halving (including half-and-add), \(t_{1}\) point tripling (including triple-and-add) and \(t_{1}\) point septupling are required in the execution of Algorithm 3.

figure c

5 Results and discussion

The experiments were performed on the elliptic curve recommended by the National Institute of Standards and Technology (NIST). The experiments are divided into three test groups in total, it including the elliptic curves NIST B-163, NIST B-233 and NIST B-283, and the size of the binary field was selected as 160-bit, 233-bit and 283-bit, respectively. In order to analyze the performance of the scalar multiplication algorithm proposed in this paper more visually, we compare our algorithm with the NAF proposed in [4], the DBC proposed in [10], the MBNR proposed in [11] and the MMBR proposed in [22]. For each test group, 1000 large integer scalar quantities \(k\) are selected at random, scalar multiplication is performed with each algorithm in turn without using any precomputation and pre-storage points, the average of the number of field operations of each algorithm is counted, and the number of inverse, square and multiplication are expressed in terms of \(I\), \(S\) and \(M\), respectively. The experimental environment is: the hardware environment is Intel (R) Core (TM) i7 CPU @ 2.20 GHz, the installed memory is 16 GB, the software environment is LINUX operating system, and the algorithms are implemented in C/C++ with the Multiprecision Integer and Rational Arithmetic C/C++ Library.

In this section, the performance of our proposed algorithm is described in detail. In order to clearly compare the total computational cost of the different algorithms, we select \([i]/[m] = 8\) and ignore \([s]\) over binary fields. The comparison of the total computational cost of different scalar multiplication algorithms is shown in Fig. 1, the cost of all algorithms rises as the length of the scalar \(k\) grows. Since our algorithm and MMBR both use point halving operation, the computational cost is significantly lower than other algorithms. Compared with MMBR, our algorithm has better performance, and the greater the length of the scalar k, the more obvious the advantage of our algorithm. This is because our algorithm uses new septuple formulas we proposed, which is more effective in reducing the number of expensive field inverse operation than the quintuple formulas.

Fig. 1
figure 1

Comparison of the total computational cost of different scalar multiplication algorithms. Compares the performance of our method with the NAF method in [4], the DBC method in [10], the MBNR method in [11] and the MMBR method in [22]

Table 3 shows that detailed data of computational cost of different scalar multiplication algorithms. On the curve NIST B-163 (\(k = 16{0}\) bit), the cost of our algorithm is 35% lower than the NAF, 29% lower than the DBC, 25% lower than the MBNR and 3% lower than the MMBR. On the curve NIST B-233(\(k = {233}\) bit), the cost of our algorithm is 37% lower than the NAF, 30% lower than the DBC, 26% lower than the MBNR and 4% lower than the MMBR. On the curve NIST B-283(\(k = {283}\) bit), the cost of our algorithm is 38% lower than the NAF, 30% lower than the DBC, 27%lower than the MBNR and 4% lower than the MMBR. Based on the above results, we can see that the cost of the algorithm in this paper is significantly reduced compared with NAF, DBC, and MBNR and has a small enhancement compared with MMBR, and the improvement effect becomes more obvious with the increase in the length of the scalar \(k\). As a result, in the case of the same elliptic curve over binary fields and the same field size, the scalar multiplication algorithm we propose is more efficient.

Table 3 Detailed data of computational cost of different scalar multiplication algorithms with \([i]/[m]{ = 8}\)

6 Conclusions

ECC is a crucial method to ensure secure communication between devices in the IoT. Scalar multiplication is one of the major operations in ECC. How to reduce the computational complexity of ECC scalar multiplication is key to maintaining the long-term stable functioning of IoT devices. There has been much research proposing schemes to speed up scalar multiplication. In this paper, we present an efficient formula for point septupling on elliptic curves over binary fields in affine coordinates and a scalar multiplication algorithm based on the step multi-base representation. The septuple formula uses division polynomial and multiplexing of intermediate values in affine coordinates to speed up computations by more than 14%. The scalar multiplication algorithm is based on the SMBR with \(B = \{ {{1} \mathord{\left/ {\vphantom {{1} {2}}} \right. \kern-\nulldelimiterspace} {2}},3,{7}\}\), which drastically reduces the computational cost by using point halving and the point septupling we propose. The experimental results indicate that our method can effectively reduce the cost of scalar multiplication for elliptic curves over the binary fields and contribute to the lightweight of ECC. In addition, the elliptic curve scalar multiplication method studied in this paper is still at the theoretical research stage, and future research needs to further consider the newer iterations of IoT terminal devices.