Skip to main content
Log in

Design and Analysis of Pattern Matching Algorithms Based on QuRAM Processing

  • Research Article-Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

The pattern matching problem remains in survival since past decades and becomes more sophisticated due to exponential increase in size of text databases. An effective deterministic classical algorithm is always expected to be at least \({\rm O}\left( N \right)\) time. Quantum computations are enough capable of performing exponential operations in single step of execution, so the quantum algorithms are effective. In general, the quantum pattern matching solution is possible in \({\rm O}\left( {\sqrt N } \right)\) time as its design is based on Grover’s quantum search algorithm. To our knowledge, quantum algorithms for single pattern matching are available with limitations, and no algorithm has designed for multiple pattern matching. The main objective is to design quantum algorithm for both single and multiple patterns on a processing architecture of quantum random access memory \(\left( {QuRAM} \right)\). This gives a significant advantage to process large text databases in an efficient manner. Our complexity analysis justifies that the quantum algorithmic solutions achieve computational speedup over classical methods. We summarize the emergence of quantum-based pattern matching algorithms to process biological applications. The simulation is additionally done to validate and analyze the performance of proposed quantum algorithms. Lastly, we justify that our algorithms outperform the classical and quantum solutions and they are competent for implementing over quantum computer.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Tao, T.; Mukherjee, A.: Pattern-Matching in LZW Compressed Files. IEEE Trans. On Computers 54(8), 929–938 (2005)

    Article  Google Scholar 

  2. Das, S.; Kapoor, K.: Weighted approximate parameterized string matching. AKCE International Journal of Graphs and Combinatorics 14, 1–12 (2017)

    Article  MathSciNet  Google Scholar 

  3. Hakak, S.I.; Kamsin, A.: Exact String Matching Algorithms: Survey, Issues, and Future Research Directions. IEEE Access 7, 69614–69637 (2019)

    Article  Google Scholar 

  4. Neamatollahi, P.; Hadi, M.; Naghibzadeh, M.: Simple and Efficient Pattern Matching Algorithms for Biological Sequences. IEEE Access 8, 23838–23846 (2020)

    Article  Google Scholar 

  5. Faro, S.; Lecroq, T.: The Exact Online String Matching Problem: A Review of the Most Recent Results. ACM Comput. Surv. 45(2), 1–42 (2013)

    Article  Google Scholar 

  6. Rivals, E.; Salmela, L.; Tarhio, J.: Exact Search Algorithm for Biological Sequences. Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications, John Wiley and Sons, pp. 91–111 (2011).

  7. Zou, D.; Ma, L.; Yu, J.; Zhang, Z.: Biological Databases for Human Research. Genomics Proteomics Bioinformatics 13, 55–63 (2015)

    Article  Google Scholar 

  8. Kalsi, P.; Peltola, H.; Tarhio, J.: Comparison of Exact String Matching Algorithms for Biological Sequences. CCIS Springer 13, 417–426 (2008)

    Google Scholar 

  9. Knuth, D.E.: Morris; Pratt: Fast pattern matching in strings. SIAM Journal Computing 6, 323–350 (1977)

    Article  MathSciNet  Google Scholar 

  10. Boyer, R.S.; Moore, J.S.: A fast string searching algorithm. Communication of ACM 20, 762–772 (1977)

    Article  Google Scholar 

  11. Aho, A.V.; Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18(6), 333–340 (1975)

    Article  MathSciNet  Google Scholar 

  12. Charalampos, S.; Panagiotis, D.; Konstantinos, G.: Parallel processing of multiple pattern matching algorithms for biological sequences: methods and performance results. In: Yang, N.S. (eds.) Bioinformatics – Computational Biology and Modeling, pp. 161–182. IntechOpen, London (2011)

    Google Scholar 

  13. Lin, C.-H.; Chein, L.-S.: Accelerating Pattern Matching Using a Novel Parallel Algorithm on GPUs. IEEE Trans. On Computers 62(10), 1906–1916 (2013)

    Article  MathSciNet  Google Scholar 

  14. Nielsen, M.; Chuang, I.: Quantum Computation and Quantum Information, 10th edn. Cambridge University Press, Cambridge (2010)

    Book  Google Scholar 

  15. Lov K. Grover.: A fast quantum mechanical algorithm for database search. In: Proceedings of ACM STOC 1996, pp. 212–219, ACM (1996).

  16. Ramesh, H.; Vinay, V.: String Matching in O(√n + √m) quantum time. Elsevier Journal of Discrete Algorithms. 1, 103–110 (2003)

    Article  MathSciNet  Google Scholar 

  17. Mateus, P.: A Quantum Algorithm for Closest Pattern Matching. Int. J. of Theoretical Physics 52, 3970–3980 (2003)

    Google Scholar 

  18. Aborot, J.: Quantum Approximate String Matching for Large Alphabets. Theory and Practice of Computation, World Scientific 20, 37–50 (2017)

    Article  Google Scholar 

  19. De Jesus, B.K.A.; Aborot, J.A.; Adorna, H.N.: Solving the Exact Pattern Matching Problem Constrained to Single Occurrence of Pattern P in String S Using Grover’s Quantum Search Algorithm. Theory and Practice of Computation, Springer Tokyo 7, 124–142 (2013)

    Article  Google Scholar 

  20. Montanaro, Ashley: Quantum Pattern Matching Fast on Average. Springer Algorithmica 77, 16–39 (2017)

    Article  MathSciNet  Google Scholar 

  21. Giovannetti, Vittorio: Quantum random access memory. Phys. Rev. Lett. 100, 1–4 (2008)

    MathSciNet  MATH  Google Scholar 

  22. Daniel, K.: Park; Francesco Petruccione: Circuit-Based Quantum Random Access Memory for Classical Data. Quantum Physics, Scientific Reports 9(3949), 1–8 (2019)

    Google Scholar 

  23. Marca Lanzogorta: Quantum Computer Science: Synthesis Lectures on Quantum Computing, E-Book (2008).

  24. Chakrabarty, I.; Khan, S.; Singh, V.: Dynamic Grover Search: Application in Recommendation System and Optimization Problems. Quantum Info. Process 16, 152–172 (2017)

    Article  MathSciNet  Google Scholar 

  25. Mandviwalla, A.; Ohshiro, K.; Ji,B.: Implementing Grover’s Algorithm on the IBM Quantum Computers. In: Proceedings of International Conference on Big Data 2018, pp. 2531–2537, IEEE (2018).

  26. Jones, T.: Benjamin C: QuEST and High Performance Simulation of Quantum Computers. Science Reports 9(10736), 1–9 (2018)

    Google Scholar 

  27. Hao, X.; Zhang, F.; Xla, S.; Zhou, Y.: Quantum Algorithms for Learning the Algebraic Normal Form of Quadratic Boolean Functions. Quantum Inf. Process. 19(273), 1–22 (2020)

    MathSciNet  Google Scholar 

  28. Yu. I. Bogdanova; N. A. Bogdanova; D. V. Fastovets; V. F. Lukichev: Representation of Boolean Function in terms of Quantum Computations. In: International Conference on Micro – and Nano Electronics, pp. 1-19 (2018).

Download references

Acknowledgement

The authors are thankful to other researchers working in the same domain to share their ideas and \(QuEST\) quantum library that helps us validating proposed algorithms.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kapil Kumar Soni.

Appendices

Appendix A

In Table 8, we provide some important points for the contextual understanding of proposed algorithms in terms of advantages, limitations, and preferred biological applications. This summary is listed in reference to the Analysis Note 1 to Analysis Note 4 and Application Note 1 to Application Note 4. In perspective of quantum pattern matching algorithm design, we observed the general limitation that the expectation of search results with high probability can be affected with the increase in size of text databases. For all multiple pattern algorithm, we noted that their performance varies when the length of pattern varies (for the unequal sized pattern). However, on the basis of alphabet size, the different possible lengths of text and pattern are considered. Therefore, we provide the summarized detail of biological applications for which our algorithms are found suitable.

Table 8 \(QuRAM\)-based pattern matching algorithms comparison with application

Appendix B

The experiments are performed using \(QuEST\) simulation for evaluating the average execution time and utilization of \(RAM\) workspace. Moreover, we recorded a standard log of \(QASM\) to observe the number of quantum gates applied to quantum registers during algorithmic simulation [26]. These details are noted in Table 9. The separate discussion over the identified results and observed facts are provided here through simulation notes.

Table 9 Experimental outcomes of QuEST-specific simulation

Simulation Note 2. The evaluated outcomes which are noted in Table 9 assures that the number of solutions is correctly reported at each identified index \(|T_{i} \rangle_{AR} \in |T_{n} \rangle_{QuRAM}\). In Table 7, our expected qubits requirement and the observed qubits under \(QuEST\) simulation are completely mapped. To note the average outcomes of \(QuEST\) simulation, the experiment was repeated 20 times. For the triplets of text file size and different pattern lengths, we have noted an average execution time (in sec) and workspace utilization of \(RAM\) (in KiB). The execution time of \(QuEMS\_US\) decreases gradually on comparing \(QuEMS\_ES\) because of the second pattern length is shorter (unequal sized), and therefore, the average execution time of \(QuEMS\_US\) is decreased.

Similarly, we observed same fact between an average execution time of \(QuEMM\_US\) and \(QuEMM\_ES\) algorithms. We noted that the average execution time of \(QuEMM\) will be definitely more on comparing with \(QuEMS\) algorithms. This happens because of \(QuEMS\) searches only for single occurrence of each pattern and \(QuEMM\) searches for all occurrences of multiple pattern strings. However, we observed an exception that the average execution time of single pattern \(QuESM\) algorithm is found comparatively more than \(QuEMS\_US\) and \(QuEMM\_US\). We know that the \(QuEMS\_US\) algorithm searches only for the single occurrence of multiple pattern; therefore, the execution time can be less.

In contrast \(QuESM\) searches for all the occurrences of single pattern, so the average execution time may be more than that of \(QuEMS\_US\) algorithm. An increased deviation in the average execution time of \(QuESM\) over \(QuEMM\_US\) is an exception. However, we clarify that this may happen due to the random increase in depth of Boolean function which was realized for \(QuRAM\) through the \(ANF\). Our evaluations of average execution time over the quantum machine would be considered as negligible for small sized text and pattern.

Simulation Note 3. In Table 9, the utilization of classical \(RAM\) workspace (in KiB) was observed throughout the course of execution phase of simulation program. The consumption of memory space is noted in triplet of text file sizes, and the simple correlation is found between all executed algorithms. A memory utilization of (\(QuEMS\_US\) and \(QuEMM\_US\)) is lower than that of the (\(QuEMS\_ES\) and \(QuEMM\_ES\)) as because of the second pattern is of shorter length. However, the \(RAM\) requirement for \(QuEMM\) is found more than the \(QuEMS\) algorithm due to searching for all occurrences of multiple pattern strings. Besides, the single pattern \(QuESM\) is comparatively observed with slight less requirement of \(RAM\) among all executions. The classical \(RAM\) utilization to execute the implemented algorithms is observed within the closer regions of memory consumption.

There exist close similarities between the memory consumptions of our algorithms, instead of deviations. However, it is not an exception, rather this happens because of \(ANF\) is used to realize the unitary \(U_{LD}\) for \(QuRAM\). Since the Hadamard gates are used to realize the text indices in superposition and \(ANF\) builds the superposition of coherently correlated data, we expect slight variation in the depth of the Boolean functions, as functional outcome is dependent on the depth of at most \(O\left( {2^{n} } \right)\). So, classical memory consumption would be more, and therefore, the consistent memory space \(\left( {RAM} \right)\) is needed throughout the execution of programs.

Simulation Note 4. We recorded the log of standard quantum assembly \(\left( {QASM} \right)\) instructions, through which we observed the number of quantum gates used during simulation of quantum circuits. A tuple \(\left( {H, X, R_{z} \left( {\theta = 0} \right), C^{n - 1} Z, C^{k} NOT} \right)\) includes quantum gates which are noted during simulation. The rightmost column of Table 9 shows an individual gate count of algorithms as per the specified tuple. The gate count is observed for either single pattern \(P\) or the multiple pattern strings \(P1\) and \(P2\) (equal or unequal sizes). Each quantum gate is separately counted as in triplet of text file size (32, 128, 512). As expected, we observed proportional increase and decrease in the gate counts as per the given size binary encoded text file and binary encoded pattern(s).

We therefore noted, some exceptions in the standard log file of \(QASM\), as the equal sized multiple pattern \(QuEMS\_ES\) and \(QuEMM\_ES\) algorithms are realizing increase in the gate counts of \(R_{z}\) and \(C^{k} NOT\) gates (approximately double). This was observed between the patterns \(P1\) and \(P2\), only over the text file size of 512 characters. This is still happened due to the random increase in circuit depth of the Boolean function which was realized for \(QuRAM\) through the \(ANF\).

Throughout our discussion, we were concerned about the qubits requirement as it is exponentially proportional to requirement of classical memory and thus causes exponential increase in processing time. The \(QuEST\) performance is also dependent on algorithmic scaling with respect to the use of qubits and it is limited to underlying classical machine configuration. We observed that according to our machine configuration, quantum simulation is limited up to 25 qubits. So far, we implemented the algorithms \(QuESM\), \(QuEMS\) and \(QuEMM\) as they were found feasible for simulation. However, Table 10 shows that \(QuAMM\) algorithm needs an implementation of unitary \(U_{HD}\) for approximate match. This design has the same qubit complexity as we theoretically noted for the \(APM\) algorithm (Table 4). Indeed, due to the excessive multiplicative constants, higher number of qubits are required; therefore, the simulation of \(QuAMM\) algorithms is infeasible. For a triplet of file size and assumed patterns, we list the expected qubits requirement. “Expected Qubits” of Table 10 shows that the requirement of qubits are higher than the capability of our machine.

Table 10 Analysis of qubits for \(QuAMM\) algorithms

Appendix C

Our presented algorithms are implemented through \(QuEST\) library to validate quantum algorithms. A Genome Sequence file of Hot Pepper (Capsicum Annuum) is available at the link (http://plants.ensembl.org/Capsicum_annuum/Info/Index). The dataset of subset genome and the \(QuEST\)-specific simulation codes are uploaded at github.com. All the algorithm codes are publicly available and can be accessed through the referenced link (https://github.com/profkapilsoni/QuQPMA).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Soni, K.K., Malviya, A.K. Design and Analysis of Pattern Matching Algorithms Based on QuRAM Processing. Arab J Sci Eng 46, 3829–3851 (2021). https://doi.org/10.1007/s13369-020-05310-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-020-05310-y

Keywords

Navigation