Skip to main content
Log in

Unifying Parsing and Reflective Printing for Fully Disambiguated Grammars

  • Published:
New Generation Computing Aims and scope Submit manuscript

Abstract

Language designers usually need to implement parsers and printers. Despite being two closely related programs, in practice they are often designed separately, and then need to be revised and kept consistent as the language evolves. It will be more convenient if the parser and printer can be unified and developed in a single program, with their consistency guaranteed automatically. Furthermore, in certain scenarios (like showing compiler optimisation results to the programmer), it is desirable to have a more powerful reflective printer that, when an abstract syntax tree corresponding to a piece of program text is modified, can propagate the modification to the program text while preserving layouts, comments, and syntactic sugar. To address these needs, we propose a domain-specific language BiYacc, whose programs denote both a parser and a reflective printer for a fully disambiguated context-free grammar. BiYacc is based on the theory of bidirectional transformations, which helps to guarantee by construction that the generated pairs of parsers and reflective printers are consistent. Handling grammatical ambiguity is particularly challenging: we propose an approach based on generalised parsing and disambiguation filters, which produce all the parse results and (try to) select the only correct one in the parsing direction; the filters are carefully bidirectionalised so that they also work in the printing direction and do not break the consistency between the parsers and reflective printers. We show that BiYacc is capable of facilitating many tasks such as Pombrio and Krishnamurthi’s ‘resugaring’, simple refactoring, and language evolution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. We assume basic knowledge about functional programming languages and their notations, in particular Haskell [5, 32]. In Haskell, an argument of function application does not need to be enclosed in (round) parentheses, i.e. we write \(f\,x\) instead of f(x); type variables are implicitly universally quantified, i.e. \(f \;{:}{:}\; a \rightarrow b \rightarrow a\) is the same as \(f \;{:}{:}\; \forall a\ b.\ a \rightarrow b \rightarrow a\), where  :  :  means has type. Additionally, we omit universal quantification for free variables in an equation; for instance, \(\textit{parse}\;(\textit{print}\;s\;t) = t\) is in fact \(\forall s\ t.\ \textit{parse}\;(\textit{print}\;s\;t) = t\).

  2. While single quotation marks are for characters, double quotation marks are for strings. For simplicity, the user can always use double quotation marks.

  3. The reason for storing primitives in the \(\mathsf {String}\) type is because \(\mathsf {String}\) is the most precise representation that will not cause the loss of any information. For instance, this is useful for retaining the leading zeros of an integer such as 073. Storing 073 as \(\mathsf {Integer}\) will cause the loss of the leading zero.

  4. For simplicity, we use \(^\sharp \) to annotate type-incorrect CSTs in which fields for layouts (and comments) and unimportant constructors such as Lit are omitted.

  5. The general type for disambiguation filters is \([t] \rightarrow [t]\), which allows comparison among a list of CSTs. However, since in this paper we only consider property filters defined in terms of predicates (on a single tree), it is sufficient to use the simplified type \(t \rightarrow \mathsf {Bool}\). See Generalised Parsing, Disambiguation, and Filters.

  6. This is not a very realistic filter, although it sufficiently demonstrates the use of filters and removes ambiguity in simplest cases like 1 + 2 * 3. In general, the filter should be complete (Definition 9) so that ambiguity is fully removed from the grammar.

  7. The Yacc-style approach adopts the word precedence [21] while the filter-based approaches tend to use the word priority [24, 50]. We follow the traditions and use either word depending on the context.

  8. Although terminals such as ‘*’ and ‘+’ are uniquely determined by constructors and not explicitly included in the CSTs, there are fields in CSTs for holding whitespaces after them. Thus, Times still has three subtrees. Also, for simplicity, the bi-filter fTimesPlusPrio attempts to repair the whitespace subtree t2 even though the repair can never happen since t2 cannot match p.

  9. https://www.cs.princeton.edu/~appel/modern/testcases/.

  10. Although they use different implementation techniques, we will not dive into them in our related work. See Matsuda and Wang’s related work for a comparison [34].

  11. An injective production, or a chain production, is one whose right-hand side is a single nonterminal; for instance, \(\texttt {E -> N}\).

References

  1. Aasa, A.: Precedences in specifications and implementations of programming languages. In: Selected Papers of the Symposium on Programming Language Implementation and Logic Programming, Elsevier Science Publishers B. V., Amsterdam, PLILP ’91, pp. 3–26. http://dl.acm.org/citation.cfm?id=203429.203431 (1995)

  2. Afroozeh, A., Izmaylova, A.: Faster, practical GLL parsing. In: Franke, B. (ed.) Compiler Construction, pp. 89–108. Springer, Berlin (2015). https://doi.org/10.1007/978-3-662-46663-6_5

    Chapter  Google Scholar 

  3. Aho, A.V., Johnson, S.C., Ullman, J.D.: Deterministic parsing of ambiguous grammars. Commun. ACM 18(8), 441–452 (1975)

    Article  MathSciNet  Google Scholar 

  4. Appel, A.W.: Modern Compiler Implementation in ML, 1st edn. Cambridge University Press, New York (1998)

    MATH  Google Scholar 

  5. Bird, R.: Thinking Functionally with Haskell. Cambridge University Press, Cambridge (2014). https://doi.org/10.1017/CBO9781316092415

    Book  Google Scholar 

  6. Boulton, R.: Syn: a single language for specifying abstract syntax trees, lexical analysis, parsing and pretty-printing. Tech. Rep. Number 390, Computer Laboratory, University of Cambridge (1966)

  7. Brabrand, C., Møller, A., Schwartzbach, M.I.: Dual syntax for XML languages. Inf. Syst. 33(4–5), 385–406 (2008). https://doi.org/10.1016/j.is.2008.01.006

    Article  MATH  Google Scholar 

  8. van den Brand, M., Visser, E.: Generation of formatters for context-free languages. ACM Trans. Softw. Eng. Methodol. 5(1), 1–41 (1996). https://doi.org/10.1145/226155.226156

    Article  Google Scholar 

  9. van den Brand, M.G.J., Scheerder, J., Vinju, J.J., Visser, E.: Disambiguation filters for scannerless generalized LR parsers. In: Proceedings of the 11th International Conference on Compiler Construction, Springer, London, UK, CC ’02, pp. 143–158. https://doi.org/10.1007/3-540-45937-5_12 (2002)

  10. Cantor, D.G.: On the ambiguity problem of Backus systems. J. ACM 9(4), 477–479 (1962)

    Article  MathSciNet  Google Scholar 

  11. Czarnecki, K., Foster, J.N., Hu, Z., Lämmel, R., Schürr, A., Terwilliger, J.F.: Bidirectional transformations: a cross-discipline perspective. In: Proceedings of the 2nd International Conference on Theory and Practice of Model Transformations, Springer, Berlin, ICMT ’09, pp. 260–283. https://doi.org/10.1007/978-3-642-02408-5_19 (2009)

  12. Dijkstra, E.W.: Guarded commands, nondeterminacy and formal derivation of programs. Commun. ACM 18(8), 453–457 (1975). https://doi.org/10.1145/360933.360975

    Article  MathSciNet  MATH  Google Scholar 

  13. Duregård, J., Jansson, P.: Embedded parser generators. In: Proceedings of the 4th ACM Symposium on Haskell, ACM, New York, NY, USA, Haskell ’11, pp. 107–117. https://doi.org/10.1145/2034675.2034689 (2011)

  14. Earley, J.: An efficient context-free parsing algorithm. Commun. ACM 13(2), 94–102 (1970). https://doi.org/10.1145/362007.362035

    Article  MATH  Google Scholar 

  15. Fischer, S., Hu, Z., Pacheco, H.: The essence of bidirectional programming. Sci. China Inf. Sci. 58(5), 1–21 (2015)

    Article  Google Scholar 

  16. Foster, J.N.: Bidirectional programming languages. PhD thesis, University of Pennsylvania (2009)

  17. Foster, J.N., Greenwald, M.B., Moore, J.T., Pierce, B.C., Schmitt, A.: Combinators for bidirectional tree transformations: a linguistic approach to the view-update problem. ACM Trans. Program. Lang. Syst. 29, 3 (2007). https://doi.org/10.1145/1232420.1232424

    Article  MATH  Google Scholar 

  18. Fowler, M., Beck, K.: Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional, Boston (1999)

    Google Scholar 

  19. Gibbons, J., Stevens, P.: International Summer School on Bidirectional Transformations (Oxford, UK, 25–29 July 2016). Lecture Notes in Computer Science, vol. 9715. Springer, Berlin (2018)

    MATH  Google Scholar 

  20. Gosling, J., Joy, B., Steele, G.: The Java Language Specification, 3rd ed (2006). https://docs.oracle.com/javase/specs/

  21. Hirzel, M., Rose, K.H.: Tiger language specification (2013). https://cs.nyu.edu/courses/fall13/CSCI-GA.2130-001/tiger-spec.pdf

  22. Hu, Z., Ko, H.S.: Principles and practice of bidirectional programming in BiGUL. In: Gibbons, J., Stevens, P. (eds.) Bidirectional Transformations: International Summer School, Oxford, UK, July 25–29, 2016, Tutorial Lectures, pp. 100–150. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-79108-1_4

    Chapter  Google Scholar 

  23. Johnson, S.C.: Yacc: Yet another compiler-compiler. AT&T Bell Laboratories Technical Reports (AT&T Bell Laboratories Murray Hill, New Jersey 07974). p. 32 (1975)

  24. Kernighan, B.W., Ritchie, D.M.: The C Programming Language. Prentice Hall Press, Upper Saddle River (1988)

    MATH  Google Scholar 

  25. Kinoshita, D., Nakano, K.: Bidirectional certified programming. In: Eramo, R., Johnson. M. (eds) Proceedings of the 6th International Workshop on Bidirectional Transformations Co-Located with The European Joint Conferences on Theory and Practice of Software (ETAPS 2017), CEUR Workshop Proceedings, Uppsala, Sweden, vol. 1827, pp. 31–38 (2017)

  26. Klint, P., Visser, E.: Using filters for the disambiguation of context-free grammars. In: Pighizzini, G., Pietro, P.S. (eds) Proceedings of the ASMICS Workshop on Parsing Theory, University of Milan, Italy, Milano, Italy, pp. 1–20 (1994)

  27. Ko, H.S., Hu, Z.: An axiomatic basis for bidirectional programming. Proc. ACM Program. Lang. 2(POPL), 41:1–41:29 (2018). https://doi.org/10.1145/3158129

    Article  Google Scholar 

  28. Ko, H.S., Zan, T., Hu, Z.: BiGUL: a formally verified core language for putback-based bidirectional programming. In: Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation, ACM, New York, NY, USA, PEPM ’16, pp. 61–72 (2016). https://doi.org/10.1145/2847538.2847544

  29. LaLonde, W.R., des Rivieres, J.: Handling operator precedence in arithmetic expressions with tree transformations. ACM Trans. Program. Lang. Syst. 3(1), 83–103 (1981). https://doi.org/10.1145/357121.357127

    Article  MATH  Google Scholar 

  30. Lämmel, R., Jones, S.P.: Scrap your boilerplate: a practical design pattern for generic programming. In: Proceedings of the 2003 ACM SIGPLAN International Workshop on Types in Languages Design and Implementation, ACM, New York, NY, USA, TLDI ’03, pp. 26–37 (2003). https://doi.org/10.1145/604174.604179

  31. Lutterkort, D.: Augeas—a configuration API. In: Proceedings of the Ottawa Linux Symposium, Ottawa, Canada, pp. 47–56 (2008)

  32. Macedo, N., Pacheco, H., Cunha, A., Oliveira, J.N.: Composing least-change lenses. Proc. Sec. Int. Workshop Bidirect. Transform. 57, 1–19 (2013). https://doi.org/10.14279/tuj.eceasst.57.868

    Article  Google Scholar 

  33. Marlow, S., Gill, A.: The parser generator for Haskell. https://www.haskell.org/happy/ (2001)

  34. Marlow, S., et al.: Haskell 2010 language report. https://www.haskell.org/onlinereport/haskell2010/ (2010)

  35. Martins, P., Saraiva, J., Fernandes, J.P., Van Wyk, E.: Generating attribute grammar-based bidirectional transformations from rewrite rules. In: Proceedings of the ACM SIGPLAN 2014 Workshop on Partial Evaluation and Program Manipulation, ACM, New York, NY, USA, PEPM ’14, pp. 63–70 (2014). https://doi.org/10.1145/2543728.2543745

  36. Matsuda, K., Wang, M.: Embedding invertible languages with binders: a case of the FliPpr language. In: Proceedings of the 11th ACM SIGPLAN International Symposium on Haskell, ACM, New York, NY, USA, Haskell 2018, pp. 158–171 (2018a). https://doi.org/10.1145/3242744.3242758

  37. Matsuda, K., Wang, M.: FliPpr: a system for deriving parsers from pretty-printers. New Gener. Comput. 36(3), 173–202 (2018b). https://doi.org/10.1007/s00354-018-0033-7

    Article  Google Scholar 

  38. Matsuda, K., Mu, S.C., Hu, Z., Takeichi, M.: A grammar-based approach to invertible programs. In: Gordon, A.D. (ed) Proceedings of the 19th European Conference on Programming Languages and Systems, Springer, Berlin, no. 20 in ESOP’10, pp. 448–467 (2010). https://doi.org/10.1007/978-3-642-11957-6_24

  39. Norell, U.: Towards a practical programming language based on dependent type theory. PhD thesis, Chalmers University of Technology (2007)

  40. Pacheco, H., Hu, Z., Fischer, S.: Monadic combinators for “putback” style bidirectional programming. In: Proceedings of the ACM SIGPLAN 2014 Workshop on Partial Evaluation and Program Manipulation, ACM, New York, NY, USA, PEPM ’14, pp. 39–50 (2014a). https://doi.org/10.1145/2543728.2543737

  41. Pacheco, H., Zan, T., Hu, Z.: BiFluX: A bidirectional functional update language for XML. In: Proceedings of the 16th International Symposium on Principles and Practice of Declarative Programming, ACM, New York, NY, USA, PPDP ’14, pp. 147–158 (2014b). https://doi.org/10.1145/2643135.2643141

  42. Pombrio, J., Krishnamurthi, S.: Resugaring: lifting evaluation sequences through syntactic sugar. In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, ACM, New York, NY, USA, no. 6 in PLDI ’14, pp. 361–371 (2014). https://doi.org/10.1145/2594291.2594319

  43. Pombrio, J., Krishnamurthi, S.: Hygienic resugaring of compositional desugaring. In: Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming, ACM, New York, NY, USA, no. 13 in ICFP 2015, pp. 75–87 (2015). https://doi.org/10.1145/2784731.2784755

  44. Rendel, T., Ostermann, K.: Invertible syntax descriptions: unifying parsing and pretty printing. In: Proceedings of the Third ACM Haskell Symposium on Haskell, ACM, New York, NY, USA, Haskell ’10, pp. 1–12 (2010). https://doi.org/10.1145/1863523.1863525

  45. Reps, T., Teitelbaum, T.: The synthesizer generator. In: Proceedings of the First ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments, ACM, New York, NY, USA, SDE 1, pp. 42–48 (1984). https://doi.org/10.1145/800020.808247

  46. Reps, T., Teitelbaum, T., Demers, A.: Incremental context-dependent analysis for language-based editors. ACM Trans. Program. Lang. Syst. 5(3), 449–477 (1983). https://doi.org/10.1145/2166.357218

    Article  Google Scholar 

  47. Scott, E., Johnstone, A.: GLL parsing. Electron. Notes Theor. Comput. Sci. 253(7), 177–189 (2010). https://doi.org/10.1016/j.entcs.2010.08.041

    Article  Google Scholar 

  48. Scott, E., Johnstone, A., Economopoulos, R.: BRNGLR: a cubic tomita-style glr parsing algorithm. Acta Inform. 44(6), 427–461 (2007). https://doi.org/10.1007/s00236-007-0054-z

    Article  MathSciNet  MATH  Google Scholar 

  49. Sheard, T., Jones, S.P.: Template meta-programming for Haskell. In: Proceedings of the 2002 ACM SIGPLAN Workshop on Haskell, ACM, New York, NY, USA, Haskell ’02, pp. 1–16 (2002). https://doi.org/10.1145/581690.581691

  50. Tomita, M.: An efficient context-free parsing algorithm for natural languages. In: Proceedings of the 9th International Joint Conference on Artificial Intelligence-Volume 2, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, IJCAI’85, pp. 756–764 (1985). http://dl.acm.org/citation.cfm?id=1623611.1623625

  51. Traver, V.J.: On compiler error messages: what they say and what they mean. Ad. Hum. Comput. Interact. 2010, 3:1–3:26 (2010). https://doi.org/10.1155/2010/602570

    Article  Google Scholar 

  52. Visser, E.: A case study in optimizing parsing schemata by disambiguation filters. International Workshop on Parsing Technology (IWPT 1997), pp. 210–224. Massachusetts Institute of Technology, Boston, USA (1997a)

  53. Visser, E.: Syntax definition for language prototyping. PhD thesis, University of Amsterdam (1997b)

  54. Younger, D.H.: Recognition and parsing of context-free languages in time \(n^3\). Inf. Control 10(2), 189–208 (1967)

    Article  Google Scholar 

  55. Zhu, Z., Zhang, Y., Ko, H.S., Martins, P., Saraiva, J., Hu, Z.: Parsing and reflective printing, bidirectionally. In: Proceedings of the 2016 ACM SIGPLAN International Conference on Software Language Engineering, ACM, New York, NY, USA, SLE 2016, pp. 2–14. https://doi.org/10.1145/2997364.2997369 (2016)

Download references

Acknowledgements

We thank the reviewers and the editor for their selflessness and effort spent on reviewing our paper, a quite long one. With their help, the readability of the paper is much improved, especially regarding how several case studies are structured, how theorems for the basic BiYacc and theorems for the extended version handling ambiguous grammars are related, and how look-alike notions are ‘disambiguated’. This work is partially supported by the Japan Society for the Promotion of Science (JSPS) Grant-in-Aid for Scientific Research (S) No. 17H06099; in particular, most of the second author’s contributions were made when he worked at the National Institute of Informatics and funded by the Grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zirun Zhu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 17 KB)

Supplementary material 2 (zip 88 KB)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, Z., Ko, HS., Zhang, Y. et al. Unifying Parsing and Reflective Printing for Fully Disambiguated Grammars. New Gener. Comput. 38, 423–476 (2020). https://doi.org/10.1007/s00354-019-00082-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00354-019-00082-y

Keywords

Navigation