Skip to main content
Log in

Consistent updating of databases with marked nulls

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

This paper revisits the problem of consistency maintenance when insertions or deletions are performed on a valid database containing marked nulls. This problem comes back to light in real-world linked data or RDF databases when blank nodes are associated with null values. This paper proposes solutions for the main problems one has to face when dealing with updates and constraints, namely update determinism, minimal change and leanness of an RDF graph instance. The update semantics is formally introduced and the notion of core is used to ensure a database as small as possible (i.e.   the RDF graph leanness). Our algorithms allow the use of constraints such as tuple-generating dependencies, offering a way for solving many practical problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. Informally speaking, a sub-formula \(\gamma \) of a FOL formula \(\alpha \) is a string occurring in \(\alpha \) which is itself a FOL formula.

  2. Recall that our forward operator (Definition 4.1) does not compute atoms which are already in the database instance.

References

  1. Abiteboul S, Hull R, Vianu V (1995) Foundations of databases, vol 8. Addison-Wesley, Reading

    MATH  Google Scholar 

  2. Afrati FN, Kolaitis PG (2009) Repair checking in inconsistent databases: algorithms and complexity. In: Proceedings of the 12th international conference on database theory—ICDT, Russia, March 23–25, 2009, pp 31–41

  3. Ahmeti A, Calvanese D, Polleres A (2014) Updating RDFS ABoxes and TBoxes in SPARQL. CoRR arXiv:1403.7248

  4. Alchourrón CE, Gärdenfors P, Makinson D (1985) On the logic of theory change: partial meet contraction and revision functions. J Symb Log 50(2):510–530. https://doi.org/10.2307/2274239

    Article  MathSciNet  MATH  Google Scholar 

  5. Arenas M, Pérez J (2011) Querying semantic web data with SPARQL. In: Proceedings of the 30th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, PODS, Athens, Greece, pp 305–316

  6. Benedikt M, Konstantinidis G et al (n.d.) Benchmarking the chase. In: Principles of database systems (PODS 2017) (to appear)

  7. Chabin J, Gomes Jr L, Halfeld Ferrari M (2018) A context-driven querying system for urban graph analysis. In: IDEAS. ACM, pp 297–301

  8. Chirkova R, Fletcher GHL (2009) Towards well-behaved schema evolution. In: 12th international workshop on the web and databases, WebDB, USA

  9. Codd EF (1975) Understanding relations (installment #6). FDT Bull ACM SIGMOD 7(1):1–4

    Google Scholar 

  10. DBOrleans-Team (2018) A prototype—updating with marked nulls—version 2018. http://www.univ-orleans.fr/lifo/Members/Mirian.Halfeld/mi2-software.html. Accessed 23 Oct 2018

  11. De Giacomo G, Lembo D, Oriol X, Savo DF, Teniente E (2017) Practical update management in ontology-based data access. In: Proceedings of the semantic web—ISWC–16th international semantic web conference, Vienna, Austria, Part I, pp 225–242

  12. De Giacomo G, Lenzerini M, Poggi A, Rosati R (2009) Dealing with inconsistencies and incompleteness in database update (position paper)

  13. Deutsch A, Nash A, Remmel JB (2008) The chase revisited. In: Proceedings of the twenty-seventh ACMSIGMOD-SIGACT-SIGART symposium on principles of database systems, PODS 2008, June 9–11, 2008, Vancouver, BC, Canada, pp 149–158

  14. D’Orazio L, Halfeld-Ferrari M, Hara CS, Kozievitch NP, Musicante MA (2017) Graph constraints in urban computing: dealing with conditions in processing urban data. In: DARLI-AP, international workshop on data analytics solutions for Real-LIfe APplications in conjunction with the 3rd IEEE international conference on smart data, England, UK

  15. Fagin R, Kolaitis PG, Miller RJ, Popa L (2003) Data exchange: semantics and query answering. In: Proceedings of the database theory—ICDT, 9th international conference, Italy, pp 207–224

  16. Fagin R, Kolaitis PG, Popa L (2005) Data exchange: getting to the core. ACM Trans Database Syst 30(1):174–210

    Article  Google Scholar 

  17. Fagin R, Kuper GM, Ullman JD, Vardi MY (1986) Updating logical databases. Adv Comput Res 3:1–18

    Google Scholar 

  18. Fagin R, Ullman JD, Vardi MY (1983) On the semantics of updates in databases. In: Proceedings of the second ACM SIGACT-SIGMOD symposium on principles of database systems, Colony Square Hotel, Atlanta, Georgia, USA, pp 352–365

  19. Flouris G, Konstantinidis G, Antoniou G, Christophides V (2013) Formal foundations for RDF/S KB evolution. Knowl Inf Syst 35(1):153–191

    Article  Google Scholar 

  20. Gottlob G (2005) Computing cores for data exchange: new algorithms and practical solutions. In: Proceedings of the twenty-fourth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, Baltimore, Maryland, USA, pp 148–159

  21. Gottlob G, Orsi G, Pieris A (2011) Ontological queries: rewriting and optimization. In: Proceedings of the 27th international conference on data engineering, ICDE, Germany, pp 2–13

  22. Grahne G (1991) The problem of incomplete information in relational databases. Lecture notes in computer science, vol 554. Springer, New York

    Book  Google Scholar 

  23. Grahne G, Onet A (2011) On conditional chase termination. In: Proceedings of the 5th Alberto Mendelzon international workshop on foundations of data management, Santiago, Chile, May 9–12, 2011

  24. Gutierrez C, Hurtado CA, Vaisman AA (2011) RDFS update: from theory to practice. In: Proceedings of the semanic web: research and applications—8th extended semantic web conference, ESWC, Greece, Part II, pp 93–107

  25. Halfeld Ferrari Alves M, Hara CS, Kozievitch NP, Uber FR (2018) Urban data consistency in RDF: a case study of Curitiba transportation system. In: ‘LADaS@VLDB’, volume 2170 of CEUR workshop proceedings, CEUR-WS.org, pp 33–40

  26. Halfeld Ferrari Alves M, Laurent D, Spyratos N (1998) Update rules in datalog programs. J Log Comput 8(6):745–775

    Article  MathSciNet  Google Scholar 

  27. Halfeld Ferrari M, Hara CS, Uber FR (2017) RDF updates with constraints. In: Proceedings of the knowledge engineering and semantic web—8th international conference, KESW, Szczecin, Poland, pp 229–245

  28. Halfeld Ferrari M, Laurent D (2017) Updating RDF/S databases under constraints. In: Proceedings of the advances in databases and information systems—21st European conference, ADBIS, Nicosia, Cyprus, pp 357–371

  29. Hansson SO (2016) Logic of belief revision. In: Zalta EN (ed) The Stanford encyclopedia of philosophy, winter 2016 edition, 2016th edn. Metaphysics Research Lab, Stanford University, Stanford

    Google Scholar 

  30. Hell P, Nesetril J (1992) The core of a graph. Discrete Math 109(1–3):117–126. https://doi.org/10.1016/0012-365X(92)90282-K

    Article  MathSciNet  MATH  Google Scholar 

  31. Imielinski T, Lipski W Jr (1984) Incomplete information in relational databases. J ACM 31(4):761–791

    Article  MathSciNet  Google Scholar 

  32. Katsuno H, Mendelzon AO (1991) On the difference between updating a knowledge base and revising it. In: Proceedings of the 2nd international conference on principles of knowledge representation and reasoning (KR’91). Cambridge, MA, USA, April 22–25, pp 387–394

  33. Knublauch H, Hendler JA, Idehen K (2011) SPIN—overview and motivation. W3C member submission. http://www.w3.org/Submission/2011/SUBM-spin-overview-20110222. Accessed 3 Nov 2017

  34. Knublauch H, Ryman A (2017) Shapes constraint language (SHACL). W3C first public working draft, w3c. http://www.w3.org/TR/2015/WD-shacl-20151008/. Accessed 3 Nov 2017

  35. Kozievitch NP, Gadda TMC, Fonseca KVO, Rosa MO, Gomes Jr LC, Akbar M (2016) Exploratory analysis of public transportation data in Curitiba. In: ‘XXXVI CSBC’. Sociedade Brasileira de Computação, pp 1656–1666

  36. Lausen G, Meier M, Schmidt M (2008) Sparqling constraints for RDF. In: Proceedings of the EDBT, 11th international conference on extending database technology, France, pp 499–509

  37. Libkin L (2006) Data exchange and incomplete information. In: Proceedings of the twenty-fifth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, June 26–28, 2006, Chicago, Illinois, USA, pp 60–69

  38. Libkin L (2015) Sql’s three-valued logic and certain answers. In: 18th international conference on database theory, ICDT, Brussels, Belgium, pp 94–109

  39. Link S (2002) Towards a tailored theory of consistency enforcement in databases. In: Proceedings of the foundations of information and knowledge systems, second international symposium, FoIKS, Germany, pp 160–177

  40. Link S, Schewe K (2002) An arithmetic theory of consistency enforcement. Acta Cybern 15(3):379–416

    MathSciNet  MATH  Google Scholar 

  41. Lipski Jr W (1984) On relational algebra with marked nulls. In: Proceedings of the third ACM SIGACT-SIGMOD symposium on principles of database systems, Waterloo, Ontario, Canada, pp. 201–203

  42. Lösch U, Rudolph S, Vrandecic D, Studer R (2009) Tempus fugit. In: Proceedings of the semantic web: research and applications, 6th European semantic web conference, ESWC, Crete, Greece, , pp 278–292

  43. Motik B, Horrocks I, Sattler U (2007) Bridging the gap between OWL and relational databases. In: Proceedings of the 16th international conference on world wide web, WWW, Canada, pp 807–816

  44. Nikolaou C, Koubarakis M (2016) Querying incomplete information in RDF with SPARQL. Artif Intell 237:138–171

    Article  MathSciNet  Google Scholar 

  45. Onet A (2013) The chase procedure and its applications in data exchange. In: Data exchange, integration, and streams, pp 1–37

  46. Patel-Schneider PF (2015) Using description logics for RDF constraint checking and closed-world recognition. In: Proceedings of the twenty-ninth AAAI conference on artificial intelligence, USA, pp 247–253

  47. Pichler R, Skritek S (2011) The complexity of evaluating tuple generating dependencies. In: Proceedings of the database theory—ICDT, 14th international conference, Sweden, pp 244–255

  48. Rabinovitch J, Leitman J (1996) Urban planning in Curitiba. Sci Am 274(3):46–53

    Article  Google Scholar 

  49. Reiter R (1986) A sound and sometimes complete query evaluation algorithm for relational databases with null values. J ACM 33(2):349–370

    Article  MathSciNet  Google Scholar 

  50. Schewe K, Thalheim B (1998) Limitations of rule triggering systems for integrity maintenance in the context of transition specifications. Acta Cybern 13(3):277–304

    MathSciNet  MATH  Google Scholar 

  51. Schewe K, Thalheim B (1999) Towards a theory of consistency enforcement. Acta Inf 36(2):97–141. https://doi.org/10.1007/s002360050155

    Article  MathSciNet  MATH  Google Scholar 

  52. Solbrig H, hommeaux EP (2014) Shape expressions 1.0 definition. W3C member submission. http://www.w3.org/Submission/2014/SUBM-shex-defn-20140602. Accessed 3 Nov 2017

  53. Stardog5 (2017) Enterprise knowledge graph. http://www.stardog.com/docs/

  54. W3C-Working-Group (n.d.) Linked data patch format—pathological graph. https://www.w3.org/TR/ldpatch/#pathological-graph. Accessed 3 Nov 2017

  55. Winslett M (1990) Updating logical databases. Cambridge University Press, New York

    Book  Google Scholar 

  56. Zaniolo C (1984) Database relations with null values. J Comput Syst Sci 28(1):142–166

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank our colleagues Carmem Hara, Nádia P. Kozievitch and Flávio Uber who kindly shared the use of RDF-URBS data, and our student Julien Revaud for his work on the current implementation. We also wish to thank the reviewers for their remarks and suggestions that lead to important improvements of a preliminary version of our work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mirian Halfeld-Ferrari.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Proof of Proposition 6.1

Proposition 6.1

Let \(\Delta = ({\mathfrak {D}},{\mathbb {C}})\) be a database and \({\textsf {iRequest}} \) a finite set of facts. If Algorithm 1 returns \(\Delta '=({\mathfrak {D}}', {\mathbb {C}})\) where \({\mathfrak {D}}' \ne {\mathfrak {D}}\), then \({\mathfrak {D}}'\) satisfies the following statements:

  1. 1.

    Effectiveness: (a) for every \(\varphi \) in \({\textsf {iRequest}} \), \({\mathfrak {D}}'\) contains an instance of \(\varphi \), (b) \(R\_core({\mathfrak {D}}')={\mathfrak {D}}'\) and (c) \({\mathfrak {D}}' \models {\mathbb {C}}\).

  2. 2.

    Monotonicity:\({\mathfrak {D}}\sqsubseteq {\mathfrak {D}}'\).

  3. 3.

    Minimal change: For every \(\varphi \) in \({\mathfrak {D}}'\) not in \({\mathfrak {D}}\) and not isomorphic to an atom in \({\textsf {iRequest}} \), \({\mathfrak {D}}' {\setminus } \{\varphi \} \not \models {\mathbb {C}}\).

Proof

1(a) Let \(\varphi \) in \({\textsf {iRequest}} \). Then, \(\varphi \) is in \(T^*_\textsf {fw}({\mathfrak {D}}\cup {\textsf {iRequest}})={\mathfrak {D}}_1\) (line 3 of Algorithm 1). Thus, if \(\varphi \) is not in \({\mathfrak {D}}'\), then \(\varphi \) has been instantiated due to the computation of the restricted core on line 4 of Algorithm 1. Therefore, in this case \(R\_core({\mathfrak {D}}_1)={\mathfrak {D}}'\) contains an instance of \(\varphi \).

1(b) The fact that \(R\_core({\mathfrak {D}}')={\mathfrak {D}}'\) is an obvious consequence of Algorithm 1.

1(c) If \({\mathfrak {D}}_1\) contains no null N such that \(\delta (N) \ge k\), then \({\mathfrak {D}}' \models {\mathbb {C}}\) holds because, by Proposition 4.1, \({\mathfrak {D}}_1\) satisfies \({\mathbb {C}}\) and thus, by Lemma 3.1, so does its core which is equal to its restricted core due to Lemma 4.2(1).

Now, assume that \({\mathfrak {D}}_1\) contains at least one null N with \(\delta (N) \ge \delta _{max}\) and that \({\mathfrak {D}}' \not \models {\mathbb {C}}\). This means that there exists \(c:B(\mathbf{X },\mathbf{Y }) \Rightarrow L(\mathbf{X },\mathbf{Z })\) in \({\mathbb {C}}\) such that \({\mathfrak {D}}' \not \models c\). Thus, there exists a homomorphism h such that \(h(body(c)) \subseteq {\mathfrak {D}}'\) and, for any extension \(h'\) of h to the variables occurring in \(\mathbf{Z } \), \(h'(head(c))\) is not in \({\mathfrak {D}}'\). Denoting h(body(c)) by \(B(\alpha , \beta )\), this means that \(B(\alpha , \beta ) \subseteq {\mathfrak {D}}'\) and that \({\mathfrak {D}}'\) contains no atom of the form \(L(\alpha , \_)\). Since \({\mathfrak {D}}'\subseteq {\mathfrak {D}}_1\), we have \(h(body(c)) \subseteq {\mathfrak {D}}_1\). As for every null N in h(body(c)), \(\delta (N) < \delta _{max}\), \({\mathfrak {D}}_1\) contains an atom of the form \(L(\alpha , \nu )\) where \(\nu \) is made of nulls whose degree is \(\delta _{max}\). Therefore, \(L(\alpha , \nu )\) has been removed during the core processing, meaning that \({\mathfrak {D}}_1\) also contains an instance \(L(\alpha ', \gamma )\) of \(L(\alpha , \nu )\). Denoting by h the homomorphism computing the restricted core \({\mathfrak {D}}'\), \({\mathfrak {D}}_1\) contains \(h(L(\alpha , \mathbf{N }))=L(\alpha ', \gamma )\) along with all atoms in \(B(\alpha , \beta )\) or in \(h(B(\alpha , \beta ))\). The case \(h(\alpha ) \ne \alpha \) is not possible because this implies that \(B(\alpha ,\beta ) \subseteq {\mathfrak {D}}'\) does not hold. Hence, \(h(\alpha ) = \alpha \), and so, \(h(L(\alpha ,\nu ))=L(\alpha , \gamma )\) is in \({\mathfrak {D}}'\), a contradiction to the fact that \({\mathfrak {D}}'\) contains no atom of the form \(L(\alpha , \_)\). Therefore, this part of the proof is complete.

2. In Algorithm 1, on line 3, we have \({\mathfrak {D}}\subseteq T^*_\textsf {fw}({\mathfrak {D}}\cup {\textsf {iRequest}})\), thus \({\mathfrak {D}}\sqsubseteq T^*_\textsf {fw}({\mathfrak {D}}\cup {\textsf {iRequest}})\) holds. As earlier noticed, \( T^*_\textsf {fw}({\mathfrak {D}}\cup {\textsf {iRequest}}) \sqsubseteq R\_core(T^*_\textsf {fw}({\mathfrak {D}}\cup {\textsf {iRequest}}))\), which implies by transitivity that on line 4, \({\mathfrak {D}}\sqsubseteq {\mathfrak {D}}_1\) holds. Therefore, the proof of this part is complete.

3. Let \(\varphi \) in \({\mathfrak {D}}'\), not in \({\mathfrak {D}}\) and not isomorphic to an atom in \({\textsf {iRequest}} \), and assume that \({\mathfrak {D}}' {\setminus } \{\varphi \} \models {\mathbb {C}}\). Then, \(\varphi \) has been generated during the computation of \({\mathfrak {D}}_1=T^*_\textsf {fw}({\mathfrak {D}}\cup {\textsf {iRequest}})\). Hence, there exists \(c:B(\mathbf{X },\mathbf{Y }) \Rightarrow L(\mathbf{X },\mathbf{Z })\) in \({\mathbb {C}}\) and a homomorphism h such that \(h(c) \subseteq {\mathfrak {D}}_1\) and \(h(head(c))=\varphi \).

Since \({\mathfrak {D}}' {\setminus } \{\varphi \} \models c\), c cannot be full, meaning that c has existentially quantified variables and that \({\mathfrak {D}}' {\setminus } \{\varphi \}\) contains an atom of the form \(L(\alpha , \gamma )\) where \(\alpha =h(\mathbf{X })\) and \(\gamma \ne h(\mathbf{Z })\). Let p be the least integer such that \(h(body(c)) \subseteq T^p_\textsf {fw}({\mathfrak {D}}\cup {\textsf {iRequest}})\) and \(\varphi \not \in T^p_\textsf {fw}({\mathfrak {D}}\cup {\textsf {iRequest}})\). Since \(\varphi \) is in \({\mathfrak {D}}'\), for every null N in \(\varphi \), \(\delta (N) < \delta _{\max }\). Thus, when computing \(T^{p+1}_\textsf {fw}({\mathfrak {D}}\cup {\textsf {iRequest}})\), all nulls occurring in the instantiated body of the corresponding constraint have a degree strictly less than \(\delta _{\max }-1\), meaning that \(\varphi \) is generated as the atom \(L(\alpha , \mathbf{N })\) where \(\mathbf{N } \) is a vector of fresh nulls whose degrees are strictly less than \(\delta _{max}\). As a consequence, by definition of \(T_\textsf {fw}\), \(L(\alpha , \gamma )\) is not in \(T^p_\textsf {fw}({\mathfrak {D}}\cup {\textsf {iRequest}})\) (otherwise \(L(\alpha , \mathbf{N })\) would not be generated), meaning that \(L(\alpha , \gamma )\) is generated at a subsequent computation step q (i.e.   \(q>p+1\)).

Now, let \(h_0\) be the homomorphism defined by \(h_0(\mathbf{N })=\gamma \) and \(h_0(N)=N\) for every N not occurring in \(\mathbf{N } \). We first notice that \(h_0\) is restricted because, since \(L(\alpha , \gamma )\) is in \({\mathfrak {D}}'\) (this so because \(L(\alpha , \gamma ) \in {\mathfrak {D}}'{\setminus } \{\varphi \}\)), all nulls in \(\gamma \) have a degree strictly less than \(\delta _{\max }\). Denoting by \(Atoms(\mathbf{N })\) the set of atoms in \({\mathfrak {D}}_1\) in which at least one null in \(\mathbf{N } \) occurs, we show that for every A in \(Atoms(\mathbf{N })\), \(h_0(A)\) is also in \({\mathfrak {D}}_1\). Consider now the step m where the first atom A in \(Atoms(\mathbf{N })\) different from \(\varphi \) appears in \(T^m_\textsf {fw}({\mathfrak {D}}\cup {\textsf {iRequest}})\). During the computation, a constraint \(c_A\) is instantiated into \(h_A(c_A)\) so as to contain \(\varphi \) in its body (the only atom so far in which \(\mathbf{N } \) occurs), in order to generate \(A=h_A(head(c_A))\). Hence, as soon as \(h_0(\varphi )\) appears in \(T^q_\textsf {fw}({\mathfrak {D}}\cup {\textsf {iRequest}})\), \(h_0(h_A(c_A))\) applies and generates \(h_0(A)\), showing that \(h_0(A)\) is in \({\mathfrak {D}}_1\). Since this reasoning applies step by step for every A in \(Atoms(\mathbf{N })\), this implies that for every A in \(Atoms(\mathbf{N })\), \(h_0(A)\) is in \({\mathfrak {D}}_1\).

Hence, in the computation of \(R\_core({\mathfrak {D}}_1)\), \(\mathbf{N } \) can be instantiated into \(\gamma \) using \(h_0\) (or one of its extensions), which implies that \(\varphi \) is not in \({\mathfrak {D}}'\), a contradiction to our hypothesis that \(\varphi \) is in \({\mathfrak {D}}'\). Therefore, the proof is complete. \(\square \)

Proof of Proposition 6.2

Proposition 6.2

Let \(\Delta = ({\mathfrak {D}},{\mathbb {C}})\) be a database and \({\textsf {dRequest}} \) a finite set of instantiated atoms where every null occurs at most once. Algorithm 2 returns \(\Delta '=({\mathfrak {D}}', {\mathbb {C}})\) where \({\mathfrak {D}}'\) satisfies the following statements:

  1. 1.

    Effectiveness (a) for every \(\varphi \) in \({\textsf {dRequest}} \), \({\mathfrak {D}}'\) contains no atom isomorphic to \(\varphi \), (b) \(R\_core({\mathfrak {D}}')={\mathfrak {D}}'\), and (c) \({\mathfrak {D}}' \models {\mathbb {C}}\).

  2. 2.

    Monotonicity\({\mathfrak {D}}' \sqsubseteq {\mathfrak {D}}\).

Proof

1(a) Let \(\varphi \) in \({\textsf {dRequest}} \) and suppose an atom \(\varphi _1\) isomorphic to \(\varphi \) that belongs to \({\mathfrak {D}}'\), when running Algorithm 2. If \(\varphi _1\) is in ToDel on line 1, then for \(\varphi _1\) to belong to \({\mathfrak {D}}'\), it must be that on line 5, \(\varphi _1\) is in \({\mathfrak {D}}_1\) and thus that, on line 7, \(\varphi _1\) is in Back. If \(\varphi _1\) is not in ToDel, then \(\varphi _1\) is not in \({\mathfrak {D}}\) showing that this atom appears in \({\mathfrak {D}}_1\) on line 4, the only step in Algorithm 2 where atoms are added to the instance. In this case, we again have that \(\varphi _1\) is in Back. But then, on line 11, \(\varphi _1\) belongs to \(T^*_\textsf {bw}(Disallowed \cup Back, {\mathfrak {D}}_1)\) showing that \(\varphi _1\) cannot belong to \({\mathfrak {D}}_2\). Since \({\mathfrak {D}}'\) is a subset of \({\mathfrak {D}}_2\), it is not possible that \(\varphi _1\) belongs to \({\mathfrak {D}}'\), a contradiction that completes this part of the proof.

1(b) \(R\_core({\mathfrak {D}}')={\mathfrak {D}}'\) is an obvious consequence of the statements on lines 5 and 12 of Algorithm 2.

1(c) If \({\mathfrak {D}}'\) is output by Algorithm 2 on line 9, then Disallowed is empty, meaning that every null N in \({\mathfrak {D}}_1\) is such that \(\delta (N) < \delta _{max}\). Thus, by Proposition 4.1, we have \(T^*_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel)\models {\mathbb {C}}\), implying that, by Lemma 3.1, \({\mathfrak {D}}' \models {\mathbb {C}}\). Otherwise, \({\mathfrak {D}}'\) is output on line 13, in which case Proposition 4.2 shows that \({\mathfrak {D}}_2 \models {\mathbb {C}}\), and thus, by Lemma 3.1, we have \({\mathfrak {D}}' \models {\mathbb {C}}\), which completes this part of the proof.

2. To prove that \({\mathfrak {D}}' \sqsubseteq {\mathfrak {D}}\), we first show that for \({\mathfrak {D}}_1=R\_core(T^*_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel))\) as computed on line 5 of Algorithm 2, we have \({\mathfrak {D}}_1 \sqsubseteq {\mathfrak {D}}\). To this end, let us first prove by induction that \(T^*_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel) \sqsubseteq {\mathfrak {D}}\). The base case of the induction, that is, \({\mathfrak {D}}{\setminus } ToDel \sqsubseteq {\mathfrak {D}}\), holds because \({\mathfrak {D}}{\setminus } ToDel \subseteq {\mathfrak {D}}\). Then, for a given integer p, assuming that \(T^k_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel) \sqsubseteq {\mathfrak {D}}\) for every \(0 \le k \le p\), we show that \(T^{p+1}_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel) \sqsubseteq {\mathfrak {D}}\). To see this, let \(h^p\) be the homomorphism such that \(h^p(T^p_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel)) \subseteq {\mathfrak {D}}\) and consider the two cases as follows:

  • Assume first that \(T^{p+1}_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel)= T^{p}_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel)\). This means that the least fixed point has been reached, in which case \(h^{k+1}\) is equal to \(h^k\).

  • When \(T^{p+1}_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel)\ne T^{p}_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel)\), for every null N in \(T^{p}_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel)\) let \(h^{p+1}(N)= h^p(N)\). Then, considering a null N in \(T^{p+1}_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel)\) but not in \(T^{p}_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel)\) implies that \(\delta (N)\le \delta _{\max }\) and that there exists c in \({\mathbb {C}}\) and a homomorphism h such that \(h(body(c)) \subseteq T^{p}_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel)\) and N occurs in \(\varphi =h(head(c))\). In this case, we have \(h^p(h(body(c)) \subseteq {\mathfrak {D}}\) because, by our induction hypothesis, \(h^p(T^{p}_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel))\subseteq {\mathfrak {D}}\). Since \({\mathfrak {D}}\models {\mathbb {C}}\), c applies in \({\mathfrak {D}}\), meaning that \({\mathfrak {D}}\) contains an atom \(A=v(\varphi )\) where v is defined over the nulls \(\nu \) in \(\varphi \) by \(v(\nu )=t\) where t is the corresponding term in A. Notice that v is well defined because any null \(\nu \) in \(\varphi \) either occurs in \(T^{p}_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel)\), in which case \(v(\nu )=\nu \), or \(\nu \) is a fresh null such N, in which case \(v(\nu )\) is uniquely defined. In this case, let \(h^{p+1}(N)\) be v(N). We thus obtain a homomorphism \(h^{p+1}\) such that \(h^{p+1}(T^{p+1}_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel)) \subseteq {\mathfrak {D}}\), showing that \(T^{p+1}_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel) \sqsubseteq {\mathfrak {D}}\).

Therefore, \(T^*_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel) \sqsubseteq {\mathfrak {D}}\) holds. Moreover, since \({\mathfrak {D}}_1=R\_core(T^*_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel))\), \({\mathfrak {D}}_1\sqsubseteq T^*_\textsf {fw}({\mathfrak {D}}{\setminus } ToDel)\) holds, implying \({\mathfrak {D}}_1 \sqsubseteq {\mathfrak {D}}\). Thus, if the test on line 9 succeeds, we trivially have \({\mathfrak {D}}' \sqsubseteq {\mathfrak {D}}\). Otherwise, since \({\mathfrak {D}}' \subseteq {\mathfrak {D}}_2 \subseteq {\mathfrak {D}}_1\), \({\mathfrak {D}}' \sqsubseteq {\mathfrak {D}}\) holds as well, and the proof is complete. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chabin, J., Halfeld-Ferrari, M. & Laurent, D. Consistent updating of databases with marked nulls. Knowl Inf Syst 62, 1571–1609 (2020). https://doi.org/10.1007/s10115-019-01402-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-019-01402-w

Keywords

Navigation