Skip to main content
Log in

Metro maps for efficient knowledge learning by summarizing massive electronic textbooks

  • Original Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

As the number of textbooks soars, people may be stuck into thousands of books when learning knowledge. In order to provide a concise yet comprehensive picture for learning, we propose a novel framework, called MM4Books, to automatically build metro maps for efficient knowledge learning by summarizing massive electronic textbooks. We represent each book in digital libraries as a sequence of chapters, and then obtain learning objects by clustering the semantically similar chapters via an unsupervised clustering method to create a learning graph, and then build the metro map by applying an integer linear programming-based technique to select a collection of high informative and fluent but low redundant learning paths from the learning graph. To the best of our knowledge, it is the first work to address this task. Experiments show that our proposed approach outperforms all the state-of-the-art baseline approaches, and we also implemented a practical MM4Books system to prove that users can really benefit from the proposed approach for knowledge learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://books.google.com.

  2. http://en.wikipedia.org/wiki/Million_Book_Project.

  3. http://lucene.apache.org.

  4. http://www-01.ibm.com/software/commerce/optimization/cplexoptimizer/.

  5. http://www.ckcest.zju.edu.cn/kv.

  6. http://www.cadal.zju.edu.cn.

References

  1. Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Enriching textbooks with images. In: CIKM (2011)

  2. Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Data mining for improving textbooks. ACM SIGKDD Explor. Newsl. 13(2), 7–19 (2012)

    Article  Google Scholar 

  3. Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Study navigator: an algorithmically generated aid for learning from electronic textbooks. In: EDM (2014)

  4. Chen, Z., Zhang, X., Boedihardjo, A.P., Dai, J., Lu, C.T.: Multimodal storytelling via generative adversarial imitation learning. In: IJCAI (2017)

  5. Csomai, A., Mihalcea, R.: Linking educational materials to encyclopedic knowledge. In: AIED (2007)

  6. Dou, W., Yu, L., Wang, X., Ma, Z., Ribarsky, W.: Hierarchicaltopics: visually exploring large text collections using topic hierarchies. IEEE Trans. Vis. Comput. Graph. 19, 2002–2011 (2013)

    Article  Google Scholar 

  7. Filippova, K.: Multi-sentence compression: finding shortest paths in word graphs. In: COLING (2010)

  8. Gillies, J., Quijada, J.J.: Opportunity to learn: a high impact strategy for improving educational outcomes in developing countries. Working Paper. Academy for Educational Development (2008)

  9. He, Z., Chen, C., Bu, J., Wang, C., Zhang, L., Cai, D., He, X.: Document summarization based on data reconstruction. In: AAAI (2012)

  10. Hu, B., Lu, Z., Li, H., Chen, Q.: (2014a) Convolutional neural network architectures for matching natural language sentences. In: NIPS

  11. Hu, P., Huang, M., Zhu, X.: Exploring the interactions of storylines from informative news events. J. Comput. Sci. Technol. 29, 502–518 (2014b)

    Article  Google Scholar 

  12. Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: ACL (2014)

  13. Kenter, T., de Rijke, M.: Short text similarity with word embeddings. In: CIKM (2015)

  14. Kokkodis, M., Kannan, A., Kenthapadi, K.: Assigning educational videos at appropriate locations in textbooks. In: EDM (2014)

  15. Larranaga, M., Conde, A., Calvo, I., Elorriaga, J.A., Arruarte, A.: Automatic generation of the domain module from electronic textbooks: method and validation. IEEE Trans. Knowl. Data Eng. 26(1), 69–82 (2014)

    Article  Google Scholar 

  16. Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML (2014)

  17. Liang, C., Wang, S., Wu, Z., Williams, K., Pursel, B., Brautigam, B., Saul, S., Williams, H., Bowen, K., Giles, C.L.: Bbookx: an automatic book creation framework. In: Proceedings of the 2015 ACM Symposium on Document Engineering, pp 121–124. ACM (2015)

  18. Lu, Z., Li, H.: A deep architecture for matching short texts. In: NIPS (2013)

  19. von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  20. Mei, Q., Guo, J., Radev, D.R.: Divrank: the interplay of prestige and diversity in information networks. In: KDD (2010)

  21. Mihalcea, R., Tarau, P.: Textrank: bringing order into texts. In: EMNLP (2004)

  22. Pang, L., Lan, Y., Guo, J., Xu, J., Wan, S., Cheng, X.: Text matching as image recognition. In: AAAI (2016)

  23. Shahaf, D., Guestrin, C., Horvitz, E.: Metro maps of science. In: KDD (2012a)

  24. Shahaf, D., Guestrin, C., Horvitz, E.: Trains of thought: generating information maps. In: WWW (2012b)

  25. Sigurdsson, G.A., Chen, X., Gupta, A.: Learning visual storylines with skipping recurrent neural networks. In: ECCV (2016)

  26. Tang, S., Wu, F., Li, S., Lu, W., Zhang, Z., Zhuang, Y.: Sketch the storyline with charcoal: a non-parametric approach. In: IJCAI (2015)

  27. Tran, T.A., Niederée, C., Kanhabua, N., Gadiraju, U., Anand, A.: Balancing novelty and salience: adaptive learning to rank entities for timeline summarization of high-impact events. In: CIKM (2015)

  28. Wang, D., Li, T., Ogihara, M.: Generating pictorial storylines via minimum-weight connected dominating set approximation in multi-view graphs. In: AAAI (2012)

  29. Wang, L., Cardie, C., Marchetti, G.: Socially-informed timeline generation for complex events. In: HLT-NAACL (2015a)

  30. Wang, S., Liang, C., Wu, Z., Williams, K., Pursel, B., Brautigam, B., Saul, S., Williams, H., Bowen, K., Giles, C.L.: Concept hierarchy extraction from textbooks. In: Proceedings of the 2015 ACM Symposium on Document Engineering, pp. 147–156. ACM (2015b)

  31. Wang, S., Ororbia, A., Wu, Z., Williams, K., Liang, C., Pursel, B., Giles, C.L.: (2016) Using prerequisites to extract concept maps from textbooks. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 317–326. ACM

  32. Wang, Z., Shou, L., Chen, K., Chen, G., Mehrotra, S.: On summarization and timeline generation for evolutionary tweet streams. IEEE Trans. Knowl. Data Eng. 27, 1301–1315 (2015c)

    Article  Google Scholar 

  33. Wang, Z., Hamza, W., Florian, R.: Bilateral multi-perspective matching for natural language sentences. CoRR arXiv:1702.03814 (2017)

  34. Wu, Y., Wu, W., Li, Z., Zhou, M.: Response selection with topic clues for retrieval-based chatbots. arXiv:160500090 (2016)

  35. Wu, Z., Li, Z., Mitra, P., Giles, C.L.: Can back-of-the-book indexes be automatically created? In: CIKM (2013)

  36. Yang, S., Lu, W., Yang, D., Li, X., Wu, C., Wei, B.: Keyphraseds: automatic generation of survey by exploiting keyphrase information. Neurocomputing 224, 58–70 (2017)

    Article  Google Scholar 

  37. Yu, S., Li, X., Zhao, X., Zhang, Z., Wu, F.: Tracking news article evolution by dense subgraph learning. Neurocomputing 168, 1076–1084 (2015)

    Article  Google Scholar 

  38. Zhang, L., Li, L., Li, T., Zhang, Q.: Patentline: analyzing technology evolution on multi-view patent graphs. In: SIGIR (2014)

  39. Zhao, Y., Karypis, G.: Evaluation of hierarchical clustering algorithms for document datasets. In: CIKM (2002)

  40. Zhou, D., Xu, H., He, Y.: An unsupervised Bayesian modelling approach for storyline detection on news articles. In: EMNLP (2015)

  41. Zhou, D., Xu, H., Dai, X.Y., He, Y.: Unsupervised storyline extraction from news articles. In: IJCAI (2016)

  42. Zhu, X., Ming, Z., Zhu, X., Chua, T.S.: Topic hierarchy construction for the organization of multi-source user generated contents. In: SIGIR (2013)

  43. Zhu, X., Ming, Z., Hao, Y., Zhu, X., Chua, T.S.: Customized organization of social media contents using focused topic hierarchy. In: CIKM (2014)

Download references

Acknowledgements

This work is supported by the Zhejiang Provincial Natural Science Foundation of China (No. LY17F020015), the Chinese Knowledge Center of Engineering Science and Technology (CKCEST), the Fundamental Research Funds for the Central Universities (No. 2017FZA5016), and MOE-Engineering Research Center of Digital Library.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiming Lu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, W., Ma, P., Yu, J. et al. Metro maps for efficient knowledge learning by summarizing massive electronic textbooks. IJDAR 22, 99–111 (2019). https://doi.org/10.1007/s10032-019-00319-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-019-00319-y

Keywords

Navigation