Skip to main content

Advertisement

Log in

Data wrangling practices and collaborative interactions with aggregated data

  • Published:
International Journal of Computer-Supported Collaborative Learning Aims and scope Submit manuscript

Abstract

Data visualization technologies are powerful tools for telling evidence-based narratives about oneself and the world. This paper contributes to the literature on data science education by examining the sociotechnical practices of data wrangling—strategies for selecting and managing large, aggregated datasets to produce a model and story. We examined the learning opportunities related to data wrangling practices by investigating youth’s talk-in-interaction while assembling models and stories about family migration using interactive data visualization tools and large socioeconomic datasets. We first identified ten sociotechnical practices that characterize youth’s interaction with tools and collaboration in data wrangling. We then suggest four categories of activities to describe patterns of learning related to the practices, including addressing missing data, understanding data aggregation, exploring social or historical events that constitute the formation of data patterns, and varying data visual encoding for storytelling. These practices and activities are important to understand for supporting future data science education opportunities that facilitate learning and discussion about scientific and socioeconomic issues. This study also sheds light on how the family migration modeling context positions the youth as having agency and authority over the data and contributes to the design of CSCL environments that tackle the challenges of data wrangling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. All participant names are pseudonyms.

  2. Transcript conventions: CAPITALS indicate emphasis; (Observer notes) indicate significant gesture;? indicates rising intonation;! indicates exclamations; [indicates overlapping talk;, or. indicates pauses less than a half-second … indicates pauses longer than a half-second

References

  • Aridor, K., & Ben-Zvi, D. (2018). Statistical modeling to promote students’ aggregate reasoning with sample and sampling. ZDM, 50(7), 1165–1181.

    Google Scholar 

  • Azevedo, F. S., & Mann, M. J. (2018). Seeing in the dark: Embodied cognition in amateur astronomy practice. Journal of the Learning Sciences, 27(1), 89–136.

    Google Scholar 

  • Bandura, A. (1986). Social foundations of thought and action: A social-cognitive view. Englewood Cliffs: Prentice-Hall.

    Google Scholar 

  • Barron, B. (2006). Interest and self-sustained learning as catalysts of development: A learning ecology perspective. Human Development, 49(4), 193–224.

    Google Scholar 

  • Barron, B., Gomez, K., Pinkard, N., & Martin, C. K. (2014). The digital youth network: Cultivating digital media citizenship in urban communities. Cambridge: MIT Press.

    Google Scholar 

  • Börner, K. (2019). VIS keynote address: Data visualization literacy. In 2019 IEEE Conference on Visual Analytics Science and Technology (VAST) (pp. 1-1). IEEE.

  • Boyd, D., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society, 15(5), 662–679.

    Google Scholar 

  • Cairo, A. (2019). How charts lie: Getting smarter about visual information. New York: WW Norton & Company.

    Google Scholar 

  • Cobb, P., Confrey, J., DiSessa, A., Lehrer, R., & Schauble, L. (2003). Design experiments in educational research. Educational Researcher, 32(1), 9–13.

    Google Scholar 

  • Dalton, C. M., Taylor, L., & Thatcher, J. (2016). Critical data studies: A dialog on data and space. Big Data & Society, 3(1). https://doi.org/10.1177/2053951716648346.

  • Davis, P., Horn, M., Block, F., Phillips, B., Evans, E. M., Diamond, J., & Shen, C. (2015). “Whoa! We’re going deep in the trees!”: Patterns of collaboration around an interactive information visualization exhibit. International Journal of Computer-Supported Collaborative Learning, 10(1), 53–76.

    Google Scholar 

  • Engel, J. (2017). Statistical literacy for active citizenship: A call for data science education. Statistics Education Research Journal, 16(1), 44–49.

    Google Scholar 

  • Enyedy, N., & Mukhopadhyay, S. (2007). They don't show nothing I didn't know: Emergent tensions between culturally relevant pedagogy and mathematics pedagogy. The Journal of the Learning Sciences, 16(2), 139–174.

    Google Scholar 

  • Fivush, R., Bohanek, J. G., & Zaman, W. (2011). Personal and intergenerational narratives in relation to adolescents' well-being. New Directions for Child and Adolescent Development, 131, 45–57.

    Google Scholar 

  • Gibson, J. J. (1986). The ecological approach to visual perception. Hillsdale: Erlbaum (Original work published 1979).

    Google Scholar 

  • Glaser, B. G. (1965). The constant comparative method of qualitative analysis. Social Problems, 12(4), 436–445.

    Google Scholar 

  • Goldstein, B. E., & Hall, R. (2007). Modeling without end: Conflict across organizational and disciplinary boundaries in habitat conservation planning. In J. Kaput, E. Hamilton, S. Zawojewski, & R. Lesh (Eds.), Foundations for the future (pp. 57–76). Mahwah: Erlbaum.

    Google Scholar 

  • Goodwin, C. (1994). Professional vision. American Anthropologist, New Series, 96(3), 606–633 Wiley.

    Google Scholar 

  • Goodwin, C., & Goodwin, M. H. (1996). Seeing as a situated activity: Formulating planes. In Y. Engeström & D. Middleton (Eds.), Cognition and communication at work (pp. 61–95). Cambridge: Cambridge University Press.

    Google Scholar 

  • Greeno, J. G. (1994). Gibson’s affordances. Psychological Review, 101, 336–342.

    Google Scholar 

  • Greeno, J. G., & Engeström, Y. (2014). Learning in activity. In K. Sawyer (Ed.), The Cambridge handbook of the learning sciences (2nd ed., pp. 128–147). London: Cambridge University Press.

    Google Scholar 

  • Hall, R., & Nemirovsky, R. (2012). Introduction to the special issue: Modalities of body engagement in mathematical activity and learning. Journal of the Learning Sciences, 21(2), 207–215.

    Google Scholar 

  • Hancock, C., Kaput, J. J., & Goldsmith, L. T. (1992). Authentic inquiry with data: Critical barriers to classroom implementation. Educational Psychologist, 27(3), 337–364.

    Google Scholar 

  • Ingulfsen, L., Furberg, A., & Strømme, T. A. (2018). Students’ engagement with real-time graphs in CSCL settings: Scrutinizing the role of teacher support. International Journal of Computer-Supported Collaborative Learning, 13(4), 365–390.

    Google Scholar 

  • Jiang, S. (2018). STEM+ L: Investigating Adolescents' participation trajectories in a collaborative multimodal composing environment (Doctoral dissertation, University of Miami).

  • Jiang, S., & Kahn, J. B. (2019). Data wrangling practices and process in modeling family migration narratives with big data visualization technologies. In 13th International Conference on Computer Supported Collaborative Learning-A Wide Lens: Combining Embodied, Enactive, Extended, and Embedded Learning in Collaborative Settings, CSCL 2019 (pp. 208-215). International Society of the Learning Sciences (ISLS).

  • Jordan, B., & Henderson, A. (1995). Interaction analysis: Foundations and practice. Journal of the Learning Sciences, 4(1), 39–103.

    Google Scholar 

  • Kahn, J. (2020). Learning at the intersection of self and society: The family geobiography as a context for data science education. Journal of the Learning Sciences, 29(1), 57–80.

    Google Scholar 

  • Kahn, J., & Hall, R. (2016). Getting personal with big data: Stories with multivariable models about global health and wealth. Paper presented at the American education research association 2016 annual meeting, Washington D.C.

  • Konold, C., Higgins, T., Russell, S. J., & Khalil, K. (2015). Data seen through different lenses. Educational Studies in Mathematics, 88(3), 305–325.

    Google Scholar 

  • Kosara, R., & Mackinlay, J. (2013). Storytelling: The next step for visualization. Computer, 46(5), 44–50.

    Google Scholar 

  • Krumhansl, R., Busey, A., Krumhansl, K., Foster, J., & Peach, C. (2013). Visualizing oceans of data: Educational interface design. In 2013 OCEANS-San Diego (pp. 1-8). IEEE.

  • Latour, B. (1999). Pandora's hope: Essays on the reality of science studies. Cambridge: Harvard University Press.

    Google Scholar 

  • Lave, J. (1996). Teaching, as learning, in practice. Mind, Culture, and Activity, 3(3), 149–164.

    Google Scholar 

  • Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge: Cambridge University Press.

    Google Scholar 

  • Lee, V. R., & Dubovi, I. (2020). At home with data: Family engagements with data involved in type 1 diabetes management. Journal of the Learning Sciences, 29(1), 11–31.

    Google Scholar 

  • Lee, V. R., & Wilkerson, M. (2018). Data use by middle and secondary students in the digital age: A status report and future prospects. Commissioned paper for the National Academies of sciences, engineering, and medicine, board on science education, committee on science investigations and engineering Design for Grades 6–12. Washington, D.C.

  • Lehrer, R., & English, L. (2018). Introducing children to modeling variability. In International handbook of research in statistics education (pp. 229–260). Springer, Cham.

  • Makar, K., & Rubin, A. (2018). Learning about statistical inference. In International handbook of research in statistics education (pp. 261–294). Springer, Cham.

  • Makar, K., Bakker, A., & Ben-Zvi, D. (2011). The reasoning behind informal statistical inference. Mathematical Thinking and Learning, 13(1–2), 152–173.

    Google Scholar 

  • Moore, D. (1990). Uncertainty. In L. Steen (Ed.), On the shoulders of giants: New approaches to numeracy (pp. 95–137). Washington, D.C.: National Academy Press.

    Google Scholar 

  • Noss, R., & Hoyles, C. (1996). Windows on mathematical meanings: Learning cultures and computers (Vol. 17). Dordrecht: Kluwer Academic Publishers.

    Google Scholar 

  • Pangrazio, L., & Sefton-Green, J. (2020). The social utility of ‘data literacy’. Learning, Media and Technology, 45(2), 208–220.

    Google Scholar 

  • Philip, T. M., Schuler-Brown, S., & Way, W. (2013). A framework for learning about big data with mobile technologies for democratic participation: Possibilities, limitations, and unanticipated obstacles. Technology, Knowledge and Learning, 18(3), 103–120.

    Google Scholar 

  • Philip, T. M., Olivares-Pasillas, M. C., & Rocha, J. (2016). Becoming racially literate about data and data-literate about race: Data visualizations in the classroom as a site of racial-ideological micro-contestations. Cognition and Instruction, 34(4), 361–388.

    Google Scholar 

  • Polman, J. L., & Hope, J. M. (2014). Science news stories as boundary objects affecting engagement with science. Journal of Research in Science Teaching, 51(3), 315–341.

    Google Scholar 

  • Radinsky, J. (2020). Mobilities of data narratives. Cognition and Instruction, 1–33.

  • Radinsky, J., Hospelhorn, E., Melendez, J. W., Riel, J., & Washington, S. (2014). Teaching American migrations with GIS census webmaps: A modified “backwards design” approach in middle-school and college classrooms. Journal of Social Studies Research, 38(3), 143–158.

    Google Scholar 

  • Radinsky, J., Tabak, I., & Moore, M. (2019). Disciplinary task models for designing classroom orchestration: The case of data visualization for historical inquiry. Proceedings of the 13th international conference of the computer supported collaborative learning (CSCL), Lyon, France.

  • Roberts, J., & Lyons, L. (2017). The value of learning talk: Applying a novel dialogue scoring method to inform interaction design in an open-ended, embodied museum exhibit. International Journal of Computer-Supported Collaborative Learning, 12(4), 343–376.

    Google Scholar 

  • Rubel, L. H., Lim, V. Y., Hall-Wieckert, M., & Sullivan, M. (2016). Teaching mathematics for spatial justice: An investigation of the lottery. Cognition and Instruction, 34(1), 1–26.

    Google Scholar 

  • Rubel, L. H., Hall-Wieckert, M., & Lim, V. Y. (2017). Making space for place: Mapping tools and practices to teach for spatial justice. Journal of the Learning Sciences, 26(4), 643–687.

    Google Scholar 

  • Schegloff, E. A. (1997). Conversation analysis and socially shared cognition. In L. B. Resnick, J. Levine, & S. D. Teasley (Eds.), Perspectives on socially shared cognition (pp. 150–171). Washington, DC: American Psychological Association.

    Google Scholar 

  • Segel, E., & Heer, J. (2010). Narrative visualization: Telling stories with data. IEEE Transactions on Visualization and Computer Graphics, 16(6), 1139–1148.

    Google Scholar 

  • Stahl, G. (2013). Transactive discourse in CSCL. International Journal of Computer-Supported Collaborative Learning, 8(2), 145–147.

    Google Scholar 

  • Star, S. L. (1985). Scientific work and uncertainty. Social Studies of Science, 15(3), 391–427.

    Google Scholar 

  • Stevens, R., & Hall, R. (1998). Disciplined perception: Learning to see in technoscience. In M. Lampert & M. L. Blunk (Eds.), Talking mathematics in school: Studies of teaching and learning (pp. 107–149). Cambridge: University Press.

    Google Scholar 

  • Strauss, A., & Corbin, J. (1998). Basics of qualitative research. Techniques and procedures for developing grounded theory (2nd ed.). Thousand Oaks: Sage.

    Google Scholar 

  • Tchounikine, P. (2019). Learners’ agency and CSCL technologies: Towards an emancipatory perspective. International Journal of Computer-Supported Collaborative Learning, 14(2), 237–250.

    Google Scholar 

  • Tuominen, K., Savolainen, R., & Talja, S. (2005). Information literacy as a sociotechnical practice. The Library Quarterly, 75(3), 329–345.

    Google Scholar 

  • Venturini, T., Jensen, P., & Latour, B. (2015). Fill in the gap: A new alliance for social and natural sciences. Journal of Artificial Societies and Social Simulation, 18(2), 11.

    Google Scholar 

  • Wilkerson, M. H., & Laina, V. (2018). Middle school students’ reasoning about data and context through storytelling with repurposed local data. ZDM, 50(7), 1223–1235.

    Google Scholar 

  • Wilkerson, M. H., & Polman, J. L. (2020). Situating data science: Exploring how relationships to data shape learning. Journal of the Learning Sciences, 29(1), 1–10.

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Science Foundation under grant number 1341882. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shiyan Jiang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, S., Kahn, J. Data wrangling practices and collaborative interactions with aggregated data. Intern. J. Comput.-Support. Collab. Learn 15, 257–281 (2020). https://doi.org/10.1007/s11412-020-09327-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11412-020-09327-1

Keywords

Navigation