1 Introduction

The rapid development of information and communication technologies facilitated the implementation of many digitally based applications. One of these is the chatbot which has been used in various areas, including marketing, customer service, technical support, education, and training (Smutny & Schreiberova, 2020). The chatbot is a natural language processing (NLP) software that uses artificial intelligence. Artificial intelligence refers to systems or machines that perform tasks by imitating human intelligence and which can recursively heal themselves according to the information they collect (Oracle Turkey, 2014a). Artificial intelligence has sub-branches such as symbolic artificial intelligence, artificial neural networks, natural language processing (language thinking), speech synthesis (artificial speech), speech understanding (speech analysis), expert systems, pattern recognition and genetic algorithms; chatbots and smart assistants are additionally an example of dialogue-based artificial intelligence used in daily life (Reznik, 2009). Chatbots use artificial intelligence to understand questions faster and provide efficient answers and smart assistants use it to extract critical information from large user-defined data sets to improve timing (Oracle Turkey, 2014a).

Chatbots that simulate human conversations are a computer program. Those that use artificial intelligence, make sense of free-text entered by the user with natural language processing technology and determine the correct answer and present it to the user (Vikipedi, 2021). Chatbot applications, which are similar to instant messaging, interact with the user on a specific topic or in a specific area, using text and/or voice, using natural chat. The fact that chatbots are effective digital assistants which can provide information to their users, answer questions and discuss a specific topic or perform a given task makes them attractive tools. Digital assistants can learn a user’s preferences over time, make suggestions and even anticipate their needs (Oracle Turkey, 2014a). Today, personal digital assistants such as Siri from Apple, Alexa from Amazon, Microsoft Cortana, or Google’s Assistant are among the most well-known voice recognition and artificial intelligence technologies (Smutny & Schreiberova, 2020). There are speech or text-recognition-based versions of the chatbot application. Text-based chatbots usually follow a set of built-in rules or flow to answer questions which enable them to respond to their users (Budiu, 2018).

The number of students per teacher is continually increasing in schools which causes the time teachers can devote to each student to decrease. With chatbots that can be used to supplement the teacher, students who cannot ask questions to his their teacher can find answers to their questions at any time of the day. As the student interacts with the chatbot, they can simultaneously control their own learning following the constructivist learning theory and progress at their own pace (Demirci & Yavuz, 2009; Gültekin et al., 2007). It also provides a collaborative environment where students can communicate with each other and ask questions (Singh, 2018).

According to Winkler and Söllner (2018), chatbots, have the potential to compensate for inadequate individual support from instructors for a variety of reasons, which include enabling the student to progress at their own pace, increasing the quality of the learning process and providing individual solution that can be applied proactively to increase learning outcomes. Chatbots that support students individually are spreading rapidly in the field of education (Arruda et al., 2019; Chen et al., 2020; Haller et al., 2005; Nghi et al., 2019). Moreover, Grudin and Jacques (2019) state that chatbots are one appropriate learning method for students to repeat their knowledge of previous topics and can be used to collect feedback about a course. On the other hand, chatbots have the potential to encourage questions from students who may be anxious or feel inhibited about participating in a regular class session (Verleger & Pembridge, 2018).

During the Covid 19 pandemic, schools have been closed in many countries, and students have continued their education online. Many problems have arisen as a consequence, especially among primary and secondary school students whose self-learning skills have not yet developed sufficiently. In a study conducted with students, teachers, and parents, the participants found that even though online education made some positive contribution, it also had some deficiencies, namely limited interaction, not being appropriate for individual differences, technical problems, and problems in accessing online courses. Moreover, participants explained that it should be developed and improved in terms of infrastructure, inequality of opportunity, content, and material (Başaran et al., 2020). Chatbots, therefore, can be a good option to increase interaction and motivation, facilitating learning based on individual differences and mitigating loneliness and the inability to socialize.

Today’s students, defined as Generation Z, are born in a digital environment surrounded by computers, computer games, tablets, and smart phones. For students of this generation, digital tools have become an integral part of their lives and their dexterity with technological devices is therefore better than that of previous generations. Their mental development is faster, and they process information quickly. They want to determine the learning conditions and the time they devote to it themselves. They want to get everything very quickly and consume it instantly; they like speed and live fast (Altunbay & Bıçak, 2018; Çetin & Karalar, 2016). It is therefore predicted that the interactive feature of this chat application will attract the attention of generation Z in particular. Chatbot applications, which are also a communication channel, can be an effective tool to attract them to the learning-teaching environment and sustain their interest. The use of mobile devices, in particular, to download and use digital applications has further increased the popularity of these applications among users. Mobile devices help students learn by offering flexibility of location and time by allowing students to use applications continuously.

Although there are many studies concerning the design and development of chatbots in the literature, there are limited studies that have been applied in educational environments and include the results of these studies. Some of the studies investigating the effectiveness of chatbots in education include: Haller et al. (2005) who conducted a study with psychology students to determine whether chatbots could improve student-content interaction in online education, and which showed that chatbot technology is a promising teaching and learning tool in distance and online education. A chat robot called ‘Jill Watson’, developed by the University of Georgia, was used in a computer science course and caused students to report being more interested in the course as well as state that they wanted to use this robot in other lessons (Lipko, 2016). Similarly, Ho et al. (2018) developed a chatbot application that provided counseling to students while they determined their elective courses and resulted in them stating that they find it more useful than other academic counseling types. A study conducted with a small sample group by Kamita et al. (2019) organized two types of training in the form of chatbots and web courses to improve the participants’ mental health. The authors state that the probability of efficacy is higher with the utilization of chatbots in increasing user motivation, supporting stress reduction, and guiding self-learning. In a study conducted with computer science students by Arruda et al. (2019), the chatbot they developed with goal-oriented requirements modeling was perceived as useful by students and note that the participants planned to use it in the future. Nghi et al. (2019) explain that in using a chatbot application designed to teach the subject of prepositions in English, students benefited from the experience, that this application made learning exciting and fun; and that most students perceived AI chatbot tools as an important part of their learning process.

Chen et al. (2020) determined that the chatbot they developed to teach Chinese significantly increased students’ learning success and due to creating an individualized environment could lead to better results than could be achieved in the classroom. Abbasi et al. (2019) compared an android-based chatbot application designed to teach object-oriented programming and traditional search engine applications. They found a high germane load (caused by psychological activity and support learning information) in the participants’ post-test scores using the chatbot system, and observed a significant difference in the sub-cognitive layer output of the participants. Lin and Chang (2020) have developed a chatbot that helps students studying psychology to write their theses more easily. As a result of this research, it was determined that the chatbot had a significant effect on student achievement and that students showed greater interest. In a study conducted by Yin et al. (2020) with undergraduate students, the subject of Digital Systems was presented to the experimental group with the help of a chatbot application. As a result of the research, while they did not find a significant difference between the performance levels of the experimental and control groups, it was observed that the intrinsic motivation of the students in the experimental group was higher,

The use of ICT in all areas of education is encouraged, but the use of ICT in science education, which is one of the areas where the use of technology is continually increasing, has so far been achieved more effectively. The research also shows that technology supports the development of science skills as well as additional skills required of twenty-first century students such as critical and creative thinking (Jimoyiannis & Komis, 2001). Various technological environments are being tried to benefit in the education of science. As a result of the queries we made in academic databases with keywords such as “science course”, “science education”, chatbot, “chatbot in education”, we could not find any research on the results obtained when the chatbot is used in science lessons. Studies on the design of chatbots that can only be used for science lessons (Durall & Kapros, 2020; Matsuura & Ishimura, 2017) were accessed.

The potential benefits and harms of chatbots in learning and teaching have not yet been fully revealed. This study is important to the field in terms of contributing science course activities using chatbot applications and shedding light on the development of chatbot-supported activities in other disciplines. In the virtual environment, the internet is a giant information trove for students and is important for finding answers to their questions. However, the reliability of information sources obtained from the internet environment is one of the more important issues discussed today. With the help of the use of chatbots, students can reach reliable information. They can continue their education either in the classroom or outside with e-learning, access visual and audio material at any time with video and simulation support, ask what they are curious about, and improve their skill at asking questions. It is thought that the use of chatbots is important in science education and that it will consequently contribute to academic success, since it will provide an environment in which students can gain the courage to ask questions and continue to develop this courage. Because of the Covid-19 pandemic, teaching is continuing online, making it difficult to help the students focus on the learning environment. Since chatbots are a mobile application, they allow students to practice. In this study, the effects of chatbot application on students in science course is examined.

1.1 The overview of the study

This research aims to develop, use and evaluate sample chatbot application that can be made within the scope of the “Matter and the changing state of matter” unit of the 5th grade Science course in secondary school through instant messaging programs. To achieve this goal, the research is structured around the following questions:

  1. 1.

    Is there a significant difference between the achievements of the experimental group using the chatbot application and control group students regarding the “Matter and the changing state of matter” unit before and after intervention of the chatbot application?

  2. 2.

    What are the opinions of the students in the experimental group about influence of chatbot application on learning and motivation in science class?

2 Method

2.1 Research context

In this study, a quasi-experimental design with pre-test, post-test, and control groups were used. Both qualitative and quantitative analyses were used with the aim of comparing the results obtained to achieve more consistent data. Qualitative and quantitative data was collected simultaneously or sequentially, which could be used to examine different research questions and a nested embedded pattern was used in which the analysis of the data sets was carried out independent of the other. Neutral assignment was used in the quasi-experimental design. The classes were established by the school administration when started academic year. Ethics committee permission for the study and other necessary permission were obtained from the institution to which the school is affiliated.

2.2 Participants

The participants were 41 students each studying in the 5A or 5 K branches of a state secondary school in Darıca district of Kocaeli, Turkey, in the 2020-2021 academic year. They were divided into two groups, namely, 5A (n = 20) and 5 K (n = 21). Students’ families are in the lower-income group in terms of socio-economic level. The students’ means of performance scores in the science course in the previous year were determined as 𝑥̅ = 81.67 (n = 22.08) for the experimental group and 𝑥̅=83.16 (n = 7.85) for the control group. The demographic information of the participants in the study is as follows (Table 1):

Table 1 Demographic information of the students participating in the study

2.3 Tools

The student’s success on the ‘Heat and Temperature’ test

This test, developed by Altınok (2011) consists of 21 items on the topic of heat and temperature, the reliability coefficient of which was found to be r = 0.87 with the average difficulty level calculated by the author as p = 0.45. This value, which is between 0.30-0.49, is accepted as a medium difficulty level (Hasançebi et al., 2020). Students’ success was measured by the number of correct answers they supplied. They had 60 min to complete the test.

Student interview form

A semi-structured interview form was designed to determine students’ opinions about the chatbot application in the science course. While preparing the interview questions, the related literature was investigated, and a question pool was created (Akpullukçu, 2011; Bağcaz, 2009; Bozkurt et al., 2013). Before the form was utilized, it was checked by a Science Education instructor and a Computer and Information Technologies instructor. Following their advice, the final version of this form was shaped. Interview questions were arranged in a way to question the content and activities of the study. After the experimental study, a virtual focus group meeting was held with all the students in the experimental group simultaneously, the questions in the interview form were asked, and the answers were noted. The questions the students were asked during the interview are as follows:

  1. 1.

    Was using this chatbot application in the science course helpful for you to learn the topic?

  2. 2.

    How did you feel while using the chatbot application?

  3. 3.

    What difficulties, if any, you encounter while using the chatbot application?

  4. 4.

    If you wanted to work with the chatbot again, what would you like to add?

  5. 5.

    What do you like about the implementation of the chatbot?

  6. 6.

    What did you dislike about these implementations?

  7. 7.

    What lessons and topics would you like such activities to be applied to?

  8. 8.

    Would you like to use the chatbot activities during lesson time? Would you prefer to use the chatbot activities at the beginning of the lesson or the end of it?

  9. 9.

    When did you mostly prefer to use the chatbot application? Why?

  10. 10.

    Do you think that the chatbot can be a sufficient resource to answer all your questions about the topics of your course?

  11. 11.

    Could the chatbot alone be a sufficient resource for learning?

The interview notes were subsequently examined separately by the two researchers of this study. The main code and sub-codes were determined in the interview. Afterwards, the interview results were analyzed, the percentage of agreement between the two coders was calculated, and the result was found to be 97%. For this purpose, the reliability formula of Miles and Huberman (1994) was used. According to Neuendorf (2002), when the value of agreement percentages is 80% or greater, they are considered acceptable (Yürük, 2005). It can be stated that the percentage of agreement between the coders is an acceptable value for this research.

2.4 Procedure

Within the scope of this study, the following process steps were followed while designing chatbot activities. The literature on chatbot applications in science courses and in the teaching of science courses were reviewed. According to the review of the related literature, it was found that the students had difficulty in learning concepts in the “Matter and the changing state of matter” unit, and students had misconceptions about them (Atam, 2006). For this reason this unit in the curriculum was selected. The curriculum’s expected acquisitions were therefore examined for this unit in the Science course and it was decided which questions would be developed according to which acquisitions. In line with these acquisitions, 80 intents were prepared for the chatbot.

In addition to text-based intents, students’ visual and auditory learning was supported with videos and simulations accessed through the web connection. The experts controlled the questions of the developed intents; some of them were eliminated and reduced their number to 71. The chatbot was trained with a series of questions and answers, considering that students could ask questions differently.

Google’s Dialogflow program, which analyzes many types of inputs, including not only written text but also voice inputs, by using intuitive artificial intelligence, was used in the development of the chatbot, and this application was named “Fenbotum” (My sciencebot). The Node.js runtime environment was used in this program to process server and clients’ requests. Since artificial intelligence was used in the chatbot, a design was made so that the students could usually find the answer to their questions by just writing a few words of that question without asking a full-text question. This chatbot was presented to the students through the group created in Telegram, an instant messaging program. In addition to the course content, intents containing 150 short daily speech expressions were added to enable students to have fun learning and not get bored. (e.g., by including answers for expressions such as: you are very smart, will you marry me, today is my birthday, you are very boring, who is the boss?)

Experimental process

Due to the Covid-19 pandemic, online education is offered at all education levels in Turkey. Teachers are continuing to deliver their lessons on online learning and teaching platforms. The sample group to participate in the study was determined by the science teacher who teaches the course. The teacher selected two classes at the 5th-grade level and chose the one with more technological infrastructure as the experimental group. Before starting the study, an achievement test on the topic of heat and temperature developed by Altınok (2011) was administered to both groups. The teacher informed the students about how to use the robot, Afterwards teachers, and researchers, the students in the experimental group were start to use the chatbot application.

The “Matter and the Changing State of Matters” unit is described as 26-course hours long and covering four weeks in the secondary school science curriculum. Sub-topics of the changing state of the matter (melting, freezing, boiling, condensation, evaporation, sublimation, frosting) were determined as covering 6 h, Distinctive Properties of Matter (melting and freezing point, boiling point) as 6 h, Heat and Temperature (heat, temperature, heat exchange) as 7 h and The Effects of Heat (expansion, shrinkage) as 7 h. The teacher went online with both groups for four weeks and taught the topic. The students in the experimental group used the chatbot application whenever they wanted after learning each sub-topic. Teachers and researchers did not intervene in students’ learning experiences, but did follow them. At the end of 4 weeks, the Heat and Temperature topic achievement test was applied to both groups as a post-test. Additionally, a focus group online meeting was held with the students in the experimental group, with the participation of researchers and teachers. Figure 1 illustrates the daily number of students using the chatbot throughout the four weeks. Figure 2 shows the number of messages, and Fig. 3 shows the students’ most preferred hours to use the chatbot. These figures demonstrate that the chatbot was used almost every day, and students were most active on the chatbot at around 8 pm. In Fig. 4, screenshots of students’ conversations with the chatbot are provided.

Fig. 1
figure 1

Number of students using the chatbot per day

Fig. 2
figure 2

Number of daily messages by students using chatbots

Fig. 3
figure 3

The hours mostly preferred by the students to use the chatbot

Fig. 4
figure 4

Screenshots of students’ conversations with the chatbot

2.5 Analysis

SPSS 20 package program was used for the data analysis. The Wilcoxon test is used to examine the significance of the difference between the scores of two related sets of measurements and to compare the pre-test and post-test total scores of the experimental and control groups (Büyüköztürk, 2013). The preference for nonparametric tests in the data analysis is directly related to the small sample size. The data obtained in the study was analyzed by non-parametric statistical methods since the number of groups was less than 30.To determine whether there is a significant difference between pre-test and post-test groups, the Mann Whitney U test was used to analyze the differences between groups at the significance level of 0.05 (Büyüköztürk, 2013).

The opinions of the students on the use of the chatbot were analyzed by two researchers using the content analysis method. In order to ensure the reliability of the data analysis, the data obtained from the study was coded separately by three experts, and Miles and Huberman’s (1994) agreement percentage formula (Reliability = number of agreements / number of agreements + disagreements) was used to calculate the consistency of the coding. The percentage of agreement was calculated and analyzed separately for each question, and finally, the overall fit was examined. Walther et al. (2013) suggested inter-rater reliability (IRR) as a means to “mitigate interpretative bias” and ensure a “continuous dialogue between researchers to maintain consistency of the coding”. As a result of the calculation, the first agreement percentage was found to be 0.83. The agreement between the raters was then recalculated by examining each of the items that could not be agreed upon and reviewing them by way of discussion. This process continued until a consensus was reached on the scores, with the percentage of agreement in these cases of 0.97. Miles and Huberman (1994) suggest that an inter-rater reliability (IRR) of 80% agreement between coders on 95% of the codes is sufficient agreement among multiple coders.

3 Findings

Examples and numbers of the most frequently and never asked questions by the students to the chatbot in the experimental process are presented in Table 2. This table shows that the students did not ask enough questions about the sub-topics. The first reason for this might be that the students did not have enough information about the sub-topics of the main topic, the second that their knowledge of how to ask questions is limited, and finally, the lack of ability to making associations between the concepts they have learned and daily life.

Table 2 Examples of the questions students asked most frequently and those which were never asked questions by the students to the chatbot in the experimental process

To determine whether the data in the study showed normal distribution, the Kolmogorov-Smirnov test were applied both to the pre-test and post-test results of the experimental and control groups. The results confirmed that both groups showed normal distribution (pre-test, z = 0.627 and post-test, z = 0.745). The academic achievement scores of the students in the experimental and control groups from the pre-test and post-test were analyzed through Wilcoxon Ranks test, the results of which are given in Table 3.

Table 3 Wilcoxon Ranks Test of the pre-test and post-test scores of the experimental and control groups

According to Table 3, as a result of the Wilcoxon signed ranks test performed to determine the significance of the difference between the pre-test and post-test scores of the students in the experimental and control groups in the academic achievement test, the difference in the experimental group was significant (z = −2.168, p < .05), but it was determined that there was no difference (z = −. 751, p > .05) in the control group. Considering the mean rank and rank totals of the difference scores in the experimental group, it is seen that the difference observed is in favor of the positive ranks, i.e. the post-test score. Despite the four weeks of online learning activities, there was no significant increase in the control group’s achievement. The reasons behind this factor might include that online courses could not be carried out in a certain order like face-to-face teaching in the school environment, and that it is unknown whether the home environment is suitable for participation in the online course. Moreover, the student-teacher and student-student interactions are not the same as in the face-to-face education process; some technical problems might be faced during the online connection, the change to their daily routines might interfere with concentration and engagement with the content, unequal access to the required digital resources among students. Consequently, the quality of online education for secondary school students should be questioned.

According to the results of the Mann Whitney U test shown in Table 4 conducted to determine the significance of the difference between the pre-test and post-test scores of the experimental and control groups, it was found that there was no significant difference between the groups (chatbot group: U = 186,500, p > .05, control group: U = 199,000, p > .05). Although no significant difference was found between the groups, the chatbot application significantly increased the students’ achievement in the experimental group. This result obtained from the online education environment showed the necessity of implementing it in face-to-face teaching.

Table 4 Mann Whitney U test results of the pre-test and post-test scores of the chatbot group and the control group

Table 5 includes the participants’ opinions about the chatbot application used in the science course, all of which state that this application makes a positive contribution to their learning. The participants explained that chatbot application are useful, they can find answers to their questions, it helps them learn new things, and it enables them to familiarize themselves with the topic. For example, one of the participants expressed their opinion by saying: “When I asked things, he immediately answered”.

Table 5 Students’ views on the benefit to their learning of using the Chatbot application in science lessons

All of the participants stated that they liked the chatbot application used in the science course because they learned new things thanks to it, it provided new information, it answered their questions immediately, that they had fun, it increased their interest in the science course, it could answer their questions, and it was accessible whenever they wanted. For instance, the participants’ statements regarding the chatbot applications included: “I had fun”, “I asked the question, and it immediately replied”, and “I liked it very much” (Table 6).

Table 6 Students’ feelings about chatbot application in the science course

Table 7 includes the participants’ opinions about the difficulties they faced while using the chatbot application in the science course. While some of the participants stated that they did not have any problems, some indicated various problems related to the application. The problems participants encountered included: late answers to questions, unanswered questions on different topics, different answers to the questions asked in quick succession, and even that the chatbot introduces itself to give the same answer to a different question. Also, students added that some students deliberately ask the same question many times to earn higher scores, it does not understand some of the questions it does not answer some questions, and it answers the question in English. A possible reason for the last of these problems is that there are some video and simulation recordings, which were found appropriate and necessary by the researchers for the related topic, do not have verbal narration, but only provide English sub-titles.

Table 7 Student views on difficulties encountered while using the Chatbot application

Table 8 states the expectations of the students for the next usage of this application. They expect mainly these functions: to be louder, to explain the topic in a detailed way, to summarize the topic after the course, to include test questions and answer keys, to solve questions whenever they need, to structure where students to ask and answer the questions each other, to have an artificial intelligence developed to answer more questions, and for it to be able to replace their teachers when their teachers cannot come to the course. One of the participants, who wanted it to be available at all times, shared their opinion as follows: “I think we can use it all the time, it can be prepared for every subject”, and another stated their opinion as “I think it should walk around the school as a robot and answer our questions after the lesson”.

Table 8 Expectations of the students for the next usage

Table 9 contains the participants’ opinions about their likes and dislikes concerning the chatbot activities during the science course. The participants answered the question about what they like about the chatbot application with: we can ask it a large number of questions, it is especially polite, and we receive answers to all of our questions. The number of the participants sharing positive comments after receiving only the textual or only visual (video and simulation) answers is equal to the number of the participants sharing their positive comments after receiving both textual and visual answers. While a participant who suggested that there should only be textual answers stated the reason for this as: “I can learn better when I only read written things”, while another participant who had the idea that there should only be a visual answer stated: “I am distracted while reading the written ones, so I prefer to watch”.

Table 9 Students’ opinions about their likes and dislikes in chatbot activities

It was observed that some of the students stated that they were not satisfied when they could not get answers to questions related to other topics during the chatbot activities and when they received the same answers to different questions. Since the study concerns the topics included in the content of the 5th grade “Matter and changing state of matter” unit, it was possible that they could only get answers to the questions related to the relevant part. However, it was also seen in the answers they gave during the interview and concluded from the chatbot records that the participants could not get an answer because they asked questions out of the scope of the related topic. For instance, one of the participants’ complaints was that “he did not answer my questions about space.”

The students’ views on whether chatbots alone would be a sufficient resource for learning the topics are stated in Table 10. It can be concluded that the participants expressed positive opinions, such as that it provides opportunities such as summarizing, listening, and repeating the topic. However, they think that it will not be sufficient alone to learn the topics because of the possibility of not being able to answer all questions and the concern that the system may collapse. One of the participants who gave a negative opinion stated that “we may have many questions in our minds, so I think it is not enough”. Regarding the statement that “the system can collapse”, it can be said that students have reservations about whether the chatbot will sufficiently meet their technical needs.

Table 10 Students’ views on whether only chatbot would be a sufficient resource for learning the topics

The students additionally stated that they would also like to use chatbot activities in their other courses, such as English, mathematics, and social sciences. It was observed that the participants who wanted these activities to be applied to the English course were especially interested in learning new vocabulary. For example, a participant expressed their opinion on this issue by saying, “I can ask him about the vocabulary we just learned in the English course, then it can tell me the meanings of the words; we can also learn them before the lesson.” It was also observed that they would need it for visuals of the places mentioned in the social science texts, and that it would be used for repetition in the mathematics course. The participant who supports the idea that a chatbot would be necessary in the social science course stated that “for example, when I wonder about the places I have never seen, it can show me those places”.

According to these results, it can be concluded that the students perceive the chatbot as a guide to help them to learn topics outside of the classroom. The statement of one of the participants expressing this opinion as “I would like to ask my questions and learn their answers immediately during breaks, but I would like my teacher to explain it during the lesson” supports this judgment. Students who want to use it while learning stated that they did not want to use it at the beginning, in the middle, or at the end of the lesson, or when their teachers are not in the course. Students who want to use the chatbot while lesson, stated that they want to use the chatbot in the middle, at the end of the lesson or when their teachers are not in the lesson. It can be said that the participants who have this view evaluated the chatbot as capable of supplement the teacher.

The students answered the question of when they mostly preferred to use the chatbot application as: they preferred to use it when they do not have online courses, or between online courses (during breaks), again at the end of the science courses, or while solving questions. It can be seen that the time of online courses is a determinant for using the chatbot, instead of a specific time period. While expressing their opinion on this question, one of the participants stated that “It would be better to have it in the evening because we always have lessons at other hours”. This supported the prediction regarding the effect of the pandemic in determining the times when this application would be used.

4 Conclusions and discussion

In this study, an attempt has been made to develop and evaluate the sample chatbot application programmed specifically for the “Matter and Changing state of matter” unit of the 5th grade Science course in secondary school and enable and evaluate them through instant messaging programs. According to the results of the study, it was determined that the students used the chatbot application almost every day for four weeks, but the students only asked certain questions, some questions asked less and some none. Furthermore, it was observed that they mostly used the chatbot in the evening hours.

While a significant difference was found between the pre-test and post-test scores of the students in the chatbot application group, none was found in the control group. Despite the 4-week online learning activities, there was no significant increase in the control group’s achievement. The reasons for this could be that online education could not be carried out in the same way as face-to-face teaching in the school. For example, the conditions of the learning environment, whether it was suitable for participation in online courses was not known, and that the student-teacher and student-student interaction was not as effective as in the face-to-face education process. According to the results, it can be concluded that the main problems relate to technical reasons such as the dramatic change in students’ daily routines, the differences between the quality of students’ digital devices, and consequently, the nature of online education for secondary school. Muramatsu and Wangmo (2020) found that students experienced stress in online education due to the interruptions to their network connection, and since the vast majority have to use a smartphone for online education, that smart phones made it difficult to focus on the lesson due to their small screens. Nevertheless, even though there was no significant difference between the experimental and control groups in terms of achievement, but the chatbot application significantly affected the students’ achievements in the experimental group. These results also revealed the necessity to investigate the effects of this application in face-to-face teaching.

According to the related literature, since there were no teaching applications related to chatbots in science courses, the results obtained from this study were discussed by comparing them with the results of chatbot applications used in other courses, e-learning, mobile learning, and augmented reality. According to the literature, the chatbot developed for students to learn languages significantly improved the students’ learning success, and the results were better than what has been achieved in the classroom thanks to its ability to provide an individualized environment (Chen et al., 2020). In another study conducted to teach programming, a positive significant difference was observed in the germane load and sub-cognitive layer outputs of students using chatbots (Abbasi et al., 2019). On the other hand, Yin et al. (2020) conducted a study with university students that found the positive effects of chatbot applications on the intrinsic motivation of students in the experimental group, but they did not find much difference between students’ performances. The use of voice assistants even though they did not cause a significant difference in the learning outcomes, whose effectiveness in learning has been tested by Sáiz-Manzanares et al. (2020), was found to be a great aid in learning processes as it promotes the development of self-regulated learning, is functionally evaluated by university students in accessing information during the pandemic, and that students’ satisfaction with their learning increased.

Moreover, the studies applying chatbot applications to science courses were examined; Cheng et al. (2019) conducted a study with a mobile technology-supported experiential learning application which was designed to improve students’ attitudes, competencies, and learning performance as well as problem-solving skills. According to their findings, this application significantly increased students’ learning achievement, environmental attitudes, and collective competencies. The authors also stated that students who learned with the this approach showed higher problem-solving competence than those who learned with the traditional mobile learning approach. In a similar vein, Koç and Ayık (2017) determined that the use of social networks in 6th and 7th-grade science and English courses had a positive effect on students’ academic success. In a recent study, Korkmaz and Kadirhan (2020) stated that science education, blended by using the education information network, contributed to the academic achievements and attitudes of the students. In another study, Ahmed and Parsons (2013) stated that the learning environment, which they created using a mobile device to help high school students during “Abductive science inquiry”, improved the performance and attitudes of the students in the experimental group compared to the control group, and that its permanence was also maintained.

Technology in the mobile learning industry has played a significant role in enabling students and educators to interact with upcoming learning opportunities, thus enabling them to have a richer learning experience; and, according to the mobile learning market research, it has been determined that the tendency towards mobile learning has increased in the international arena (Markets and Markets, 2020a, b). According to the OECD report on innovation in education, it has been stated that the use of mobile devices in Science Education is important particularly in the development of students’ content and procedural knowledge (Vincent-Lancrin et al., 2019). Many studies reveal that science education can be developed using personal computers, smart phones, tablets and different types of educational software (Sykes, 2014; Tavares & Moreira, 2017).

When the students’ opinions about the chatbot were analyzed, it could be summarized that they generally found this activity useful and enjoyable; it helped them learn, it was fun, it increased their curiosity about the science course, they could reach it whenever they wanted, and they wanted to use it in other courses such as English, mathematics and social sciences. In addition, students perceived the chatbot as a guide to help them learn topics outside the classroom. Also, instead of a specific time period, the time of online courses was a determinant of students’ preference to use the chatbot.

Students’ expectations about improving the chatbot can be listed as follows: for it to be louder, to explain the topic, to summarize the topic after the lecture, to include test questions and answer keys, to solve questions when needed, to be equipped to ask and answer questions, to have artificial intelligence to answer more questions, and to replace their teachers when they cannot come to the class. The participants shared their opinions about the aspects of the chatbot they like; among these were that they could ask many questions and get answers to all their questions, they like to have the option between written texts and videos. On the other hand, they are not satisfied when they cannot get answers to questions about other topics, or when they receive the same answers to different questions.

According to the literature, findings of studies investigating the use of chatbots in language teaching have ascertained that they facilitate students’ learning, it was useful, it created excitement and made learning fun, and most students perceived the artificial intelligence chatbot tools as an important part of their learning process (Nghi et al., 2019). Furthermore, in the study by Durak and Karaoğlan Yılmaz (2019), students explained the augmented reality applications created an enjoyable learning environment and made the learning process remarkable and effective. They consequently thought that the augmented reality applications could contribute positively to the success of the course if they also used it in the next course. Fryer and Bovee (2018) indicated that even if students mostly study outside the classroom, teachers can still significantly affect this process. Teachers wishing to use this type of technology for extra practicing hours during self-study consider chatbots an opportunity to learn more, rather than as a convenient tool to practice anywhere and anytime (Fryer et al., 2019).

This study has shown that the chatbot application designed for science courses positively affected students’ online education learning process; students found the chatbot useful and entertaining and wanted to use it in other lessons. It was a good assistant in learning outside the classroom. It allowed them to repeat the lesson and facilitated learning by being interactive. Since the application enabled students to see each other’s questions and answers, it also encouraged them to complete the missing information they had on the topic. Also, while the teacher interacts with the students at a certain time and in a certain environment, the chatbot continuously provides information to students on the topic thanks to the flexibility of place and time. Anwarulloh and Agustia (2019) also stated that as a result of their study, the students learned interactively with the chatbot. In the present study, the students used the chatbot only outside of the classroom.

Further studies can be suggested concerning the use of chatbots during lessons, for them to cover more topics, access information on the web through search engines when students need it and include a voice-over. Suggestions can also be made to measure students’ cognitive learning with a tool other than the achievement test, to determine the effect on students’ level of associating science concepts with daily life in the case of long-term use, and to use it for different courses. While this study has been carried out virtually, similar studies can be done in the classroom environment to investigate its effects in face-to-face teaching after the pandemic.

4.1 Limitation of the study

The present research was conducted with students under online education conditions without any face-to-face communication due to the Covid-19 pandemic. Most of the students are at a low socio-economic level, so they do not have their own devices that they can use to continuously access the internet, and they generally get access through the devices belonging to their parents. This situation prevented them from using the application adequately. Besides, it is not known how the learning environment of the students at home affects online learning conditions. The short online course sessions (it takes about 30 min) reduce student-teacher interaction, and limit student’s active participation. This, therefore, resulted in an inability to control students’ learning. In short, 30 min lessons, the disadvantage of having to use their parents’ devices to access the chatbot application are the main limitations of this study. Due to the pandemic, certain aspects of classroom management were unavailable for students due to obstacles to an optimum learning environment.