Menu
Research papers are a pairing of two 18 minute presentations followed by 18 minutes of Discussion led by a Discussant, with remaining time for Q & A.
This is presentation 1 of 2, scroll down to see more details.
Other presentations in this group:
The Technology Acceptance Model (TAM) served as the underpinning of our research queries and analysis (TAM; Davis, 1989). This model identifies the relationship between various external variables and perceived usefulness and ease of use of a given technology. In turn, user perceptions may impact their attitude towards using the technology and their behavioral intention to use it. Ultimately, attitudes and behavioral intentions impact the users actual use of the technology. Latent variables of TAM were utilized to explore factors that had potential to impact in-service teachers’ use of the skills as a potential PL modality.
This usability study examined educator’s use of an Amazon Alexa skill the authors developed (Authors, 2020). This skill was originally developed in March 2020 by special education experts. Pinpointing a critical best practice that could easily be implemented, the development team focused the skill on giving effective feedback as outlined in the Special Education High-Leverage Practices (HLPs; Council for Exceptional Children & CEEDAR Center, 2015). Based on the HLP literature and information videos, the team drafted a script explicitly teaching and demonstrating the key elements of effective feedback. This script was subsequently converted into an Alexa skill by utilizing skill development software that eliminated the need for direct coding.
The development team included skill visuals support (e.g., contents of the main menu, images, and quiz questions; see Figure 1) for users who utilize the Alexa app on their smartphone or tablet and those users with Alexa devices with screens (e.g., Echo Show). The Effective Feedback Skill offers five sections that participants may choose from via the main menu: (1) overview of effective feedback, (2) definition of effective feedback, (3) components of effective feedback, (4) example of effective feedback, and (5) entire skill.
The research team recruited special education teachers by using convenience sampling. After Institutional Review Board approval, the team individually asked potential participants if they would be interested in joining the study. Participants were primarily sought from a pool of current and former graduate students from the authors' special education preparation program; however, one participant served as an adjunct in the program. Ultimately, five special education teachers and one behavior interventionist participated in the study.
This usability test included 19 unique tasks embedded in each section of the skill. Participants received a task list directing their actions when navigating through the skill and engaging with the device. Participants were observed and recorded while interacting with the skill and following the task list. While the researchers primarily served as quiet observers, support was offered when participants failed to give a command to which the device was programmed to respond.
Each participant spent approximately 50 minutes to one hour interacting with and exploring the skill based on the provided task list. In the meantime, one member from the observation team filled out a researcher-developed observation template to monitor the task completion rates, errors, and other unexpected events.
After completing the entire tasks listed, the participants responded to a survey asks the levels of easiness of each task using a 7-point Likert scale (1: very difficult to 7: very easy). Also, they responded to the user experiences survey that asks about overall experiences of using the skill.
The researchers utilized the Voice Usability Scale (VUS; Zwakman et al., 2021) survey using a 7-point Likert scale, from strongly disagree (1) to strongly agree (7), in order to examine the participants’ experiences after using the provided skill. A total of 14 items were included in the survey. The reliability of this survey was relatively high (Cronbach's 𝛼 = .81; Taber, 2018)
Completion rates were analyzed by the number of participants who successfully completed the required task divided by the total number of participants. Task difficulty and user experiences were analyzed using descriptive statistics, such as minimum and maximum values, and means and standard deviations.
After users completed the two required surveys, the research team conducted semi-structured interviews, asking participants about their overall experience with the skill, and their attitudes and familiarity with AI devices in general. The semi-structured interview questionnaire was categorized into two major themes: experience with the skill and attitudes toward CAs. Participant responses often generated follow-up questions and further discussion. Interviews were relaxed and conversational in nature, allowing participants to openly share their perceptions and shift the conversation to additional, yet related, topics and content as desired. Interviews were also recorded over Zoom and later transcribed for analysis.
Quantitative completion rates were analyzed by the number of participants who successfully completed the required task divided by the total number of participants. Task difficulty and user experiences were analyzed using descriptive statistics, such as minimum and maximum values, and means and standard deviations.
When reviewing and analyzing our qualitative data, we employed multiple coders throughout the process. Seeking interpretive convergence, the first and second authors coded collaboratively (Saldana, 2021). To align data interpretations between them, the two engaged in detailed dialogues, discussing their interpretations and unique perspectives. Codes were defined and redefined throughout the coding process, as multiple conversations and analyses led to consensus between the two primary coders (Brinkmann & Kvale, 2018; Sandelowski & Barroso, 2007). To further bolster interrater reliability, a third member of the research team who did not participate in any of the interviews of this study recoded the data, based on training and codebook definitions. The Cohen’s kappa coefficient (1960) was calculated for 33% of the data, resulting in 0.51 (range of the agreement percentages: 98.08%-99.83%) interrater reliability between the initial coding team and the third researcher, which can be interpreted as fair to good agreement (Mchuge, 2012).
Quantitative Data
The participants responded to the item difficulty survey soon after completing the provided 19 tasks. According to the results, the participants recognized most of the tasks provided were easy to complete (M = 5.83 - 7.00; see Table ). However, discrepancies between perceived easiness and completion rates were observed. The observer’s records showed lower completion rates in invoking skill (task #1) and completing specific activities in Section 5 (tasks #4 to #7). Also, one participant skipped one task (task #13) described in the task list and moved on to another section instead of going back to the main menu as directed by the task list.
Qualitative Data
Qualitative data was organized into two overarching categories with subgroups. The first category, Opinions and Attitudes, included subgroups Overall Experiences and Skill Specific Feedback. The second broad category, Applications and Suggestions, included subcategories Content Application and Insights. Unique comments within each subcategory were tallied, and a percentage derived by number of participants making a comment within each subcategory was calculated.
Opinions and Attitudes (Overall Experiences/Skill Usefulness):
Despite participants expressing feelings of unfamiliarity and unease when initially interacting with the skill (83.33%), nearly all participants (83.33%) expressed an overall positive experience after completing the skill and 66.67% explicitly expressed appreciation for the example provided within the skill. While half (50.00%) of participants commented that they would be interested in using a similar skill in the future, interest in purchasing or utilizing an Alexa Device for anything other than music (or other minimally engaging activities) was low. Just over half (66.67%) noted drawbacks of the skill, most often connecting to internet access, repetition within the skill, and the need to physically invoke Alexa while using a mobile device. The majority of participants (83.33%) shared suggestions for improvement, including the provision of external resources to support skill completion, eliminating repeating information, and embedding video or additional text.
Applications and Suggestions (Content Application/Insights):
The majority of participants (83.33%) were self reflective and could imagine themselves utilizing the content in the skill to improve their teaching practices. Participants (100.00%) could envision skills providing professional development to a variety of educational and noneducational groups, such as classroom teachers, paraprofessionals, substitute teachers, supervisors in any profession, parents, and anyone working with children. While all participants could envision using Amazon Alexa as an instructional tool in the k-12 setting, 33.33% additionally envisioned harnessing the device to support instructional practices in higher education. Visions of utilizing the device at the k-12 setting were connected primarily to tools and functions of the device itself, such as timers, music, and access to content. Versus visions of utilizing the device in higher education, which focused primarily on utilizing Alexa skills to pre-teach or support instruction.
While typically utilized for daily tasks (e.g., listening to music, setting timers, checking the weather), CAs may serve to provide PL for educational professionals. Participants in this study largely viewed the use of CAs as an innovative PL modality in a positive way. Most participants envisioned themselves seeking additional PL opportunities via the CA device and extended visions beyond personal PL use to employing CA skills in the classroom to support students, both in k-12 and higher education. This exploratory study investigated various factors, such as easiness, usefulness, attitude, and intention, which may impact the actual use of CA skills; therefore, future research could further examine the relations among these factors based on the technology acceptance model (TAM; Davis, 1989), and in consideration of the findings in this study, to further facilitate the development and adoption of CA skills as an innovative PL modality in the education field.
Ultimately, this research points to the potential utilization of CA skills to provide flexible, personalized PL specific to educational professionals. Given user curiosity and willingness to explore CA as a means to PL, CA skills widen the pool of PL options, adding an innovative way for educators to grow and develop their skills.
Brinkmann, S., & Kvale, S. (2018). Doing interviews: Qualitative Research Kit Book 2 (Vol. 2). SAGE.
Council for Exceptional Children & CEEDAR Center, (2015). High-leverage practices in special education. CEC
Davis, F.D. (1989). Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly. 13(3), 319–340.
McHugh, M. L. (2012). Interrater reliability: the kappa statistic. Biochemia medica, 22(3), 276-282.
Saldana, J. (2021), The coding manual for qualitative researchers. SAGE.
Sandelowski, M., Barroso, J., & Voils, C. I. (2007). Using qualitative metasummary to synthesize qualitative and quantitative descriptive findings. Research in nursing & health, 30(1), 99-111.
Taber, K. S. (2018). The use of Cronbach’s alpha when developing and reporting research instruments in science education. Research in science education, 48(6), 1273-1296.
Zwakman, D. S., Pal, D., & Arpnikanondt, C. (2021). Usability evaluation of artificial intelligence-based voice assistants: the case of Amazon Alexa. SN Computer Science, 2(1), 1-16.
Downloaded Amazon Alexa is not required for participation in this presentation.