Voice Principle: Difference between revisions

From ECT wiki
 
(25 intermediate revisions by 2 users not shown)
Line 1: Line 1:
=='''Overview'''==
=='''Overview'''==
[[File:Anyone.png|thumb|300px|link=https://www.youtube.com/watch?v=uhiCFdWeQfA|'Anyone, Anyone' Ferris Bueller's Day Off.[https://www.youtube.com/watch?v=uhiCFdWeQfA| Direct YouTube link]]]
[[File:Voice principle.png|thumb|350px|link=https://www.youtube.com/watch?v=UmrejRnex8E|The Voice Principle[https://www.youtube.com/watch?v=UmrejRnex8E| Direct YouTube link]]]
'''“People learn better when narration is spoken in a human voice rather than in a machine voice.(Mayer, 2014)”'''
'''“People learn better when narration is spoken in a human voice rather than in a machine voice.(Mayer, 2014)”'''


Similar to the Personalization Principle that a speaker with a conversational style will have more impact on learning than a formal style presentation, the Voice Principle states that learners learn more deeply when being spoken to in a friendly natural human voice than a synthetic computer-generated voices voice. A machine(synthetic computer-generated) voice also means not speaking as if you are a machine, just like how the teacher does in the movie “Ferris Bueller’s Day Off” which made the lesson boring and tedious and lead to a negative effect on the learning. ‘A friendly human voice’ is emotional, it conveys that someone is speaking directly to you and give you a sense of belonging and social presence.
Similar to the [https://ectwiki.online/index.php?title=Personalization_Principle Personalization Principle] that a speaker with a conversational style will have more impact on learning than a formal style presentation, the Voice Principle states that learners learn more deeply when being spoken to in a friendly natural human voice than a synthetic computer-generated voices voice. A machine(synthetic computer-generated) voice also means not speaking as if you are a machine, just like how the teacher does in the movie “Ferris Bueller’s Day Off” which made the lesson boring and tedious and lead to a negative effect on the learning. ‘A friendly human voice’ is emotional, it conveys that someone is speaking directly to you and give you a sense of belonging and social presence.


'''*The Voice Principle''' is one of the multimedia principles that assist designers in the planning of instructional multimedia materials to actively engage learners in learning. It is suggested to foster [https://ectwiki.online/index.php?title=Generative_Processing generative processing] in the [https://ectwiki.online/index.php?title=Cognitive_Theory_of_Multimedia_Learning cognitive theory of multimedia learning(CTML)].
'''*The Voice Principle''' is one of the [https://ectwiki.online/index.php?title=Mayer%27s_Principles_in_Multimedia_Learning multimedia principles] that assist designers in the planning of instructional multimedia materials to actively engage learners in learning. It is suggested to foster [https://ectwiki.online/index.php?title=Generative_Processing generative processing] in the [https://ectwiki.online/index.php?title=Cognitive_Theory_of_Multimedia_Learning cognitive theory of multimedia learning(CTML)].


=='''Evidence'''==
=='''Evidence'''==
Line 14: Line 14:


''More research and experiments that support the Voice Principle are stated in Table 1.''
''More research and experiments that support the Voice Principle are stated in Table 1.''
=='''Design Implication'''==
[[File:Anyone.png|thumb|330px|link=https://www.youtube.com/watch?v=uhiCFdWeQfA|'Anyone, Anyone' Ferris Bueller's Day Off.[https://www.youtube.com/watch?v=uhiCFdWeQfA| Direct YouTube link]]]
==='''''Anyone, Anyone From Ferris Bueller's Day Off'''''===
An example of how a machine-like human voice negatively works in the learning context is conducted in the movie 'Ferris Bueller's Day Off.' When the teacher repeats 'Anyone, Anyone', his voice sounds unappealing and has no emotion which made students 'tune out' and get bored during the learning. Students had no interaction or engagement while listening to the teachers' lectures at all. By this, students would devote less cognitive effort when learning with the teacher's machine voice during a lesson.
==='''''Using Voice Principle the Animation of 'How Corona Virus Affects Your Body''''''===
[[File:CoronaVirus.png|thumb|330px|link=https://www.youtube.com/watch?v=0yp2zbU4fas|'How Corona Virus Affects Your Body[https://www.youtube.com/watch?v=0yp2zbU4fas| Direct YouTube link]]]
How Corona Virus Affects Your Body? is a YouTube video produced by Peekaboo Kids in May 2021.  It is published on YouTube and mainly introduces the pandemic, how COVID-19 spreads, and how it affects human bodies to kids with less prior knowledge of biology and medical issues. As the contents of COVID-19 contain very complex information and are hard to understand for children, this video aims to use cartoonish animation and sound effects to motivate children and explain difficult concepts in a more simplistic, understandable, and entertaining way.
In the video, the cartoon character Dr. Binocs (Figure C2) takes the role of an expert and a narrator to explain concepts to kids with conversational language. The instructional designer applies the Voice Principle in the video as they present Dr.Binocs with a natural human voice that allows the children to feel more engaged and more conversational with the learning content.
However, although this video has a good implication for the Voice Principle, it ignores some functional principles, such as [https://ectwiki.online/index.php?title=Segmenting_Principle segmenting principle] and [https://ectwiki.online/index.php?title=Signaling_principle signaling principle] that failed to keep children focus on the video for a long time and distracted by other unnecessary content which increased kid's extraneous load as a result.


=='''Two Theoretical Perspectives'''==
=='''Two Theoretical Perspectives'''==
==='''Cognitive Load Theory'''===
==='''Cognitive Load Theory'''===
*Cognitive Load Theory: a cognitive architecture of a human being is divided into three parts: a limited working memory, a limitless long-term memory, and schemas that work to organize in long-term memory (Sweller, 2011)
*[https://ectwiki.online/index.php?title=Cognitive_Load_Theory Cognitive Load Theory]: a cognitive architecture of a human being is divided into three parts: a limited working memory, a limitless long-term memory, and schemas that work to organize in long-term memory (Sweller, 2011)


According to cognitive load researchers (Paas & Sweller, 2014), synthetic voices increase extraneous cognitive load and reduce usable cognitive capacity to integrate new information with existing knowledge. An unappealing voice may cause learners more time to generate the information which is more likely to increase the working memory that is distributed into the cognitive thinking process that raises the [https://ectwiki.online/index.php?title=Cognitive_Load_Theory cognitive load].  
According to cognitive load researchers (Paas & Sweller, 2014), synthetic voices increase extraneous cognitive load and reduce usable cognitive capacity to integrate new information with existing knowledge. An unappealing voice may cause learners more time to generate the information which is more likely to increase the working memory that is distributed into the cognitive thinking process that raises the cognitive load.  


According to cognitive load theory assumptions, the human brain receives instruction in two different channels, verbal and visual, before information processing begins, and the capacity is relatively limited. As a result, synthetic machine-generated voice types may increase the extraneous cognitive load of those exposed to multimedia instruction or engaged in multimedia material because it appeals to the uninterested and distracting in the absence of sufficient social cues (Wouters et al., 2008).
According to cognitive load theory assumptions, the human brain receives instruction in two different channels, verbal and visual, before information processing begins, and the capacity is relatively limited. As a result, synthetic machine-generated voice types may increase the extraneous cognitive load of those exposed to multimedia instruction or engaged in multimedia material because it appeals to the uninterested and distracting in the absence of sufficient social cues (Wouters et al., 2008).
Line 28: Line 42:
Cues, such as the voice or image of presenters embedded in a multimedia lesson, can serve as social stimuli. The extent to which cues convey social concepts, in particular, can vary. For example, a machine-synthesized voice does not carry the same degree of social cues as the human voice (Mayer et al., 2003).  
Cues, such as the voice or image of presenters embedded in a multimedia lesson, can serve as social stimuli. The extent to which cues convey social concepts, in particular, can vary. For example, a machine-synthesized voice does not carry the same degree of social cues as the human voice (Mayer et al., 2003).  


While enthusiastic voice has shown that the enthusiastic voice prompted more effective social ratings, the calm voice led to a higher germane load. Furthermore, the embedded social elements give the impression that multimedia instruction involves social interaction rather than one-way passive lecturing. This may encourage learners to exert the same effort as when interacting with humans.  
While enthusiastic voice has shown that the enthusiastic voice prompted more effective social ratings, the calm voice led to a higher germane load. Furthermore, the embedded social elements give the impression that multimedia instruction involves social interaction rather than a passive lecturing. This may encourage learners to exert the same effort as when interacting with humans.  


In terms of gender differences in voice, Linek et al. (2010) found that the female voice was more effective than the male voice at capturing learners' attention and retention scores. Additionally, the social ratings of the female voice were found to be more assertive and appealing.
In terms of gender differences in voice, Linek et al. (2010) found that the female voice was more effective than the male voice at capturing learners' attention and retention scores. Additionally, the social ratings of the female voice were found to be more assertive and appealing.


=='''Design Implication'''==
=='''Challenges'''==
 
It is important to note that, despite the well-studied cognitive impacts, some of these principles proposed in the early 2000s, when technology was not advanced enough to test each one of them, have not been thoroughly investigated in experimental settings (Craig & Schroeder, 2019). More experimentation and research should be conducted using advanced text-to-speech engines to investigate the effectiveness of the Voice Principle's application.


=='''Reference'''==
=='''Reference'''==
Line 45: Line 59:


Dinçer, S., & Doğanay, A. (2017). The effects of multiple-pedagogical agents on learners' academic success, motivation, and cognitive load. Computers & Education, 111, 74-100. https://doi.org/10.1016/j.compedu.2017.04.005
Dinçer, S., & Doğanay, A. (2017). The effects of multiple-pedagogical agents on learners' academic success, motivation, and cognitive load. Computers & Education, 111, 74-100. https://doi.org/10.1016/j.compedu.2017.04.005
Wouters, P., Paas, F., & van Merriënboer, J. J. (2008). How to optimize learning from animated models: A review of guidelines based on cognitive load. Review of Educational Research, 78(3), 645-675. https://doi.org/10.3102/0034654308320320
Bandura, A. (1969). Social-learning theory of identificatory processes. Handbook of socialization theory and research, 213, 262.
Linek, S. B., Gerjets, P., & Scheiter, K. (2010). The speaker/gender effect: does the speaker’s gender matter when presenting auditory text in multimedia messages?. Instructional Science, 38(5), 503-521. https://doi.org/10.1007/s11251-009-9115-8
Paas, F., Renkl, A., & Sweller, J. (2003). Cognitive load theory and instructional design: Recent developments. Educational psychologist, 38(1), 1-4. https://doi.org/10.1207/S15326985EP3801_1
Peekaboo Kidz. (2021, May 18). How Corona Virus Affects Your Body? | COVID-19 | The Dr Binocs Show | Peekaboo Kidz. YouTube. https://www.youtube.com/watch?v=0yp2zbU4fas

Latest revision as of 19:07, 16 December 2022

Overview[edit | edit source]

The Voice PrincipleDirect YouTube link

“People learn better when narration is spoken in a human voice rather than in a machine voice.(Mayer, 2014)”

Similar to the Personalization Principle that a speaker with a conversational style will have more impact on learning than a formal style presentation, the Voice Principle states that learners learn more deeply when being spoken to in a friendly natural human voice than a synthetic computer-generated voices voice. A machine(synthetic computer-generated) voice also means not speaking as if you are a machine, just like how the teacher does in the movie “Ferris Bueller’s Day Off” which made the lesson boring and tedious and lead to a negative effect on the learning. ‘A friendly human voice’ is emotional, it conveys that someone is speaking directly to you and give you a sense of belonging and social presence.

*The Voice Principle is one of the multimedia principles that assist designers in the planning of instructional multimedia materials to actively engage learners in learning. It is suggested to foster generative processing in the cognitive theory of multimedia learning(CTML).

Evidence[edit | edit source]

Table 1

It is widely claimed that the use of synthetic voice in educational contexts and materials impedes comprehension and increases the cognitive load of the learners.

To explore this claim, Mayer(2020) conducted five experiments on different studies in which there were comparisons between machine voice and human voice(Atkinson et al., 2005; Mayer et al., 2003; Mayer & DaPra, 2012). The results indicate that the natural human voice is much better than the synthetic voice, as it is natural and socially appealing t people. Furthermore, the experiments show that the human voice positively affects retention and transfer scores. Giving an example of showing students a 140-second narrated video of lightning formation that included spoken words(Mayer, 2003), a non-conversational Russian accent speaker and a standard accent voice speaker are provided to the learners. Students exposed to the standard voice type scored higher than the other type in the following transfer test. This leads to the conclusion that a destructive and unappealing human voice may harm people because it reduces the learner's social stimuli.

More research and experiments that support the Voice Principle are stated in Table 1.

Design Implication[edit | edit source]

'Anyone, Anyone' Ferris Bueller's Day Off.Direct YouTube link

Anyone, Anyone From Ferris Bueller's Day Off[edit | edit source]

An example of how a machine-like human voice negatively works in the learning context is conducted in the movie 'Ferris Bueller's Day Off.' When the teacher repeats 'Anyone, Anyone', his voice sounds unappealing and has no emotion which made students 'tune out' and get bored during the learning. Students had no interaction or engagement while listening to the teachers' lectures at all. By this, students would devote less cognitive effort when learning with the teacher's machine voice during a lesson.

Using Voice Principle the Animation of 'How Corona Virus Affects Your Body'[edit | edit source]

'How Corona Virus Affects Your BodyDirect YouTube link

How Corona Virus Affects Your Body? is a YouTube video produced by Peekaboo Kids in May 2021. It is published on YouTube and mainly introduces the pandemic, how COVID-19 spreads, and how it affects human bodies to kids with less prior knowledge of biology and medical issues. As the contents of COVID-19 contain very complex information and are hard to understand for children, this video aims to use cartoonish animation and sound effects to motivate children and explain difficult concepts in a more simplistic, understandable, and entertaining way.

In the video, the cartoon character Dr. Binocs (Figure C2) takes the role of an expert and a narrator to explain concepts to kids with conversational language. The instructional designer applies the Voice Principle in the video as they present Dr.Binocs with a natural human voice that allows the children to feel more engaged and more conversational with the learning content.

However, although this video has a good implication for the Voice Principle, it ignores some functional principles, such as segmenting principle and signaling principle that failed to keep children focus on the video for a long time and distracted by other unnecessary content which increased kid's extraneous load as a result.

Two Theoretical Perspectives[edit | edit source]

Cognitive Load Theory[edit | edit source]

  • Cognitive Load Theory: a cognitive architecture of a human being is divided into three parts: a limited working memory, a limitless long-term memory, and schemas that work to organize in long-term memory (Sweller, 2011)

According to cognitive load researchers (Paas & Sweller, 2014), synthetic voices increase extraneous cognitive load and reduce usable cognitive capacity to integrate new information with existing knowledge. An unappealing voice may cause learners more time to generate the information which is more likely to increase the working memory that is distributed into the cognitive thinking process that raises the cognitive load.

According to cognitive load theory assumptions, the human brain receives instruction in two different channels, verbal and visual, before information processing begins, and the capacity is relatively limited. As a result, synthetic machine-generated voice types may increase the extraneous cognitive load of those exposed to multimedia instruction or engaged in multimedia material because it appeals to the uninterested and distracting in the absence of sufficient social cues (Wouters et al., 2008).

Social Agency[edit | edit source]

Learning involves social activity(Bandura, 1969). According to social agency theorists (Atkinson et al., 2005), the human voice can be identified quickly due to social interaction and familiarity, resulting in active learning. In summary, social agency theory holds that using social cues in multimedia learning improves educational quality and increases retention (Dinçer & Doğanay, 2017). Social agency theory is a set of ideas that explains how social factors affect multimedia learning (Linek et al., 2010). Cues, including the voice or image of presenters integrated into a multimedia lesson, might act as social stimuli.

Cues, such as the voice or image of presenters embedded in a multimedia lesson, can serve as social stimuli. The extent to which cues convey social concepts, in particular, can vary. For example, a machine-synthesized voice does not carry the same degree of social cues as the human voice (Mayer et al., 2003).

While enthusiastic voice has shown that the enthusiastic voice prompted more effective social ratings, the calm voice led to a higher germane load. Furthermore, the embedded social elements give the impression that multimedia instruction involves social interaction rather than a passive lecturing. This may encourage learners to exert the same effort as when interacting with humans.

In terms of gender differences in voice, Linek et al. (2010) found that the female voice was more effective than the male voice at capturing learners' attention and retention scores. Additionally, the social ratings of the female voice were found to be more assertive and appealing.

Challenges[edit | edit source]

It is important to note that, despite the well-studied cognitive impacts, some of these principles proposed in the early 2000s, when technology was not advanced enough to test each one of them, have not been thoroughly investigated in experimental settings (Craig & Schroeder, 2019). More experimentation and research should be conducted using advanced text-to-speech engines to investigate the effectiveness of the Voice Principle's application.

Reference[edit | edit source]

“Anyone, anyone” teacher from Ferris Bueller’s Day Off. (2011, December 29). Retrieved November 18, 2019, from https://www.youtube.com/watch?v=uhiCFdWeQfA.

Mayer, R. (2014). The Cambridge Handbook of Multimedia Learning, Second Edition. New York City: Cambridge University Press.

Swller, J., (1994). Cognitive load theory, learning difficulty, and instructional design. Learning and Instrction. 4: 295-312

Atkinson, R. K., Mayer, R. E., & Merrill, M. M. (2005). Fostering social agency in multimedia learning: Examining the impact of an animated agent's voice. Contemporary Educational Psychology, 30(1), 117-139. https://doi.org/10.1016/j.cedpsych.2004.07.001

Dinçer, S., & Doğanay, A. (2017). The effects of multiple-pedagogical agents on learners' academic success, motivation, and cognitive load. Computers & Education, 111, 74-100. https://doi.org/10.1016/j.compedu.2017.04.005

Wouters, P., Paas, F., & van Merriënboer, J. J. (2008). How to optimize learning from animated models: A review of guidelines based on cognitive load. Review of Educational Research, 78(3), 645-675. https://doi.org/10.3102/0034654308320320

Bandura, A. (1969). Social-learning theory of identificatory processes. Handbook of socialization theory and research, 213, 262.

Linek, S. B., Gerjets, P., & Scheiter, K. (2010). The speaker/gender effect: does the speaker’s gender matter when presenting auditory text in multimedia messages?. Instructional Science, 38(5), 503-521. https://doi.org/10.1007/s11251-009-9115-8

Paas, F., Renkl, A., & Sweller, J. (2003). Cognitive load theory and instructional design: Recent developments. Educational psychologist, 38(1), 1-4. https://doi.org/10.1207/S15326985EP3801_1

Peekaboo Kidz. (2021, May 18). How Corona Virus Affects Your Body? | COVID-19 | The Dr Binocs Show | Peekaboo Kidz. YouTube. https://www.youtube.com/watch?v=0yp2zbU4fas