111966 (616332), страница 3
Текст из файла (страница 3)
The speech recognition technology underlying closed response query implementations is very simple, even in the more sophisticated systems. For any given interaction, the task perplexity is low and the vocabulary size is comparatively small. As a result, these systems tend to be very robust. Recognition accuracy rates in the low to upper 90% range can be expected depending on task definition, vocabulary size, and the degree of non-native disfluency.
FUTURE TRENDS IN VOICE-INTERACTIVE CALL
In the previous sections, we reviewed the current state of speech technology, discussed some of the factors affecting recognition performance, and introduced a number of research prototypes that illustrate the range of speech-enabled CALL applications that are currently technically and pedagogically feasible. With the exception of a few exploratory open response dialog systems, most of these systems are designed to teach and evaluate linguistic form (pronunciation, fluency, vocabulary study, or grammatical structure). This is no coincidence. Formal features can be clearly identified and integrated into a focused task design. This means that robust performance can be expected. Furthermore, mastering linguistic form remains an important component of L2 instruction, despite the emphasis on communication (Holland, 1995). Prolonged, focused practice of a large number of items is still considered an effective means of expanding and reinforcing linguistic competence (Waters, 1994). However, such practice is time consuming. CALL can automate these aspects of language training, thereby freeing up valuable class time that would otherwise be spent on drills.
While such systems are an important step in the right direction, other more complex and ambitious applications are conceivable and no doubt desirable. Imagine a student being able to access the Internet, find the language of his or her choice, and tap into a comprehensive voice-interactive multimedia language program that would provide the equivalent of an entire first year of college instruction. The computer would evaluate the student's proficiency level and design a course of study tailored to his or her needs. Or think of using the same Internet resources and a set of high-level authoring tools to put together a series of virtual encounters surrounding the task of finding an apartment in Berlin. As a minimum, one would hope that natural speech input capacity becomes a routine feature of any CALL application.
To many educators, these may still seem like distant goals, and yet we believe that they are not beyond reach. In what follows, we identify four of the most persistent issues in building speech-enabled language learning applications and suggest how they might be resolved to enable a more widespread commercial implementation of speech technology in CALL.
1. More research is necessary on modeling and predicting multi-turn dialogs.
An intelligent open response language tutor must not only correctly recognize a given speech input, but in addition understand what has been said and evaluate the meaning of the utterance for pragmatic appropriateness. Automatic speech understanding requires Natural Language Processing (NLP) capabilities, a technology for extracting grammatical, semantic, and pragmatic information from written or spoken discourse. NLP has been successfully deployed in expert systems and information retrieval. One of the first voice-interactive dialog systems using NLP was the DARPA-sponsored Air Travel Information System (Pallett, 1995), which enables the user to obtain flight information and make ticket reservations over the telephone. Similar commercial systems have been implemented for automatic retrieval of weather and restaurant information, virtual environments, and telephone auto-attendants. Many of the lessons learned in developing such systems can be valuable for designing CALL applications for practicing conversational skills.
2. More and better training data are needed to support basic research on modeling non-native conversational speech.
One of the most needed resources for developing open response conversational CALL applications is large corpora of non-native transcribed speech data, of both read and conversational speech. Since accents vary depending on the student's first language, separate databases must either be collected for each L1 subgroup, or a representative sample of speakers of different languages must be included in the database. Creating such databases is extremely labor and cost intensive--a phone level transcription of spontaneous conversational data can cost up to one dollar per phone. A number of multilingual conversational databases of telephone speech are publicly available through the Linguistic Data Consortium (LDC), including Switchboard (US English) and CALLHOME (English, Japanese, Spanish, Chinese, Arabic, German). Our own effort in collaboration with John Hopkins University (Byrne, Knodt, Khudanpur, & Bernstein, 1998; Knodt, Bernstein, & Todic,1998) has been to collect and model spontaneous English conversations between Hispanic natives. All of these efforts will improve our understanding of the disfluent speech of language learners and help model this speech type for the purpose of human-machine communication.
DEFINING AND ACQUIRING LITERACY IN THE AGE OF INFORMATION
Moll defined literacy as "a particular way of using language for a variety of purposes, as a sociocultural practice with intellectual significance" (1994, p. 201). While traditional definitions of literacy have focused on reading and writing, the definition of literacy today is more complex. The process of becoming literate today involves more than learning how to use language effectively; rather, the process amplifies and changes both the cognitive and the linguistic functioning of the individual in society. One who is literate knows how to gather, analyze, and use information resources to solve problems and make decisions, as well as how to learn both independently and cooperatively. Ultimately literate individuals possess a range of skills that enable them to participate fully in all aspects of modern society, from the workforce to the family to the academic community. Indeed, the development of literacy is "a dynamic and ongoing process of perpetual transformation" (Neilsen, 1989, p. 5), whose evolution is influenced by a person's interests, cultures, and experiences. Researchers have viewed literacy as a multifaceted concept for a number of years (Johns, 1997). However, succeeding in a digital, information-oriented society demands multiliteracies, that is, competence in an even more diverse set of functional, academic, critical, and electronic skills.
To be considered multiliterate, students today must acquire a battery of skills that will enable them to take advantage of the diverse modes of communication made possible by new technologies and to participate in global learning communities. Although becoming multiliterate is not an easy task for any student, it is especially difficult for ESL students operating in a second language. In their attempts to become multiliterate, ESL students must acquire linguistic competence in a new language and at the same time develop the cognitive and sociocultural skills necessary to gain access into the social, academic, and workforce environments of the 21st century. They must become functionally literate, able to speak, understand, read, and write English, as well as use English to acquire, articulate and expand their knowledge. They must also become academically literate, able to read and understand interdisciplinary texts, analyze and respond to those texts through various modes of written and oral discourse, and expand their knowledge through sustained and focused research. Further, they must become critically literate, defined here as the ability to evaluate the validity and reliability of informational sources so that they may draw appropriate conclusions from their research efforts. Finally, in our digital age of information, students must become electronically literate, able "to select and use electronic tools for communication, construction, research, and autonomous learning" (Shetzer, 1998).
Helping students develop the range of literacies they need to enter and succeed at various levels of the academic hierarchy and subsequently in the workforce requires a pedagogy that facilitates and hastens linguistic proficiency development, familiarizes students with the requirements and conventions of academic discourse, and supports the use of critical thinking and higher order cognitive processes. A large body of research conducted over the past decade (see, e.g., Benesch, 1988; Brinton, Snow, & Wesche, 1989; Crandall, 1993; Kasper, 1997a, 2000a; Pally, 2000; Snow & Brinton, 1997) has shown that content-based instruction (CBI) is highly effective in helping ESL students develop the literacies they need to be successful in academic and workforce environments.
CONTENT-BASED INSTRUCTION AND LITERACY DEVELOPMENT
CBI develops linguistic competence and functional literacy by exposing ESL learners to interdisciplinary input that consists of both "everyday" communicative and academic language (Cummins, 1981; Mohan, 1990; Spanos, 1989) and that contains a wide range of vocabulary, forms, registers, and pragmatic functions (Snow, Met, & Genesee, 1989; Zuengler & Brinton, 1997). Because content-based pedagogy encourages students to use English to gather, synthesize, evaluate, and articulate interdisciplinary information and knowledge (Pally, 1997), it also allows them to hone academic and critical literacy skills as they practice appropriate patterns of academic discourse (Kasper, 2000b) and become familiar with sociolinguistic conventions relating to audience and purpose (Soter, 1990).
The theoretical foundations supporting a content-based model of ESL instruction derive from cognitive learning theory and second language acquisition (SLA) research. Cognitive learning theory posits that in the process of acquiring literacy skills, students progress through a series of three stages, the cognitive, the associative, and the autonomous (Anderson, 1983a). Progression through these stages is facilitated by scaffolding, which involves providing extensive instructional support during the initial stages of learning and gradually removing this support as students become more proficient at the task (Chamot & O'Malley, 1994). Second language acquisition (SLA) research emphasizes that literacy development can be facilitated by providing multiple opportunities for learners to interact in communicative contexts with authentic, linguistically challenging materials that are relevant to their personal and educational goals (see, e.g., Brinton, et al., 1989; Kasper, 2000a; Krashen, 1982; Snow & Brinton, 1997; Snow, et al., 1989).
In a 1996 paper published in The Harvard Educational Review, The New London Group (NLG) advocated developing multiliteracies through a pedagogy that involves a complex interaction of four factors which they called Situated Practice, Overt Instruction, Critical Framing, and Transformed Practice. According to the NLG, becoming multiliterate requires critical engagement in relevant tasks, interaction with diverse forms of communication made possible by electronic technologies, and participation in collaborative learning contexts. Warschauer (1999) concurred and stated that a pedagogy of critical inquiry and problem solving that provides the context for "authentic and collaborative projects and analyses" (p. 16) that support and are supported by the use of electronic technologies is necessary for ESL students to acquire the linguistic, social, and technological competencies key to literacy in a digital world.
According to a 1995 report published by the United States Department of Education, "technology is an important enabler for classes organized around complex, authentic tasks" and when "used in support of challenging projects, [technology] can contribute to students' sense ... that they are using real tools for real purposes." Technology use increases students' motivation as it promotes their active engagement with language and content through authentic, challenging tasks that are interdisciplinary in nature (McGrath, 1998). Technology use also encourages students to spend more time on task. As they search for information in a hyperlinked environment, ESL students benefit from increased opportunities to process linguistic and content information. Used as a tool for learning, technology supports a level of task authenticity and complexity that fits well with the interdisciplinary work inherent in content-based instruction and that promotes the acquisition of multiliteracies.
THEORY INTO PRACTICE
These research findings suggest that in our efforts to prepare ESL students for the challenges of the academic and workforce environments of the 21st century, we should adopt a pedagogical model that incorporates information technology as an integral component and that specifically targets the development of the range of literacies deemed necessary for success in a digital, information-oriented society. This paper describes a content-based pedagogy, which I call focus discipline research (Kasper, 1998a), and presents the results of a classroom study conducted to measure the effects of focus discipline research on the development of ESL students' literacy skills.
As described here, focus discipline research puts theory into practice as it incorporates the principles of cognitive learning theory, SLA research, and the four components of the NLG's (1996) pedagogy of multiliteracies. Through pedagogical activities that provide the context for situated practice, overt instruction, critical framing, and transformed practice, focus discipline research promotes ESL students' choice of and responsibility for course content, engages them in extended practice with linguistic structures and interdisciplinary material, and encourages them to become "content experts" in a subject of their own choosing.















