Today, children of all ages interact with speech recognition systems but are largely unaware of how they work. Teaching K-12 students to investigate how these systems employ phonological, syntactic, semantic, and cultural knowledge to resolve ambiguities in the audio signal can provide them a window on complex AI decision-making and also help them appreciate the richness and complexity of human language. We describe a browser-based tool for exploring the Google Web Speech API and a series of experiments students can engage in to measure what the service knows about language and the types of biases it exhibits. Middle school students taking an introductory AI elective were able to use the tool to explore Google’s knowledge of homophones and its ability to exploit context to disambiguate them. Older students could potentially conduct more comprehensive investigations, which we lay out here. This approach to investigating the power and limitations of speech technology through carefully designed experiments can also be applied to other AI application areas, such as face detection, object recognition, machine translation, or question answering.
如今,各个年龄段的孩子都会与语音识别系统互动,但他们大多不清楚这些系统是如何工作的。教幼儿园到12年级的学生去探究这些系统如何运用语音、句法、语义和文化知识来解决音频信号中的歧义,可以为他们提供一扇了解复杂人工智能决策的窗口,同时也有助于他们认识到人类语言的丰富性和复杂性。我们介绍了一种基于浏览器的工具,用于探索谷歌网络语音应用程序接口,以及一系列学生可以参与的实验,以衡量该服务对语言的了解程度以及它所表现出的偏差类型。参加人工智能入门选修课的中学生能够使用该工具来探究谷歌对同音异形异义词的了解以及它利用语境消除歧义的能力。年龄较大的学生可能会进行更全面的探究,我们在此对此进行了阐述。这种通过精心设计的实验来探究语音技术的能力和局限性的方法也可以应用于其他人工智能应用领域,比如面部检测、物体识别、机器翻译或问答系统。