{"created":"2023-06-20T13:21:52.296408+00:00","id":2170,"links":{},"metadata":{"_buckets":{"deposit":"3735ca26-8b65-48a4-9668-a101804ee909"},"_deposit":{"created_by":21,"id":"2170","owners":[21],"pid":{"revision_id":0,"type":"depid","value":"2170"},"status":"published"},"_oai":{"id":"oai:ir.soken.ac.jp:00002170","sets":["2:429:19"]},"author_link":["0","0","0"],"item_1_creator_2":{"attribute_name":"著者名","attribute_type":"creator","attribute_value_mlt":[{"creatorNames":[{"creatorName":"Austermann, Anja Nicole"}],"nameIdentifiers":[{}]}]},"item_1_creator_3":{"attribute_name":"フリガナ","attribute_type":"creator","attribute_value_mlt":[{"creatorNames":[{"creatorName":"オースタマン, アーニャ ニコル"}],"nameIdentifiers":[{}]}]},"item_1_date_granted_11":{"attribute_name":"学位授与年月日","attribute_value_mlt":[{"subitem_dategranted":"2010-09-30"}]},"item_1_degree_grantor_5":{"attribute_name":"学位授与機関","attribute_value_mlt":[{"subitem_degreegrantor":[{"subitem_degreegrantor_name":"総合研究大学院大学"}]}]},"item_1_degree_name_6":{"attribute_name":"学位名","attribute_value_mlt":[{"subitem_degreename":"博士(情報学)"}]},"item_1_description_12":{"attribute_name":"要旨","attribute_value_mlt":[{"subitem_description":"Understanding a user's natural interaction is a challenge that needs to be addressed in order to enable novice users to use robots smoothly and intuitively. While using a set of\r\nhard-coded commands to control a robot is usually rather reliable and easy to implement, it is troublesome for the user, because it requires him/her to learn and remember special commands in order to interact with the robot and does not allow the user to use his or her natural interaction style. Understanding natural, unrestricted spoken language and multi-modal user behavior would be desirable but is still an unsolved problem.\r\n\r\nTherefore, this dissertation proposes a domain-specific approach to enable a robot to learn to understand its user's natural way of giving commands and feedback through natural\r\ninteraction in special virtual training tasks. The user teaches the robot to understand his/her individual way of expressing approval, disapproval and a limited number of\r\ncommands using speech, prosody and touch.\r\n\r\nIn order to enable the robot to pro-actively explore how the user gives commands and provoke approving and disapproving reactions, the system uses special training tasks.\r\nDuring the training, the robot cannot actually understand its user. In order to enable the robot to react appropriately anyway, the training tasks are designed in such a way that the robot can anticipate the user's commands and feedback - e.g. by using games which allow the user to judge easily whether a move of the robot was good or bad and give appropriate feedback, so that the robot can accurately guess whether to expect positive or negative feedback and even provoke the feedback it wants to learn by deliberately making good or bad moves.\r\n\r\nIn this work, \"virtual\" training tasks are used to avoid time-consuming walking motion and to enable the robot to access all properties of the task instantly. The task-scene is shown on a screen and the robot visualizes its actions by motion, sounds and its LEDs. A first experiment for learning positive and negative feedback uses easy games, like \"Connect Four\" and \"Pairs\" in which the robot could explore the user's feedback behavior by making good or bad moves. In a follow-up study, which was conducted with a child-sized humanoid robot as well as pet-robot AIBO, this work has been extended for learning simple commands. The experiments used a \"virtual living room\", a simplified living room scene, in which the user can ask the robot to fulfill tasks such as switching on the TV or serying a coffee.\r\n\r\nAfter learning the names of the different objects in the room by pointing at them and asking the user to name them, the robot requests from the task server to show a situation that requires a certain action to be performed by the robot: E.g. the light is switched off so that the room is too dark. The user responds to this situation by giving the appropriate command to the robot: \"Hey robot, can you switch the light on?\" or \"It's too dark here!\". By correct/incorrect performance, the robot can provoke positive/negative feedback from the user. One of the benefits of \"virtual\" training tasks is that the robot can learn commands, that the user cannot teach by demonstration, but which seem to be necessary for a service or entertainment robot, like showing the battery status, recharging, shutting down, etc.\r\n\r\nThe robot learns by a two-staged algorithm based on Hidden Markov Models and classical conditioning, which is inspired by associative learning in humans and animals. In the first stage, which corresponds to the stimulus encoding in natural learning, unsupervised training of HMMs is used to model the incoming speech and prosody stimuli. Touch stimuli are represented using a simple duration-based model. Unsupervised training of HMMs allows the system to cluster similar perceptions without depending on explicit transcriptions of what the user has said or done, which are not available when learning through natural interaction.\r\n\r\nUtterances and meanings can usually not be mapped one-to-one, because the same meaning can be expressed by multiple utterances, and utterances can have different meanings. This is handled by the associative learning stage. It associates the trained HMMs with meanings and integrates perceptions from different modalities, using an implementation of classical conditioning. The meanings are inferred from the robot's situation. E.g. If the robot just requested the task server to show a dirty spot on the carpet, the robot assumes, the following utterance means clean(carpet), so the system first searches for a match of any of the HMMs, associated with the meaning \"carpet\". Then, the remainder of the utterance is used to train a HMM sequence to be associated with the meaning \"to clean\". The positions of the detected parameters are used to insert appropriate placeholders in the recognition grammar. In a first study, based on game-like tasks, the robot learned to discriminate between positive and negative feedback based on speech, prosody and touch with an average accuracy of 95.97%. The performance in the more complex command learning task is 84.45% for distinguishing eight commands with 16 possible parameters.","subitem_description_type":"Other"}]},"item_1_description_18":{"attribute_name":"フォーマット","attribute_value_mlt":[{"subitem_description":"application/pdf","subitem_description_type":"Other"}]},"item_1_description_7":{"attribute_name":"学位記番号","attribute_value_mlt":[{"subitem_description":"総研大甲第1384号","subitem_description_type":"Other"}]},"item_1_select_14":{"attribute_name":"所蔵","attribute_value_mlt":[{"subitem_select_item":"有"}]},"item_1_select_8":{"attribute_name":"研究科","attribute_value_mlt":[{"subitem_select_item":"複合科学研究科"}]},"item_1_select_9":{"attribute_name":"専攻","attribute_value_mlt":[{"subitem_select_item":"17 情報学専攻"}]},"item_1_text_10":{"attribute_name":"学位授与年度","attribute_value_mlt":[{"subitem_text_value":"2010"}]},"item_creator":{"attribute_name":"著者","attribute_type":"creator","attribute_value_mlt":[{"creatorNames":[{"creatorName":"AUSTERMANN, Anja Nicole","creatorNameLang":"en"}],"nameIdentifiers":[{}]}]},"item_files":{"attribute_name":"ファイル情報","attribute_type":"file","attribute_value_mlt":[{"accessrole":"open_date","date":[{"dateType":"Available","dateValue":"2016-02-17"}],"displaytype":"simple","filename":"甲1384_要旨.pdf","filesize":[{"value":"288.7 kB"}],"format":"application/pdf","licensetype":"license_11","mimetype":"application/pdf","url":{"label":"要旨・審査要旨","url":"https://ir.soken.ac.jp/record/2170/files/甲1384_要旨.pdf"},"version_id":"d36afd72-b8ae-4aff-a403-7bcee2cf33dd"},{"accessrole":"open_date","date":[{"dateType":"Available","dateValue":"2016-02-17"}],"displaytype":"simple","filename":"甲1384_本文.pdf","filesize":[{"value":"7.5 MB"}],"format":"application/pdf","licensetype":"license_11","mimetype":"application/pdf","url":{"label":"本文","url":"https://ir.soken.ac.jp/record/2170/files/甲1384_本文.pdf"},"version_id":"023c9068-e4ec-4e9e-a533-8169317951e0"}]},"item_language":{"attribute_name":"言語","attribute_value_mlt":[{"subitem_language":"eng"}]},"item_resource_type":{"attribute_name":"資源タイプ","attribute_value_mlt":[{"resourcetype":"thesis","resourceuri":"http://purl.org/coar/resource_type/c_46ec"}]},"item_title":"Learning to Understand Multimodal Commands and Feedback for Human-Robot Interaction","item_titles":{"attribute_name":"タイトル","attribute_value_mlt":[{"subitem_title":"Learning to Understand Multimodal Commands and Feedback for Human-Robot Interaction"},{"subitem_title":"Learning to Understand Multimodal Commands and Feedback for Human-Robot Interaction","subitem_title_language":"en"}]},"item_type_id":"1","owner":"21","path":["19"],"pubdate":{"attribute_name":"公開日","attribute_value":"2011-06-03"},"publish_date":"2011-06-03","publish_status":"0","recid":"2170","relation_version_is_last":true,"title":["Learning to Understand Multimodal Commands and Feedback for Human-Robot Interaction"],"weko_creator_id":"21","weko_shared_id":-1},"updated":"2023-06-20T15:55:00.941566+00:00"}