Faculty Profile
Professor: Digital Media, Graduate School
Katunobu ITOU
- Ph.D. (Computer Science)
Research area:
- Speech Recognition
- Multi-Modal Dialog System
- Speech Interface
Related site:
Personal Statement
He received the B.E., M.E. and Ph.D degress in computer science from Tokyo Institute of Technology in 1988, 1990 and 1993 respectively. From 2003 to 2006, he was an associate professor at Graduate School of Information Science of the Nagoya University. In 2006, he joined the Faculty of Computer and Information Sciences at Hosei University, Japan, as a Professor. His current research interest is spoken language processing. He is a member of the Information Processing Society of Japan and the Acoustical Society of Japan.
Teaching Courses
Undergraduate School
- Mathematical Literacy A?
- Mathematical Literacy B?
- Machine and Assembly Language?
- Project A?
- Project B?
- Speech Processing
- Seminar on Computer Science
- Thesis
Graduate School
- Speech and Language Processing
- Speech and Language Processing I?
- Speech and Language Processing II?
- IT Factory Seminar III?
- Research Semimar III?
- Research Course in Computer and Information Sciences?
- Master Thesis
- Doctor Dissertation
Laboratory
Natural language as a tool in real world is a target of my work, and it covers the areas such as speech recognition, multi-modal dialog system, speech interface, and information retrieval.
Natural language has two major functions; interactional function and transactional function. The former have been assumed to be covered by spoken langauge and the other have been assumed to be covered by written language.
One of my major research topics is to utilize spoken language as transactional using speech technologies. For interactional aspects, I am exploring the possibilities of a prosodic aspects in spoken language interface.
I will address to clarify the ability of natural language in real world applications.
Publications
2001
- M. Goto, K. Itou, T. Akiba and S. Hayamizu. Speech Completion: New Speech Interface with On-demand Completion Assistance. HCI-International 2001, (2001/8).
- T. AKIBA and K. Itou A Structured Statistical Language Model conditioned by Arbitrarily Abstracted Grammatical Categories based on GLR parsing. Eurospeech 2001, (2001/9).
- F. Asano, M. Goto, K. Itou and H. Asoh Real-time Sound Source Localization and Separation System and Its Application to Automatic Speech Recognition. Eurospeech 2001, (2001/9)
- Atsushi Fujii, Katunobu Itou, and Tetsuya Ishikawa. Speech-Driven Text Retrieval: Using Target IR Collections for Statistical Language Model Adaptation in Speech Recognition. ACM SIGIR'01 Workshop on Information Retrieval Techniques for Speech Applications, Sep. 2001.
- Shigeki Sagayama, Satoshi Nakamura, Tsuneo Nitta, Katunobu Itou, Yoichi Yamashita, Shigeo Morishima, Atsushi Yamada. Developement of anthropomorphic dialog agent; a plan and development and its significance. IPNMD 2001, Dec. 2001.
- Katunobu Itou, Atsushi Fujii, and Tetsuya Ishikawa. Language Modeling for Multi-Domain Speech-Driven Text Retrieval. IEEE Automatic Speech Recognition and Understanding Workshop, Dec. 2001.
2002
- Akinobu LEE, Tatsuya KAWAHARA, Kazuya TAKEDA, Masato MIMURA, Atsushi YAMADA, Akinori ITO, Katsunobu ITOU, Kiyohiro SHIKANO. Continuous Speech Recognition Consortium --- an Open Repository for CSR Tools and Models --- Proceedings of International Conference on Language Resources and Evaluation (LREC2002), pp. , May 2002
- Atsushi Fujii, Katunobu Itou, and Tetsuya Ishikawa. Producing a Large-scale Encyclopedic Corpus over the Web. Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC-2002), pp.1737-1740, May. 2002.
2003
- Atsushi Fujii, Katunobu Itou, and Tetsuya Ishikawa. A System for On-demand Video Lectures. AAAI-03 Spring Symposium on Intelligent Multimedia Knowledge Management, Mar. 2003.
- Atsushi Fujii, Katunobu Itou, and Tetsuya Ishikawa. LODEM: A Multilingual Lecture-on-demand System. Proceedings of the 2003 ISCA Workshop on Multilingual Spoken Document Retrieval, pp.13-18, April. 2003.
- Tomoyosi Akiba, Katunobu Itou, and Atsushi Fujii. Adapting Language Models for Frequent Fixed Phrases by Emphasizing N-gram Subsets. Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech 2003), pp.1469-1472, Sep. 2003.
- Atsushi Fujii and Katunobu Itou. Building a Test Collection for Speech-Driven Web Retrieval. Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech 2003), pp.1153-1156, Sep. 2003.
- Atsushi Fujii, Katunobu Itou, Tomoyosi Akiba, and Tetsuya Ishikawa. A Cross-media Retrieval System for Lecture Videos. Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech 2003), pp.1149-1152, Sep. 2003.
- Masataka Goto, Yukihiro Omoto, Katunobu Itou, and Tetsunori Kobayashi: Speech Shift: Direct Speech-Input-Mode Switching through Intentional Control of Voice Pitch, Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech 2003), September 2003.
2004
- Tomoyosi Akiba, Atsushi Fujii, and Katunobu Itou. Collecting Spontaneously Spoken Queries for Information Retrieval. Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC-2004), May. 2004.
- Kei IGARASHI, Chiyomi MIYAJIMA, Katsunobu ITOU, Kazuya Takeda, Fumitada ITAKURA and Huseyn Abut. Biometric Identification Using Driving Behavioral Signals. Proc. of ICME 2004, 2004
- T.Kawahara, A.Lee, K.Takeda, K.Itou and K.Shikano. Recent Progress of Open-Source LVCSR Engine Julius and Japanese Model Repository Proc. of International Conference on Spoken Language Processing, (INTERSPEECH/ICSLP 2004, Cheju Korea), Spec4402p.6, November 2004
- H. Bann, C.Miyajima, K.Itou, K.Takeda, F.Itakura. Speech recognition using synchronization between speech and figre tapping. Proc. of International Conference on Spoken Language Processing, (INTERSPEECH/ICSLP 2004, Cheju Korea, November 2004), ThC2501o.1, 2004
- H.Fujimura, K.Itou, K.Takeda,F.Itakura Analysis of In-car speech recognition experiments using a large-scale multi-mode dialogue corpus. Proc. of International Conference on Spoken Language Processing, (INTERSPEECH/ICSLP 2004, Cheju Korea), Spec4001o.2, November, 2004
2005
- Weifeng Li, Katunobu Itou, Kazuya Takeda, and Fumitada Itakura Two-stage Noise Spectra Estimation and Regression based In-car Speech Recognition using Single Distant Microphone. ICASSP 2005, pp.I-533-536, 2005.
- Hiroshi Fujimura, Chiyomi Miyajima, Katsunobu Itou, Kazuya Takeda, and Fumitada Itakura Analysis of a large in-car speech corpus and its application to the multimodel ASR. ICASSP 2005, pp. I-445-I-448, 2005
- Seiichiro Hosoe, Takanori Nishino, Katsunobu Itou, and Kazuya Takeda Measurement of head-related transfer functions in the proximal region ForumAcusticum2005, Nr. 4460,, 2005.
- Madoka Takimoto, Takanori Nishino, Katsunobu Itou, and Kazuya Takeda Evaluation of sound localization under condition of covered ears ForumAcusticum2005, Nr. 5190,, 2005.
- Naoya Inoue, Takanori Nishino, Katsunobu Itou, and Kazuya Takeda HRTF modeling using physical features. ForumAcusticum2005, Nr. 3690,, 2005.
- Toshiyuki Kimura, Wataru Mizuno, Takanori Nishino, Katsunobu Itou, and Kazuya Takeda. Sound Field Auralization System in Free Listening Positions Using Wave Field Synthesis and Head Related Transfer Functions. ForumAcusticum2005, Nr. 3690,, 2005
- Takanori Nishino, Fuminori Saito, Katsunobu Itou, and Kazuya Takeda Modeling or a Room Impulse Response with Cepstrum Analysis ForumAcusticum2005, 2005.
- Hiroshi Tanaka, Hiroshi Fujimura, Chiyomi Miyajima, Takanori Nishino, Katsunobu Itou and Kazuya Takeda Data Collection and Evaluation of Speech Recognition for Mortorbike Riders. Eurospeech 2005, 4DP2-15, 2005.
- Weifeng Li, Katunobu Itou, Kazuya Takeda and Fumitada Itakura Subjective and Objective Quality Assessment of Regression-enhanced Speech in Real Car Environments. Eurospeech 2005, 4BP3-11, 2005
- Yasunori Ohishi, Masataka Goto, Katunobu Itou, and Kazuya Takeda Discrimination between Singing and Speaking Voices. Eurospeech 2005, 3CO2-6, 2005.
- Kazuya Takeda, Hiroshi Fujimura, Katsunobu Itou, Nobuo Kawaguchi, Shigeki Matsubara and Fumitada Itakura, Construction and Evaluation of a Large In-Car Speech Corpus. IEICE Trans. Inf. and Syst., Vol.E88-D, No.3, pp.553-561, , 2005
- Hiromitsu Ban, Chiyomi Miyajima, Katsunobu Itou, Kazuya Takeda and Fumitada Itakura. Speech Recognition using Finger Tapping Timings. IEICE Trans. Inf. and Syst., vol.E88-D, No.3, pp.667-670, 2005
- Weifeng Li, Tetsuya Shinde, Hiroshi Fujimura, Chiyomi Miyajima, Takanori Nishino, Katunobu Itou, Kazuya Takeda, and Fumitada Itakura, Multiple Regression of Log Spectra for In-car Speech Recognition using Multiple Distributed Microphones IEICE Trans. Inf. and Syst., vol.E88-D, No.3, pp.384-390, 2005
- Naoya Inoue, Toshiyuki Kimura, Takanori Nishino, Katsunobu Itou and Kazuya Takeda. Evaluation of HRTFs estimated using physical features. Acoustical Science and Technology, Acoustical letters, vol. 26, No. 5, pp.453-455, 2005.
2006
- Weifeng Li, Chiyomi MIYAJIMA, Takanori NISHINO, Katunobu ITOU, Kazuya TAKEDA and Fumitada ITAKURA Adaptive Nonlinear Regression for In-car Speech Recognition using Multiple Distributed Microphones. IEICE Trans., Inf. and Syst., vol.E89-D, no.3, 2006
- Toshiyuki Wakita, Koji Ozawa, Chiyomi Miyajima, Katsunobu Ito, Kazuya Takeda Driver Identification Using Driving Behavior Signals. IEICE Trans. Inf.&Syst., vol.E89-D, no.3, 2006