2025年度に発表された文献の一覧
学術論文誌
- S. Luan, Y. Wakabayashi, T. Toda, "Generalized sound field interpolation for freely spaced microphone arrays in rotation-robust beamforming," Applied Acoustics, Vol. 236, Article 110706, pp. 1-15. Apr. 2025.
- M. Eshghi, T. Toda, "Predicting fundamental frequency patterns in electrolaryngeal speech using automated phoneme extraction," IEEE Access, Vol. 13, pp. 73831-73847, Apr. 2025.
- Y. Ohtani, T. Okamoto, T. Toda, H. Kawai, "Fast neural vocoder with fundamental frequency control using finite impulse response filters," IEEE Transactions on Audio, Speech and Language Processing, Vol. 33, pp. 1893-1906, Apr. 2025.
- D. Ma, Y. Choi, T. Fujimura, F. Li, C. Xie, K. Kobayashi, T. Toda, "Sequence-to-sequence voice conversion-based techniques for electrolaryngeal speech enhancement in noisy and reverberant conditions," APSIPA Transactions on Signal and Information Processing, Vol. 14, No. 1, e8, pp. 1-40, May 2025.
- T. Tsuboi, T. Uematsu, K. Sawada, M. Higuchi, M. Hashida, M. Muto, Y. Ito, T. Ishizaki, S. Kato, D. Nakatsubo, T. Tsugawa, S. Maesawa, Y. Saito, T. Fukushima, D. Tamakoshi, K. Hiraga, M. Suzuki, R. Saito, A. Ramirez-Zamora, M.S. Okun, M. Katsuno, "Determinants of clinical and neurophysiological features in essential tremor and essential tremor plus," Journal of Neural Transmission, Vol. 132, No. 7, pp. 1041–1050, May 2025.
- C. Xie, T. Toda, "An investigation of noisy-to-noisy voice conversion performance in various noisy conditions," APSIPA Transactions on Signal and Information Processing, Vol. 14, No. 1, e10, pp. 1-30, June 2025.
- T. Fujimura, T. Toda, "Analysis and extension of noisy-target training for unsupervised target signal enhancement," APSIPA Transactions on Signal and Information Processing, Vol. 14, No. 1, e12, pp. 1-27, June 2025.
- I. Kuroyanagi, T. Fujimura, K. Takeda, T. Toda, "Improving anomalous sound detection through pseudo-anomalous set selection and pseudo-label utilization under unlabeled conditions," APSIPA Transactions on Signal and Information Processing, Vol. 14, No. 1, e13, pp. 1-28, June 2025.
- J. He, T. Toda, "PMF-CEC: phoneme-augmented multimodal fusion for context-aware ASR error correction with error-specific selective decoding," IEEE Transactions on Audio, Speech and Language Processing, Vol. 33, pp. 2402-2417, June 2025.
- Y. Choi, C. Xie, T. Toda, "Noise and reverberation-controllable voice conversion," IEEE Transactions on Audio, Speech and Language Processing, Vol. 33, pp. 2430-2443, June 2025.
- T. Ogura, T. Okamoto, Y. Ohtani, E. Cooper, T. Toda, H. Kawai, "Phoneme-level duration controllable neural text-to-speech with phoneme embedding skip connection and modified Gaussian duration modeling," IEEE Access, Vol. 13, pp. 118369-118380, July 2025.
- Y. Hashizume, L. Li, A. Miyashita, T. Toda, "Learning separated representations for instrument-based music similarity," APSIPA Transactions on Signal and Information Processing, Vol. 14, No. 1, e16, pp. 1-32, July 2025.
- D. Ma, L.P. Violeta, K. Kobayashi, T. Toda, "Pretraining and fine-tuning techniques for electrolaryngeal speech enhancement based on sequence-to-sequence voice conversion," IEEE Transactions on Audio, Speech and Language Processing, Vol. 33, pp. 3189-3201, July 2025.
- T.B. Tienkamp, T. Rebernik, B.M. Halpern, R.J.J.H. van Son, M. Wieling, M.J.H. Witjes, S.A.H.J. de Visscher, D. Abur, "Associations between acoustic, kinematic, self-reported, and perceptual measures of speech in individuals surgically treated for oral cancer," Journal of Speech, Language, and Hearing Research, Vol. 68, No. 7, pp. 3069-3089, July 2025.
- S. Chen, T. Toda, "QHARMA-GAN: quasi-harmonic neural vocoder based on autoregressive moving average model," IEEE Transactions on Audio, Speech and Language Processing, Vol. 33, pp. 3703-3719, Sep. 2025.
- Y. Yasuda, T. Toda, "Automatic design optimization of preference-based subjective evaluation with online learning in crowdsourcing environment," Computer Speech and Language, Vol. 96, Article 101888, pp. 1-16, Sep. 2025.
- D. Yoshioka, Y. Nakata, Y. Yasuda, T. Toda, "Text- and speech-style control for lecture speech generation focusing on disfluency," APSIPA Transactions on Signal and Information Processing, Vol. 14, No. 1, e26, pp. 1-31, Sep. 2025.
- J. He, X. Shi, C.-H. Hu, J. Mi, X. Li, T. Toda, "M4SER: multimodal, multirepresentation, multitask, and multistrategy learning for speech emotion recognition," IEEE Transactions on Audio, Speech and Language Processing, Vol. 33, pp. 4055-4070, Sep. 2025.
- 西尾 直樹, 小林 和弘, 戸田 智基, "喉頭摘出者における自己音声の再獲得 ~Save the Voice Project~," 気管食道科学会会報, Vol. 76, No. 5, pp. 255-263, Oct. 2025.
- T. Imamura, Y. Hashizume, W.-C. Huang, T. Toda, "Music similarity representation learning focusing on individual instruments with source separation and human preference," APSIPA Transactions on Signal and Information Processing, Vol. 14, No. 4, e305, pp. 1-29, Oct. 2025.
- R. Yoneyama, A. Miyashita, R. Yamamoto, T. Toda, "Wavehax: aliasing-free neural waveform synthesis based on 2D convolution and harmonic prior for reliable complex spectrogram estimation," IEEE Transactions on Audio, Speech and Language Processing, Vol. 33, pp. 4454-4470, Oct. 2025.
- T. Komatsu, K. Takeda, T. Toda, "Audio difference learning framework for audio captioning," APSIPA Transactions on Signal and Information Processing, Vol. 14, No. 1, e34, pp. 1-18, Nov. 2025.
- B.M. Halpern, T.B. Tienkamp, T. Rebernik, R.J.J.H. van Son, S.A.H.J. de Visscher, M.J.H. Witjes, D. Abur, T. Toda, "XPPG-PCA: reference-free automatic speech severity evaluation with principal components," IEEE Journal of Selected Topics in Signal Processing, Vol. 19, No. 5, pp. 783-795, Oct. 2025.
- L.P. Violeta, W.-C. Huang, D. Ma, R. Yamamoto, K. Kobayashi, T. Toda, "Resolving domain mismatches in electrolaryngeal speech enhancement with linguistic intermediates," IEEE Journal of Selected Topics in Signal Processing, Vol. 19, No. 5, pp. 827-839, June 2025.
- X. Li, N. Luo, F. Yu, J. Li, K. Li, Y. Li, Z. Zhao, Y. Liu, X. Shi, "Human auditory representation learning for cross-dialect bird species recognition," Ecological Informatics, Vol. 93, Article 103554, pp. 1-20, Dec. 2025.
- H. Yamashita, T. Okamoto, R. Takashima, Y. Ohtani, T. Takiguchi, T. Toda, H. Kawai, "Sequence-to-sequence voice conversion with weighted guided attention," IEEE Access, Vol. 13, pp. 216583-216595, Dec. 2025.
- B.M. Halpern, W.-C. Huang, L.P. Violeta, T. Toda, "Severity-controllable pathological text-to-speech synthesis for clinical applications," IEEE Transactions on Neural Systems and Rehabilitation Engineering, Vol. 34, pp. 573-582, Jan. 2026.
- X. Shi, J. He, X. Li, T. Toda, "A comprehensive study on the effectiveness of ASR representations for noise-robust speech emotion recognition," IEEE Transactions on Audio, Speech and Language Processing, Vol. 34, pp. 707-722, Jan. 2026.
レター
- N. Nishio, K. Kobayashi, D. Ma, S. Mitani, M. Sone, T. Toda, "A voice conversion system from electrolarynx speech to preoperative patient’s speech for total laryngectomy," OTO Open, Vol. 10, No. 1, Scientific Briefing, 5 pages, Feb. 2026.
国際会議
- Y. Hashizume, T. Toda, "Investigation of perceptual music similarity focusing on each instrumental part," Proc. IEEE ICASSP, 5 pages, Hyderabad, India, Apr. 2025.
- T. Fujimura, I. Kuroyanagi, T. Toda, "Improvements of discriminative feature space training for anomalous sound detection in unlabeled conditions," Proc. IEEE ICASSP, 5 pages, Hyderabad, India, Apr. 2025.
- K. Nishizawa, R. Yamamoto, W.-C. Huang, T. Toda, "Investigating factors related to the naturalness of synthesized unison singing," Proc. IEEE ICASSP, 5 pages, Hyderabad, India, Apr. 2025.
- T. Ogura, T. Okamoto, Y. Ohtani, E. Cooper, T. Toda, H. Kawai, "Mora-level prosody prediction for text-to-speech using Japanese BERT without accentual labels," Proc. IEEE ICASSP, 5 pages, Hyderabad, India, Apr. 2025.
- D. Ma, J. Mi, F. Li, L.P. Violeta, K. Kobayashi, T. Toda, "Improving electrolaryngeal speech enhancement via a representation learning method based on integrated text and speech representations," Proc. IEEE EMBC, 6 pages, Copenhagen, Denmark, July 2025.【3rd Place Award in EMBC 2025 Student Paper Competition(受賞者:Ding Ma)】
- T. Ogura, T. Okamoto, Y. Ohtani, E. Cooper, T. Toda, H. Kawai, "GST-BERT-TTS: prosody prediction without accentual labels for multi-speaker TTS using BERT with global style tokens," Proc. INTERSPEECH, pp. 444-448, Rotterdam, the Netherlands, Aug. 2025.
- X. Shi, X, Li, T. Toda, "Who, When, and What: leveraging the "Three Ws" concept for emotion recognition in conversation," Proc. INTERSPEECH, pp. 1763-1767, Rotterdam, the Netherlands, Aug. 2025.
- W.-C. Huang, E. Cooper, T. Toda, "SHEET: a multi-purpose open-source speech human evaluation estimation toolkit," Proc. INTERSPEECH, pp. 2355-2359, Rotterdam, the Netherlands, Aug. 2025.
- J. He, N. Sawada, K. Miyazaki, T. Toda, "CMT-LLM: context-aware multi-talker ASR utilizing large language models," Proc. INTERSPEECH, pp. 2575-2579, Rotterdam, the Netherlands, Aug. 2025.
- J. He, J. Mi, T. Toda, "GIA-MIC: multimodal emotion recognition with gated interactive attention and modality-invariant learning constraints," Proc. INTERSPEECH, pp. 2695-2699, Rotterdam, the Netherlands, Aug. 2025.
- B. Halpern, T. Tienkamp, T. Rebernik, R. van Son, M. Wieling, D. Abur, T. Toda, "Relationship between objective and subjective perceptual measures of speech in individuals with head and neck cancer," Proc. INTERSPEECH, pp. 3733-3737, Rotterdam, the Netherlands, Aug. 2025.
- X. Shi, X. Li, T. Toda, "Speaker-aware multi-task learning for speech emotion recognition," Proc. INTERSPEECH, pp. 4333-4337, Rotterdam, the Netherlands, Aug. 2025.
- X. Shi, J. Mi, X. Li, T. Toda, "Advancing emotion recognition via ensemble learning: integrating speech, context, and text representations," Proc. INTERSPEECH, pp. 4693-4697, Rotterdam, the Netherlands, Aug. 2025.
- R. Yoneyama, M. Kawamura, R. Terashima, R. Yamamoto, T. Toda, "Comparative analysis of fast and high-fidelity neural vocoders for low-latency streaming synthesis in resource-constrained environments," Proc. INTERSPEECH, pp. 4888-4892, Rotterdam, the Netherlands, Aug. 2025.
- Z. Zhang, W.-C. Huang, X. Wang, X. Miao, J. Yamagishi, "Mitigating language mismatch in SSL-based speaker anonymization," Proc. INTERSPEECH, pp. 5133-5137, Rotterdam, the Netherlands, Aug. 2025.
- C.-H. Hu, Y. Yasuda, A. Yoshimoto, T. Toda, "Unifying listener scoring scales: comparison learning framework for speech quality assessment and continuous speech emotion recognition," Proc. INTERSPEECH, pp. 5428-5432, Rotterdam, the Netherlands, Aug. 2025.
- M. Murata, K. Miyazaki, T. Koriyama, T. Toda, "Eigenvoice synthesis based on model editing for speaker generation," Proc. INTERSPEECH, pp. 5523-5527, Rotterdam, the Netherlands, Aug. 2025.
- Y. Yasuda, J. Yamagishi, T. Toda, "Continual subjective evaluation method of speech by merging sort-based preference tests towards ever-expanding corpus of human ratings," Proc. SSW, pp. 14-20, Leeuwarden, the Netherlands, Aug. 2025.
- L.P. Violeta, W.-C. Huang, T. Toda, "Serenade: a singing style conversion framework based on audio infilling," Proc. EUSIPCO, pp. 411-415, Palermo, Italy, Sep. 2025.
- K. Ogita, R. Yoneyama, W.-C. Huang, T. Toda, "VAE-SiFiGAN: source-filter HiFi-GAN based on variational autoencoder representations with enhanced pitch controllability," Proc. EUSIPCO, pp. 531-535, Palermo, Italy, Sep. 2025.【Finalists of EUSIPCO Best Student Paper Award(対象者:Kenichi Ogita)】
- K. Hattori, W.-C. Huang, K. Takeda, T. Toda, "An evaluation of supervised virtual microphone estimators in reverberant sound fields," Proc. APSIPA ASC, pp. 125-130, Singapore, Oct. 2025.
- H. Miyaji, K. Sawada, W.-C. Huang, T. Toda, "Designing a music difficulty measure for controllable automatic piano rearrangement," Proc. APSIPA ASC, pp. 246-251, Singapore, Oct. 2025.
- K. Sawada, W.-C. Huang, T. Toda, "Hierarchical symbolic music generation with variational autoencoder-based bar-wise feature sequences," Proc. APSIPA ASC, pp. 299-304, Singapore, Oct. 2025.
- M. Kaneko, W.-C. Huang, T. Toda, "Estimating speaker'ss seating position from monaural speech in a simulated vehicle interior sound field," Proc. APSIPA ASC, pp. 625-629, Singapore, Oct. 2025.
- K. Niwa, K. Kobayashi, T. Toda, "Investigation of the effectiveness of converted speech auditory feedback in low-latency real-time voice conversion," Proc. APSIPA ASC, pp. 753-758, Singapore, Oct. 2025.
- Y. Nakata, D. Yoshioka, W.-C. Huang, T. Toda, "Disfluency disentanglement enhancement in spoken-text-style transfer for spontaneous speech synthesis," Proc. APSIPA ASC, pp. 1098-1103, Singapore, Oct. 2025.
- D. Yoon, T. Toda, "Neural semi-fragile watermarking for proactive deepfake speech detection," Proc. APSIPA ASC, pp. 2092-2097, Singapore, Oct. 2025.
- S. Tang, Z. Liu, L. Chen, K.A. Lee, T. Toda, Z.-H. Ling, "A preliminary study on sectional voice anonymization and detection," Proc. APSIPA ASC, pp. 2229-2234, Singapore, Oct. 2025.
- W.-C. Huang, "Advancing speech quality assessment through scientific challenges and open-source activities," Perspective paper, Proc. APSIPA ASC, pp. 2552-2557, Singapore, Oct. 2025.
- L. Chen, K.A. Lee, Z.-H. Ling, X. Wang, R.K. Das, T. Toda, H. Li, "Speaker privacy and security in the big data era: protection and defense against deepfake," Perspective paper, Proc. APSIPA ASC, pp. 2570-2575, Singapore, Oct. 2025.
- K. Wilkinghoff, T. Fujimura, K. Imoto, J. Le Roux, Z.-H. Tan, T. Toda, "Handling domain shifts for anomalous sound detection: a review of DCASE-related work," Proc. DCASE Workshop, pp. 20-24, Barcelona, Spain, Oct. 2025.
- M. Matsumoto, T. Fujimura, W.-C. Huang, T. Toda, "Adjusting bias in anomaly scores via variance minimization for domain-generalized discriminative anomalous sound detection," Proc. DCASE Workshop, pp. 25-29, Barcelona, Spain, Oct. 2025.
- T. Fujimura, K. Wilkinghoff, K. Imoto, T. Toda, "ASDKit: a toolkit for comprehensive evaluation of anomalous sound detection methods," Proc. DCASE Workshop, pp. 40-44, Barcelona, Spain, Oct. 2025.
- T. Fujimura, I. Kuroyanagi, T. Toda, "Discriminative anomalous sound detection using pseudo labels, target signal enhancement, and ensemble feature extractors," Proc. DCASE Workshop, pp. 180-184, Barcelona, Spain, Oct. 2025.
- K. Mizukami, D. Deguchi, T. Toda, H. Murase, H. Kyutoku, T. Minematsu, "Study on automatic generation of lecture videos based on content analysis of lecture slides," Proc. CELDA, 4 pages, Porto, Portugal, Nov. 2025.
- J. He, N. Sawada, K. Miyazaki, T. Toda, "PARCO: phoneme-augmented robust contextual ASR via contrastive entity disambiguation," Proc. IEEE ASRU, 7 pages, Honolulu, USA, Dec. 2025.
- E. Cooper, T. Okamoto, Y. Ohtani, T. Toda, H. Kawai, "Layer-wise analysis for quality of multilingual synthesized speech," Proc. IEEE ASRU, 7 pages, Honolulu, USA, Dec. 2025.
- W.-C. Huang, H. Wang, C. Liu, Y.-C. Wu, A. Tjandra, W.-N. Hsu, E. Cooper, Y. Qin, T. Toda, "The AudioMOS Challenge 2025," Proc. IEEE ASRU, 8 pages, Challenge paper, Honolulu, USA, Dec. 2025.
- W. Ren, Y.-C. Lin, W.-C. Huang, R.E. Zezario, S.-W. Fu, S.-F. Huang, E. Cooper, H. Wu, H.-Y. Wei, H.-M. Wang, H.-y. Lee, Y. Tsao, "HighRateMOS: sampling-rate aware modeling for speech quality assessment," Proc. IEEE ASRU, 4 pages, Challenge paper, Honolulu, USA, Dec. 2025.
- Y. Ohtani, T. Okamoto, T. Toda, H. Kawai, "Voice factor control using FIR-based fast neural vocoder for speech generation applications," Proc. IEEE ASRU, 4 pages, Demo paper, Honolulu, USA, Dec. 2025.
- J. Shi, B.-H. Su, S. Bharadwaj, Y. Zhao, S.-H. Wang, J. Hang, H. Wang, W. Wang, W. Feng, Y. Tang, N. Topaloglu, S. Arora, J. Tian, W. Chen, H.-j. Shim, W. Zhang, W.-C. Huang, S. Watanabe, "VERSA-v2: a modular and scalable toolkit for speech and audio evaluation with expanded metrics, visualization, and LLM integration," Proc. IEEE ASRU, 4 pages, Demo paper, Honolulu, USA, Dec. 2025.
講習会
- W.-C. Huang, E. Cooper, J. Shi, "Automatic quality assessment for speech and beyond," Tutorial, INTERSPEECH, Rotterdam, the Netherlands, Aug. 2025.
招待講演
- 米山 怜於, "ニューラルボコーダ概説:生成モデルと実用性の観点から," 音学シンポジウム, 招待講演, 東京, June 2025.
- 戸田 智基, "音声研究の知見がニューラルボコーダの発展にもたらす効果," 音学シンポジウム, 招待講演, 東京, June 2025.
- T. Toda, "Recent advances and future directions in voice conversion," Survey Talk, INTERSPEECH, Rotterdam, the Netherlands, Aug. 2025.
- T. Toda, "Personalized speech generation," Perspective Talk, Panel session "Voice Privacy and Security," APSIPA ASC, Singapore, Oct. 2025.
- W.-C. Huang, "Advancing speech quality assessment through scientific challenges and open-source activities," Perspective Talk, Panel session "Neural Speech Assessment and Its Application," APSIPA ASC, Singapore, Oct. 2025.
- W.-C. Huang, "Challenges in self-supervised speech representation-based voice conversion," ASA-ASJ Joint Meeting, Invited Talk, 3aSC1, Honolulu, USA, Dec. 2025.
- T. Toda, "Lessons learned from research in speech signal processing," Symposium on Speech and Behavior Informatics, Honolulu, USA, Dec. 2025.
- W.-C. Huang, "A Taiwanese scholar in Japan: a guide to studying abroad in Japan," Invited Talk, Information Center on Study Abroad, Taipei Public Library, Taiwan, Jan. 2026.
- 戸田 智基, Xiaohan Shi, "音声表情に着目した音声情報処理の進展," 音響学会, 招待講演, 東京, June 2025.
研究会
- 藤村 拓弥, "ICASSP2025における異常音検知の動向," 信学技報, Vol. 125, No. 36, EA2025-1, pp. 1-6, May 2025.
応用音響研究会, オーガナイズドセッション, May 2025.
- 橋爪 優果, "ICASSP2025における音楽情報処理の動向," 信学技報, Vol. 125, No. 36, EA2025-3, pp. 13-17, May 2025.
- 米山 怜於, "ニューラルボコーダ概説:生成モデルと実用性の観点から," 情報処理研報, Vol. 2025-SLP-156, No. 3, 1 page, June 2025.
- 戸田 智基, "音声研究の知見がニューラルボコーダの発展にもたらす効果," 情報処理研報, Vol. 2025-SLP-156, No. 4, 1 page, June 2025.
- 宮司 光梨, 澤田 桂都, ホワン ウェンチン, 戸田 智基, "制御性の高いピアノ自動編曲に向けた楽曲難易度指標の設計," 情報処理研報, Vol. 2025-MUS-143, No. 8, pp. 1-7, June 2025.
- 山下 陽生, 岡本 拓磨, 高島 遼一, 大谷 大和, 滝口 哲也, 戸田 智基, 河井 恒, "重み付きAttentionのアライメント機構を用いた系列変換型声質変換," 情報処理研報, Vol. 2025-SLP-143, No. 75, pp. 1-6, June 2025.【音学シンポジウム2025優秀発表賞(受賞者:山下 陽生)】
- 服部 公宏, ホワン ウェンチン, 武田 一哉, 戸田 智基, "多様なシミュレーション音場における教師あり仮想マイクアレイ信号推定の汎化性能評価," 信学技報, Vol. 125, No. 74, SP2025-20, pp. 107-112, June 2025.
- W.-C. Huang, L.P. Violeta, T. Toda, "JATTS: a comparison-oriented Japanese text-to-speech open-sourced toolkit," 信学技報, Vol. 125, No. 74, SP2025-22, pp. 119-124, June 2025.
- 橋爪 優果, 渡邉 研斗, 中塚 貴之, 佃 洸摂, Tian Cheng, 中野 倫靖, 後藤 真孝, 戸田 智基, "MixQuery: ユーザ選択ステムの集約に基づく楽器音色指向楽曲検索システム," 情報処理研報, Vol. 2026-MUS-145, No. 21, pp. 1-9, Feb. 2026.
- 今村 剛大, 橋爪 優果, ホワン ウェンチン, 戸田 智基, "個別楽器音に着目した楽曲間類似度表現学習におけるテキスト表現による楽器指定," 信学技報, Vol. 125, No. 369, EA2025-92, pp. 114-120, Mar. 2026.
- 荻田 健一, 米山 怜於, ホワン ウェンチン, 戸田 智基, "大規模学習条件下および雑音環境下におけるVAE-SiFiGANの性能評価," 信学技報, Vol. 125, No. 371, SP2025-77, pp. 306-311, Mar. 2026.
- 須藤 克仁, 譚 皓天, 西川 勇太, 加納 保昌, サクティ サクリアニ, 高道 慎之介, 戸田 智基, 中村 哲, "声質変換による原話者音声出力を行う音声から音声への同時翻訳システム," 情報処理研報, Vol. 2026-NL-267, No. 3, pp. 1-8, Mar. 2026.
大会講演
- 中井 淳一, 藤村 拓弥, 高田 将典, 浅野 憲司, 若松 智之, 戸田 智基, "説明性向上マルチモーダルAIによるMOCの潜在的異常見える化," 第24回情報科学技術フォーラム(FIT2025), CH-006, 第3分冊, pp. 21-24, Sep. 2025.
- 小椋 忠志, 岡本 拓磨, 大谷 大和,Erica Cooper, 戸田 智基, 河井 恒, "GST-BERT-TTS:アクセントラベル不要な複数話者日本語TTS," 音講論, 1-1-1, pp. 1115-1118, Sep. 2025.
- 安田 裕介, 井本 桂右, 深山 覚, 戸田 智基, "アクセント制御音声合成と主観比較評価最適化による専門家非依存アクセントアノテーション法," 音講論, 1-1-3, pp. 1123-1126, Sep. 2025.
- Huang Wen-Chin, Wang Hui, Liu Cheng, Wu Yi-Chiao, Tjandra Andros, Hsu Wei-Ning, Cooper Erica, Yong Qin, 戸田 智基, "The AudioMOS Challenge 2025," 音講論, 1-1-16, pp. 1167-1170, Sep. 2025.
- 山下 陽生, 岡本 拓磨, 高島 遼一, 大谷 大和, 滝口 哲也, 戸田 智基, 河井 恒, "系列変換型声質変換モデルのモバイル端末実装," 音講論, 3-Q-21, pp. 1365-1366, Sep. 2025.
- 兼子 政孝, ホワン ウェンチン, 戸田 智基, "模擬車内音場におけるモノラル音声を用いた話者の着座位置推定," 日本法科学技術学会第31回学術集会, D-35, Nov. 2025.
- 鈴木 直樹, 木迫 璃玖, 槫林 優, 大平 茂輝, 戸田 智基, "名古屋大学におけるLMS連携システムを一元管理するためのWebアプリケーションの開発," 大学ICT推進協議会 2025年度年次大会, 3AM2B-3, pp. 616-620, Dec. 2025.
- 木迫 璃玖, 鈴木 直樹, 槫林 優, 大平 茂輝, 戸田 智基, "名古屋大学における受講状況の可視化を支援するLMS連携システムの開発と運用," 大学ICT推進協議会 2025年度年次大会, 3AM2B-4, pp. 621-626, Dec. 2025.
- 槫林 優, 鈴木 直樹, 木迫 璃玖, 大平 茂輝, 戸田 智基, "名古屋大学における課題採点支援LMS 連携Web ツールの開発," 大学ICT推進協議会 2025年度年次大会, 3AM2B-6, pp. 634-640, Dec. 2025.
- R. Yoneyama, T. Toda, "Why is a sinusoidal signal input effective in time-domain neural vocoders?," ASA-ASJ Joint Meeting, 2aSP21, Dec. 2025.
- K. Ogita, R. Yoneyama, W.-C. Huang, T. Toda, "Robust fundamental frequency control in source-filter neural vocoding via probabilistic latent representations," ASA-ASJ Joint Meeting, 2pSP11, Dec. 2025.
- T. Imamura, Y. Hashizume, W.-C. Huang, T. Toda, "Instrument-wise music similarity representation learning with source separation and human preference," ASA-ASJ Joint Meeting, 5aMU12, Dec. 2025.
- Y. Hashizume, T. Toda, "Investigation of perceptual music similarity based on individual instrumental parts and its correspondence to deep learning features," ASA-ASJ Joint Meeting, 5aMU13, Dec. 2025.
- K. Sawada, W.-C. Huang, T. Toda, "Cascaded symbolic music generation with bar-wise feature sequence modeling," ASA-ASJ Joint Meeting, 5aMU14, Dec. 2025.
- 松本 昌亮, 藤村 拓弥, Wen-Chin Huang, 戸田 智基, "異常スコア分散最小化に基づくバイアス調整を用いたドメイン汎化型識別的異常音検知," 音講論, 3-Q-6, pp. 309-312, Mar. 2026.
- 今村 剛大, 小松 達也, 宗像 北斗, 戸田 智基, "動画内区間検索における関連度値校正のための音響・映像特徴量統合," 音講論, 1-5-14, pp. 943-946, Mar. 2026.
- Wen-Chin Huang, Erica Cooper, 戸田 智基, "自動音声品質評価モデルにおけるマルチデータセット学習の調査," 音講論, 2-5-9, pp. 973-974, Mar. 2026.
- 山下 陽生, 岡本 拓磨, 高島 遼一, 大谷 大和, 滝口 哲也, 戸田 智基, 河井 恒, "系列変換型複数話者声質変換方式の比較," 音講論, 3-5-3, pp. 1001-1004, Mar. 2026.
- 古田 京平, Wen-Chin Huang, 安田 裕介, 戸田 智基, "知識蒸留による因果的な音声潜在特徴抽出と音声変換への適用," 音講論, 3-5-5, pp. 1009-1010, Mar. 2026.
- 小椋 忠志, 岡本 拓磨, 大谷 大和, 戸田 智基, 河井 恒, "fo-BERTを用いた日本語音声合成アクセントラベル推定の検討," 音講論, 1-Q-32, pp. 1077-1078, Mar. 2026.
- 大谷 大和, 岡本 拓磨, 戸田 智基, 河井 恒, "NICT日本語複数話者複数スタイル音声合成コーパスのための知覚表現および話者印象語データセットの構築," 音講論, 3-Q-30, pp. 1135-1136, Mar. 2026.
- Bence Mark Halpern, 戸田 智基, "発話障害者の自発音声を対象とした非参照型明瞭度予測," 音講論, 3-Q-42, pp. 1173-1174, Mar. 2026.
- 戸田 智基, Xiaohan Shi, "音声表情に着目した音声情報処理の進展," 音講論, 2-5-1, pp. 1203-1204, Mar. 2026.
- Minseok Kim, Wen-Chin Huang, 戸田 智基, "ピアノ楽曲の「ジャズらしさ」の知覚評価データ収集と潜在因子の分析," 音講論, 1-7-4, pp. 1219-1220, Mar. 2026.
その他発表
- 西尾 直樹, 小林 和弘, 戸田 智基, 横井 紗矢香, 向山 宣昭, 和田 明久, 横井 麻衣, 重山 真由, 三谷 壮平, 曾根 三千彦, "電気のコエから自分のコエへ -Save the Voice Project-," 日本気管食道科学会会報, 特集5 パネルディスカッション1:喉頭摘出後のコミュニケーション支援, Vol. 76, No. 2, p. 108, Apr. 2025.
- W.-C. Huang, "Automatic quality assessment for speech and beyond," Talk, Conversational AI Reading Group, Mila/Concordia University, May 2025.
- T. Fujimura, I. Kuroyanagi, T. Toda, "The NU systems for DCASE 2025 Challenge Task 2," Technical report, DCASE Task 2, 5 pages, July 2025.【DCASE 2025 Challenge Task 2 Judges' Award】
- F. Li, F. Shen, D. Ma, J. Zhou, L. Wang, F. Fan, X. Chen, T. Toda, H. Niu, "Mandarin speech reconstruction from neck and facial surface electromyography," Proc. IEEE EMBC, Research poster presentation, 4 pages, Copenhagen, Denmark, July 2025.
- L.P. Violeta, D. Ma, W.-C. Huang, T. Toda, "Pretraining and adaptation techniques for electrolaryngeal speech recognition," EUSIPCO, SPS journal paper presentation, Palermo, Italy, Sep. 2025.
- "名古屋大学「AIP加速課題:発声技能拡張PJ」," CEATEC 2025, 展示, 千葉, Oct. 2024.
- 戸田 智基, "国際チャレンジ活動を通した発声技能拡張基盤の構築," 2025年度AIPプロジェクトシンポジウム ~AI研究が創る未来~, 2025年度AIP加速課題研究成果発表, 東京, Mar. 2026.
博士論文
- Shuming Luan, "Generalized sound field interpolation in rotation-robust microphone array signal processing," 情報学研究科知能システム学専攻博士論文, July 2025.
- Shaowen Chen, "Deep speech analysis-modification-synthesis based on quasi-harmonic modeling," 情報学研究科知能システム学専攻博士論文, Dec. 2025.
- Chao Xie, "Noisy-to-noisy voice conversion capable of controlling background noise," 情報学研究科知能システム学専攻博士論文, Mar. 2026.
- Ding Ma, "Training techniques of sequence-to-sequence voice conversion for electrolaryngeal speech enhancement," 情報学研究科知能システム学専攻博士論文, Mar. 2026.
- Jiajun He, "Studies on context-aware speech recognition and multimodal spoken language understanding," 情報学研究科知能システム学専攻博士論文, Mar. 2026.
- Daiki Yoshioka, "Spoken-text processing for spontaneous speech generation," 情報学研究科知能システム学専攻博士論文, Mar. 2026.
- Yuka Hashizume, "Research on part-level music similarity for music retrieval focusing on individual instrumental parts," 情報学研究科知能システム学専攻博士論文, Mar. 2026.
- Reo Yoneyama, "Neural vocoder based on generative adversarial networks considering speech production mechanism," 情報学研究科知能システム学専攻博士論文, Mar. 2026.
- Lester Phillip Violeta, "Domain adaptation techniques for electrolaryngeal speech recognition and enhancement," 情報学研究科知能システム学専攻博士論文, Mar. 2026.
修士論文
- Alfonso Domingo Agustin, "Knowledge Distillation for Efficient Pose Estimation in Sign Language Translation," 情報学研究科知能システム学専攻修士論文, Mar. 2026.
- 今村 剛大, "個別楽器音に基づく楽曲間類似度表現学習における知覚・知見・知識の導入," 情報学研究科知能システム学専攻修士論文, Mar. 2026.
- 荻田 健一, "音源フィルタ型ニューラルボコーダにおける確率的潜在表現学習," 情報学研究科知能システム学専攻修士論文, Mar. 2026.
- 山内 凌我, "拡散モデルに基づく音声変換におけるストリーミング処理に関する検討," 情報学研究科知能システム学専攻修士論文, Mar. 2026.
- KIM, Minseok, "ピアノ楽曲の「ジャズらしさ」に関する知覚的因子分析と推定モデル構築," 情報学研究科知能システム学専攻修士論文, Mar. 2026.
卒業論文
- 石川 雄斗, "深度情報を利用した口唇動画からの音声合成," 令和7年度情報学部コンピュータ科学科卒業論文, Feb. 2026.
- 照屋 颯大, "参照楽器音を用いたテキスト指示による楽器音検索," 令和7年度情報学部コンピュータ科学科卒業論文, Feb. 2026.
- De Pontes Jefferson Makoto, "深層学習および逐次推定処理に基づくオーディオエフェクト設定逆推定," 令和7年度情報学部コンピュータ科学科卒業論文, Feb. 2026.