Deep Learning-based Voice Conversion for Avatar Therapy: DDSP as a baseline approach to overcome the high-latency and black-box deficits of traditional deep learning-based voice conversion techniques
The Ph.d. project by Anders Bargum in collaboration between Khora and Multisensory Experience Lab investigates the recently developed and real-time efficient ’differentiable digital signal processing’ (DDSP) framework as a baseline approach to overcome the high-latency and black-box deficits of traditional deep learning-based voice conversion techniques.
The goal is to create a controllable and configurable voice conversion software for applications that aid AVATAR therapy – a new form of virtual reality treatment that through therapeutic sessions allows individuals who hear voices to have a dialogue with a digital representation of their presumed persecutor.
In short; the project aims at providing new and novel techniques for recreating the voices heard by people with schizophrenia.