New speech synthesis systems are based on complex neural network architectures to achieve naturalness close to human speech. This complexity prevents their direct application in embedded systems that have limited processing power and memory resources.
The Speech Group has developed a prototype speech synthesis system “Vezilka-lite” based on neural networks that can be applied in this kind of systems. The system is implemented with a simplified model for creating the voice (vocoder), which allows the system to work in real time. This leads to a reduction in naturalness compared to more complex vocoders, but is a necessary trade-off for deploying the full system on a nested platform.
The system was developed for the company Gordian systems, Skopje, in order to be applied in the innovative assistive devices manufactured by the company. These devices will be able to communicate with target users who may be visually impaired or blind. With that, it will be possible to directly influence the improvement of the life of this category of citizens and increase their inclusion in society.
Here’s what the prototype system sounds like.
Јас сум систем за синтеза на говор на македонски наменет за работа во реално време на вгнездени системи.
Системот е развиен за компанијата Гордиан системи со поддршка од Фондот за иновации и технолошки развој.
Асистивните технологии овозможуваат дигитална инклузија на лицата со попреченост.
Визијата на здружението за асистивна технологија „Отворете ги прозорците“ е свет на рамноправни активни луѓе.
To train our system, we used some of the publicly available audiobooks published by the Macedonian National Union of the Blind. The development of the system was financed by the Macedonian Fund for Innovation and Technological Development.