The Profivox corpus
The Profivox corpus text-to-speech system uses a flexible, long unit-search method for speech synthesis. This requires a fast computer as the computational demand is high. The procedure takes into account that speech is an event of the moment, the sound wave is constantly changing. Even uttering the same speech sound twice the produced two wave forms are not exactly the same. This gives the personal sound timbre. This method ensures the best quality synthesized speech. The owner of the sound can be recognized. This is because this technology concatenates long speech units like words, word sequences, or complete sentences, when it generates speech from text. It is used in applications where impeccable sound quality is a requirement (e.g. weather forecast). The price of this good quality is that this method can only be used in a limited topic. The synthesis database is a multi-hour speech corpus. The person reads sentences and phrases that are most likely to occur in the limited topic, for example, in weather forecast texts. The text to be read should be designed with serious, precise work. The ‘master sentence’ application procedure is used at every time when new recording is done in the studio. The speech database is labeled in detail. Speech synthesis is then performed using a search algorithm. This selects the most appropriate waveform elements from the speech database in several steps and with weighting calculations. So, the longer the selected unit, the better the quality is. This complex algorithm works in real time. The result is personal and natural-sounding speech. The speech database must be created individually for each topic.