BME Department of Telecommunications and Media Informatics, Speech Recognition Laboratory (LSR)

Laboratory leader: Dr. Péter Mihajlik

The laboratory has been researching, developing and teaching automatic speech recognition (ASR) since the end of the 20th century. Practical applications has been a priority from the beginning, i.e. from the recognition of key words to the real-time (or even faster) speech to text conversion of fluent speech two decades later. The beauty and difficulty of ASR is that it requires in-depth knowledge of many topics, such as machine learning, finite state machines, statistics, physical acoustics, phonetics, linguistics, natural and programming languages, scripting languages, and apply GPU. It is important to point out that although ASR technology has always been based on statistically driven machine learning, with the explosion of “deep learning” in the early 2010s, development has skyrocketed and has not stopped ever since. It can be said that in-depth machine learning and ASR go hand in hand, and that the efficiency of ASR sometimes becomes comparable to that of humans due to the large amount of data. URL: https://www.tmit.bme.hu/lsr

Video in Hungarian: https://www.youtube.com/watch?v=zOJTrnP5M04