Research Projects
Impaired speech production poses significant communication hurdles, often impacting career prospects and quality of life. This project envisions (i) the prediction of effects of clinical treatment of impaired speech, and (ii) a better sounding substitution speech, i.e., electrolarynx (EL) speech, for individuals with otherwise unfavorable prospects.
Objectives: Objective 1 is to predict post-treatment speech audio recordings, i.e., readings of a German standard text. Input data is paired text-parallel pre-treatment speech audio recordings, and electronic patient records (EPRs), i.e., unstructured textual clinical reports containing information about the speech impairment and treatment plan. Objective 2 is to improve speech of larygectomees in...
Gas pipelines and power lines are part of the critical civil infrastructure of a country. Fibre Optic Monitoring of linear distribution infrastructure is done using Distributed Acoustic Sensing (DAS). New methods of pipeline monitoring are developed and evaluated in field trials.
In the last decade, conversational speech has received a lot of attention among speech scientists. On the one hand, accurate automatic speech recognition (ASR) systems are essential for conversational dialogue systems, as these become more interactional and social rather than solely transactional. On the other hand, linguists study natural conversations, as they reveal additional insights to controlled experiments with respect to how speech processing works. Investigating conversational speech, however, does not only require applying existing methods to new data, but developing new categories, new modeling techniques and including new knowledge sources.
Hypotheses, research questions and objectives
The three objectives of...
Es soll eine UWB-basierte Platform für Automobilanwendungen entwickelt werden, die auf Grund der Kombination von
UWB Kommunikation, UWB Radartechnologie und sicherem maschinellen Lernen neuartige Eigenschaften besitzen soll.
Maschinelles Lernen und weitere Sicherheitsmassnahmen sind der Schlüssel, um mehr Information über die vorhandenen Ziele zu erhalten, um die gewünschten Ziele besser erkennen und lokalisieren zu können und um diese Daten nur authorisierten Benutzern zur Verfügung zu stellen. Die Kombination der erwähnten Technologien soll in einem einheitlichen Konzept für Bauteil und Antenne münden, welches folgende Anwendungen unterstützt:
Zugangskontrolle zum Auto, Innen/Aussen-Erkennung und Erkennung von Vitalfunktionen und Gesten.
Gestenerkennung ermöglicht zum Beispiel individualisierte Unterstützung...
Durch Hands-On Experimente und altersgerechte Erklärungen sollen Schülerinnen einen freudvollen Einstieg in Technik bekommen. Durch das Einstiegstor der Musik und des Schalls geht es über elektronische Aufnahmen schließlich in die Elektrotechnik. Gemeinsam mit einem MOOC und einem Workshopdesign (siehe auch Sprachworkshop) werden die Ergebnisse zu einer Toolbox für Pädagoginnen zusammengefasst.
Das Forschungsteam rund um Lukas und Luisa Luchs.<br \>Entworfen und gezeichnet von Katharina Peter.
Das Forschungs und Studienfeld der Elektrotechnik Toningenieure*innen bietet sich für die Umsetzung an. Töne und Musik können in Workshops als physikalische Phänomene erklärt werden. Über die Aufnahme, Speicherung und Ausgabe von Schall wird ein Einstieg in...
With currently available Automatic Speech Recognition (ASR) systems, very good recognition performance can be obtained for read speech (word accuracies of 100 – 90%), but not for conversational speech (60 – 80 %). Highly accurate ASR systems for conversational speech are especially relevant for conversational dialogue systems, as they shall become more conversational, interactional and social rather than transactional. Thus, in recent decades, an increasing number of studies have focused on investigating the differences between these speaking styles in order to find ways how to improve ASR performance for conversational speech. One difference between read and conversational speech is that...
Deep representation learning is one of the main factors for the recent performance boost in many image, signal and speech processing problems. This is particularly true when having big amounts of data and almost unlimited computing resources available as demonstrated in competitions such as for example ImageNet. However, in real-world scenarios the computing infrastructure is often restricted and the computational requirements are not fulfilled. In this research proposal we suggest several directions for reducing the computational burden, i.e. the number of arithmetic operations, while maintaining the level of recognition performance.
Today, advanced embedded CPUs have reached an architectural feature set...
Everyday life applications highly depend on successful speech transmission and speech communication, to name a few: smart homes with voice commands, hands-free mobile telephony, and speech recognition with machines. In all these applications it is quite important to guarantee a high performance robust to the background noise or reverberation in the room. A pre-processing stage in the form of signal enhancement is very important in order to remove the undesired background noise sources. Our goal is to develop methods for estimating the desired source signal observed in noise and to tackle new challenges in different speech applications: noise reduction, source...
In this project, which is funded by Higher Education Structural Funds (Hochschulraumstrukturmittel), seven university partners and three other institutions deal with theoretical and practical aspects of the Digital Edition from different perspectives.
At the SPSC Lab we are exploring tools to do new ways of analysis of editions containing text and audio. Those tools should enable researchers from the Humanities to analyse material in new ways that have not been possible without a digital edition.
Specifically we currently explore the following topics:
Network analysis of theater plays, comparing e.g. different edition of the same play. Directed information in literature based...
Wireless communication and localization are key components of the envisioned “Internet-of-Things”. However, wireless technologies suffer from physical and man-made impairments, e.g., multipath propagation and interferences from competing transmissions, as well as from the effect of temperature variations and other environmental properties. This impairs the accuracy, latency, loss, and energy consumption of wireless services. Our key objective is to offer statistical guarantees on the reliability and availability of correct wireless localization and communication by automatically adapting system parameters using models of the transceiver hardware and the environment.
This research is a subproject being conducted in the framework of the DependableThings program,...
People, who lost their larynx, e.g. due to cancer, depend on a substitution voice. The three most common methods (esophageal voice, voice prosthesis, electronic speech aid) sound male, if they sound human at all. Since for a long time, the huge majority of the patients were men, this issue never came into the focus of research and development. However, in recent years, there has been a significant increase of female patients, so that a variety of voice qualities that go beyond the existing male norm, are of increasing importance for the products of the company partner.
The focus of HumanEVoice...
The aim of MIMIC is to gain further understanding of the mechanisms of psychological and physiological adaptation or maladaptation in extreme or stressful environments through computerized analysis of speech and the content of spoken and written verbal communication. The project also aims at the improvement of the data collection and analysis methods developed in previous studies and prove their applicability in an operational environment.
Discriminative learning of Bayesian networks (BNs) for classi fication tasks is often bene ficial compared to generative learning. This is particularly true in case of model mismatch, i.e. when the BN cannot represent the true data distribution. In the past, we developed maximum margin parameter learning for Bayesian network classifi ers and Gaussian Mixture models. Furthermore, we used the margin objective for approximate and exact structure learning. This research is extended within this proposal. The focus is three-fold: (i) Extension of margin-based parameter learning to a hybrid paradigm merging the advantages of generative and discriminative learning. We aim at extending...
More than 50% of adults in Germany have difficulties to fully comprehend information ditributed by government authorities and companies. This lack of reading abilites excludes people from knowing their right, from education and could even put them in danger. Communication in plain language is therefore an importamt tool to reduce barriers for information comprehension.
We are working to automatically evaluate whether a text is written in a way that it is comprehensible for people at different language levels (i.e. B1, A2, A1). In the long run this will give those who translate texts into plain language at tool to check...
The project investigates a localization system of passive RFID tags for an intelligent process control system. The real-time tracking of components, tools, and products is a key technology to optimize work flows, e.g. in flexible manufacturing. REFlex not only covers research of the localization system and modeling of flexible production environments: Ethical and social implications of the new technology (possible tracking of persons) are studied also.
The project ENTRANCE has the goal to investigate signal processing and system design methods that enable the design of flexible and power-efficient radio transmitters.
Motivation and Challenges A special focus is digital predistortion for WLAN according to 802.11ac which includes larger bandwidths than 802.11n and requires advanced predistortion structures and learning algorithms. Optimization is used as a general tool for system design, as well as a method for structure and parameter learning of digital predistortion systems. Signal processing techniques are applied to the development and analysis of new measurement methods, which are highly important for the characterization of hardware building...
The Problem Automatic speech recognition (ASR) systems were originally designed to cope with carefully pronounced speech. Most real world applications of ASR systems, however, require the recognition of spontaneous, conversational speech (e.g., dialogue systems, voice input aids for physically disabled, medical dictation systems, etc.). Compared to prepared or read speech, conversational speech contains utterances that might be considered ‘ungrammatical’ and contain disfluencies, such as “…oh, well, I think ahhm exactly …” The pronunciation of the words may depend for instance on the regional background of the speakers, the formality of the situation or the frequency of the word. A highly...
Robustness against reverberation, noise, and interfering audio signals is one of the grand challenges in speech recognition, speech understanding, and audio analysis technology. One avenue to approach this challenge is single-channel audio separation. Recently, factorial hidden Markov models have won the single-channel speech separation and recognition challenge. These models are capable of modeling acoustic scenes with multiple sources interacting over time. While these models reach super-human performance on specific tasks, there are still serious limitations restricting the applicability in many areas.
We aim to generalize these models and enhance their applicability in several aspects: (i) Introduction of discriminative large margin...
The project MINT investigates an RF-based localization and tracking system intended for indoor use. The method to be investigated, previously proposed by our group, exploits information from reflected multipath components, assuming prior knowledge of a floor plan. This approach has been termed “multipath-assisted indoor navigation and tracking (MINT)”. The project evaluates the practical feasibility of the MINT approach.
Overview
The MINT concept allows for an optimal combination of reflected signal components in an indoor environment. Optimality is achieved by a method that can automatically estimate (either in a dedicated training phase or online during the tracking) the uncertainty and reliability...
Ziel des vorliegenden transdisziplinären grundlagenorientierten Forschungsprojektes ist die computerunterstützte Analyse von akustischen Signalen zur nicht-invasiven Diagnostik thorakaler Erkrankungen. Dabei werden akustische Signale über Sensoren, die am Thorax des Patienten positioniert sind, aufgenommen und mittels intelligenter Analyseverfahren klassifiziert. Die physiologischen Atemgeräusche werden über eine krankheitsbedingte Veränderung der Schallbedingungen in der Brusthöhle unterschiedlich alteriert und sind deshalb als akustisch charakteristische Signale wahrnehmbar. Das Projektziel ist eine zuverlässige computerunterstütze Analyse und Klassifikation dieser Signale.
Im Projekt soll eine Methode einschließlich eines Demonstrators für eine zuverlässige und frühzeitige Erkennung eines Pneumothorax entwickelt werden. Dies unterstützt kostengünstiges Screening von Risikopatienten, die Erstdiagnostik im Notarztwesen und...
The project aims enrich the available psychological knowledge through phonological and content analysis a variety of recorded speech samples collected at regular periods from the over-wintering crews at Concordia Antarctic Research Station.
Participants are asked to record a weekly video/voice diary of the most significant events of the past week. At the same time they will also read aloud a short tale widely used for phonological analyses in different languages. Participants are asked to speak and read in French, Italian or English, whichever is most convenient for them. A dedicated laptop computer with head microphone will be used for these...
Seit 2009 werden zum interdisziplinären Thema Klassenraumakustik verschiedene Arbeiten durchgeführt mit dem Ziel, den Einfluss der Raumakustik in Klassenzimmern auf den Schulalltag zu untersuchen und die enorm vielfältigen Zusammenhänge aufzuzeigen.
Als Partnerschule für diese Untersuchungen konnte das BRG Kepler in Graz gewonnen werden, das nicht zuletzt aufgrund seiner naturwissenschaftlichen Ausrichtung, stets großes Interesse für die verschiedenen Themengebiete zeigte. Als unmittelbaren Nutzen aus diesen Arbeiten wurden im Jahr 2014 nahezu alle Klassenräume saniert und bieten jetzt eine deutlich bessere raumakustische Situation.
Bisher durchgeführte Arbeiten
Klassenraumakustik, Diplomarbeit von Maurice Müller, 2009 Akustische Sanierung von Klassenräumen, Diplomarbeit von Claudia Reithner, 2013 Optimierung des...
Some people, after suffering voice problems over a longer period of time, are confronted with the diagnosis of laryngeal cancer. While at an early stage there is a good chance of healing and being able to continue the previous live, sometimes the last chance is to remove the entire larynx. Vocal communication as usual is not possible anymore. The person has to learn to use a substitution voice, which sounds very different compared to a natural voice. The social stigma, which can go along with the medical situation, poses the danger to lead the person into social isolation. The estimated...
The modeling, measurement, transmission, and processing of information-bearing data and signals are key constituents of any modern technical system. Driven by scalability and reliability considerations, there has recently been a remarkable trend to implement these constituents in a distributed manner. Notable examples for distributed information processing architectures are communication networks, sensor networks, smart grids, traffic telematic systems, and grid computing. The project Signal and Information Processing in Science and Engineering (SISE) aims at making fundamental contributions to some of the most eminent and pressing problems arising in the context of distributed information processing. This ambitious goal requires the development of...
The DIRHA project addresses the development of voice-enabled automated home environments based on distant-speech interaction in different languages. A distributed microphone network is installed in the rooms of a house in order to monitor selectively acoustic and speech activities observable inside any space, and to eventually run a spoken dialogue session with a given user in order to implement a service or to have access to appliances and other devices. The multi-microphone front-end is based on the use of arrays consisting of analog microphones or Micro Electro-Mechanical Systems (MEMS) digital microphones. The targeted system analyses the given multi-space acoustic scene...
The aim of this project is the demonstration, validation, and evaluation of a wireless multicarrier transmission scheme that employs a novel noncoherent receiver. The receiver supports energy detection of a multiband ultra-wideband (UWB) signal. It is a robust, power-efficient receiver architecture that is capable of collecting energy from the multipath components of the channel response and it has a scalable increased data rate. The design and evaluation of a hardware demonstrator for this receiver architecture is the key objective of the project. Central element of the demonstrator is an analog frontend that lowers the requirement for digital signal processing and...
Within the project LOBSTER a system for analysing escaping groups of people in crisis situations in public buildings/constructions is developed. For the localisation and the analysis of the activities of the escaping groups of people, the positioning technologies GNSS, WLAN, and MEMS of common smart phones are used. The determined positions are transmitted to a LBS centre in case of distress. In the centre these data are used in combination with plant layouts and mathematical filter technologies (mathematical particle filter and Kalman Filter) to analyse and predict the escape behaviour. The analysis supports the first responders in establishing a significantly...
The main idea of the DRAGON project is to research and use new design methodologies and architectural innovations, based on reconfigurability and state-of-the-art digital CMOS technology, in order to break the barriers imposed by the lack of scaling properties of analog components. With this concept, distinct reductions in cost, size and energy consumption for multi-standard cellular handsets can be achieved, while higher demands on data rate can be met.Data rates are increasing every day, therefore, the energy consumption per transmitted or received data bit has to be reduced in order to save energy and avoid thermal problems. Wireless data services...
Graphical models have become the method of choice for representation of uncertainty in machine learning. Two research issues are currently of major interest in the scientific community: First, much work is devoted to find and analyze more efficient approximate inference algorithms, e.g, loopy belief propagation, variational methods, sampling methods, concave-convex procedure, loop corrections, et cetera. Second, there has been much interest in learning the parameters and the structure of directed graphical models from data. Basically, there are two main paradigms for learning in the machine learning community: generative and discriminative learning. Generative learning is well explored for directed graphical models,...
The intention of the project is to join research activities in the field of advanced audio processing. The central goal is to strengthen and augment the cooperation between academia and economy. The link between computationally demanding algorithms for audio signal processing and the ability to develop real-time systems is sought after within many innovative application fields that are tackled by the industrial partners, like professional audio and communication technologies, automotive, and entertainment systems. The expected results can be implemented in systems for in-car-communications, dictation and teleconferencing, as well as professional headphones and loudspeakers, and casino gaming machines.
Today the accurate and safe determination of position and time information using GNSS has become an essential part in our society. Unfortunately, the more valueable a resource becomes to our civil infrastructure the more criminals or malicious agents seek to discover and exploit weaknesses in order to disrupt legitimate users or to perpetrate fraud. While the signal authentication necessary to secure the system against such attacks is available for military and government use (depending on the GNSS system), there is no such security function for civilian applications.
The main goal of the proposed SoftGNSSTrusted project is the investigation of new...
Das Projekt GreenPArk instrumentalisiert die digitale Signalverarbeitung zur Steigerung des Wirkungsgrades von HF-Leistungsverstärkern in Mobilfunk-Basisstationen. Dazu werden geschaltete Verstärker unter Verwendung neuartiger digitaler Modulationsverfahren und Signalverarbeitungsmethoden untersucht. HF-Leistungsverstärker in Mobilfunk- Basisstationen, die mit intelligenten Algorithmen und neuartigen Architekturen ausgestattet werden, haben alleine in der Steiermark ein Energieeinsparungspotential von über 21 Millionen kWh pro Jahr, was dem Jahresstromverbrauch von zirka 4900 3-Personen- Haushalten entspricht. Ziel des 2-jährigen Projekts GreenPArk ist die Realisierung solcher Algorithmen, um das Einsparungspotential für die Steiermark nutzbar zu machen.
Verteilte Signale und Daten werden in Zukunft von zentraler Bedeutung für viele Bereiche des täglichen Lebens sein. Vernetzte Sensoren und verteilte Daten erlauben ein verbessertes Verständnis unserer Welt und ihre nachhaltige Nutzung. Um diese großen Datenmengen in nützliche Information zu verwandeln, sind bahnbrechende wissenschaftliche Erkenntnisse am Schnittpunkt von Mathematik, Signal- und Informationsverarbeitung, Nachrichtenübertragung und Scientific Computing erforderlich. Wir werden neue Theorien, Algorithmen und Implementierungen entwickeln, die die Extraktion, Kompression, Übertragung und Speicherung von großen verteilen Datenmengen erlauben. Der Schwerpunkt liegt auf verteilten Architekturen, die fehlertolerant und skalierbar gestaltet werden können. Die Ergebnisse dieser Grundlagenforschung sind in Sensor- und Kommunikationsnetzen, verteilten...
The area of passive UHF RFID is mostly a niche application for tracking of small goods. Accurate localization of tagged objects could be beneficial in numerous applications, such as warehouse and point-of-sale portals, salesrooms, or archives. Although there has been considerable research on this issue since 2005, accurate positioning remains elusive. There are two major reasons for this: Severe multipath propagation due to the backscatter nature (degenerate channels) and the portal setup (resembling industrial environments) is the dominant source of errors. In combination with limitations enforced by the design of passive UHF RFID (low-power, low-complexity tags; high throughput of tags...
The main objective of the Action is to combine previously unexploited techniques with new theoretical developments to improve the assessment of voice for as many European languages as possible, while acquiring in parallel data with a view to elaborating better voice production models.
Progress in the clinical assessment and enhancement of voice quality requires the cooperation of speech processing engineers and laryngologists as well as phoniatricians. Specifically, this Action is a joint initiative of speech processing teams and the European Laryngological Research Group (ELRG).
Objectives:
Developing analysis algorithms that impact on speech processing applications and assessment of voice disorders Making...
This poject explores security enhanced speaker verification and identification systems based on speech signal watermarking. The goal is to detect several situations where a playback speech, a synthetically generated speech, a manipulated speech signal or a hacker trying to imitate the speech is fooling the biometric system. One issue is to determine whether biometrics (i.e. speaker analysis) and watermarking can coexist simultaneously minimizing the mutual effects.
The main objective of the SoftGNSS project is the development of a software defined Global Positioning System (GPS) receiver whose performance is enhanced by a dual-frequency approach. The combined processing of the L1 and L2c GPS frequency allows for mitigating measurement errors, as for example errors caused by distortions introduced in the Ionosphere. An improved receiver accuracy makes the application of GPS beneficial for an even wider range of applications as compared with todays performance obtained through a single frequency approach. The software defined nature of the system facilitates the adaptation of the receiver to prospective GPS specifications and future...
The here presented studies were carried out within the basic project Robust, part of COAST. Robust deals with robust speech pre-processing for speech recognition. The development of a new method for source separation is the task within the speech enhancement module of Robust.
Single Channel Source Separation using synthetic a priori known excitation signals and searched for the vocal tract envelope (VTE) information. Source-filter based single channel speech separation using multi-pitch tracking.
A notorious challenge for automatic speech recognition is the significant decrease of recognition rates encountered under non-ideal acoustic environments. The presence of background noise or of con-current speech from speakers other than the target speaker greatly impairs speech recognition performance. A further obtrusive influence is due to varying recording conditions (diverse noise sources, microphone position, etc.). This base project aims at providing defined and stable signal quality for speech as a precondition for robust speech recognition. This includes the suppression of background noise and of speech of interfering speakers, both being a frequent cause of reduced recognition performance. In addition...
Over the last decade, Bayesian networks have become the method of choice for representation of uncertainty in machine learning. Bayesian networks are used in many research areas such as bioinformatics, computer vision, speech recognition, error-correcting coding theory, and artificial intelligence. Currently, the research is focused on two main issues. First, much work is devoted to finding more efficient approximate inference algorithms. Second, there has been much interest in learning the parameters and the structure of Bayesian networks from data. Basically, there are two main paradigms for learning in the machine learning community: generative and discriminative learning. There is a strong...
The complexity of RFID systems has been increasing continuously. New applications are emerging, where the tags are extended by arbitrary sensors, the collection of data from low class tags, the communication between the tags and the support of Real-Time Localization Systems (RTLS). These new applications require active RFID tags, where a battery powers the tag. New communication techniques have to be evaluated for these tags to satisfy their requirements. Active RFID tags are currently extremely expensive so the focus is on simple and low complexity techniques to reach new market segments. The tags have to operate in highly multipath intensive...
Within this project the use of advanced speech recognition technology for telehealth or telecare applications is evaluated. State of the art video-care systems, e.g. BETAVISTA from Zydacron connect care service providers, such as hospitals, doctors, nurses and nursing homes with their patients or clients enabling daily monitoring and counsel to take place effectively. The communication hardware connects with the patient and their medical devices and retrieves the patients data from their home and transfers it to the service provider. A complete solution from the patient’s location via any available network to the service provider is offered. The concepts of the...
The main aim of the project ALSO is to expand speech recognition systems toward a speaker and paragraph-specific parametrization and automatic adaptation (of parameters), so that recognition becomes more exact; to enable more efficient implementation and to achieve greater acceptance among users. The development of new tools for improved training is an essential part of speech recognition systems based on available data, which depict and describe the field of application exactly. This enables the user to be trained for an existing system and concurrent application, deploying user-friendly and available means. One the one hand, high initial recognition rates and improved...
The speech communication channel for flight control between Pilot and Tower is used to transmit additional data such as a flight number. The data should be available at the screen of the flight controller to gain additional confidence about the identiy of the communication partner. The problem is approached by using watermarking techniques, where data is embedded into a host signal without being perceptable for the receiver.
PROACT has the short-term goal to stir increased interest and cooperation in the research area of contactless identification technology. The medium-term goal is to establish Graz as a center of excellence in advanced RFID technology and related fields of research. PROACT has the goal to augment teaching activity for RFID topics and to attract students to specialize in an RFID-related area. An appropriate number of high qualified PROACT graduates should find attractive jobs in the local industry. Another goal of PROACT is to strengthen the collaboration of the local industry with academia. Building on existing expertise in this area, PROACT...
The rapid time variation of mobile radio channels is often modeled as a random process with second order moments reflecting vehicle speed, bandwidth and the scattering environment. These statistics typically show that there is little room for prediction of channel properties such as received power or complex taps of the impulse response coefficients, at least when linear predictor structures are considered. We have used mutual information estimation to measure statistical dependencies in sequences of wideband mobile radio channel data and found significant nonlinear dependencies, far exceeding the linear component. Based on these upper limits for the predictability of channel evolution...
The Christian Doppler Laboratory for Nonlinear Signal Processing addresses fundamental research questions arising from signal processing applications which are challenging due to their nonlinear aspects. We deliver theoretical analyses, develop and optimize new algorithms and, through their implementation, build awareness for their complexity, robustness, accuracy, and power consumption trade-offs. The Christian Doppler Laboratory for Nonlinear Signal Processing plays a leading role in the solution of signal processing problems where conventional methods fail. By entering into industrial partnerships, it thrives from and supports the bidirectional exchange of know-how and people between nonlinear science and the sweeping digital signal processing revolution.
High-frequency fast frequency-hopping systems require frequency synthesizers to provide multi-gigahertz clocks with a band switching time on the order of few tens of nanoseconds, posing difficult challenges with respect to noise, sidebands, and power dissipation. Conventional phase-locked loop (PLL)-based synthesizers are simply ill-suited due to the long settling times, which are typically tens of microseconds. Recent research has pushed the development of digital-based low-noise high-frequency synthesizers where the traditional analog forward path is replaced by a digital processing core and the VCO is replaced by a Digitally Controlled Oscillator (DCO). The advantages of such architectures include: friendly implementation in newest...
High-frequency fast frequency-hopping systems require frequency synthesizers to provide multi-gigahertz clocks with a band switching time on the order of few tens of nanoseconds, posing difficult challenges with respect to noise, sidebands, and power dissipation. Conventional phase-locked loop (PLL)-based synthesizers are simply ill-suited due to the long settling times, which are typically tens of microseconds. Recent research has pushed the development of digital-based low-noise high-frequency synthesizers where the traditional analog forward path is replaced by a digital processing core and the VCO is replaced by a Digitally Controlled Oscillator (DCO). The advantages of such architectures include: friendly implementation in newest...
In recent years the rapid growth of the number of users in mobile communication networks led to the development of third generation standards like UMTS. The modulation and the multiple user access methods where designed for high spectral efficiency. This leads to strong fluctuations of the power envelope transmitted by the UMTS Base-Stations and therefore to nonlinear effects caused by power amplifiers. Because these devices are the most cost intensive, it is desirable to operate the amplifiers close to their compression points. The main problem is the pronounced dynamic nonlinear behaviour of the amplifier, combined with fluctuations in the envelope...
The goal of the project is the digital correction of analog signal processing errors in fast analog-to-digital converters. Through this digital correction of errors, costs for production of fast converters should be limited and a more flexible adaption of new technologies will be allowed. An analog-to-digital converter is a complex system that causes dynamic, nonlinear, and time-variant errors. In order to determine analog signal processing errors typical high-speed architectures are investigated. The aim of the investigations is the systematic identification of these architectures and their influence on the ideal signal conversion. Identification is achieved through theoretical descriptions, simulation models and...
The research is concerned with the identification and inversion of weak nonlinear behaviour occurring in analog integrated circuits for ADSL applications. The nonlinear behaviour induced by the nonlinear characteristics of the analog components of the integrated circuits, is limiting the performance of the overall ADSL data transmission system. Thus, the goal is to compensate the inherent nonlinearities of the circuit. This nonlinear equalization should be realized in the digital domain, through adaptive nonlinear filters. Nonlinear system identification serves as a starting point for the analysis of the inversion of nonlinear systems. Through the identification of a discrete time nonlinear model...
Speech recording is a common practice in daily professional activities, such as for lawyers, physicians, journalists and architects, among others. The combination of dictation systems with automatic speech recognition (ASR) is being demanded today as the natural procedure to take over their daily transcription routines. However, in those working environments (e.g. hospital, court of law, street, etc.), it is not always possible to record in silent or noise-free conditions, this fact causing ASR to become unreliable. The researchers in oneVoice have developed several novel signal processing-based techniques for analyzing speech with natural intonation. These methods represent the scientific basis of...
In emergency situations, particularly within smoke filled, partially or completely collapsed large buildings, communications with rescue personnel can be difficult. Safety & co-ordination of the operations is hampered by a lack of knowledge of the location of emergency staff. The project will investigate & demonstrate the use of UltraWideBand (UWB) radio, to allow the precise location of personnel to be measured & displayed in a control centre & simultaneously improve communications reliability. The feasibility of using UWB to search for survivors in smoke filled rooms or buried beneath rubble & to generate simple maps will also be investigated.
Ultra-Wideband (UWB) communications is an emerging new technology for high speed data transmission systems that is expected to enable low-cost and low-power devices. Instead of a modulated carrier, streams of ultra-short pulses (< 1ns) are used for wireless data transmission, yielding signals of huge bandwidths (> 1 GHz) but at very low power densities. In principle, the nature of the signal used makes the technology suitable for low-cost implementations in standard CMOS technology. However, before UWB systems can be produced at large scale and low cost, there are numerous open research issues to be solved. Only in recent years, the...
SonEnvir ist ein vom steirischen Zukunftsfond gefördertes Forschungsprojekt mit dem Ziel Sonifikation und ihre Anwendungen in verschiedenen wissenschaftlichen Disziplinen zu erforschen. Viele wissenschaftliche Forschungsgebiete arbeiten mit komplexen, multidimensionalen Daten. Die üblichen Verfahren, innere Strukturen dieser Daten darzustellen, sind Visualisierung und statistische Analyse. Beide Ansätze sind anerkannt, haben aber bekannte Nachteile: Visualisierung ist durch die perzeptuellen Schwächen des Sehsinns begrenzt (schlechte zeitliche Auflösung, nur wenige Dimensionen darstellbar), und Statistik durch das mathematische Verständnis des Forschers, was die Komplexität der Verfahren betrifft - und deren Bedeutung für die zu analysierenden Daten. Sonifikation ist die Repräsentation und Analyse von Daten durch Klang und...
The goal of the proposed research is the development of a new and efficient source coder for speech and audio signals based on the approach of coding in the perceptual domain. In this approach the signal is transformed into an auditory representation by passing it through a model of the human peripheral auditory system. The auditory representation is quantized and encoded for an efficient digital transmission or storage. Upon decoding the auditory representation is then transformed back into the acoustic domain using an inverse of the auditory model. Auditory modeling and research on perceptual-domain coding provides insight into human perception...
The SPARC (Semantic Phonetic Automatic ReConstruction) project aims at automatically reconstructing the original wording of a medical dictation from its formatted, corrected written form and the error-prone output of a speech recogniser. Normally, either of these two texts alone is not sufficient to obtain a literal transcription, since the written report may contain reformulations of the original utterance and the recogniser output misrecognitions. In the SPARC approach, the resources are now combined and a semantic and phonetic analysis is performed on the texts to resolve the mismatches between them. This way, the available large corpora of audio recordings of the...
The SNOW project aims to support nomadic workers in their performance of maintenance and production tasks. It is developing a multimodal interface enabling workers to interactively access documentations via different input modes such as speech, gestures or handwriting using mobile devices in the field. At TUG particularily the speech input and output modality are strengthened by denoising and enhancement algorithms for speech in harsh acoustic environment.
Multimedia data has a rich and complex structure in terms of inter- and intra-document references and can be an extremely valuable source of information. However, this potential is severely limited until and unless effective methods for semantic extraction and semantic-based cross-media exploration and retrieval can be devised. Todays leading-edge techniques in this area are working well for low-level feature extraction (e.g. colour histograms), are focussing on narrow aspects of isolated collections of multimedia data, and are dealing only with single media types. MISTRAL follows the following lines of radically new research: MISTRAL will extract a large variety of semantically relevant...
Goal of the proposed research is the development of new efficient methods for the identification of the Input-output (i/o) behavior of nonlinear dynamical systems. The efficiency in terms of computational complexity should be achieved by exploiting the structural constraints of the nonlinear dynamical system. An accurate description of the i/o behavior of nonlinear dynamical systems such as nonlinear circuits gains in relevance for research and industrial applications. The linearization of nonlinear systems through their inverse is one example of an important area of application. To realize such applications it is necessary to be able to map the i/o behavior of...
In steel industry there is an increasing demand for automatic inspection systems to control the quality of products. Through the economic pressure on the supplier to industry the inspection of a few samples from the production lot is insufficient. Especially, in car industry a complete, reliable, and automatic surface inspection is necessary. Hence, there is huge demand for vision based quality control systems in industry. The aim of the research project is to develop sophisticated methods for evaluating the surface quality of steel blocks. This means that irregularities have to be detected reliably. Further, they have to be classified as...
The main objective of this Action is to improve the quality and capabilities of the voice services for telecommunication systems through the development of new nonlinear speech processing techniques. The proposed new mathematical methods are expected to provide advances in generic speech processing functions. Examples of these are: higher quality speech synthesis, more efficient speech coding, improved speech recognition, and improved speaker identification. It is envisaged that the proposed nonlinear processing techniques will significantly facilitate the acceptance of voice interfaces for information systems such as the mobile Internet (by improving synthesis and recognition). Additionally, these techniques are expected to make...
In the steel industry there is an increasing demand for automatic inspection systems to control the quality of products. Through the economic pressure on the supplier to the industry, the inspection of a few samples from the production lot is insufficient. Especially in the car industry, a complete, reliable, and automatic surface inspection is necessary. The aim of the research project is to develop sophisticated methods for evaluating the surface quality of steel products. This means that irregularities have to be detected reliably. Further, they have to be classified as erroneous or as non-problematic. Due to the fact that an...
COMMIT is ftw.’s part of the EUREKA/Medea+ project INCA Integrated Copper Network Access (Medea+ project proposal number A106). Medea+ is a pan-European cooperation program to promote chip-manufacturing technologies, with some focus on system-on-chip (SOC) development. The INCA project develops chipsets for broadband wireline systems, supporting technologies and methodologies, and prepares for future broadband products. COMMIT is responsible for delivering functionality descriptions of near-future products, researching product enabling key technologies for voice and data transport over IP, algorithms for radio-frequency interference (RFI) rejection, and specifying the technical impact of the unbundling of the local loop. Attention is to be given to...
Loss of water due to leaks in pipes is a significant problem for communities in Austria and Southern Europe. Presently, such leaks are localized by noise measurement and auditory assessment. This method requires experienced staff and results in substantial localization errors due to high background noise levels. We are investigating advanced signal processing methods which allow the suppresion of background noise in ground microphone measurements. Various time-frequency processing methods are studied in MATLAB as well as their implementation using C and single-chip DSPs.
The aim of the project is to develop a highly effective acoustic user interface for visually impaired and blind people. To improve the usability over commonly used “screen readers”, 3-dimensional sound simulation is employed to simulate surrounding acoustic rooms via headphones. To create those virtual rooms, the Ambisonic approach was chosen and will be implemented on the TI DSP Evaluation Module. The project will result in a test system which is designed to replace an operating-systems desktop based on audio cues only. Similar to the graphical desktops, the acoustic rooms can contain icons which result in actions if clicked on....
The research is concerned with the identification and inversion of weak nonlinear behaviour occurring in analogue integrated circuits for ADSL applications. The nonlinear behaviour induced by the nonlinear characteristics of the analog components of the integrated circuits, is limiting the performance of the overall ADSL data transmission system. Thus, the goal is to compensate the inherent nonlinearities of the circuit. This nonlinear equalization should be realized in the digital domain, through adaptive nonlinear filters. Nonlinear system identification serves as a starting point for the analysis of the inversion of nonlinear systems. Through the identification of a discrete time nonlinear model...
The COST 258 Action with the title “The Naturalness of Synthetic Speech” is concerned with the coordination of basic research of 34 laboratories dealing with text-to-speech synthesis in 17 European countries.COST Action 258 proposes a range of studies that address the core issues of naturalness in synthetic speech in concrete applications. Our contribution addresses prosodic models for the German language, SRELP based demisyllable synthesis using VieCToS, and nonlinear oscillator models for signal generation. not assigned GG: not assigned KP: 34 European laboratories (see COST258 homepage)
Due to the increasing use of mobile phones and future integrated devices (GPRS, UMTS terminals), there is a growing need for mobile access to information services. Spoken language interfaces are increasingly important because the mobile devices don’t feature comfortable keyboards for input, and also because the users’ hands are not always free for device operation. One precondition for spoken language interfaces is robust speech recognition which can handle regional variants. The project has already created two spoken language databases which cover the regional variants of the German language as spoken in Austria, recorded over the fixed and mobile telephone networks,...
Antropomorphic signal processing develops computational models of human communication modalities that emulate the physiological processes of their natural counterpart. Widely known examples are found in articulatory models for speech synthesis and hearing models for recognition. In speech and audio coding, the decoder’s task is to synthesize signals that evoke the same auditory response as the original signal, independent of its source. While a lot is known about human audition and the related neural code, resynthesis of audible waveforms from such code has been achieved only recently. We develop one such auditory model inversion approach and investigate its application to speech...
The anticipated convergence of wireless communications and the internet demands for ever increasing data rates on radio air-interfaces for short range indoor as well as medium to wide range outdoor communications systems. This research activitiy aims at the development of key technologies and know how for the design of the radio links of future high-speed mobile systems. The characterization of the mobile radio channel is one of the key requirements for a successful system design. Channel measurement techniques and channel models are investigated in this project. On the transmission side, the focus lies on advanced signaling techniques, considering spread spectrum,...