Audio Analysis: Intelligibility Enhancement
Audio Analysis: Intelligibility Enhancement
Intelligibility Enhancement
Although the use of dedicated filtering hard- and software is widespread in the latter type of work, the net
effect of the use of this equipment in terms of getting additional words down on paper is not always
impressive. In fact, a large proportion of the work carried out under this heading is probably primarily of
a cosmetic nature; in judiciaries with a jury system in particular it is often necessary for all relevant
speech recordings to be played in court. Removing unpleasant noises may facilitate listening for
uninitiated listeners like members of the jury; it may also reduce fatigue and thereby increase productivity
in those who have to transcribe large quantities of speech recorded under forensic real-world conditions.
The enhancement of clandestine or covert recordings, other than those made by private citizens, is not a
core activity for many forensic laboratories for the simple reason that covert recordings made by police or
other investigative forces will not normally be ruled admissible by a criminal court of law. Information
obtained from such recordings cannot therefore be used for evidential purposes. The extent to which
information obtained from enhanced audio recordings may play a role as an investigative tool and the
efficacy of covert recording is hard to assess because by definition these matters do not lend themselves
to public scrutiny.
The public image of this type of activity is strongly shaped by publications like Spycatcher and the
Francis Ford Coppola film The Conversation (1974), in which Gene Hackman plays an audio surveillance
expert who is slowly caving in under the psychological pressure of his job.
To achieve the best results in transcribing questioned utterances in low to extremely low-quality
recordings the use of highly competent and educated native speakers of the language variety in question is
strongly recommended. A thorough familiarity with the accent and dialect of the speakers in the
recording, as well as some familiarity with the details of the case, will often enable the analyst to
compensate for the loss of redundancy of linguistic cues that is characteristic of poor-quality recordings.
Disputed Utterances
There are relatively few reports of work undertaken in this area. French provides an illustration of some
of the procedures that may be helpful here. A related issue is the growing demand for speech recognition
systems to meet the need to transcribe enormous quantities of forensic speech recordings. At present, the
vast quantities of recorded speech generated by telephone interception systems are transcribed by
relatively highly paid and trained human listeners. Most commercially available speech-to-text systems
require extensive learning sessions, a (single) co-operative speaker and relatively high-quality recordings
to meet acceptable performance standards and are therefore unsuitable for forensic use. Interestingly, the
Lithuanian Institute of Forensic Examination in Vilnius reports a system called Transcriber, produced by
the Speech Technologies Centre, Russia, which it claims to be using for the automatic conversion of
speech to text.