Mailing List Follow akustyk on Twitter
 
Tutorials
Akustyk
Linguistics
Audio technology
Field recording
Recommendations
Reviews
Film Processing

Using proper microphone placement

Headset, lavalier, handheld, table top, and goose neck microphones

The title of this section lists the various types of microphones that can be used in the field. I have personally found each to have unique advantages but also to pose unique challenges in the field. Rather than try to argue in favor of one particular solution, I would advise you to try to use a technique that will enable you to keep the microphone relatively close to the talker's lips, and, as importantly, at a constant distance (as seen in Figure 1), slightly off to the side of the mouth. The present discussion concerns general microphone types only; for a summary of electrical and acoustic options available, please refer to this section of my website.

Mic placement

Figure 1. Three types of microphone placement typically used in field recording; note the relatively close placement to the talker's lips, slightly off to the side of the mouth

There is a very good argument against introducing too much technological (and otherwise) complexity into an interview situation (see the observer's paradox, for example). While I completely agree with this sentiment, I must also point out that one can make a equally strong argument in favor of technological sophistication. I often argue that those who can strike a perfect balance between content and technology are most likely to acquire reliable field data.

Microphones

Figure 2. With the right technique, virtually any microphone type should perform admirably, courtesy Audio-Technica

Spectral detail

I frequently use the term "spectral detail" throughout this website. What exactly is "spectral detail" and how important is it? It is a very good question. One might argue, and quite rightfully so, that we can understand speech very well in various adverse acoustic conditions involving excessive noise (e.g., college cafeteria) and signal degradation (e.g., over the telephone). While the auditory system can do extremely well with a degraded signal, speech analysis software cannot. For analysis and/or synthesis, we need as much spectral detail as possible. The amount of spectral detail can be influenced by a number of factors, such as signal-to-noise ratio, microphone placement, recording levels, recording medium, vocal effort, etc. Just remember one thing, while a given recording may sound good, it may be completely unuseable for analysis purposes. Figure 3 shows two spectrograms of the word "job." Both recordings sound perfectly intelligible and would pose no particular challenge for the auditory system. However, spectrographic analysis demonstrates recording A (left) panel has significantly less spectral detail than recording B (right). In recording A, formant tracks are much less clearly defined, the bandwidths appear increased, there is a lot of extrenuous noise, there appears to be a bump in low frequencies. Analysis software simply would not be able to discern much useful and reliable information from recording A.

Spectral detail

Figure 3. Differences in spectral detail between two recordings of the word "job"

In purely acoustic terms, microphone placement helps control the amount of spectral detail and the signal-to-noise ratio. Both aspects are inversely proportional to the distance between the microphone and the talker's lips. Figure 4 illustrates the point. It shows three recordings captured approximately at the same recording levels (RMS of 74 dB) at the distance of 3 cm, 15 cm, and 30 cm from the talker's lips, respectively. You can see a deterioration in SNR from left to right. Subjectively, all three samples sound decent, perfectly intelligible, but their usefulness for acoustic analysis is compromised by unfavorable SNR.

3 distances

Figure 4. Comparison of SNR at three different recording distances

Quantifying microphone placement

DownloadI few years ago, I did a quick comparative study of the reliability of formant analysis as a function of microphone placement. I tested three scenarios: (1) a built-in microphone (with a Marantz recorder), (2) a lavalier microphone (with a MiniDisc recorder), and (3) a headset microphone (with a digital recorder). The type of recorder, though important, is not crucial because each of the three recorder types has adequate frequency response and dynamic range for basic formant analysis. The key to reliability is microphone placement technique. Not surprisingly, the headset produced the most reliable (consistent and accurate) results. Please, feel free to download this paper free-of-charge. It is published on my website only, so you can cite http://bartus.org as publisher: Plichta, B. (2004), Signal acquisition and acoustic analysis of speech. http://bartus.org

Improving signal acquisition through testing

The case above is a rather stark example of how the lack of spectral detail may influence analysis. Let us now consider a much more subtle case. Suppose you are interested (as I often am) in low frequency information for the analysis of voice quality, nasalization, phonation, etc. Reading spectrographic information alone, as we did above, may not be sufficient. Fortunately, you only need to do this type of detailed analysis only once - at the beginning of a project. You need to make sure that your choice of equipment and technique will enable you to have confidence in your data.

To illustrate the point, I generated a waveform consisting of five freuquencies of equal amplitudes spanning from 100 Hz to 500 Hz, at 100-Hz intervals. I played the test signal out of a flat-response, low-frequency reference loudspeaker at the distance of 15 cm from the microphone, at the level of 70 dB SPL. I recorded the sound with two different microphones: (1) Behringer ECM8000 and (2) Audio-Technica AT831b (at 22,050 Hz/16-bit). Both are really nice, but really different microphones. The Behringer is an omnidirectional (so-called "measurement") microphone, while the Audio-Technica is a directional lavalier. I like both microphones, but use them for different purposes. Figure 5 shows spectral differences between the two recordings. We expect the FFT to return five equal peaks. Only the Behringer gets it right, while the Audio-Technica shows signal attenuation in the low frequency range. Performing a customized test helps us quantify the low-end response of these two microphones in a way that informs our research and helps us make educated choices about equipment and methodology. There are probably good engineering reasons why the AT831b was designed to attenuate low frequencies, but in the analysis of nasalization from narrow-band spectra, we "live and die" by these low-frequency harmonics. We want the low-end to be as neutral as possible.

Attenuation

Figure 5. Comparison between low-frequency response between the Behringer ECM8000 and Audio-Technica AT831b microphones

By the way, my data show an attenuation even greater that that published in the manufacturer's official frequency response graph (Figure 6). I marked off the attenuated area in red. The dashed line reppresents the frequency envelope with the roll-off filter ON. My measurements were done with the filter OFF. It would be grossly unfair to allege that Audio-Technica, a very reputable company, published inaccurate data. Rather, I would seek an explanation for the discrepancy in the the different measuring techniques. Microphone testing can get really involved really quickly. Also, I think of it is a useful exercise in interpreting a frequency response graph, which available for most professional microphones in production today. You will find frequencies along the x-axis. They span the entire range supported by a given microphone. The frequencies are typically plotted on a log scale. Along the y-axis, you have response magnitude in dB. The easiest way to think about it is as input/output system, where we record a known signal with the microphone we are testing, say a pure tone at 1,000 Hz and 70 dB SPL and measure the recorded value of that pure tone. We repeat the test for other frequencies and "connect the dots" on the final graph. Ideally, we should have a perfectly flat I/O response, but that, of course, never happens. We want however, no major bumps and dips in the spectrum. Our expectations will also be motivated by our analysis. As I mentioned earlier, for voice quality, phonation, and nasalization, we are specially concerned with low frequencies. For basic formant analysis, we want a flat (or reasonably so) response below 5,000 Hz. Finally, the entire range of frequencies should be considered, though, these days most microphones have an adequate range represented. Some exceptions inlucde various telecommunications headsets, built-in laptop microphones, camcorder microphones, digital dictaphones, etc.

Frequency response

Figure 6. Published frequency response data of the Audio-Technica AT831b microphone; note the attenuated region marked off with red

As I mentioned several times on this website, you can find most microphone data at http://www.microphone-data.com. You will have to register but it is free and you get access to hundreds of microphones (even discontinued ones) and a few very useful articles.