Accessibility
navigation | page content |
Accessibility
top of site | navigation |
Latest Tutorials
Tutorials

Optimising audio

Understanding what’s occurring when digitising audio can reap massive quality rewards.
Optimising audio

The MP3 audio format gained early popularity with its effective balance between audio quality and reduced file size when compared with the huge sizes dictated by its non-compressed relations such as the original WAV. Purists, however, have been quick to point out that although MP3 may be adequate for the average listener, there are still inherent flaws in the lossy format that can only result in a reduction in quality. Yet MP3 has proved popular due to the way it reduces its file size by stripping away audio frequencies that take up valuable storage space that fall outside the range of the human ear.

There are many established audio applications available that support MP3 export, but before we can consider the best formats and settings for such, we need to have a better understanding of what sound actually is.

The science of sound
Sound is the vibration or variation in air pressure sensed by the ear, and the pattern and rate of these audible variations gives sound its unique quality. The range of auditory perception lies between 20 and 20,000 cycles per second. Air pressure fluctuations outside of this range are imperceptible to the human ear, with those falling below known as subsonic and those over as ultrasonic. These unique patterns of air pressure variations are known as waveforms and it is the ability to accurately capture and reproduce these waveforms that results in good sound quality.

The waveform captures the volume, pitch and tone (timbre) so, when played back, such vibrations can be replicated by the speaker cone to give a recognisable emulation of the original audio source. To capture sound at its purest, we need to understand how digitised sound gets converted.

Two sets of binary values represent digitised waveforms: sampling rate and bit-depth. A sample is the value and position assigned to a point of an electrical waveform while the number of samples taken per second is called the sampling rate. Bit-depth refers to the size of the binary numbers assigned to describe the dynamic value of each sample. This might be compared to a movie camera with sample rate being equivalent to the number of frames per second and bit rate to the pixel count the sensor is capable of identifying. A higher sample rate will therefore record more frequent audible snapshots while a higher bit rate will allow the sound to be positioned more accurately on the final waveform.

Generally, the agreed standard for the sample rate is 44,100Hz. This is the lowest that can accurately reproduce the highest frequencies of our audible hearing range. Sampling at reduced rates can be compared to scanning a photograph at lower resolutions – you may recognise the sound although quality will suffer. For CD quality storage the bit rate should be set to 16-bit.

You may choose lower rates for simple recordings such as vocal narration or if your audio is intended for online distribution. But as with images, it’s generally recommended that original recordings should have the highest possible settings from which you can then convert or optimise to the end media.

Audio can be sourced from various media. CD is the simplest, with most audio applications featuring a form of rip feature (take care on the format in which it is saved). Some applications favour MP3, while others prefer WMV. but these are compressed options. For the purest duplication, the humble WAV must be the format of choice.

Converting formats
MusicMatch (www.musicmatch.com) is a flexible tool for saving and converting simple audio clips, which features a recorder that allows various settings across the main formats of MP3, MP3Pro, WMV and WAV. The software also has the advantage of supporting line-in recording so you can extend your media sources from CD to anything you choose to plug into your soundcard. For example, this might be a microphone for vocal narration in a home movie, as well as other analogue sources such as a record player or tape deck.

With such an option in place, it becomes a routine task to record the output from such devices onto your hard drive, as you digitise your older music collection using the best quality settings for sample and bit rates to most accurately capture the sound. But with such captures come the problem of traditional analogue sources in the form of pops and noise that you may want to remove from your recordings.

To counter such unwanted features you’ll need to turn to more dedicated waveform editing software such as the open source Audacity (audacity.sourceforge.net) or Adobe Audition (www.adobe.com) Audacity is the simpler of the two and features an easy-to-use Noise Removal feature, which requires you to specify a few seconds of the unwanted noise – usually between the tracks of an LP – so that Audacity can recognise the frequency of sound you want to remove. You can then gauge the amount of reduction if you find you’re losing too much top end for a quick fix to the problem at hand. Audition provides a more accurate method of noise removal. The Click/Pop Eliminator effect can be run both automatically and manually so you can determine the most appropriate threshold parameters with precision should you wish to apply more or less processing to quieter or louder sections of music. Using such a tool, a more experienced user can run progressive passes where each pass is faster than the previous to gradually shave such unwanted elements for the best results.

Advanced editing
The software also includes a powerful frequency space-editing feature, which is similar to Photoshop’s Clone tool. Using this new feature, you can select and isolate unwanted sounds such as scratches or even coughs from a recording, and allow Audition to process and remove the anomaly by matching the frequencies from the surrounding waveform.

Both Audition and Audacity contain numerous complementary tools and features that will help you achieve the best quality from your audio sources. For example, both include pitch- and time-stretching options that ensure the tonality of a sound source remains reasonably unaffected were you to speed up or reduce the length of time a sound source were to play back. Such editing tools improve your options with the various mixing controls so you can balance multiple sound sources to create and edit sound tracks, narration and vocal lines to accompany a video source for example. And of course, you have the option with both titles to output your final mix to various outputs from the regular WAV and MP3 options through to alternative and emerging formats such as Ogg Vorbis, should you prefer different methods of compression to your audio clips.

But as you can see, if you’re serious about your sounds, there’s plenty more potential to converting audio than simply sticking a CD into your computer and waiting for Windows Media Player to tag its details and doing a wholesale copy to WMV before converting to WMV.

Adding effects

Digital effects provide another method of adding layers and depth to your original audio. They can be used to minimise or mask the occurrence of artifacts as well as alter the tonal characteristics of individual waveforms to match your other source material.

For example, you may have a musical backing of an ensemble performing in the lofty interior of an old church or hall with plenty of natural reverb but an independently recorded voice over that was captured in an insulated sound booth. By adding a little reverb to the vocal, you’ll find the result becomes much more believable.

Most dedicated audio editing packages will provide you with a range of such commonly used effects including delay for adding harder echo-like effects as well as more specialist tools such as pitch bending few out of tune notes.

Equalisation

Another commonly used method for removing artifacts while preserving the original audio clip is to use equalisation or EQ. This provides a more precise method of altering the gain or volume of specific frequencies within a given audio region. For example, analogue recordings might suffer from excessive hiss at the high end of the spectrum; a reduction here might provide satisfactory results. The same technique can be applied for minimising the effects of wind on a microphone or reducing the pronounced sound of the occurrence of S sounds during a voice-over.

Audacity has a dedicated Equalisation option under the Effect menu that lets you to make use of various predefined curves. Or you can choose to create your own by adding points along the frequency to determine where you may want boosts or reductions within the sound spectrum. Other tools will offer the same features, but in different places.

Normalisation

Normalising your audio is a process that maximises the bit-depth information available and provides consistent volume levels across the entire clip. This is an important consideration, particularly if you don’t want to shock your listener with extreme unexpected volume changes. The normalisation process corrects the problem by raising all sounds to one uniform standard amplification level to a point where the loudest peak is just below clipping, making the overall sound file louder but allowing the listener to determine a comfortable playback level throughout.

Due to the nature of the process, it’s advisable to leave the normalisation process until you’ve completed your project. Many applications require you to flatten grouped audio clips into a single file. If you’re getting ready to master an audio CD, using Group Waveform Normalize is a great way to make sure that all tracks on the CD have a consistent volume.

Chris Schmidt  
  PC Plus Issue 237 - December 2005