Representing sound in binary

Sound is used by computer games, music programs and other applications on a computer system. How is sound put into a form that a computer can understand and how can sound stored inside a computer be played through speakers?

Recording sound digitally and playing it back
When you speak or play music into a microphone, the microphone takes the sound waves and converts them into a voltage. As the sound waves vary, so the voltage varies. The microphone is connected to a computer's sound card. This card samples the microphone's voltage at intervals. We can show a sample of some sound on a graph. The sound we are passing to the microphone is constantly changing so it is known as an 'analogue signal'. 


Lets do what the sound card does and sample the voltage. We will start by doing this at 1 second intervals, starting at 0 seconds. When a sample has been taken, we'll convert the reading into a 'binary signal'.


We have now taken some sound and 'digitalised' it. Our file looks like this: 0000 1000 0101 1011 1100 0101 1010 1000 1000 1001 0010

We can now save our digital file and play it back whenever we want to. We do that by taking each sample of data and outputting it, via our sound card, to the speakers. The sound card takes the digital signal and does whatever conversion is necessary so the speakers can play the sound. We can 'hear' what this might sound like by plotting our binary data back on the graph. We will plot our data on top of the original sound file so we can compare them.


The red line represents the digital file. In an ideal world, the red line should sit directly on top of the blue line, which represents the original sound we recorded. If the red line did sit on the blue line, then the sound that was played back would be the same that was recorded. However, it's not, so the sound won't be exactly the same as the original. The reason for this is to do with our 'sample interval'. Taking a reading every second gave us a small file but the quality was poor. If we halved the interval between readings, we would double the file size but hopefully improve the playback quality. Let's take a reading every half a second instead of every second. Here is the data:


And here is the data on top of the original sound file:


This is a big improvement on what we had before, although the file size has now doubled. We could reduce the interval between samples again, perhaps taking a reading every 0.25s, and this would no doubt improve the recording even more, making it even closer to the original. The price, of course, would be an even bigger file size. You may see in any recording software you use the phrase 'sample rate'. This is simply how many times we take a reading in a second. It's just a different way of looking at the sample interval.

The next time you 'rip' a CD of music, to turn it into a set of MP3 recordings, and you have to select the quality of the sound, you will understand what is going on. You are simply selecting the sample rate. You will know that you can get really good, faithful recordings of the original songs, but the file sizes will increase as the quality improves. This is important because an MP3 player or your phone will have a fixed amount of storage space and you don't want it all used up by three songs! The trick is to find a balance between playback quality and file size. 

A last note, on making sound files smaller
You are probably starting to realise that it takes a lot of data to store a good quality sound file. The raw sound files themselves can be very large. This can be a problem if you have limited storage space on a storage device or you want to listen to songs over the Internet by listening to 'streamed' music. Streamed music is where you listen to the music as it is downloaded from the Internet as opposed to downloading and saving the whole song first and then playing it. As the first few seconds of a streamed song is downloaded, it is put into a special storage area called a 'buffer' and then playing starts. While that part of the song is played, more of the song is being downloaded and buffered. If the files are too large, however, then you can't download it quickly enough for it to be buffered and played. The song stops for a little while, to give your computer the time to download more of the song before playing can resume.

To get around this problem, they are usually 'compressed'. That means that the raw sound file has a maths formula applied to it (called a 'codec', or coder / decoder) so that the raw file gets squashed. The file becomes much smaller in size so it can be downloaded and streamed more quickly. This is exactly the same idea that we met with picture files. Raw picture files are large, but by applying a code, we can make them much smaller. You already may know quite a few compression codecs, including MP3 and ogg. These have all been converted from a raw sound file format (e.g. WAV, RAW and AIFF) that a sound recording application used. Although the raw files have all of the quality, converting them to an MP3 file will make the file much smaller but lose a little quality of sound. However, most people find it hard to notice any drop in quality.

Q1. Is a microphone an input or an output device?
Q2. Are speakers an input or an output device?
Q3. What is a microphone connected to?
Q4. What is meant by an analogue signal?
Q5. What is meant by a digital signal?
Q6. Name three file formats for digital songs, apart from MP3 (you may need to look on the Internet).
Q7. What is meant by the sample interval?
Q8. What is meant by the sample rate?
Q9. What is the relationship between sample size and sample rate?
Q10. What is meant by 'ripping a music CD'?
Q11. Explain how music is streamed.
Q12. Why are raw sound files converted into MP3 files?
Q13. State two raw, uncompressed sound file formats.
Q14. State two sound codecs.

Extension work
Experiment with some software for ripping a music CD. you can find lots of applications by searching for e.g. 'best free software for ripping a CD'. Find the settings for quality and adjust the sample rate. Record the same song at the lowest quality and the highest quality. Can you hear the difference when you play the songs back? Compare the file sizes of the same songs recorded at different qualities. How different are they? Do different software applications recording at the same quality produce results that are the same? Experiment!