What is sound?

To understand and get a feel for exactly how sound works it can help to visualize the physical movement of it in the real world and THEN realize how that movement is represented by two-dimensional images in a diagram or chart.

Let's start with this video from NPR showing what sound waves tend to "look" like using separated light beams...

Frequency & Amplitude

So how does that fuzzy, unclear, impossible to see, property of physical matter become the stereotypical "wave" image that we are used to seeing?

You will often hear sound described as a "wave", and indeed many visual representations use a wave with crests and troughs to represent sound, but as you've seen that does not mean that sound moves along a line that zig-zags through space. It's better to think of sound as the compression of the air around you. Imagine a -The visual representations we usually see are really akin to a line graph showing this compression over time.

Vertical is the compression strength. Horizontal is the time passed.


Amplitude: a measure of movement from an equilibrium or resting state.

The word "amplitude" can technically encompass the change of something over a specific time or distance as well as the range of the change, but in terms of audio time is best described by frequency. With audio it is usually inferred that the starting position is the complete lack of a signal and therefore the amplitude is left only to describe the intensity of the signal at any given point in time.

Waveform: the shape of a graph that represents the compressions of a signal over a time or distance.

As you can see in the diagram to the side, and by listening to the audio clip on the right, the visual "wave" matches the general volume level of the audio clip. When the audio increases the wave gets higher as a representation of that increase. What's important to know is that this is an incredibley simplified overview of the entire sound clip. The phrase "waveform" will typically refer to only a very small section of the audio clip. If we were to zoom in on the surface of that line we'd see that it's not a smooth line at all.

Looking closer at the line you can see that the wave moves in very small ways as well as the large curve. The changes would be so minute that you won't be able to identify them on the larger wave.

You might be wondering why the second visualization of the wave has a mid line that the amplitude rises above but also falls below. Not to get into physics, but remember that every action has an equal and opposite reaction, and "vibrations" are just quick actions and reactions. Eventually they will grow quiet and deaden, but only after all the energy has been spent.

When you hear a constant tone as opposed to a more complex song or sound effect it's only partly because the overall wave doesn't change much. But the shape of the surface still dictates the kind of sound you hear. This basic shape can take on other forms such as Sine, square, triangle, sawtooth, and white noise / fuzz.

Below are playable sound clips, alongside the visual representation of a small part of their waveform.

Sawtooth (or just "Saw")
White noise (akin to "Fuzz")
Harmonic Partials: Other, less prominent, frequencies caused by the natural inaccuracies of an instrument that prevent a pure sine tone. These usually appear as multiples of the base frequency.

Knowing that these minute wave changes exist in the surface of the larger waveform is key to understanding the concept of frequencies in sound.

It should also be noted that the typical representations of sound waves, as I've used here, are very idealized. Most tones generated by a DAW have parameters that help shape the wave. A more realistic depiction can be seen when we look at the visualizations of the sine and saw samples you can hear above in sound editing software such as Adobe Audition.

The saw and triangle waves of a pure "sounding" tone over 1/100th of a second. This particular file contained about 450 sample values in that time.


The rate a signal can be determined over a unit of time is known as the Frequency.

The "Frequency" of a sound will always describe the same thing, the recurrence of a signal, but this word can be used within two different contexts.

The first context for "frequency" in the digital realm will describe the signal rate of individual instruments, vocals or effects within the a mix itself. Here is where we also find the key of what sound is, and how human beings interpret what we know as "sound". In fact put aside the idea of "instruments" for now. The origin of a sound does not affect how it is recieved, all sound is just compression of gas, all compression of gas is vibrations through matter.

Your body alone can notice large vibrations such as trains rumbling by, doors closing, or the road as you ride in a car. With large enough vibrations, an earthquake for instance, you can even visibly count the number of times you might swing back and forth. These are examples where you feel the vibration several times a second, but what if a vibration has a cycle of several thousand times a second? Your entire body wouldn't notice it at all as the vibration has restarted its cycled before you've even felt the end of the first. And this is exactly what the ear of an animal has evolved to do, it's a specialized tool to detect vibrations on a scale your more fleshy bits can not.

The second is the context of a sound file or project you might be working with, though in this context it will be known as the "Sample rate", or how many samples will be taken over each second of audio. A "sample" is really just another word for Amplitude, which is described below, but can be interpreted as volume for now.

The higher the sample rate, the higher the possible range of sounds you'll be able to achieve in a project, the higher the general quality.

Here we have three audio samples. The first is a synthetic instrument which was generated at about 44.1 kHz. The second is the same sound "down sampled" to just 2 kHz. The third runs at only 1000 samples a second. Listen for the difference.

What you should notice, considering you have a good range of hearing, is that the top audio clip with the highest sample rate produces the higher pitches within the strum. Remember, it's not the higher sample rate that's creating those sounds, it's simply allowing those sounds from the original instrument to be recorded. The lower the sample rate the more high pitched sounds will be lost simply because the recording can't produce as many samples as it takes to create a high pitched frequency.

Please note some of these samples might not play in specific browsers.
The original 44.1 kHz.
2 kHz.
1 kHz.

You should have been able to hear more of the "Higher" sounds with the first file. Since the second file is only outputting compressions at 10KHZ, it isn't even capable of playing sounds that would be higher then that.

In this context a digital frequency will always be constant. The technical specs of most file formats won't even allow for the sampling rate of a sound to change after it has started playback.


The change in pitch between the same note on two scales, where each higher note has double the frequency of the lower, defines an Octave.

A quick note about how frequency translates to musical terminology. An octave range is typically half or double the frequency above or below it.

For instance in the image to the left you can see two notes. The note below would be one octave higher than the note above.


You'll often hear the loudness of a sound described as a "Decibel" (dB). These are the division (deci) of a bel unit that, originally, was based purely on perception. It is the ratio of two quantities, the magnitude over a reference level, *usually* the lowest of a discernable range.

Decibels are logarithmic units that describe a ratio between two values of a physical quantity.

Since it is a ratio it is important to remember that using 100 watts to create your sound should make it only twice as loud as if you were using 10 watts. (This is exactly why in some areas you may see the words “gain” and “loss” used in place of simply “volume”.)

A chart showing the change in the Decibel level when the input power increases beyond the initial point when a signal could be recieved.


Phase describes the angle (velocity) of a signal and the period of the signal wave at any given time.

The effects of a sounds phase are not always immediately apparent. Unlike every other quality discussed on this page you probably won't even notice the effects of it while listening to a single sound. But when combining sounds, especially those that take up the same frequency range, the consequences of it become quickly apparent.

Here's a quick exercise you can try. In your chosen DAW make a simple oscilator instrument and input two whole notes at c6 and c7. Duplicate the instrument and place two notes for the new instrument at c5 and c6. When played you should get an obvious 3 pitch sound. Pan the first oscilator to the left stereo channel and the second to the right.

At this point if you were to add a phase inversion (probably applied to one channel) you'd find that the c6 sound, the mid tone of the sound played, will completely drop out.

Another example that's easy to re-create. In your chosen DAW create a simple one note, pure tone in an oscillator. Duplicate the channel. In the channel settings for one of them change the pitch quickly. In FL studio for example you can use the "pitch" knob to up the pitch by the full 200 cents it allows.

Cents are a unit of measuring pitch. 1 cent is equal to 1/100th of a semitone where there are 12 semitones to an octave.

What you should notice is a kind of undulating effect in the sound. The amplitude will waver quickly, but noticeably, and the meters will appear to "wobble". This is happening because changing the pitch in most programs will involve contracting or extending the length of the wave. When done only *slightly* this means that the waveform will by mostly in sync some of the time, but out of sync at other times, creating a quick wavering effect in the sound.

This particular example is a form of "masking". This is a term used to describe one tone muting the perception of another, such as a 4khz tone masking a softer 3.8khz tone while not changing an even quieter 1khz tone (this can be alleviated by good stereo placement / channel mixing). We'll come back to this in the mastering section.

Let's end the page with a few more ways to visualize sound.

The first is an example of a "Ruben's Tube" by Jared Ficklin. This device works by channeling speaker vibrations through a cylinder of gas being burnt off on top...

Here is a music video from Nigel John Stanford which makes use of quite a few different techniques...

Finally, if you'd like a rundown on what happens when these vibrations finally get to your head then there are numerous documentary shorts that can help to explain that. CrashCourse for instance has a video on Hearing and Balance specifically.