Bit depth & Rate

When encoding audio digitally we have to find a way to take the vibrations we hear in the real world and store them as definite bits of either on or off values. In other words binary 1's and 0's and so on.

Consider the below visualization of a digital recording.

Each column is considered a sample.

Each sample has the possibility of being 8 units high. This is our bit depth. When interpreted by a playback device a value of 1 will be no sound and a value of 8 will be the loudest possible sound.

From left to right we see the rate. We can't tell how much time is shown here as that is something that is defined by the header (meaning it's stored once at the beginning of a file and not in the main data itself). Let's say the above image shows 1 second.

There are 16 columns so there are 16 samples over one second.

8 bits multiplied by 16 samples a second means we have a total bit rate of 128 bits per second.

What's more is that the number of values you can record in 8 bits isn't just "8". It's 2 to the 8th power, because you're multiplying a bit which is either 0 or 1 times 8 bits, resulting in 256 possible values for each sample.

Keep in mind that the example above is incredibley simplified. An actual audio file for a professionally mixed song might store 16 bits per sample, totalling 65536 possible volume levels per sample, and a have sampling rate of 44,100 samples per second just to record at a decent audio quality.

CBR VS VBR

One way we have of keeping file sizes down is to reduce the sample depth or the rate of sampling that a file records. Not across the entire file, but only when it is needed.

In a song for instance you'll want to have a high bit rate for vocal sections or for instruments that have many subtleties and textures. But what about silent areas? If a single violin is holding a single note and

Furthermore, the silence that probably exists at the beginning or end of a song doesn't really need any depth, so why record a sample depth of several hundred?

File formats

Under construction.

Every compressed file format is going to be lossy. The question is to what degree.

Below is a basic rundown of the more populat file types.

WAV MP3 FLAC Vorbis
Lossless X Semi
Proprietary X

MIDI

MIDI is a digital standard for encoding notes and their related properties instead of actual sound.

"MIDI" stands for Musical Instrument Digital Interface. It's a musical format that you've probably heard of throughout the years (it was standardized in 1983) because of its ubiquity. It was a standard format for games when the games themselves were made up of only a few kilobytes, is used as a low-space format for cell phones or other electric devices, and retains its use within the music industry as a way for devices to communicate with each other.

MIDI files have a few inherent qualities...

  • They DO NOT contain sound data (individual sample values are not stored).
  • They DO contain information about note itself including, but not limited to volume, note length, pitch, panning, velocity, and instrument type.
  • Because of this the file size for MIDI can be extremely small compared to formats that comprise thousands of samples a second.
  • The instrument sound produced is dependent entirely on the sounds stored either on the onboard sound card, or in a plug in the MIDI signal is traveling through.

Most DAWs will actually store MIDI as their default format for piano rolls. Whether its FL Studio or ProTools, unless you've only utilized sound clips, you've probably already used MIDI.

It's also important to recognize that there are different MIDI standards, not just denoting the file format, but which instruments make up the library. "General" MIDI is one of the most ubiquitous and, for instance, defines instruments ("programs") 1-8 as pianos and 9-16 as percussion instruments.

The Port (number)

Ports are the way your software / hardware has of connecting inputs to outputs internally. Setting a controller to port 17 and a generator plug-in to the same port will allow that controller to be played through that generator.

These should NOT be confused with the input / send channels you might find in the Mixer window of a DAW.

Fun fact; in the most DAWs this doubles as a way to get two instruments to play from the same piano roll. Set the out port on the channel with the notes to be the same as the in port of a second and the first should activate the second.

The Patch

The "Patch / program" is basically the instrument selected.

The Bank

If supported by your current MIDI set this can define subsets of various instruments. For instance in the General MIDI 2 standard patch 04 define Honky-Tonk Pianos. Bank 0 of that patch dentoes a normal version while Bank 1 denotes a "wide" style piano.

Tempo, BPM, spacing

There are quite a few terms that will describe the speed of a song over a given period of time. The generic term "tempo" does NOT necessarily denote a standardized system of measurement. In classical music for example you'll often hear keywords like "allegro" or "presto" that describe the speed of a song, and which can be recognized through experience, but which remain different between musicians. None the less it is, within a DAW, typically synonymous with BPM.

More definitive are terms like "BPM", or Beats Per Minute. In the digital realm this literally refers to the count on which many properties of your sound depend, like delays and quantized notes, but not necessarily the position of notes themselves. It's perfectly possible for a note to fall between the "beats" of your project.

The most important thing to realize about "tempo" and "BPM" is that they are not related to your recording quality or sample rate. They often use the same measurments, seconds and minutes, but are NOT implicitly tied to one another. You can have a 120 BPM song at 40 or 80 kHz, you can have a 10 BPM song recorded at 20 or 60 kHz, or any other mix of tempo and sample rate.

  • Tempo describes what you hear.
  • Sample rate describes what your computer "hears".