Psychoacoustic Phenomena: Temporal Perception and Binaural Audio
Author Portrait Photo

Joseph Campo
Founder & Engineer

Monday, November 7, 2022

Psychoacoustic Phenomena: Temporal Perception and Binaural Audio

Understanding the science of how humans perceive sound, known as psychoacoustics, is a cornerstone of audio mixing. It bridges the theoretical with the practical, offering insights that audio engineers and mixers can utilize to enhance the auditory experience of a listener. This article explores the theoretical underpinnings of psychoacoustics, delving deeper into temporal perception and binaural hearing, shedding light on the technological advancements that aid in manipulating psychoacoustic principles in mixing.


Theoretical Underpinnings

The theoretical framework of psychoacoustics in mixing revolves around understanding how humans perceive sound and leveraging this perception to create a desired auditory experience. One primary phenomenon in this realm is temporal perception, which encompasses how listeners perceive the timing and sequencing of sounds. This understanding is pivotal as it informs decisions regarding rhythm, tempo, and synchronization—elements foundational to creating a coherent musical piece.


Temporal Perception in Mixing

Temporal perception extends beyond just the timing and sequencing of sounds; it also encompasses the understanding and manipulation of echo, reverberation, and other related phenomena. Echo is perceived when sound waves reflect off a surface and reach the listener's ear with a delay, creating a distinct repeat of the original sound. On the other hand, reverberation is a more complex phenomenon that occurs when sound waves reflect multiple times off various surfaces before reaching the listener's ear, creating a lingering sound that decays over time. Controlling these temporal phenomena allows mixers to create a sense of space and depth in a recording, significantly enhancing the listener's experience.


Echo and Reverberation

Echo is the perception of sound reflections arriving at the listener's ear with a delay significant enough to be perceived as distinct repetitions of the original sound. Reverberation, conversely, is the persistence of sound in a particular space after the original sound has ceased, created by a multitude of reflections off the surfaces within the room. Understanding and manipulating echo and reverberation in mixing can help create a sense of depth and space, making the sonic environment feel more natural or otherworldly, depending on the creative intent.


Temporal Masking

Temporal masking occurs when a sound is masked by another sound that occurs shortly before or after. It's a phenomenon that can affect the clarity and perception of individual sounds within a mix. Understanding temporal masking can guide mix engineers in making informed decisions regarding the timing and arrangement of sound events, thereby improving the clarity and impact of a mix.


Transient Shaping

Transients are the initial attack phase of a sound, which significantly influences the perception of the timbre and loudness of the sound. Transient shaping, the process of manipulating the transient response of audio signals, is a powerful tool in mixing. It allows mix engineers to control the punch, clarity, and presence of sounds, ensuring each element sits well and conveys the desired emotional impact.


Binaural Hearing in Mixing

Binaural hearing is another significant aspect of psychoacoustics in mixing. It refers to the ability of humans to perceive sound in three dimensions due to the auditory system's capacity to process sounds from two ears. Binaural hearing enables listeners to sense the direction and distance of sound sources, which is fundamental to creating a realistic and immersive sound experience.


Head-Related Transfer Function (HRTF)

The Head-Related Transfer Function (HRTF) is a response that characterizes how an ear receives a sound from a point source in space. It encapsulates the anatomical features of the listener's head, ears, and torso, influencing how sound waves are filtered before reaching the eardrum. In mixing, understanding and leveraging HRTF can assist in creating realistic spatial localization and a three-dimensional sound field, especially in binaural and surround sound mixing.


Interaural Time Difference (ITD) and Interaural Level Difference (ILD)

The Interaural Time Difference (ITD) and Interaural Level Difference (ILD) are pivotal cues for sound localization. ITD is the difference in arrival time of a sound at each ear, while ILD is the difference in sound level at each ear. By manipulating these cues in mixing, audio engineers can create a compelling sense of spatial localization and immersion, enhancing the listener's engagement with the auditory experience.


Binaural Recording and Mixing Techniques

Binaural recording and mixing techniques strive to create a 3D stereo sound sensation for the listener, mimicking the natural hearing process. Binaural recordings capture sound as the human ears perceive it, with interaural time and level differences intact. In binaural mixing, audio engineers leverage software tools and plugins to replicate or simulate this natural hearing experience, even if the original recordings were not captured binaurally.


Technological Advancements: Software and Tools

Technological advancements have fostered the development of software tools that facilitate the exploration and manipulation of psychoacoustic principles in mixing. Software like Audacity can be used to edit and manipulate audio files to create binaural beats, which the brain perceives as a third tone, which is the mathematical difference between two tones. Additionally, BHeare software leverages the auralization program EASE to recreate room acoustics and simulate binaural mixes in a particular space.

Furthermore, various software tools and plugins like Waves Nx, dearVR pro, and others enable mix engineers to create binaural mixes. These tools often come with HRTF filters, spatialization algorithms, and other features that allow engineers to manipulate a sound’s spatial characteristics, creating a realistic three-dimensional auditory experience.


Ambisonics

Ambisonics is a full-sphere surround sound technique that allows periphonic (including height) sound reproduction. While not strictly binaural, Ambisonics can be decoded to binaural for headphone listening, allowing for an immersive 3D sound experience. Mixing in Ambisonics requires a different approach than traditional stereo or surround mixing, and tools like the Facebook Spatial Workstation or Ambi Head HD plugin enable mix engineers to work in this expansive sound field.


Conclusion

These domains of psychoacoustics offer a fascinating dive into the interplay between the physical properties of sound, the human auditory system, and the emotional response elicited by different auditory experiences. Applying these psychoacoustic principles in mixing not only enhances the technical quality of the mix but also elevates emotive communication and the overall listening experience, making them indispensable tools in the hands of adept mix engineers.