Scott Gershin Takes the Audience on a Sonic Journey

Scott Gershin is an award-winning sound supervisor, sound designer, and mixer who has been a pioneer and leader in the film and gaming community for over three decades.

After studying mixing and music at Berklee College of Music, Gershin was one of the first to use computers to edit and design sound against picture (using an Atari computer to design Honey, I Shrunk the Kids, 1987).

From the mid-1980s, he became the main sound designer at Soundelux, working on award-winning films like Honey, I Shrunk the Kids, Born on the Fourth of July, Glory, and Steel Magnolias during his first two years at Soundelux. In 1991, he established Soundelux Media Labs, later called Soundelux Design Music Group (DMG). His vision was to create a sound design think tank supporting multiple industries. In addition to his film work, Scott and his team also supported sound design for theme parks, commercials, music videos, and industrials (such as Nike) before lending his talents and entering into the interactive entertainment industry, expanding their services to include voice-over recording and casting, as well as music composition.

Scott has designed and supervised such films as Guillermo del Toro’s Pinocchio, Maya & the Three, Nightcrawler, Pacific Rim, Hellboy 2, Chronicles of Riddick, Team America, Shrek, Star Trek, Blade II, Flubber, Heat, Braveheart, JFK, Home Alone, Cliffhanger, The Doors, Book of Life, Tarzan, and American Beauty, to name a few.

After 29 years and the sale of Soundelux, Gershin departed and joined and founded several divisions including Formosa Interactive, Sound Lab at Technicolor, and The Sound Lab, a Keywords Studio, contributing to more than 100 movies and the same number of video games.

He is credited with bringing film-quality sound into the interactive entertainment industry, including working with major game studios such as Riot, Capcom, Square Enix, Platinum, Microsoft, Sony Games, Activision, Tencent, NetEase, EA, Insomniac, and id.

Earlier this year, he received the Lifetime Achievement Award from the Game Audio Network Guild to add to his 13 MPSE Golden Reel Awards (close to 40 nominations), an Emmy (for Maya and the Three), a Lumiere Award (for his VR work), numerous G.A.N.G. and TEC Awards, and a BAFTA nomination for American Beauty.

With thanks to Scott Gershin for taking the time to talk with HPA.

You have described yourself as an ‘audio photographer.’ Can you explain what this means?
Where people will use a camera to capture a moment, my job is to collect audio snapshots or samples of the world around us. Anything that makes a sound—from animals to weaponry to vehicles, to banging objects together, to the sounds of people living their lives… A lifetime of collecting or recording sounds that can one day be used on a variety of movies and games to tell stories and create experiences. It’s my paint.

I’m constantly listening to everything everywhere I go, creating a mental database of the sounds of life—how they differ from city to city or country to country… cultural differences, technological differences. Such as how emergency sirens sound different in different parts of the world or something as simple as the sounds of children playing in a schoolyard or the way different professionals communicate, such as traffic controllers or police dispatch operators. Trying to capture the vernacular. To understand the similarities and differences of each area and generation within an area.

There are just so many wonderful sounds. Often, I will capture them and manipulate them to create something else. But one of the most effective uses of sound is silence—to create contrast between chaos and nothingness. A high-contrast black and white picture. Having an audience only hear themselves breathing. The sudden void of nothingness in a brief moment. When used properly, there’s nothing more powerful.

Do you keep a personal library of sounds that you’ve collected?
Yes! I have close to 5 million sounds now. Even if I am not able to capture a sound, I hear it in my head. So I can reference it later. For instance, I did the FX show Mrs. America, which is set in 1970s New York. Growing up in my early years in NY, I remember the sound of Greenwich Village and New York City. I remembered the sound of the city… I wanted to replicate what I remembered so you could even smell it. I wanted to capture and recreate what I remembered, including the accents, the sounds of the city, and the slang and vernacular of that era. I spent a lot of time with the actors in our loop group to capture that sound.

Whether I remember it or research it, I need to be able to capture that realism to be able to do justice to whatever project I’m on. That includes anything from regional dialects to the way people talk when displaying levels of aggression. Or cars. Or weaponry. Or anything. To me, sound is endlessly fascinating. I’m always listening, and I’ve always got the urge to grab clips and samples of life. I’m always listening to sounds and imagining how I can manipulate them.

Can you give us some examples?
I could grab the ‘whoosh’ of an airplane toilet being flushed and turn that into a weapon or a hurricane. With weaponry, I need to be able to capture the essence of a weapon at a volume that people could listen to without blowing their ears out. That’s tricky. At the beginning of my career, I worked with Oliver Stone on JFK and needed to be able to capture the fatal shots from Dealey Plaza. During production, they shot blanks in Dealey Plaza so I could hear the echo. With the technology at the time, I was able to replicate that echo, and every time that SFX was played, it started from a different perspective and a different place in the theatre.

These are all things that you find through experimentation and play. That’s the creative component of the job. It’s using your craft to enhance the project. Bringing the director’s vision to fruition. Figuring out how to make an audience giggle, cry, fear, awe, or think. Creating the sounds of 25-story robots and creatures in Pacific Rim by going to Long Beach Harbor and dropping 80-foot cargo containers on top of each other to create the sounds of their footsteps and punches. That was loads of fun.

Anyone wanting to do what you do seems to need the ability to tune their ears into the environment. Can you elaborate on your process?
I don’t hear any better than anyone, but I listen better than most. I refer to something I call your inner ear—the ability to hear and imagine sounds in your head. I’ve been such a fan of films, TV, and games my whole life. Watching thousands of hours and listening my whole life, I can hear shows in my head. It’s like playing music. You need to hear it in your head first before playing it. This is the same when I read a script or watch a rough cut. I’m able to hear and feel the movie in my head before I start, which helps guide me as to how I should approach the upcoming project. In addition, when I am starting my sound design, I’ll often try by making the sounds with my voice. The reason I do that is I’m trying to figure out how to make the sound. And also I’ll do that with clients to better describe what I am thinking… Words just don’t work… sometimes I need to vocalize it. It’s extremely helpful if I can imagine what that sound can sound like or remember what that sound sounds like in real life, to determine if I have recorded it and it’s in my library or if I need to record or design it.

The next thing needed is a good understanding of technology. I think of audio tools as paintbrushes that I can use to manipulate pitch, amplify, attenuate, modulate, or otherwise tweak in dozens of different ways. Then I need to figure out what I want this sound to now become, and equally important, when it should be used. I am constantly learning all the newest tools. Did I mention I love technology and those who create it?

On The Chronicles of Riddick (2004), I wanted to approach the design of the spaceships differently, so I had this idea to use guitars. I loosened the strings via a Floyd Rose, then used the whammy (vibrato) bar, tightening them up while I passed through a Marshall amp. I used it as an element to create the sound of the spaceship rising.

Another example was when I recorded a submarine. I found out the real sub was pretty boring, so I created the sounds of the sub from stuff at my home. The air conditioner blades became the engine, the rotating jets in my hot tub became the torpedoes (using an underwater mic), and the depth charges were me in a pool with my aluminum canoe. I blasted the canoe with air releases from the tanks underwater as well as hitting the canoe underwater with many objects, recording with an assortment of mics, including a hydrophone.

For Pacific Rim, I brought in Tina Guo, who’s a famous cellist. I had her scraping and manipulating her cello, playing below the bridge and everywhere we could think of—bowing and plucking—while processing it in real time and running it through a guitar modeler to create an element for the Kaiju’s vocals. Also using exotic synths, processors, anything that makes noise and sound.

People may not realize that you’re the voice of a lot of movie and game characters such as Flubber in Flubber, Herbie in Herbie Fully Loaded, Antie in Honey, I Shrunk the Kids, Reapers in Blade II, the Dragon in Shrek, and Dogfish in Guillermo del Toro’s Pinocchio, to name a few.
Even though none of these characters recite any words, they have to portray emotion. I like to use my voice and manipulate it in real time to create something new. Not human-sounding. A creature—big or small or cute—I use my voice to portray a character that can emote without words, creating the emotion of that character so audiences can understand that character. The illusion has to be perfect. If you know it’s me, I have failed. It must be a perfect marriage, a perfect blend that transcends to be that character. Then the illusion is successful. I vocalized Herbie in Herbie Fully Loaded. I get told a bunch that they didn’t hear my voice… I ask if they found Herbie endearing and cute. They say yes. I say, it’s just a Volkswagen bug… cars don’t sound cute. That’s when I know the illusion worked. It’s one of the things I love to do the most in my career and also the most challenging.

Does your creative process change if you’re doing a video game or a feature film, or are there more similarities than differences?
I find that the creative process in the creation of sounds between video games and movies isn’t actually that much different. Especially when you compare theatrical animation and video games. The design process is very similar.

In a theatrical environment, I’ve got a controlled linear medium in a dark room played back in numerous formats—Stereo, 5.1, or Atmos. I use those spatial formats to their fullest potential. I’m able to playback material at certain volumes and move them around the room to enhance whatever scenario and scene we’re trying to create. I like to take into account the life of the movie. Where it might play in a theatre for 4–10 weeks, it will live the rest of its life in a smaller playback medium like your home theatre, TV set, or headphones. I try to take into account how sound will translate to other mediums—how the dialogue is going to be heard and how it’s going to translate against a full mix.

In gaming, I am creating sounds and mixes that will play back in a user-defined perspective. The player controls perspective and panning and how sounds will mix against other sounds based on gameplay. The audio playback format is similar to TV and movies played in a home environment. In the sound design for film, the sounds are edited in a linear format and will playback exactly the same every time you watch it. In games, the sounds are created and designed as elements that get played back based on how the player interfaces with the game. A keystroke can trigger a sound such as a weapon; a joystick can move the character forward and reverse, left and right; and the sounds being played back that aren’t controlled by the player are triggered based on how the game is progressing or the decisions the player is making. Those sounds and music are being used based on numerous factors… where every time you play the game you’re getting a different variation of the whole soundtrack and the mix.

But at the basic level, when sound designing a creature, a weapon, or an object, it is very similar in both formats. How it is used is very different. Also, in linear formats (film, TV), I am considered post-production. In gaming, I am considered production because I am making assets for a game that, at a later date, will be integrated into the game.

Does sound for film bleed into games or does sound for games influence the sound of film?
Right now, games and film are very much locked in creatively. For instance, video game developers grew up on Star Wars, The Matrix, or other action, science fiction film classics, and they want that sort of experience in their games. Many people from the game world have a deep appreciation of film sound. Even outside of science fiction or combat games, you have more dramatic games which now inspire great TV—The Last Of Us, Arcane, Fallout, Resident Evil are glowing examples. I think where comic books created generations of IP, moving forward I believe games are now taking that space. Both the movie and games industry are in the business of making IP that can be viewed and played on all formats. That’s where, I believe, we’ve been heading. Both mediums are great forms of entertainment.

Is the craft of sound design threatened or enhanced by AI?
Put it this way: we pay a few hundred dollars a year for subscription to Pro Tools, but that’s nothing compared to the technology needed to do visual effects. If you look at budgets on tentpole projects, they’ll spend $100 million on VFX and $400,000 on sound editorial. It’s the other end of the scale. So if it costs $10M to invest in training a decent AI tool for sound, you’ll never make a return on your investment. For that reason, I’m not worried that the art of sound will magically disappear with AI.

AI can also play an important role in aspects of our process. For example, if we’re recording dialogue on location in a big noisy city, often you’re plagued by the background noise. There are already technologies that will clean that dialogue up and make it usable, but AI will take this capability to the next level—enhancing syllables, D’s, S’s can be enhanced to better hear them. Bringing forward mumbled dialogue will be an option. Better ways to blend production dialogue and ADR. These are the tools being created now.

I’m not a big believer in AI as the boogeyman coming to replace my job. I believe and hope AI is going to create better toolsets to help us save time and allow for more creative time. But I am speaking from the point of view of sound design. The use of AI to replace actor dialogue or to automate scriptwriting, etc.—that’s another matter entirely. A lot of discussion is being had in these areas. And I support controlling the use of AI to not replace performers and artists, but to create tools to enhance what we’ve been doing—such as file searches, cleanup and restoration, organizing assets, spotting against picture, creation of pan-able objects following pixel clusters, applying EQ profiles on other sound files, “better EQ matching,” etc. It’ll just be better and more useful plugins.

Do you have a personal favorite of all the titles you’ve worked on?
I’m proud of the diversity of work that I’ve done. I haven’t been pigeonholed to a single genre or style. I’ve had the opportunity to work with some of the best creatives in the business—to tell a diverse style of stories, emotions, and experiences. To make an audience laugh, fear, cheer, and think.

There are projects that define and accent periods of my life—my first movie Honey, I Shrunk The Kids; my close relationship and work with Guillermo del Toro; my experiences with Oliver Stone, James Cameron, Jorge Gutierrez, Wolfgang Petersen, Sam Mendes, John Hughes, David Twohy, and so many more. I was very fortunate early in my career to be exposed to filmmakers who wanted to use sound as a style, not just to create realism, but to create emotion.

What is your goal as a sound designer?
The aim is to try to create a feeling, an emotion, or a reaction—whether that’s making little kids giggle or making people feel uncomfortable. In Alive (1993), I made the plane crash so loud in the theater because I needed to create a sense of audio deprivation after the crash. The only way to do that is through contrast. Using guitar wood and organic elements to bring to life a wooden puppet and allow it to capture the hearts and minds of an audience—to take them on a journey of thought and reflection.

With all the mediums that I work in, you try to get the audience to buy into the sound emanating from that element or character. I don’t create sounds and designs “looking for a movie.” I start with a blank canvas. I take cues from the filmmakers. I read the script. I see the way it’s cut. I see the way the actors are performing within it, and that gives me—as an enthusiast who loves movies—a feeling. How do I contradict or support that feeling? What tricks, tools, and techniques do I have that can enhance that story in ways that don’t break the illusion? You shouldn’t notice the person behind the curtain. Pace and rhythm and pitch all feed into it.

To me, making a movie or a game is similar to being in a band. Everybody has some part to play, and when it comes together and it locks in, then it’s magical.

We use non-personally-identifiable cookies to analyze our traffic and enhance your HPA site experience. By using our website, you consent to the placement of these cookies. Learn More »

Pin It on Pinterest