Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Audible Studios
Original Content
Production Guidelines
2019
Level 1
*Please Note – these are supplemental guidelines and should be adhered to in
conjunction with the Audible Studios Full Production Guidelines of your Country/Market
Segment.
Level 1
Pre-recording
FOR details see Pre-recording section in accompanying AS Full Production Guidelines 2019
Recording
Audio should be recorded in 16-bit / 44.1kHz mono WAV file format.
o Watch out for and avoid common record mistakes.
o No File Processing. DO NOT compress, gate, or use noise reduction/downward
expansion.
FOR details see:
o Recording section p. 7
Editing
Once audio has been recorded it should be fully edited.
o The editor’s main objective is pacing.
o Always use Room Tone. Never leave silence or spaces of any kind.
o Requirements to note: file organization/length, spacing, breaths, mouth noise.
o No File Processing. DO volume (gain) adjust relative levels when necessary.
o Samples must be selected and delivered with the final retail-ready masters.
FOR details see Editing section p. 27
Quality Control (QC)
Once audio has been edited, it should be fully QC’d (Quality Control pass).
o All errors must be corrected before final retail-ready masters are delivered.
FOR details see Quality Control section p. 29
Mastering
Once audio has been edited and QC’d, it should be mastered.
o To make audio levels louder and more even throughout a production is vital.
Standard dynamic range required - audio should be peaking between -6
dB to -4 dB with an average RMS of -18 dB.
o EQ should be applied during mastering to make audio more pleasing to the ear.
Please ensure mono compatibility when delivering a stereo production with multiple sound
design, music, and sfx elements.
FOR details see Mastering and Delivery section p. 30
File Naming for Final Delivery
Mastered audio should be fully assembled into final retail-ready masters.
FOR details see:
o File Naming for Final Delivery section p. 33
o Title Credits and Assembly through Labeling Files sections in accompanying AS
Full Production Guidelines 2019
Uploading Full Productions
Uploading final retail-ready masters.
FOR details see:
o Uploading Full Production section p. 34
o Packaging and Uploading for Full Productions through Uploading Full
Productions sections in accompanying AS Full Production Guidelines 2019
How to Invoice
FOR details see How to Invoice section p. 35
2
Level 1
Table of Contents
Quick Start Guide to Original Full Productions ......................................................................... 2
Table of Contents ......................................................................................................................... 3
Receiving Original Full Production Assignments....................................................................... 5
Stages of Production .................................................................................................................... 6
Recording ...................................................................................................................................... 7
General Signal Flow .................................................................................................................. 7
Equipment.................................................................................................................................. 7
Microphones .......................................................................................................................... 7
Microphone Placement ....................................................................................................... 9
Recording Device ................................................................................................................. 9
Recording Software (Digital Audio Workstation) ............................................................ 10
Headphones ........................................................................................................................ 11
Recording Method – Punch vs. Straight .............................................................................. 12
Common Problems ................................................................................................................. 13
Recording Scenarios .............................................................................................................. 15
Scenario 1: Interview/podcast show in a studio ............................................................. 15
Scenario 2: Scripted drama or comedy recorded in a studio and “on location” .... 17
Compression ............................................................................................................................ 19
SFX ......................................................................................................................................... 19
Scenario 3: Live stage plays recorded in the theater (large or small)......................... 20
Scenario 4: Standup comics/storytellers at clubs ........................................................... 21
Scenario 5: Speakers and/or performers recorded at live events and festivals
(outdoors and indoors) ....................................................................................................... 22
Scenario 6: Podcast-type shows recorded on location (with/without mobile units) 23
Scenario 7: Reality shows recorded on locations (e.g. hospitals, bars, on the street,
in offices) .............................................................................................................................. 24
Scenario 8: Quiz shows recorded live on stage in front of a live audience ............... 25
Editing........................................................................................................................................... 27
Sample Selection for the Audible Product Page ............................................................... 28
Quality Control (QC): A Suggested Approach .................................................................. 29
Mastering and Delivery .............................................................................................................. 30
3
Level 1
Mastering ................................................................................................................................. 30
Ensuring Mono Compatibility ................................................................................................ 32
Delivery of All Content to Audible ........................................................................................ 33
Audible Original Intro-Outro Title Credits ............................................................................. 33
File Naming for Final Delivery ................................................................................................ 33
Requesting Audible Studios FTP Credentials ................................................................... 34
UPLOADING FULL PRODUCTIONS .......................................................................................... 34
FileZilla Instructions for Audible Studios ............................................................................. 35
Installation and Setup for PC and Mac............................................................................ 35
How to Invoice ............................................................................................................................ 35
4
Level 1
For different country origins, including the US, the accompanying 2019 AS Full
Production guidelines must be reviewed concerning:
o Title Credits-Upsells
o Appendix – Audible Original Intro-Outro Title Credits
o File Assembly – Terminal VS Collection
o Labeling Files
o Appendix – Audible Original’s File Labeling
o Appendix – How to Invoice
If at any time you feel a project may go over the approved rate, before proceeding any
further, you must contact and receive prior approval from Audible Studios for any
additional costs.
NOTE: By default, Original Full Productions are due 30 days upon receipt. If this is not
possible, please let the Production Team know ASAP by replying all to the project delivery
email upon receipt.
(See also section ‘In the Event of a Delivery Delay’ in the accompanying AS Full Production
Guidelines 2019)
5
Level 1
Stages of Production
For the production of most content, you will pass through these stages:
6
Level 1
Recording
In general, the content should be well-paced, easy to understand, and up to the same
professional industry level of quality found in other digitally-delivered audio content.
Focus on record-quality so everything that follows goes more smoothly. You’ll thank
yourself later.
or
Equipment
Microphones
For the scenarios to be outlined later, you’ll likely use one or a combination of these,
depending on your needs: microphone on a stand, handheld microphone, lavalier
microphone.
7
Level 1
Lavalier microphone – Also known as a “lapel mic” or “lav,” it is often used in live sound
reinforcement applications, video interviews, and multicast audio dramas. This
microphone is usually clipped to a collar, tie, or other clothing to be close to and below
a speaker’s mouth. A popular choice for this application is the Countryman B3.
8
Level 1
Microphone Placement
Handheld microphones and microphones on stands will usually
work best from 1-6” from the speaker’s mouth.
Recording Device
Depending on your situation, your microphone will connect
to one of two things: a handheld recorder or an audio
interface for a computer. Also depending on your situation,
you may connect your recording device to a venue’s
existing mixing board instead of directly to microphones.
9
Level 1
Computer with audio interface – If you’re recording with a laptop or desktop computer,
you will need an audio interface to get the microphone signal into the computer. A
popular choice for audio interface is the M-Audio M-Track Two-Channel Portable USB
Audio and MIDI Interface. Applications requiring more than two channels will require a
larger interface.
10
Level 1
Each workstation has its merits and quirks, so the choice is yours. Budget will, obviously,
also be a consideration, as well as what type of computer you already have or will get.
Headphones
11
Level 1
Generally, digitally-delivered audio content is recorded one of two ways. The linear
way is a “straight record,” sometimes called a “roll record.” You basically hit record and
start speaking. If you make a mistake, you correct yourself as you would in a
conversation by fixing the word on-the-fly while the recording keeps going. For live
events or in-studio conversations/performances, this is what you would do. For scripted
content, this may appear simpler for inexperienced speakers, but it is very labor-
intensive when it comes time to edit.
Punch Record (ideal for scripted content)- Hit record and start speaking. When a
mistake is made, stop, go back to the previous sentence or break, and “punch in” (start
recording again from that spot). It’s handy to set up a couple seconds of pre-roll in your
computer recording software so that you can hear yourself before it starts recording
again. This will allow you to (with a little practice) seamlessly match your pacing and
tone. Be careful not to cut off breaths, as the recording should sound clean, and any
chopped audio will need to be fixed later. One of the main pros of using this method is
that it becomes much quicker to edit since the content is already laid out during
recording. All that should be required is a little cleanup of the audio (removing loud
breaths or noises, tightening up pacing/pauses). One of the cons is that it takes a
reasonable amount of audio production experience to be able to “punch” cleanly and
correctly to achieve the desired result. Pro tip: It is absolutely worth your time to learn
this method.
To see and learn more about punch record, head over to www.acx.com/help
and click on Video Lessons & Resources.
Straight Record (ideal for non-scripted content)- Hit record and let it roll. Appropriate for
live and unscripted content. It makes much more editing work for scripted content,
though. See punch record for that.
Room Tone- Room tone is basically the ambience of a room, the sound of the space
itself without the addition of speaking or action. For studio and other controlled
environment recordings, this will be very quiet and clean sounding. For live events and
other “wild” environments, this may be applause, general crowd noise, street noise,
basically an average sound of what’s happening around you.
Recording room tone is kind of like recording “dead air,” where only the microphone is
recorded, by itself instead of someone speaking, capturing the ambience of the room.
For the purpose of editing, especially scripted content, at least 30 seconds of room tone
should be recorded at some point of every day of recording, using the same settings
12
Level 1
that were used to record the narration or event. This is very important, because a
change in settings will affect the sound of the room tone recording and it won’t match
the rest of your content, rendering it mostly unusable.
How would you do it in a studio? Hit record, leave the room, shut the door, wait
quietly. Listen back at high level to be sure it’s clean. Later, this room tone is
added between sentences, and used to mask noises and adjust pacing in the
edit. Again, use the same levels and settings used in the rest of the recording.
How would you do this in a live setting? You might not need to if you’re doing a
straight record and happen to capture some usable room tone audio during
pauses or at the beginning or end. Be mindful and be listening. If you’re editing a
live recording or show, audio like this can still be very useful and provide some
needed flexibility for the assembly of your content, to smooth out pacing or
make a fake beginning or ending.
Processing While Recording- If you are delivering unfinished content for Audible to edit
and master, please do not apply any processing (compression, gating, limiting, noise
reduction) during the recording stage. That will limit what we are able to do with the
raw audio.
If you are recording in a studio, we recommend not applying any processing during the
recording stage. Leave it for post-production (editing, mastering). And never use
downward expansion; it will suck the life out of your recording.
If you are recording a live event, especially if you are capturing the output of a mixing
board, there will likely already be processing applied for use in the live sound
reinforcement.
Title Credits- Title credits for the tops and tails of content will be provided by you. These
title credits must follow the policies and procedures in the 2019 AS Full Production
Guidelines and be pre-approved by Audible. They are a creative decision to be
worked out on a per-product basis with your Audible point of contact.
Common Problems:
13
Level 1
Plosives: Wind from a speaker’s mouth can often hit the mic too hard and cause
a plosive. This is an unwanted pop which can occur on any word, but more often
on ones that start with the letter “P”. Plosives sound bad, and are distracting to
the listener. If a plosive occurs, the line should be re-read (for scripted content),
while making adjustments to the position of the speaker (whenever possible)
and/or force with which the word is read, so that no pop is audible. Note: These
are more likely to occur outside of a studio because live recordings tend to have
more movement.
Audio Distortion: If the audio is too loud, it will distort or
clip. The level meter will usually show this by displaying
red, or indicating that you have gone over the ‘zero’
mark. Often, distortion cannot be heard while playing
back the audio within the software program that is
being used for recording, because there is built in
‘headroom’, which allows you to decrease the
volume of that part in the edit without having to re-
record. But even if distortion is not heard, if the level
hits ‘zero’ (or ‘goes into the red’), it will distort once
we encode the content for Audible’s platforms. To
avoid all distortion in the content, the actual level
meter must be constantly monitored, keeping it
below zero at all times.
Believable Environments: For scripted content, special care should be taken to
set the proper room feel called for in the script. With the software plug-ins
available today, this is not hard to do but it is hard to do well and do right. The
human brain is not easily fooled, especially through the ears. Experience and
listening will guide you here.
Continuity: Any part of the content that requires consideration for continuity
should be treated with care, so that it remains consistent throughout. This can
include distinct character voices, accents, or alternate pronunciations. It’s
always a good idea to add markers during recording (or manually note the time)
for anything that may need continuity. This way, it’s easy to go back and see
how it was read the first time. For example, a speaker may give a minor
character a child-like voice in once section, but then forget what it sounded like
when the character recurs later. In another example, a speaker may choose
how to read a word that has two dictionary-accepted pronunciations. Later, if
that word reappears, it needs to be pronounced the exact same way. Markers
(or notation of the record time) make it easier to locate these kind of things.
Reading Too Fast: The voice performers should read at a pace which can be
followed by any reasonable listener. A diverse group of people listen to audio
14
Level 1
content, and many listeners will not be able to follow along if someone is
speaking too fast.
Recording Scenarios
Most scenarios, nowadays, will be recorded with microphones plugged into an audio
interface connected to a laptop. If you’re in a situation where this is not possible but
you do have a handheld recorder, everything laid out in the scenario applies except
that your microphone(s) will plug into the recorder which will just capture the event as a
whole and later you will load the audio into a workstation to actually do anything to it.
15
Level 1
Since this is a conversation, you’ll want to do a straight record where you just hit record
and leave it running until the end or a break. The microphones plug into your audio
interface which connects to your laptop. Your software should separately record each
microphone on its own track instead of mixing them together. This gives you more
control over each voice. Have more than two voices? As long as your interface can
support enough channels, just keep adding one microphone per person and record
them separately. Your software will be able to keep up.
Once everything is set up and people are ready, hit record and start talking. When the
conversation is over, you hit stop. Save and backup your recording. Then you can
move on to editing and mastering.
Editing Scenario 1
For a basic conversation setup of host and guest there are only two channels of audio.
You can put one on the left and the other on the right (though that might lead to a
ping-pong effect with the sound), or you can put them both in the middle, or just either
side of center. Either way, there should be a balanced sound. This also applies to the
level of each voice. Care should be taken to balance how loud each person is in
relation to the other. In your workstation, you’ll see the conversation jump back and
forth between tracks as one person speaks and then the other. You’ll want to edit both
at the same time to maintain the overall pacing and timing.
You will likely keep the conversation intact unless you wish to alter the pacing or
content in some way. While keeping the listener in mind you may wish to shorten any
gaps and remove conversational asides. Other than that, attaching a nice beginning
and ending (which could be recorded separately) will round it out nicely.
16
Level 1
Breakdown: Multiple voices read from a script, at once or separately, in a studio or “on
location.”
Approach: One microphone per person, plugged into an audio interface connected
to your computer. Depending on your production, microphones will likely be on stands
or worn as lavaliers. If you have multiple voices at once you’ll need to make sure your
interface can handle the quantity and that you have enough microphones.
Recording all at once may be the more straight-forward route to take so that the
speakers can play off of each other. Record each microphone separately. No matter
how many voices, you can choose a straight or punch record. If you have a director or
someone who can coordinate all of the action, a punch record may be appropriate
and would decrease the editing effort later on. Otherwise, the more commonly used
straight record would happen and the editor would be tasked with sorting through and
assembling audio. In either scenario, the director would dictate the flow of the
recording session and make sure the script is followed and things move along.
Recording Techniques
When recording multiple voices simultaneously in the same room, it is important to keep
each voice as isolated from the others as much as possible. Understand the polar
pattern associated with your microphone and use rejection points to assist in isolating.
Utilizing a shotgun style microphone such as the Sennheiser MKH 416 is an ideal
candidate because of its highly directional polar pattern and off-axis rejection
characteristics.
17
Level 1
A three voice recording setup utilizing Sennheiser MKH 416’s, isolation panels, and a
thick rug to reduce room reflections
18
Level 1
When recording multiple sources, care should be taken when lines are inevitably
spoken simultaneously. It may be necessary to instruct actors to leave space in their
performance for editing to ensure overlap doesn’t result in a destructive take. When a
situation arises where multiple actors need to be speaking simultaneously, such as walla
recording, you may be able to get away with voices bleeding into various
microphones. However, it is always safer to record each voice independently and then
layer in post-production. You will have to use your best judgement for each situation
but, when in doubt, record individual takes.
Compression
Multicast audio dramas are often performed in a much more dynamic fashion
compared to their solo-narration audiobook counterparts. For this reason, compression
can be a vital tool in controlling some of those dynamics to achieve a more consistent
recording. Please note that heavy, pumping compression should always be avoided.
Moderately fast attack and release times with compression ratios of 4:1 – 6:1 reducing
around 5dB on peaks will yield consistent results. The goal is not audible compression,
but rather consistent control of peak levels.
Compression is optional and should only be used if you have a firm understanding of its
application. If you are not comfortable with this process, record as normal ensuring
ample headroom to allow for wide dynamic range.
Editing Scenario 2
The amount of work for the editor is dependent on how many voices there were and
which method of recording was used (straight or punch). It is now his/her job to follow
the script from top to bottom, finding all the pieces and putting them in their proper
places, while cleaning it up and balancing all the voices. It is often beneficial to group
simultaneously recorded tracks to keep edits consistent across all tracks.
SFX
Adding sound effects may be required for multicast dramas. All fx used MUST be
licensed and approved for use if they are from a pre-recorded library.
19
Level 1
Approach: This may be one of the simplest scenarios to record, but that’s because you
may have little control of the recording. Each actor will likely be wearing a lavalier
microphone which connects to a mixing board which the in-house audio mixer uses for
live sound reinforcement. You will need to record the output of that mixing board,
either a stereo mix of all the voices (into a laptop or handheld recorder) or ideally
separate outputs (splits) of each microphone into your capable multi-channel audio
interface connected to your laptop.
Since the production is live you’ll do a straight record from start to finish, stopping only
for an intermission. For editing purposes and production value it would be nice to
recorded the audience sound as well. It’s likely the mixing board will not be using any
microphones on the audience so you could use a handheld recorder positioned above
the crowd, angled away from the stage, or up high on a stand near the back of the
hall. This will add some depth and ambience to your recording and help fill out
moments of applause.
Editing Scenario 3
The production is live with actors so all the pacing is worked out. There won’t be much
editing except of applause at the top and tail, but there will be lots of balancing of the
20
Level 1
voices if you were lucky enough to record each one separately. You’ll do that in your
workstation. If you were only able to record a stereo mix from the mixing board, that
balancing work is done already, for better or worse. There’s little to nothing you can do
about it after the fact.
Since it’s a live show you can do a straight record from start to finish. Using a handheld
recorder aimed at the crowd, or another microphone hanging over the crowd would
be very beneficial to hear the responses of the audience and add some overall depth
and ambience to the recording.
21
Level 1
Editing Scenario 4
Since the comic/storyteller dictates the pacing and flow of the show, you will likely not
touch that for the most part. Pacing by the speaker, especially in this scenario, is
integral to the overall success of the performance, so it should remain intact. You may
just want to shorten longer sections of applause, especially at the end of a speaker’s
performance, to keep the recording moving along for the listener.
Scenario 5: Speakers and/or performers recorded at live events and festivals (outdoors
and indoors)
Breakdown: One or more speakers/performers live on a stage, each with a microphone
or speaking at a fixed podium. Recorded with a handheld recorder, or laptop if more
than two microphones at a time.
22
Level 1
Since it’s a live show you can do a straight record from start to finish. Using a handheld
recorder aimed at the crowd, or another microphone hanging over the crowd would
be very beneficial to hear the responses of the audience and add some overall depth
and ambience to the recording, especially outdoors where there are no walls to reflect
their sound back at the microphones on stage.
Editing Scenario 5
Since the speakers on stage dictate the pacing and flow of the show, you will likely not
touch that for the most part. You may just want to shorten longer sections of applause,
especially at the end of a speaker’s performance, to keep the recording moving along
for the listener. You’ll also definitely want to shorten gaps in the action, such as between
acts or when a new speaker is coming up to the stage. Remember, since this is just
audio there are no visual distractions to pass the time, it’ll just be dead air.
Approach: There’s a lot of variety possible here, so we’ll discuss a couple of scenarios.
For man-on-the-street interviews, a handheld microphone connected to a handheld
23
Level 1
recorder would be appropriate. The “host” would physically move the microphone
from his/her mouth back and forth to the guest as each person takes turns speaking,
like a news reporter might do. For sit-down interviews, perhaps setup at a table at a
convention, see Scenario 1. The only difference is that here you’re in the “wild” as
opposed to a controlled studio environment, so speaking close to the microphones will
help background noise from interfering.
No matter the setup, a straight record is the way to go because of the varying
background noise and nature of these conversations.
Editing Scenario 6
Editing will be key to crafting this content, or you can keep the straight record as-is.
Carving out a story and controlling the pacing is all up to the editor. But if the
conversation speaks for itself, there is very little to do with editing.
Scenario 7: Reality shows recorded on locations (e.g. hospitals, bars, on the street, in
offices)
Breakdown: There’s almost an infinite range of possibilities here. Number of speakers will
vary depending on the show, but they will most likely need lavalier microphones. Since
there are multiple channels, recording will be with a multi-channel interface connected
to a laptop.
Approach: For this scenario we’ll assume each speaker will have a lavalier microphone,
and that there are multiple speakers. You’ll need an audio interface for your laptop
24
Level 1
which can handle each microphone separately. Since this is a recording of “reality”
you’ll want to do a straight record, starting and stopping whenever it is deemed
appropriate. Recording in this scenario is really just capturing all the sound of what
happened. The real work will be putting it all together into a cohesive story in the edit.
Editing Scenario 7
Editing will be key to crafting this content. Carving out a story and controlling the
pacing and action is all up to the editor. There will be a lot of balancing of voices
besides cutting up the audio.
Approach: The host and contestants are on stage, probably standing, each with either
a microphone on a stand (perhaps at a podium) or wearing a lavalier microphone.
Another microphone or handheld recorder records the audience for the applause and
reactions. Each microphone on stage is recorded separately through the multi-channel
audio interface connected to your laptop. Given the nature of this show you would do
a straight record.
25
Level 1
Editing Scenario 8
Since not every speaker will be speaking all the time, the editor can lower the level of
each channel whenever the person is silent. This will clean up the overall sound if done
tastefully. Any editing will really be for content and time, like to shorten gaps in the show
or shorten any lengthy applause to keep the listener from getting bored.
26
Level 1
Editing
Once content has been recorded, it should be fully edited and QC’d. Content may be
heavily edited, especially if it’s scripted. Conversely, editing may be quite minimal, such
as for a live show where you would simply fade in the beginning and fade out the end.
All content will at the very least fade in at the beginning and fade out at the end.
Within your content, options and creative decision making will abound.
Live shows will be more straightforward, following the flow of the program. You
may wish to shorten long gaps in the action and long applause sections to keep
it moving forward for the listener.
Interviews may take a bit more editing. Including the above, the structure of the
questions and answers can be reordered or cut down for time/content.
Conversational asides or tangents can be removed to keep the content
focused.
Scripted content may require a lot of editing but the script itself will provide a
guide for most of your decision making since you’ll be seeking to adhere to that
and bring it to life.
Reality shows will likely be the most difficult to edit because, for the most part, the
material recorded will dictate the final result. And there will be a lot of material.
Stories will be crafted and assembled from multiple voices (probably) recorded
over a long period of time (probably).
For more help with editing techniques, head over to www.acx.com/help and
click on Video Lessons & Resources.
Besides any editing information discussed in the recording scenarios above, there are
some general editing guidelines to be aware of.
27
Level 1
then, again, at the end) of the editing process, and simply listen to a few minutes
of the content while closing your eyes and asking, “Is this moving too fast? Too
slowly? What pace feels right?”
3. Spacing: For all content, there should be exactly 500ms (0.5 seconds) at the
head of each recording, and exactly 3.5 seconds at the tail. It is also
recommended that there be exactly 2.5 seconds after the speaker announces
the title, if applicable, however, this may be adjusted to whatever pacing sounds
best for the read or show, as long as it remains consistent throughout. (The
spacing at the head and tail of the recording should not be adjusted and must
always be 500ms and 3.5 seconds, respectively.)
4. Clicks and Undesirable Mouth Sounds within Words: For in-studio recordings,
remove as many clicks and undesirable mouth sounds from within words and
phrases as time will allow. These will be much more noticeable here than in live
recordings because the background noise level is so low. A baseline production
value should be established at the beginning of the edit that is adhered to
throughout the program. Care should be taken when fixing or removing any
undesirable sounds around or within words so that the end result actually sounds
better.
File Processing: Never use any gates or downward expansion. This is against our
policy unless you have specifically consulted with us before using on a project.
DO adjust relative levels, when necessary, by simply adjusting the volume level
(gain) of audio to match other parts. If a section is noticeably lower in volume, you
may raise or lower it to match other parts of the content.
Please select a compelling sample that you feel will do a good job of
representing and selling the content.
The sample should be up to 5 minutes long. However, if the total content is under
15 minutes, please keep the sample to 1 minute or less.
If the content is erotic or adult-oriented, please do your best to choose a “clean”
sample. But you can choose what’s most representative, even if it has profanity.
28
Level 1
During the recording of scripted content, it is likely mistakes will be made and quite
possible they won’t be caught until (or even after) the edit.
When problems arise in the read that cannot be readily repaired through editing, such
as omitted words or phrases, misreads, et cetera, the editor should note the script
location, audio timestamp, and nature of the issue. We would recommend formalizing
any found mistakes into a QC Pack for pickups:
QC Sheet: All errors should be noted. The QC sheet should be neat and legible.
o Please deliver in both Excel and PDF Formats
• Highlighted script pages: a copy of script pages (not the full, original script sent
to you) containing the requested corrections. The line with the mistake should be
highlighted along with the sentence before it and the sentence after it.
o This marked script should consist of the full script pages for each page
where an error occurs
“01_P12_BOOKTITLE.wav”
“02_P23_BOOKTITLE.wav”
NOTE: Please be sure to provide audio references that are long enough that the
narrator/engineer can use this audio to match tone, volume, emotion, character
voicing, etc.
Once mistakes are noted, they should be re-read and fixed in the original audio files. All
of this must be corrected and fixed before the edit is considered complete and ready
for mastering.
For example, Audible Studios will expect errors like the following to be flagged and
corrected before final retail-ready masters are delivered:
29
Level 1
Audio Quality
o Egregious noises that cannot be removed in editing and render audio
unintelligible.
o Distortion that renders audio unintelligible.
o Diction that renders audio unintelligible.
o Plosives that are egregious and render words/sentences unintelligible.
Misreads (if scripted content)
Incorrect Pronunciations
o To help facilitate the recording of corrections it is recommended that all
flagged pronunciation errors on the QC sheet should include how to
correctly pronounce the word either in phonetics and/or have a link to
the correct audio pronunciation.
Inconsistent Pronunciations
o Consistent pronunciation of terms/names is vital throughout each project
and/or project series.
Inconsistent Character Voicing
o Consistent voicing of Characters is vital throughout each project and/or
project series.
Missing Audio
ETC.
It is vital to make the content levels louder and more even throughout. Often, this
process is achieved by normalization somewhere around -20db, or
compression/limiting. Compression should be applied with a fast attack and release,
around a ratio of 3:1. A hard limiter may also be used, and content is EQ’d during this
time to sweeten the sound and make it more pleasing to the ear. Often, muddled low
end and mid-range is cut to make the content sound more clear and smooth.
30
Level 1
The following is the chain of mastering used in-house at Audible using Sony Sound
Forge. This is here as a reference only, based on our typical recordings. There is no
substitute for listening. Using these exact settings, even in Sound Forge, may distort your
audio. Please use your own judgment and ears to come up with a chain and settings
that sound good for the content you are mastering.
A. Hard limit the audio with a threshold of ‐6.0dB, the fastest attack possible, and
release at 500ms.
B. RMS Normalize (NOT peak normalize) the audio at ‐19dB.
C. Hard limit the audio with a threshold of ‐3.8dB, the fastest attack possible, and
release at 500ms.
D. RMS Normalize (NOT peak normalize) the audio at ‐18dB.
E. Hard limit the audio with a threshold of ‐3.8dB, the fastest attack possible, and
release at 500ms.
F. Apply EQ as needed.
G. Apply Noise Reduction if applicable to bring the noise floor below -55 dB or
better, without altering the sound of the voice.
a. Please Note, we request that you utilize noise reduction that has the ability
to learn and selectively reduce noise, rather than a downward expander.
The Noise Floor should never in any circumstances be reduced to zero.
NOTE: The content levels should be hitting ‐6dB to ‐4dB as an average. Please do not
over compress or limit the content so that it’s too loud. We do not want content pinned
at 0dB, as a pop music song might do. ‐6 or ‐4dB is loud enough and is our standard
dynamic range and level for original content created for Audible, Amazon, and iTunes.
31
Level 1
If delivering a stereo production with multiple sound design, music, and sfx elements,
please ensure the production is mono compatible. Stereo program material may be
converted to mono based on customer download settings, so please keep these things
in mind:
While there may be some loss in compatibility in any mono fold down, it’s up to the mix
engineer to ensure it isn’t detrimental to the listening experience.
32
Level 1
Please use only the provided Audible Original Intro-Outro Title Credit Template
provided.
If an Audible Original Intro-Outro Title Credit template has not been provided, please
reach out to the Audible Studios team immediately with Post Managers CC’d
(postmanagers@audible.com).
NOTE: Audible Original Intro-Outro Title credits must be pre-approved by Audible. They
are a creative decision to be worked out on a per-product basis with your Audible
point of contact.
For content containing any music or sound effects, please deliver 16 bit 44.1 kHz
stereo .WAV files.
For content containing NO music or sound effects, please deliver 16 bit 44.1 kHz
mono .WAV files.
Your product should be delivered as only one or the other of the above file
formats, not a combination.
33
Level 1
All completed content should be delivered as a non-encrypted zip file. Please do not
use your operating system’s stock zip packaging utility, as these built-in features are
known to be unreliable. We recommend the use of WinZip for Windows computers, and
StuffIt Expander for Mac computers.
Due to specific file labeling requirements for different country origins, please refer to the
accompanying 2019 AS Full Production guidelines. If AS Full Production guidelines have
not been provided, please reach out to the Audible Studios team immediately with
Post Managers CC’d (postmanagers@audible.com).
Final Delivery
Final delivery should be made through your assigned Audible Studios FTP Account.
For full details review the 2019 AS Full Production guidelines. If AS Full Production
guidelines have not been provided, please reach out to the Audible Studios team
immediately with Post Managers CC’d (postmanagers@audible.com).
34
Level 1
For delivering files to Audible Studios we must ask you to upload via File Transfer Protocol
(FTP). Our FTP system is for inbound transfers only, but other than that it functions in a
traditional manner.
How to Invoice
NOTE - unless otherwise specified, it is the studio’s responsibility to invoice for all
completed work.
35