Kelly Maxwell—Transcription, captioning, and subtitling (EAC-BC meeting)

Kelly Maxwell gave us a peek into the fascinating world of captioning and subtitling at April’s EAC-BC meeting. Maxwell, along with Carolyn Vetter Hicks, founded Vancouver-based Line 21 Media Services in 1994 to provide captioning, subtitling, and transcription services for movies, television, and digital media.

Not very many people knew what captioning was in the 1980s and ’90s, Maxwell said. But the Americans with Disabilities Act, passed in 1990, required all televisions distributed in the U.S. to have decoders for closed-captioning built in, and Canada, as a close trading partner, reaped the benefits. Captioning become ubiquitous and is now a CRTC requirement.

Line 21 works with post-production coordinators—those who see a movie or TV show through editing and colour correction. Captioning is often the last thing that has to be done before these coordinators get paid, so the deadlines are tight. Maxwell and her colleagues may receive a script from the client, in which case they load it into their CaptionMaker software and clean it up, or they may have to do their own transcription using Inqscribe, a simple, free transcription program. They aim to transcribe verbatim, and they rely on Google (in the ‘90s, they depended on reference librarians) to fact check and get the correct spelling for everything. Punctuation, too, is very important, and Maxwell uses it to maximize clarity: “People have to understand instantaneously when they see a caption,” she said. “I won’t ever give up the Oxford comma. We’re sticklers for old-fashioned, fairly heavy comma use. It can make a difference to someone understanding on the first pass.” She also edits for reading rate so that people with a range of literacy levels will understand. “Hearing people are the number-one users of captioning,” she said.

Although HD televisions now accommodate a 40-character line, Line 21 continues to caption in 32-character lines. “Captioners like to think of the lowest common denominator,” Maxwell said. They need to consider all of the people who still have older technology. Her company doesn’t do live captioning, which is done by court reporters taking one-hour shifts and is still characterized by a three-line block of all-caps text rolling on the screen. Today the captioning can pop onto the screen and be positioned to show who’s talking. The timing is done by ear but is also timecoded to the frame. Maxwell and her colleagues format captions into readable chunks—for example, whole clauses—to make them comprehensible. Once the captions have all been input, she watches the program the whole way through to make sure nothing has been missed, including descriptions of sound effects or music.

Subtitling is similar to closed captioning, but in this case, “You assume people can hear.” Maxwell first creates a timed transcript in English and relies on the filmmakers to forge relationships with translators they can trust. Knowing the timelines, translators can match up word counts and create a set of subtitles that line up with the original script. Maxwell then swaps in these subtitles for the English ones and, after proofing the video, sends it back to the translators for a final look. How do you proofread in a language you don’t know? “You can actually do a lot of proofing and find a lot of mistakes just by watching the punctuation,” said Maxwell. “You can hear the periods,” she added. “Sometimes they [translators] change or reorder the lines.”

Before the proliferation of digital video, Maxwell told us, they couldn’t do subtitling, which had to be done directly on the film. Today, they have a massive set of tools at their disposal to do their work. “In the early ‘90s,” she said, “there were two kinds of captioning.” In contrast, today “we have 80 different delivery formats,” and each broadcaster has its own requirements for formats and sizes. “People ask me if I’m worried about the ubiquity of the tools,” said Maxwell. “No. Just because I have a pencil doesn’t mean I’m a Picasso.”

As for voice-recognition software, such as YouTube’s automatic captioning feature, Maxwell says it just isn’t sophisticated enough and can produce captions riddled with errors. “You do need a human for captioning, I’m afraid.”

Maxwell prides herself on her company’s focus of providing quality captioning. One of her projects was captioning a four-part choral performance of a mass in Latin. According the to CRTC regulations, all she had to do was add musical notes (♪♫), but she wanted to do better. She bought the score and figured out who was singing what.

In another project, she captioned a speech by the Dalai Lama. “Do you change people’s grammar, change people’s words?” The Dalai Lama probably didn’t say some of the articles or some of the verbs (like to be) that appear in the final captions, Maxwell said, but captioners sometimes will make quiet changes to clarify meaning without changing the intent of the message.

Captioning involves “a lot, a lot, a lot of googling,” she said, “and a lot of random problem solving.” She’s well practiced in the “micro-discernment of phonemes.” Sometimes when she’s unable to tell what someone has said, all it takes is to get someone else to listen to it and say what they hear. Over the years, Maxwell and her team have developed tricks like these to help them help their clients reach as wide an audience as possible.