Your first question should be- What Are Captions?
Captions are the text representation of words and other important audio information that are synced with a video.
The National Association of the Deaf defines captioning as “the process of converting the audio content of a television broadcast, webcast, film, video, CD-ROM, DVD, live event, or other productions into text and displaying the text on a screen, monitor, or other visual display system.
The best captions not only display words as the textual equivalent of spoken dialogue or narration, but they also include speaker identification, sound effects, and music description."
There are some important things to understand about different types of captions. Just because you see text on a video, doesn't mean it has a fully accessible captions file. Captions should be a separate text file associated with the video that can be adjusted to viewer preferences.
What captioning quality is required for ADA, Section 508 compliance?
Section 508 of the Federal Rehabilitation Act (enacted in 1973) is an amendment that broadens the original act's application to include online video content. While this act generally only applies to federal agencies, many states have passed laws that also make Section 508 applicable to federally funded organizations, such as colleges, research facilities, and arts institutions.
Quality standards for television captioning set the precedent for online video captioning that aims to improve accessibility. The quality standards for captioning online video include the following:
- Accuracy: Captions must be 99% accurate when relaying the speaker's exact words, including correct spelling, punctuation, and grammar, with no paraphrasing.
- Synchronized: Captions must be time synchronized so they align with the words spoken in the video, and must remain visible long enough for the viewer to read (3 to 7 seconds per caption frame).
- Completeness: Videos must include captions from beginning to end.
- Styling and Placement: Font and size should be easy to read and the placement on the screen should no block important content.
Additional items such as sound and speaker identification, and transcription symbols are also important in captioning.
Types of Captions and Transcripts:
Subtitles
Subtitles are only the context of what is spoken in video and audio for viewers who are not familiar or understand the language of the video. They are used for language translation. They are often used in foreign films or TV shows. Unfortunately, some subtitles don't match the audio voiceovers.
Subtitles assume that the viewer can hear the audio, so they don't include background sounds or speaker changes. They do not contain non-speech descriptions that are necessary for accessibility such as music or sound effects. This format does not meet ADA standards. Captioning services does not use this format.
AI Captions (Machine Auto-Generated)
AI captions can be used in YouTube, Panopto, and other video hosting services. They are a built-in feature on these platforms. ISU's Panopto system automatically processes videos after they are uploaded. AI captions use automatic speech recognition (ASR) software to create closed captions on a video. While they can be a helpful starting point, machine captions do not meet ADA standards. Videos only using ASR to generate auto-captions for recorded videos is detrimental to the accuracy of the captions. Like your phone they can auto-correct and make mistakes.
With AI, there are many factors that have to be met to produce fairly accurate captions.
- High quality audio, no background noise,
- Little or no grammatical errors, and few mispronunciations.
- Dialect, accent and speech clarity.
- And most importantly your location to the microphone. - Yeah, if you are on the other side of the room, it won't pick it up. Likewise, you don't want to be too close to the microphone either.
If any of these variables are not right, the accuracy of the captions can drop as low as 50 percent and lead to a horrible user experience.
In Zoom meetings/events, the caption feature is commonly called Closed Captioned. This feature is by default, automatically activated, and captions are created by an AI Caption system. It actually is not considered a true Closed Captioned format. Though the system makes a good attempt to replicate the majority of what you can get from Closed Captioned formats, the user does not get all of the non-verbal information and faults in textual errors with its ASR interpreting live speech. And then there are the auto-correct errors that occur.
Closed Captions
Closed captions can be turned on and off according to the viewer's preferences. They provide users with the most accessible experience with the textual equivalent of spoken dialogue or narration, but they also include speaker identification, sound effects, and music description. Captioning Services provide human editors who review and edit captions that meet federal quality standards and can ensure high accuracy rates. Captions are also edited to ensure that the viewer can comfortably read captions and that the caption window timings matches up with the audio as it is spoken.
Closed captions are added to a video as a separate text file and provide several benefits for those who access your content:
- The font size and color of the text can be adjusted.
- The captions can be turned on or off by the viewer.
- The caption file can be edited.
- The captions can be searched.
- The captions can be moved on screen.
ISU Captioning Services uses this format for all of its captions.
Open Captions
Open Captions are processed the same way as Closed Captions. However, Open Captions are "burned" into a video. Video editors like Camtasia or Adobe Premiere can overlay the captions onto the video so that it produces a singular file. There are some challenges that come with open captions:
- They are burned in to the video and cannot be edited.
- The captions cannot be turned off.
- They cannot be moved.
- A viewer cannot change the font size or text color of the captions.
- The captions are not searchable.
Open captions should only be used if a closed captioning option is not available. Videos processed in this format usually are HIPPA/clinical videos that are hosted by ISU's BOX accounts.
Transcripts
Transcripts are not to be confused as a caption. They are a text file of what is spoken in video or audio media. They may or may not have non-speech descriptions. Transcripts are particularly useful for podcasts or other content that only has audio. The advantage of a transcript is that the user can just read it without having to watch or hear the audio/video.
ISU Captioning Services also provides transcripts in an ADA standard format as needed.
Uh... What about Descriptive Video?
Descriptive video is a form of audio-visual translation used primarily by blind and visually impaired consumers of television and film. Television and movies use descriptive video when they employ a Secondary Audio Program (SAP), when the images and actions on screen are described during natural pauses in the dialogue.
Captioning Services does not supply this format. Faculty will need to contact the Disability Services for these accommodations.
And What about Sign-language (ASL) Video?
These videos may or not have audio tracks. Communication is done via sign-language. There are various language based sign-languages- e.g. American Sign-Language. Which is based on spoken American English. Transcription means rendering speech in the writing system of that language. By definition, sign language can't be captioned; gestural languages don't have a writing system. (Notations have been developed for general human movement and for specific sign languages, but we are unaware of any sign language with a native companion writing system.)
ISU Captioning Services currently does not offer video ASL services for audiences. ISU departments are referred to consult with Disability Services regarding ASL needs.
Additional Readings:
- What is Captioning?
- Accessible Social - Captions
- 36 of the Funniest Closed Caption Fails You Need to See
- The Problem with Using Auto-Captions in Education
- 6 Reasons Why Automatic Captions Are a Big Problem