Skip to main content

How can I add text for non-speech elements (like Music) in subtitles?

This article walks you through how to include descriptions of non-speech elements — such as music, applause, or ambient sounds — in your subtitle tracks alongside dialogue.

Taylor Jennings avatar
Written by Taylor Jennings
Updated this week

Important Notes

  • Non-speech descriptions (e.g. “[music]”, “[applause]”, “[door opens]”) are not automatically detected. You must manually insert them.

  • Ensure the timing of non-speech text aligns properly with your media.

  • Use square brackets or other conventional notation so it’s clearly differentiated from dialogue.

Step-by-Step Instructions

  1. Upload your media file (audio or video) and let Sonix transcribe the spoken content.

  2. Open the subtitle or transcript editor where you can view and edit individual subtitle segments.

  3. Identify places in the timeline where non-speech elements occur (e.g. background music starts, doors slam, laughter).

  4. Insert subtitle lines at those timestamps and enter descriptive text such as:

    • [music]

    • [applause]

    • [footsteps]

    • [background noise]

  5. Ensure those inserted lines do not overlap with spoken lines (or adjust timing slightly) to maintain readability.

Common Issues and Troubleshooting

How can I prevent non-speech descriptions from overlapping with dialogue and cluttering the subtitles?

Shift the timing of either the dialogue line or the non-speech line slightly so they don’t overlap. Use small timing offsets to give each line space.

How do I know where to insert non-speech descriptions?

Listen carefully to the original audio/video and mark moments where there’s no dialogue but sound activity (music, ambient noise, effects). Use the media timeline to pinpoint those moments.

Frequently Asked Questions

Do I have to add non-speech descriptions?

No, it’s optional. But adding them enhances accessibility and clarity, especially for users who rely on subtitles without sound cues.

Can I edit non-speech descriptions after exporting?

Yes, you can edit them before export in the subtitle editor. However, once you burn subtitles into video (if that’s your export method), changes become permanent.

Are there best practices for writing non-speech descriptions?

  • Use concise descriptions (e.g. [music fades], [laughter])

  • Stay consistent (same style/notation throughout)

  • Avoid excessive detail—only include what is helpful to understanding

  • Place them on their own subtitle lines, not mixed with dialogue

For further assistance, please contact Sonix support at [email protected] or through the chatbox located in the bottom right corner of our website.

Did this answer your question?