Creating a Transcription Job
Create and submit a job to transcribe one or more media files to text files in the Speech service.
Before you begin
-
Store the media files that you want to transcribe in an Object Storage bucket.
-
To compare the Whisper and Oracle ASR models for transcription job creation, see Comparing Whisper and Oracle ASR Models.
Comparing Whisper and Oracle ASR Models
Compare Whisper model and Oracle ASR model for creating transcription jobs.
In addition to the native Oracle ASR speech model, Speech supports the Whisper model from OpenAI. Whisper is trained on a large corpus of multilingual data collected from the web, and it supports file-based voice-to-text transcription for over 50 languages. This model uses the same service end points and API and SDK interfaces as the Oracle ASR model to give you flexibility and compatibility. In addition, the Whisper model uses diarization to label individual speakers in the recording.
Use the following comparison of the Whisper and Oracle ASR models to choose the correct model when creating a transcription job.
Feature | Oracle ASR Model | Whisper Model in OCI Speech |
---|---|---|
Real time transcriptions | Supported | Not supported |
Large file size | Up to 2 GB | Up to 2 GB |
Word level timestamp | Supported | Supported |
File format | AAC, AC3, AMR, AU, FLAC, M4A, MKV, MP3, MP4, OGA, OGG, WAV, WEBM | AAC, AC3, AMR, AU, FLAC, M4A, MKV, MP3, MP4, OGA, OGG, WAV, WEBM |
Multilingual support | English, Spanish, French, German, Italian, Portuguese, and Hindi | Same as Oracle ASR model plus 50 other languages* |
Diarization | Supported | Supported |
To create a transcription job, follow these steps: Use the create command and required parameters to create a transcription job.
oci speech transcription-job create [OPTIONS]
Avoid entering confidential information.
For a complete list of flags and variable options for CLI commands, see the CLI Command Reference.
Use the CreateTranscriptionJob and ChangeTranscriptionJobCompartment operations to create a job.