Home Arrow Icon Knowledge base Arrow Icon Global Arrow Icon GPT-4o API analyze audio python sample
    GPT-4o API


GPT-4o API analyze audio python sample


here is an example of how to use the GPT-4o API in Python to analyze audio data:

Transcribing and Summarizing Audio

To transcribe and summarize an audio file using the GPT-4o API, you can follow these steps:

python
import openai

openai.api_key = "YOUR_API_KEY"

# Step 1: Transcribe the audio
audio_path = "path/to/audio.mp3"
transcription = openai.Audio.transcribe(
    model="whisper-1",
    file=open(audio_path, "rb"),
)

# Step 2: Summarize the transcription
response = openai.Completion.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are generating a transcript summary. Create a summary of the provided transcription. Respond in Markdown."},
        {"role": "user", "content": f"The audio transcription is: {transcription.text}"}
    ],
    temperature=0,
)

print(response.choices[0].message.content)

Key points:

- Use the `openai.Audio.transcribe()` function to transcribe the audio file
- Specify the `"whisper-1"` model for the transcription
- Pass the audio file as a binary stream using `open(audio_path, "rb")`
- The transcription text is available in `transcription.text`

- Use the `openai.Completion.create()` function to generate a summary of the transcription
- Provide the transcription text as the user input
- Set the system message to indicate that you want a summary
- Adjust the `temperature` parameter to control the creativity of the summary

This example demonstrates how to leverage the GPT-4o API to transcribe an audio file and then generate a summary of the transcription. The API handles the audio processing and language generation, allowing you to easily integrate these capabilities into your applications.

Note that the GPT-4o API currently does not support direct audio input, so you need to first transcribe the audio to text before passing it to the model for summarization. Future updates may introduce direct audio processing capabilities.

Citations:
[1] https://www.datacamp.com/tutorial/gpt4o-api-openai-tutorial
[2] https://deepgram.com/learn/how-to-make-the-most-of-gpt-4o
[3] https://community.openai.com/t/announcing-gpt-4o-in-the-api/744700?page=3
[4] https://www.youtube.com/watch?v=VzOmqPva6ok
[5] https://daily.dev/blog/openais-gpt-4o-everything-you-need-to-know-in-one-place