国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

OpenAI-ChatGPT最新官方接口《語音智能轉(zhuǎn)文本》全網(wǎng)最詳細(xì)中英文實用指南和教程，助你零基礎(chǔ)快速輕松掌握全新技術(shù)（六）（附源碼）

2年前作者：小胡說人工智能分類：Toy博客閱讀(25)違法舉報

這篇具有很好參考價值的文章主要介紹了OpenAI-ChatGPT最新官方接口《語音智能轉(zhuǎn)文本》全網(wǎng)最詳細(xì)中英文實用指南和教程，助你零基礎(chǔ)快速輕松掌握全新技術(shù)（六）（附源碼）。希望對大家有所幫助。如果存在錯誤或未考慮完全的地方，請大家不吝賜教，您也可以點擊"舉報違法"按鈕提交疑問。

OpenAI-ChatGPT最新官方接口《語音智能轉(zhuǎn)文本》全網(wǎng)最詳細(xì)中英文實用指南和教程，助你零基礎(chǔ)快速輕松掌握全新技術(shù)（六）（附源碼）

Speech to text 語音轉(zhuǎn)文本
Learn how to turn audio into text
了解如何將音頻轉(zhuǎn)換為文本

ChatGPT 是集人工智能和自然語言處理技術(shù)于一身的大型語言模型。它能夠通過文字、語音或者圖像等多種方式與用戶進(jìn)行交互。其中，通過語音轉(zhuǎn)文字功能，ChatGPT 能夠?qū)⒂脩粽f出的話語，立即轉(zhuǎn)化為文字，并對其進(jìn)行分析處理，再以文字形式作答。這樣的交互方式大大提升了 ChatGPT 與用戶之間的交流效率。

Introduction 導(dǎo)言

The speech to text API provides two endpoints, transcriptions and translations, based on our state-of-the-art open source large-v2 Whisper model. They can be used to:
語音到文本API提供了兩個端點， transcriptions 和 translations ，基于我們最先進(jìn)的開源大型v2 Whisper模型。它們可用于：

Transcribe audio into whatever language the audio is in.
將音頻轉(zhuǎn)錄為音頻所用的任何語言。
Translate and transcribe the audio into english.
翻譯和轉(zhuǎn)錄音頻成英語。

File uploads are currently limited to 25 MB and the following input file types are supported: mp3, mp4, mpeg, mpga, m4a, wav, and webm.
文件上傳當(dāng)前限制為25 MB，支持以下輸入文件類型： mp3, mp4, mpeg, mpga, m4a, wav, and webm。

Quickstart 快速開始

Transcriptions 轉(zhuǎn)錄

The transcriptions API takes as input the audio file you want to transcribe and the desired output file format for the transcription of the audio. We currently support multiple input and output file formats.
轉(zhuǎn)錄API將您要轉(zhuǎn)錄的音頻文件和音頻轉(zhuǎn)錄所需的輸出文件格式作為輸入。我們目前支持多種輸入和輸出文件格式。

python代碼

# Note: you need to be using OpenAI Python v0.27.0 for the code below to work
import openai
audio_file= open("/path/to/file/audio.mp3", "rb")
transcript = openai.Audio.transcribe("whisper-1", audio_file)

cURL代碼

curl --request POST \
  --url https://api.openai.com/v1/audio/transcriptions \
  --header 'Authorization: Bearer TOKEN' \
  --header 'Content-Type: multipart/form-data' \
  --form file=@/path/to/file/openai.mp3 \
  --form model=whisper-1

By default, the response type will be json with the raw text included.
默認(rèn)情況下，響應(yīng)類型將是包含原始文本的json。

{
“text”: "Imagine the wildest idea that you’ve ever had, and you’re curious about how it might scale to something that’s a 100, a 1,000 times bigger.
…
}
{ “text”：“想象一下你有過的最瘋狂的想法，你很好奇它如何擴展到100倍，1,000倍大的東西?！?}

To set additional parameters in a request, you can add more --form lines with the relevant options. For example, if you want to set the output format as text, you would add the following line:
要在請求中設(shè)置其他參數(shù)，您可以添加更多帶有相關(guān)選項的 --form 行。例如，如果要將輸出格式設(shè)置為文本，則應(yīng)添加以下行：

...
--form file=@openai.mp3 \
--form model=whisper-1 \
--form response_format=text

Translations 翻譯

The translations API takes as input the audio file in any of the supported languages and transcribes, if necessary, the audio into english. This differs from our /Transcriptions endpoint since the output is not in the original input language and is instead translated to english text.
翻譯API接受任何支持語言的音頻文件作為輸入，并在必要時將音頻轉(zhuǎn)錄為英語。這與我們的/Transcriptions端點不同，因為輸出不是原始輸入語言，而是翻譯為英語文本。

python代碼

# Note: you need to be using OpenAI Python v0.27.0 for the code below to work
import openai
audio_file= open("/path/to/file/german.mp3", "rb")
transcript = openai.Audio.translate("whisper-1", audio_file)

cURL代碼

curl --request POST   --url https://api.openai.com/v1/audio/translations   --header 'Authorization: Bearer TOKEN'   --header 'Content-Type: multipart/form-data'   --form file=@/path/to/file/german.mp3   --form model=whisper-1

In this case, the inputted audio was german and the outputted text looks like:
在這種情況下，輸入的音頻是德語，輸出的文本看起來像：

Hello, my name is Wolfgang and I come from Germany. Where are you heading today?
大家好，我叫沃爾夫?qū)?，來自德國。你今天要去哪里?/p>

We only support translation into english at this time.
我們只支持翻譯成英語。

Supported languages 支持的語言

We currently support the following languages through both the transcriptions and translations endpoint:
我們目前通過 transcriptions 和 translations 端點支持以下語言：

Afrikaans, Arabic, Armenian, Azerbaijani, Belarusian, Bosnian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Marathi, Maori, Nepali, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese, and Welsh.
南非荷蘭語，阿拉伯語，亞美尼亞語，阿塞拜疆語，白俄羅斯語，波斯尼亞語，保加利亞語，加泰羅尼亞語，中文，克羅地亞語，捷克語，丹麥語，荷蘭語，英語，愛沙尼亞語，芬蘭語，法語，加利西亞語，德語，希臘語，希伯來語，印地語，匈牙利語，冰島語，印度尼西亞語，意大利語，日語，卡納達(dá)語，哈薩克語，韓語，拉脫維亞語，立陶宛語，馬其頓語，馬來語，馬拉地語，毛利語，尼泊爾語，挪威語，波斯語，波蘭語，葡萄牙語，羅馬尼亞語，俄語，塞爾維亞語、斯洛伐克語、斯洛文尼亞語、西班牙語、斯瓦希里語、瑞典語、菲律賓語、泰米爾語、泰語、土耳其語、烏克蘭語、烏爾都語、越南語和威爾士語。

While the underlying model was trained on 98 languages, we only list the languages that exceeded <50% word error rate (WER) which is an industry standard benchmark for speech to text model accuracy. The model will return results for languages not listed above but the quality will be low.
雖然底層模型在98種語言上進(jìn)行了訓(xùn)練，但我們只列出了超過50%單詞錯誤率（WER）的語言，這是語音到文本模型準(zhǔn)確性的行業(yè)標(biāo)準(zhǔn)基準(zhǔn)。該模型將返回上面未列出的語言的結(jié)果，但質(zhì)量將較低。

Longer inputs 長文件輸入

By default, the Whisper API only supports files that are less than 25 MB. If you have an audio file that is longer than that, you will need to break it up into chunks of 25 MB’s or less or used a compressed audio format. To get the best performance, we suggest that you avoid breaking the audio up mid-sentence as this may cause some context to be lost.
默認(rèn)情況下，Whisper API僅支持小于25 MB的文件。如果你有一個音頻文件比這更長，你需要把它分成25 MB或更少的塊，或者使用壓縮的音頻格式。為了獲得最佳性能，我們建議您避免在句子中間打斷音頻，因為這可能會導(dǎo)致一些上下文丟失。

One way to handle this is to use the PyDub open source Python package to split the audio:
處理這個問題的一種方法是使用PyDub開源Python包來分割音頻：

from pydub import AudioSegment

song = AudioSegment.from_mp3("good_morning.mp3")

# PyDub handles time in milliseconds
ten_minutes = 10 * 60 * 1000

first_10_minutes = song[:ten_minutes]

first_10_minutes.export("good_morning_10.mp3", format="mp3")

OpenAI makes no guarantees about the usability or security of 3rd party software like PyDub.
OpenAI不保證PyDub等第三方軟件的可用性或安全性。

Prompting 提示

You can use a prompt to improve the quality of the transcripts generated by the Whisper API. The model will try to match the style of the prompt, so it will be more likely to use capitalization and punctuation if the prompt does too. However, the current prompting system is much more limited than our other language models and only provides limited control over the generated audio. Here are some examples of how prompting can help in different scenarios:
您可以使用提示來提高Whisper API生成的轉(zhuǎn)錄本的質(zhì)量。該模型將嘗試匹配提示符的樣式，因此如果提示符也使用大寫和標(biāo)點符號，則更有可能使用大寫和標(biāo)點符號。然而，當(dāng)前的提示系統(tǒng)比我們的其他語言模型要有限得多，并且僅對生成的音頻提供有限的控制。以下是提示如何在不同情況下提供幫助的一些示例：

Prompts can be very helpful for correcting specific words or acronyms that the model often misrecognizes in the audio. For example, the following prompt improves the transcription of the words DALL·E and GPT-3, which were previously written as “GDP 3” and “DALI”.
提示對于糾正模型經(jīng)常在音頻中誤識別的特定單詞或首字母縮寫詞非常有幫助。例如，下面的提示改進(jìn)了單詞DALL·E和GPT-3的轉(zhuǎn)錄，這些單詞以前被寫成“GDP 3”和“DALI”。

The transcript is about OpenAI which makes technology like DALL·E, GPT-3, and ChatGPT with the hope of one day building an AGI system that benefits all of humanity
OpenAI開發(fā)了DALL·E、GPT-3和ChatGPT等技術(shù)，希望有一天能建立一個造福全人類的AGI系統(tǒng)。

To preserve the context of a file that was split into segments, you can prompt the model with the transcript of the preceding segment. This will make the transcript more accurate, as the model will use the relevant information from the previous audio. The model will only consider the final 224 tokens of the prompt and ignore anything earlier.
若要保留已拆分為段的文件的上下文，可以使用前一段的副本提示模型。這將使轉(zhuǎn)錄更準(zhǔn)確，因為模型將使用來自先前音頻的相關(guān)信息。該模型將只考慮提示符的最后224個標(biāo)記，而忽略之前的任何標(biāo)記。
Sometimes the model might skip punctuation in the transcript. You can avoid this by using a simple prompt that includes punctuation:
有時候模型可能會跳過文本中的標(biāo)點符號。您可以使用包含標(biāo)點符號的簡單提示來避免這種情況：

Hello, welcome to my lecture. 大家好，歡迎來聽我的講座。

The model may also leave out common filler words in the audio. If you want to keep the filler words in your transcript, you can use a prompt that contains them:
該模型還可以省略音頻中的常見填充詞。如果要在記錄中保留填充詞，可以使用包含它們的提示符：

Umm, let me think like, hmm… Okay, here’s what I’m, like, thinking."
嗯，讓我想想，嗯……好吧，我是這么想的。”

Some languages can be written in different ways, such as simplified or traditional Chinese. The model might not always use the writing style that you want for your transcript by default. You can improve this by using a prompt in your preferred writing style.
有些語言可以用不同的方式書寫，如簡體中文或繁體中文。默認(rèn)情況下，模型可能并不總是使用您希望用于抄錄的書寫樣式。你可以通過使用你喜歡的寫作風(fēng)格來改善這一點。

其它資料下載

如果大家想繼續(xù)了解人工智能相關(guān)學(xué)習(xí)路線和知識體系，歡迎大家翻閱我的另外一篇博客《重磅 | 完備的人工智能AI 學(xué)習(xí)——基礎(chǔ)知識學(xué)習(xí)路線，所有資料免關(guān)注免套路直接網(wǎng)盤下載》
這篇博客參考了Github知名開源平臺，AI技術(shù)平臺以及相關(guān)領(lǐng)域?qū)＜遥篋atawhale，ApacheCN，AI有道和黃海廣博士等約有近100G相關(guān)資料，希望能幫助到所有小伙伴們。文章來源地址http://www.zghlxwxcb.cn/news/detail-420692.html

到了這里，關(guān)于OpenAI-ChatGPT最新官方接口《語音智能轉(zhuǎn)文本》全網(wǎng)最詳細(xì)中英文實用指南和教程，助你零基礎(chǔ)快速輕松掌握全新技術(shù)（六）（附源碼）的文章就介紹完了。如果您還想了解更多內(nèi)容，請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章，希望大家以后多多支持TOY模板網(wǎng)！

本文來自互聯(lián)網(wǎng)用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務(wù)，不擁有所有權(quán)，不承擔(dān)相關(guān)法律責(zé)任。如若轉(zhuǎn)載，請注明出處：如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實不符，請點擊違法舉報進(jìn)行投訴反饋，一經(jīng)查實，立即刪除！

分享到：

領(lǐng)支付寶紅包贊助服務(wù)器費用

OpenAI-ChatGPT最新官方接口《嵌入向量式文本轉(zhuǎn)換》全網(wǎng)最詳細(xì)中英文實用指南和教程，助你零基礎(chǔ)快速輕松掌握全新技術(shù)（五）（附源碼）
ChatGPT 嵌入能夠?qū)⑽谋巨D(zhuǎn)換為固定長度的連續(xù)向量，允許對文本數(shù)據(jù)執(zhí)行分類、主題聚類、搜索和推薦等功能。這樣，以前很難被處理的文本數(shù)據(jù)可以輕松地被處理了。使用 ChatGPT 嵌入可以極大地改善用戶體驗。它能夠幫助聊天機器人更準(zhǔn)確地處理文本信息，并實現(xiàn)更有效的
2024年02月01日
瀏覽(25)
OpenAI-ChatGPT最新官方接口《聊天交互多輪對話》全網(wǎng)最詳細(xì)中英文實用指南和教程，助你零基礎(chǔ)快速輕松掌握全新技術(shù)（二）（附源碼）
Using the OpenAI Chat API, you can build your own applications with gpt-3.5-turbo and gpt-4 to do things like: 使用OpenAI Chat API，您可以使用 gpt-3.5-turbo 和 gpt-4 構(gòu)建自己的應(yīng)用程序，以執(zhí)行以下操作： Draft an email or other piece of writing 起草一封電子郵件或其他書面材料 Write Python code 編寫Python代碼 Answer
2023年04月24日
瀏覽(43)
OpenAI-ChatGPT最新官方接口《從0到1生產(chǎn)最佳實例》全網(wǎng)最詳細(xì)中英文實用指南和教程，助你零基礎(chǔ)快速輕松掌握全新技術(shù)（十一）（附源碼）
作為高級開發(fā)工程師，如果你需要開發(fā)一個使用ChatGPT的應(yīng)用程序并部署到生產(chǎn)環(huán)境上，那么在此之前，你需要提前考慮完善各項工作。比如如何做好相應(yīng)的成本控制、并發(fā)性能監(jiān)控，如何持續(xù)評估和迭代機器學(xué)習(xí)模型，以及數(shù)據(jù)安全性和合規(guī)性等方面。值得一提的是，Open
2023年04月20日
瀏覽(26)
OpenAI最新官方ChatGPT聊天插件接口《接入插件快速開始》全網(wǎng)最詳細(xì)中英文實用指南和教程，助你零基礎(chǔ)快速輕松掌握全新技術(shù)（二）（附源碼）
ChatGPT正在經(jīng)歷著一次革命性的改變，隨著越來越多的小程序和第三方插件的引入，ChatGPT將變得更加強大、靈活和自由。這些插件不僅能夠讓用戶實現(xiàn)更多更復(fù)雜的AI任務(wù)和目標(biāo)，還會帶來類似國內(nèi)微信小程序般的瘋狂，為用戶和開發(fā)者帶來更多驚喜和創(chuàng)新。想象一下，當(dāng)您
2024年02月04日
瀏覽(42)
最新寶塔反代openai官方API開發(fā)接口詳細(xì)搭建教程，解決502 Bad Gateway問題
寶塔反代openai官方API接口詳細(xì)教程，實現(xiàn)國內(nèi)使用ChatGPT+502 Bad Gateway問題解決，此方法最簡單快捷，沒有復(fù)雜步驟，不容易出錯，即最簡單，零代碼、零部署的方法。一臺海外服務(wù)器 OpenAI官方的API_KEY 第三方網(wǎng)站系統(tǒng)或插件關(guān)于第三方網(wǎng)站系統(tǒng)或插件，可以看另一篇文章介
2024年01月25日
瀏覽(20)
openai-chatGPT的API調(diào)用異常處理
因為目前openai對地區(qū)限制的原因，即使設(shè)置了全局代理使用API調(diào)用時，還是會出現(xiàn)科學(xué)上網(wǎng)代理的錯誤問題。 openai庫 == 0.26.5 【錯誤提示】： raise error.APIConnectionError( openai.error.APIConnectionError: Error communicating with OpenAI: HTTPSConnectionPool(host=\\\' api.openai.com \\\', port=443): Max retries exceede
2024年01月20日
瀏覽(29)
最新使用寶塔反代openai官方API接口搭建詳細(xì)教程及502 Bad Gateway錯誤問題解決
寶塔反代openai官方API接口詳細(xì)教程，實現(xiàn)國內(nèi)使用ChatGPT+502 Bad Gateway問題解決，此方法最簡單快捷，沒有復(fù)雜步驟，不容易出錯，即最簡單，零代碼、零部署的方法。一臺海外VPS服務(wù)器 OpenAI官方的API_KEY 第三方網(wǎng)站系統(tǒng)或插件關(guān)于第三方網(wǎng)站系統(tǒng)或插件，可以看另一篇文章
2024年01月19日
瀏覽(42)
一文讀懂Springboot如何使用ChatGPT【OpenAI官方Springboot依賴，極強接口封裝】
封裝了豐富的OpenAI 接口可直接使用申請外國虛擬信用卡【Depay】充值USTD虛擬貨幣【歐易】 USTD充值到Depay Depay 的USTD 轉(zhuǎn) USD虛擬貨幣將USD貨幣存入虛擬信用卡通過虛擬信用卡充值到ChatGPT 優(yōu)先ChatGPT試用用戶暢享絲滑的響應(yīng)速度優(yōu)先體驗新功能原文非常感謝你從頭到尾閱讀
2024年02月07日
瀏覽(31)
寶塔反代教程，ChatGPT網(wǎng)站系統(tǒng)實現(xiàn)國內(nèi)服務(wù)器訪問openai官網(wǎng)接口(使用寶塔反代openai官方的API接口教程)
近期有網(wǎng)友問寶塔如何設(shè)置反向代理，小編這里介紹一種簡單的操作方法，就是使用寶塔官方軟件面板自帶的反向代理功能來實現(xiàn)。首先您要先安裝寶塔面板，當(dāng)Nginx或LNMP環(huán)境配置完成后，便可開始設(shè)置反向代理了，下面來看下操作步驟。此方法最簡單快捷，沒有復(fù)雜步驟，
2024年02月06日
瀏覽(31)
OpenAI API最新速查表；輕松制作數(shù)字分身；8個ChatGPT「作弊」策略；微軟提示工程官方教程 | ShowMeAI日報
?? 日報周刊合集 | ?? 生產(chǎn)力工具與行業(yè)應(yīng)用大全 | ?? 點贊關(guān)注評論拜托啦！ ShowMeAI知識星球資源編碼：R102 大語言模型的發(fā)展，正在推動 OpenAI API 集成到越來越多的應(yīng)用中。這份速查表整理了官方教程的要點，便于學(xué)習(xí)者和開發(fā)者使用。 ? 獲取訪問權(quán)限 (Set UP) ? 使用
2024年02月06日
瀏覽(30)