最近CSDN開展了《0元試用微軟 Azure人工智能認知服務,精美禮品大放送》,當前目前活動還在繼續(xù),熱心的我已經(jīng)第一時間報名參與,只不過今天才有時間實際的試用。
目前活動要求博文形式分享試用語音轉文本、文本轉語音、語音翻譯、文本分析、文本翻譯、語言理解中三項以上的服務。
目前我在試用了 語音轉文本、文本轉語音、語音翻譯 功能后,決定做一個實時語音翻譯機,使用后效果是真不錯。
下面我們看看如何操作吧,首先我們進入:https://portal.azure.cn/并登錄。
獲取密鑰
在搜索框輸入 認知服務 并確認:
然后可以創(chuàng)建語音服務:
然后輸入名稱,選擇位置,選擇免費定價,新增資源組并選擇:
之后,點擊創(chuàng)建。創(chuàng)建過程中會顯示正在部署:
部署完成后,點擊轉到資源:
然后我們點擊密鑰和終結點,查看密鑰和位置/區(qū)域:
有兩個密鑰任選一個即可,位置/區(qū)域也需要記錄下來,后面我們的程序就需要通過密鑰和位置來調用。
Azure 認知服務初體驗
Azure 認知服務文檔:https://docs.azure.cn/zh-cn/cognitive-services/
按文檔要求,我們首先安裝Azure 語音相關的python庫:
pip install azure-cognitiveservices-speech
首先我們體驗一下語音轉文本:
測試語音轉文本
文檔:https://docs.azure.cn/zh-cn/cognitive-services/speech-service/get-started-speech-to-text?tabs=windowsinstall&pivots=programming-language-python
復制官方的代碼后,簡單修改下實現(xiàn)從麥克風識別語音:
import azure.cognitiveservices.speech as speechsdk
speech_key, service_region = "59392xxxxxxxxxx559de", "chinaeast2"
speech_config = speechsdk.SpeechConfig(
subscription=speech_key, region=service_region, speech_recognition_language="zh-cn")
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config)
print("說:", end="")
result = speech_recognizer.recognize_once()
print(result.text)
speech_recognition_language決定了語言,這里我設置為中文。
我運行后,對麥克風說了一句話,程序已經(jīng)準確的識別出我說的內容:
說:微軟人工智能服務非常好用。
測試文本轉語音
文檔:https://docs.azure.cn/zh-cn/cognitive-services/speech-service/get-started-text-to-speech?tabs=script%2Cwindowsinstall&pivots=programming-language-python
借助文檔我們還可以實現(xiàn)將轉換完成的語音保存起來,但這里我只演示直接聲音播放出來:
from azure.cognitiveservices.speech import AudioDataStream, SpeechConfig, SpeechSynthesizer, SpeechSynthesisOutputFormat
from azure.cognitiveservices.speech.audio import AudioOutputConfig
speech_config.speech_synthesis_language = "zh-cn"
audio_config = AudioOutputConfig(use_default_speaker=True)
speech_synthesizer = SpeechSynthesizer(
speech_config=speech_config, audio_config=audio_config)
text_words = "微軟人工智能服務非常好用。"
result = speech_synthesizer.speak_text_async(text_words).get()
if result.reason != speechsdk.ResultReason.SynthesizingAudioCompleted:
print(result.reason)
感覺轉換效果很好。
測試語音翻譯功能
文檔地址:https://docs.azure.cn/zh-cn/cognitive-services/speech-service/get-started-speech-translation?tabs=script%2Cwindowsinstall&pivots=programming-language-python
經(jīng)測試,語音翻譯同時包含了語音轉文本和翻譯功能:
from_language, to_language = 'zh-cn', 'en'
translation_config = speechsdk.translation.SpeechTranslationConfig(
subscription=speech_key, region=service_region, speech_recognition_language=from_language)
translation_config.add_target_language(to_language)
recognizer = speechsdk.translation.TranslationRecognizer(
translation_config=translation_config)
def speakAndTranslation():
result = recognizer.recognize_once()
if result.reason == speechsdk.ResultReason.TranslatedSpeech:
return result.text, result.translations[to_language]
elif result.reason == speechsdk.ResultReason.RecognizedSpeech:
return result.text, None
elif result.reason == speechsdk.ResultReason.NoMatch:
print(result.no_match_details)
elif result.reason == speechsdk.ResultReason.Canceled:
print(result.cancellation_details)
speakAndTranslation()
這里執(zhí)行后并說一句話,結果:
('大家好才是真的好。', 'Everyone is really good.')
可以同時獲取原始文本和譯文,所以我們后面的語音翻譯工具,也都使用該接口。
語音翻譯機開發(fā)
程序的大致邏輯結構:
完整代碼:
"""
小小明的代碼
CSDN主頁:https://blog.csdn.net/as604049322
"""
__author__ = '小小明'
__time__ = '2021/10/30'
import azure.cognitiveservices.speech as speechsdk
from azure.cognitiveservices.speech.audio import AudioOutputConfig
speech_key, service_region = "59xxxxde", "chinaeast2"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region,
speech_recognition_language="zh-cn")
speech_config.speech_synthesis_language = "zh-cn"
audio_config = AudioOutputConfig(use_default_speaker=True)
speech_synthesizer = speechsdk.SpeechSynthesizer(
speech_config=speech_config, audio_config=audio_config)
from_language, to_language = 'zh-cn', 'en'
translation_config = speechsdk.translation.SpeechTranslationConfig(
subscription=speech_key, region=service_region, speech_recognition_language=from_language)
translation_config.add_target_language(to_language)
recognizer = speechsdk.translation.TranslationRecognizer(
translation_config=translation_config)
def speakAndTranslation():
result = recognizer.recognize_once()
if result.reason == speechsdk.ResultReason.TranslatedSpeech:
return result.text, result.translations[to_language]
elif result.reason == speechsdk.ResultReason.RecognizedSpeech:
return result.text, None
elif result.reason == speechsdk.ResultReason.NoMatch:
print(result.no_match_details)
elif result.reason == speechsdk.ResultReason.Canceled:
print(result.cancellation_details)
def speak(text_words):
result = speech_synthesizer.speak_text_async(text_words).get()
# print(result.reason)
if result.reason == speechsdk.ResultReason.Canceled:
cancellation_details = result.cancellation_details
print("識別取消:", cancellation_details.reason)
if cancellation_details.reason == speechsdk.CancellationReason.Error:
if cancellation_details.error_details:
print("錯誤詳情:", cancellation_details.error_details)
while True:
print("說:", end=" ")
text, translation_text = speakAndTranslation()
print(text)
print("譯文:", translation_text)
if "退出" in text:
break
if text:
speak(translation_text)
簡單的運行了一下,中間的打印效果如下:文章來源:http://www.zghlxwxcb.cn/news/detail-446730.html
說: 我只想進轉過山和大海。
譯文: I just want to go in and out of the mountains and the sea.
說: 也穿越,人山人海。
譯文: Also through, the sea of people and mountains.
說: 我曾經(jīng)目睹這一切全部都隨風飄然。
譯文: I've seen it all blow in the wind.
說: 轉眼成空。
譯文: It's empty.
說: 問,世間能有幾多愁?
譯文: Q, how much worry can there be in the world?
說: 退出。
譯文: quit.
最終的語音功能也只有各位親自體驗了噢。文章來源地址http://www.zghlxwxcb.cn/news/detail-446730.html
到了這里,關于用Azure認知服務開發(fā)一個語音翻譯機,學英文很爽快的文章就介紹完了。如果您還想了解更多內容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關文章,希望大家以后多多支持TOY模板網(wǎng)!