此博客實(shí)現(xiàn)與java實(shí)現(xiàn)微軟文本轉(zhuǎn)語(yǔ)音(TTS)經(jīng)驗(yàn)總結(jié)_java tts_${簡(jiǎn)簡(jiǎn)單單}的博客-CSDN博客之上,首先感謝博客源碼的提供,本人在上面添加了一些詳細(xì)的注釋,方便大家跟好的理解和使用,畢竟我已經(jīng)用原文調(diào)試了一下午才調(diào)通,一些細(xì)節(jié)的問(wèn)題給大家標(biāo)注出來(lái),免得浪費(fèi)大家的時(shí)間,下面直接開(kāi)始代碼吧!
首先大家需要去微軟官網(wǎng)獲取到密鑰,方便調(diào)用時(shí)可以使用,大家注意看下圖,我們一定要注意給我們分配到的區(qū)域,我這里是分配到eastus ,就是east us(美國(guó)東部)的意思,大家一定需要注意一下,后面會(huì)使用到的,然后終結(jié)點(diǎn)里面的地址就是我們獲取token的地址
下面我們準(zhǔn)備幾個(gè)類,方便后面使用,大家把代碼都復(fù)制到自己項(xiàng)目中,不要有遺漏:
package com.daoversal.util;
public class ByteArray {
private byte[] data;
private int length;
public ByteArray(){
length = 0;
data = new byte[length];
}
public ByteArray(byte[] ba){
data = ba;
length = ba.length;
}
/**
合并數(shù)組
*/
public void cat(byte[] second, int offset, int length){
if(this.length + length > data.length) {
int allocatedLength = Math.max(data.length, length);
byte[] allocated = new byte[allocatedLength << 1];
System.arraycopy(data, 0, allocated, 0, this.length);
System.arraycopy(second, offset, allocated, this.length, length);
data = allocated;
}else {
System.arraycopy(second, offset, data, this.length, length);
}
this.length += length;
}
public void cat(byte[] second){
cat(second, 0, second.length);
}
public byte[] getArray(){
if(length == data.length){
return data;
}
byte[] ba = new byte[length];
System.arraycopy(data, 0, ba, 0, this.length);
data = ba;
return ba;
}
public int getLength(){
return length;
}
}
package com.daoversal.util;
import javax.net.ssl.HttpsURLConnection;
import java.net.URL;
public class HttpsConnection {
public static HttpsURLConnection getHttpsConnection(String connectingUrl) throws Exception {
URL url = new URL(connectingUrl);
return (HttpsURLConnection) url.openConnection();
}
}
package com.daoversal.util;
import lombok.extern.slf4j.Slf4j;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import java.io.StringWriter;
@Slf4j
public class XmlDom {
public static String createDom(String locale, String genderName, String voiceName, String textToSynthesize){
Document doc = null;
Element speak, voice;
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = dbf.newDocumentBuilder();
doc = builder.newDocument();
if (doc != null){
speak = doc.createElement("speak");
speak.setAttribute("version", "1.0");
speak.setAttribute("xml:lang", "en-US");
voice = doc.createElement("voice");
voice.setAttribute("xml:lang", locale);
voice.setAttribute("xml:gender", genderName);
voice.setAttribute("name", voiceName);
voice.appendChild(doc.createTextNode(textToSynthesize));
speak.appendChild(voice);
doc.appendChild(speak);
}
} catch (ParserConfigurationException e) {
log.error("Create ssml document failed: {}",e.getMessage());
return null;
}
return transformDom(doc);
}
private static String transformDom(Document doc){
StringWriter writer = new StringWriter();
try {
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer;
transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(new DOMSource(doc), new StreamResult(writer));
} catch (TransformerException e) {
log.error("Transform ssml document failed: {}",e.getMessage());
return null;
}
return writer.getBuffer().toString().replaceAll("\n|\r", "");
}
}
?下面這個(gè)類我給大家重點(diǎn)講一下,大家去下面網(wǎng)址看看自己的參數(shù)Text to speech API reference (REST) - Speech service - Azure AI services | Microsoft Learn
AUDIO_24KHZ_48KBITRATE_MONO_MP3 :語(yǔ)言類型,這個(gè)不重要,那個(gè)聲音好聽(tīng)用那個(gè),去下圖找:
ACCESS_TOKEN_URI :就是本文章的第一張圖里面,里面獲取token的地址,直接將地址復(fù)制進(jìn)來(lái)就好了。
API_KEY :自己的api key,就是密鑰。
?TTS_SERVICE_URI : 這個(gè)地址一定要對(duì)應(yīng)分配的區(qū)域才行,不然會(huì)報(bào)權(quán)限錯(cuò)誤
Synthesis tts speech failed Server returned HTTP response code: 401 for URL: https://.........
我這里是?east us(美國(guó)東部),所以就使用美國(guó)東部里面的地址即可。
package com.daoversal.util;
public class TtsConst {
/**
* 音頻合成類型(親測(cè)這種效果最佳,其他的你自己去試試)
* 里面有很多類型,可以去里面找自己需要的
* https://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-text-to-speech?tabs=streaming
*/
public static final String AUDIO_24KHZ_48KBITRATE_MONO_MP3 = "audio-24khz-48kbitrate-mono-mp3";
/**
* 授權(quán)url 獲取密鑰頁(yè)面 終結(jié)點(diǎn) 里面的地址,我們使用這個(gè)獲取token
*/
public static final String ACCESS_TOKEN_URI = "token獲取地址";
/**
* api key
*/
public static final String API_KEY = "自己的密鑰";
/**
* tts服務(wù)url,這里一定要根據(jù)自己分配的地區(qū)找相應(yīng)的地址才行
*/
public static final String TTS_SERVICE_URI = "https://eastus.tts.speech.microsoft.com/cognitiveservices/v1/";
}
下面參數(shù)給大家講一下:
textToSynthesize : 傳入的合成語(yǔ)音文本內(nèi)容
locale:語(yǔ)言類型,大家可以參考,中文在嵌入式語(yǔ)音里面,大家可以在兩個(gè)頁(yè)面找到自己需要的語(yǔ)言。
Embedded Speech - Speech service - Azure AI services | Microsoft Learn
Language support - Speech service - Azure AI services | Microsoft Learn
gender:為發(fā)聲人性別,Male表示男性
?voiceName :發(fā)聲者名稱,大家可以去下圖找出對(duì)應(yīng)的,比如中文的話:
package com.daoversal.util;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang3.StringUtils;
import org.springframework.stereotype.Component;
import javax.annotation.Resource;
import javax.net.ssl.HttpsURLConnection;
import java.io.DataOutputStream;
import java.io.InputStream;
@Slf4j
@Component
public class TtsService {
@Resource
private Authentication authentication;
/**
* 合成音頻
* @param textToSynthesize 傳入需要翻譯的文本
* @param locale 要合成的語(yǔ)言類型
* @param gender 性別
* @param voiceName 發(fā)音者名稱
* @return
*/
public byte[] genAudioBytes(String textToSynthesize, String locale, String gender, String voiceName) {
String accessToken = authentication.genAccessToken();
if (StringUtils.isEmpty(accessToken)) {
return new byte[0];
}
try {
HttpsURLConnection webRequest = HttpsConnection.getHttpsConnection(TtsConst.TTS_SERVICE_URI);
webRequest.setRequestProperty("Host", "eastus.tts.speech.microsoft.com");
webRequest.setRequestProperty("Content-Type", "application/ssml+xml");
webRequest.setRequestProperty("X-Microsoft-OutputFormat", TtsConst.AUDIO_24KHZ_48KBITRATE_MONO_MP3);
webRequest.setRequestProperty("Authorization", "Bearer " + accessToken);
webRequest.setRequestProperty("Ocp-Apim-Subscription-Key", TtsConst.API_KEY);
webRequest.setRequestProperty("User-Agent", "Mozilla/5.0");
webRequest.setRequestProperty("Accept", "*/*");
webRequest.setDoInput(true);
webRequest.setDoOutput(true);
webRequest.setConnectTimeout(5000);
webRequest.setReadTimeout(300000);
webRequest.setRequestMethod("POST");
String body = XmlDom.createDom(locale, gender, voiceName, textToSynthesize);
if (StringUtils.isEmpty(body)) {
return new byte[0];
}
byte[] bytes = body.getBytes();
webRequest.setRequestProperty("content-length", String.valueOf(bytes.length));
webRequest.connect();
DataOutputStream dop = new DataOutputStream(webRequest.getOutputStream());
dop.write(bytes);
dop.flush();
dop.close();
InputStream inSt = webRequest.getInputStream();
ByteArray ba = new ByteArray();
int rn2 = 0;
int bufferLength = 4096;
byte[] buf2 = new byte[bufferLength];
while ((rn2 = inSt.read(buf2, 0, bufferLength)) > 0) {
ba.cat(buf2, 0, rn2);
}
inSt.close();
webRequest.disconnect();
return ba.getArray();
} catch (Exception e) {
log.error("Synthesis tts speech failed {}", e.getMessage());
}
return null;
}
}
package com.daoversal.util;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang3.StringUtils;
import org.redisson.api.RedissonClient;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
import javax.net.ssl.HttpsURLConnection;
import java.io.BufferedReader;
import java.io.DataOutputStream;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.concurrent.TimeUnit;
/**
* 此類獲取token,每次調(diào)用都需要使用到token的
* token的有效期是10分鐘,但是不建議大家10分鐘調(diào)一次,免得使用了失效的token
*/
@Component
@Slf4j
public class Authentication {
@Autowired
private RedissonClient redisson;
public String genAccessToken() {
InputStream inSt;
HttpsURLConnection webRequest;
try {
//先從redis里面取緩存的token,如果沒(méi)有就遠(yuǎn)程拉取,有的話就直接使用,大家可根據(jù)自己的業(yè)務(wù)調(diào)整
Object ob = redisson.getBucket("accessToken").get();
String accessToken = ob == null ? null : ob.toString();
if (StringUtils.isEmpty(accessToken)) {
webRequest = HttpsConnection.getHttpsConnection(TtsConst.ACCESS_TOKEN_URI);
webRequest.setDoInput(true);
webRequest.setDoOutput(true);
webRequest.setConnectTimeout(5000);
webRequest.setReadTimeout(5000);
webRequest.setRequestMethod("POST");
byte[] bytes = new byte[0];
webRequest.setRequestProperty("content-length", String.valueOf(bytes.length));
//api的key,取微軟官網(wǎng)獲取
webRequest.setRequestProperty("Ocp-Apim-Subscription-Key", TtsConst.API_KEY);
webRequest.connect();
DataOutputStream dop = new DataOutputStream(webRequest.getOutputStream());
dop.write(bytes);
dop.flush();
dop.close();
inSt = webRequest.getInputStream();
InputStreamReader in = new InputStreamReader(inSt);
BufferedReader bufferedReader = new BufferedReader(in);
StringBuilder strBuffer = new StringBuilder();
String line = null;
while ((line = bufferedReader.readLine()) != null) {
strBuffer.append(line);
}
bufferedReader.close();
in.close();
inSt.close();
webRequest.disconnect();
accessToken = strBuffer.toString();
//獲取到了token,緩存到redis里面,5分鐘失效
redisson.getBucket("accessToken").set(accessToken,5L, TimeUnit.MINUTES);
//設(shè)置accessToken的過(guò)期時(shí)間為5分鐘
log.info("New tts access token {}", accessToken);
}
return accessToken;
} catch (Exception e) {
log.error("Generate tts access token failed {}", e.getMessage());
}
return null;
}
}
最后就是調(diào)用了,大家可以測(cè)試了:文章來(lái)源:http://www.zghlxwxcb.cn/news/detail-725252.html
package com.daoversal.web;
import com.daoversal.framework.http.Response;
import com.daoversal.task.DvWeekCountTask;
import com.daoversal.task.RechargeTask;
import com.daoversal.task.UserGradeCountTask;
import com.daoversal.task.WindControlMsgTask;
import com.daoversal.util.TtsService;
import io.swagger.annotations.Api;
import io.swagger.annotations.ApiOperation;
import okhttp3.*;
import org.springframework.boot.configurationprocessor.json.JSONException;
import org.springframework.boot.configurationprocessor.json.JSONObject;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import javax.annotation.Resource;
import javax.servlet.http.HttpServletResponse;
import java.io.*;
/**
* <p>
* 套餐價(jià)值釋放記錄表 前端控制器
* </p>
*
* @author HayDen
* @since 03 22 10:44:13
*/
@RestController
@RequestMapping("/test")
@Api(value = "test")
public class TestController {
@Resource
private TtsService testService;
@PostMapping("/ttsService")
@ApiOperation(value = "獲取ttsService", httpMethod = "POST" )
public void ttsService(String text) {
// byte[] bte = testService.genAudioBytes(res,"en-US","Male","en-US-JennyNeural");
byte[] bte = testService.genAudioBytes(text,"zh-CN","Male","zh-CN-YunxiNeural");
String value = "hllo.mp3";
convertByteArrayToFile(bte,value);
System.out.println("213213123");
}
/**
* 此文件是將byte[] 轉(zhuǎn)換成文件存儲(chǔ)到指定路徑的
* @param arr
* @param value
*/
public static void convertByteArrayToFile(byte[] arr,String value) {
try (
BufferedInputStream bis = new BufferedInputStream(new ByteArrayInputStream(arr));
//這里是轉(zhuǎn)換以后的文件存儲(chǔ)的路徑
FileOutputStream fileOutputStream = new FileOutputStream("/Users/recovery/Downloads/"+value);
BufferedOutputStream bos = new BufferedOutputStream(fileOutputStream)
) {
int data;
while ((data = bis.read()) != -1) {
bos.write(data);
}
bos.flush();
} catch (IOException e) {
e.printStackTrace();
}
}
}
最后大家需要注意一下就是如果你選的是英文en-US,但是輸入的文本是中文的話他是不會(huì)翻譯的,所以大家一定要注意自己的語(yǔ)言類型不要弄錯(cuò)了,如果有疑問(wèn)可以留言哦,我看到肯定會(huì)毫無(wú)保留的給大家說(shuō)明的。文章來(lái)源地址http://www.zghlxwxcb.cn/news/detail-725252.html
到了這里,關(guān)于微軟(TTS)文本轉(zhuǎn)語(yǔ)音服務(wù)API實(shí)現(xiàn)的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!