声音复刻tts

1. 概述

声音复刻是指通过训练发音者的一些指定的音频，得到一个专属的 voiceId（音色资源），这个音色与真实发音者的声音十分相似，可以用 AICloudTTSEngine(云端合成引擎) 播放出发音者的声音。关于声音复刻的详细描述，请参考声音复刻

2. API说明

通过 AICloudVoiceCopyEngine（音复刻引擎）音频训练可生成 voiceId(音色资源)、userId(用户ID) ，该 voiceId 类似的作用等同于音色资源 zhilingf （精品甜美女神小玲）；声音复刻tts 的用法上与在线合成有所差别，主要在构建启动参数 AICloudTTSIntent ，如：

...
//设置用户ID（非必填） 通过声音复刻引擎 查询复刻任务接口 返回
cloudTTSIntent.setUserId("1000015896");
//设置音色， 通过声音复刻引擎 查询复刻任务接口 返回
cloudTTSIntent.setSpeaker("5cb90a25cb51493e8d3ec93faf45ebbe");
//设置复刻合成云端服务地址
cloudTTSIntent.setServer("http://tts.duiopen.com/api/v1/voicecopy/synthesize");
...

2.1 初始化

初始化引擎并实现 AITTSListener 回调接口，监听引擎合成和播报的状态。

AICloudTTSEngine mEngine = AICloudTTSEngine.createInstance();
mEngine.init(new AICloudTTSConfig.Builder()
                .setUseStopCallback(true) //设置是否在stop之后回调 onSpeechFinish ,默认是true 回调
                .setUseCache(false)
                .build(), 
new AITTSListener(){
            @Override
            public void onInit(int status) {
                if (status == AIConstant.OPT_SUCCESS) {
                    Log.i(Tag, "初始化成功!");
                } else {
                    Log.i(Tag, "初始化失败!");
                }
            }

            @Override
            public void onError(String utteranceId, AIError error) {
            }

            @Override
            public void onReady(String utteranceId) {
            }

            @Override
            public void onCompletion(String utteranceId) {
                // 播报完成
            }

            @Override
            public void onProgress(int currentTime, int totalTime, boolean isRefTextTTSFinished) {
                // 播报中
            }

            @Override
            public void onSynthesizeStart(String utteranceId) {
                // 子线程，合成开始
            }

            @Override
            public void onSynthesizeDataArrived(String utteranceId, byte[] audioData) {
                // 子线程，mp3 音频数据，audioData.length 为 0 说明合成结束
            }

            @Override
            public void onSynthesizeFinish(String utteranceId) {
                // 子线程，合成完成
            }
        });

输入	输出
AICloudTTSConfig	初始化、合成、播放状态
AITTSListener	音频数据、播放音频

2.2 开始合成

构建引擎启动参数，并调用合成方法：

AICloudTTSIntent cloudTTSIntent = new AICloudTTSIntent();
cloudTTSIntent.setTextType("text");

//设置用户ID（非必填） 通过声音复刻引擎 查询复刻任务接口 返回
cloudTTSIntent.setUserId("1000015896");
//设置音色， 通过声音复刻引擎 查询复刻任务接口 返回
cloudTTSIntent.setSpeaker("5cb90a25cb51493e8d3ec93faf45ebbe");
//设置复刻合成云端服务地址
cloudTTSIntent.setServer("http://tts.duiopen.com/api/v1/voicecopy/synthesize");

cloudTTSIntent.setRealBack(true);//设置是否实时反馈,默认为实时反馈为true
cloudTTSIntent.setSpeed(1.0f);//设置语速
cloudTTSIntent.setAudioType(AIConstant.TTS_AUDIO_TYPE_MP3);//设置合成音频类型，默认为mp3
cloudTTSIntent.setMp3Quality(AIConstant.TTS_MP3_QUALITY_HIGH);//设置云端合成mp3码率，支持low和high，默认为low
cloudTTSIntent.setSampleRate(24000);
mEngine.synthesize(content.getText().toString(), "1025",cloudTTSIntent);

输入	输出
text	-
utteranceId	-
AICloudTTSIntent	见初始化回调合成的音频数据

更多启动参数如下：

方法名	取值	说明	默认值
setServer	url	设置请求地址，情感必须：http://tts.duiopen.com/api/v1/voicecopy/synthesize
setSampleRate(int sampleRate)	16000、24000	设置采样率，情感必须24000	24000
setMp3Quality(String mp3Quality)	low、high	设置音频质量，情感必须为high	low
setAudioType(String audioType)	mp3、wav、pcm、wav.alaw、opus	音频格式	mp3
setEmotion(String emotion)	default、happy、sorry、sad	指定情感合成	为空，不启用情感，普通tts合成
setSpeed(float speed)	(0.5,2]	设置语速(0.5表示语速快，2.0表示语速慢)	1
setPitchChange(String pitchChange)	(-60,60]	设置语调	0
setVolume(int volume)	(1,100]	设置音量(1表示音量小，100表示音量大)	50
setSpeaker(String speaker)	请参考发音人列表	设置发音人，对应 voiceId
setSaveAudioPath(String saveAudioPath)	音频路径	保存音频	不设置默认不保存
setRealBack(boolean realBack)	true、false	设置是否实时反馈	true

2.3 开始播放

调用speak方法合成并播放，需要指定音频通道（其他参数同合成）：

if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M) {
    cloudTTSIntent.setAudioAttributes(new AudioAttributes.Builder()
                    .setUsage(AudioAttributes.USAGE_MEDIA)
                    .setContentType(AudioAttributes.CONTENT_TYPE_MUSIC)
                    .build());
} else {
   cloudTTSIntent.setStreamType(AudioManager.STREAM_MUSIC);//设置合成音播放的音频流,默认为音乐流
}

mEngine.speak(content.getText().toString(), "1025",cloudTTSIntent);

开发者需要区分speak与synthesize接口的区别：

synthesize，仅合成接口
speak，合成后，内部会创建播放器播放合成音频数据

2.4 停止合成

mEngine.stop()

2.5 暂停播放

mEngine.pause();

2.6 恢复播放

mEngine.resume();

2.7 销毁引擎

mEngine.release();

3. 错误码

errId	errMsg	描述
011006	产品ID和voiceId无法匹配
011000	请求参数错误

4. 常见问题

网络也正常，tts配置也都正确，却一直报 errId:70911 , error:网络错误？

目前声音复刻tts 用的是http协议，而 Android P为了安全起见，规定禁止使用 http 协议 , 我们建议在res目录下创建 xml 目录 , 然后创建一个 network_security_config.xml 文件，里面内容如下:

<?xml version="1.0" encoding="utf-8"?>
<network-security-config>
    <base-config cleartextTrafficPermitted="true" />
</network-security-config>

然后在 AndroidManifest.xml 文件 application 中加上: android:networkSecurityConfig="@xml/network_security_config"

  <application
        android:name="com.aispeech.SpeechApplication"
        android:icon="@mipmap/ic_launcher"
        android:networkSecurityConfig="@xml/network_security_config"
        android:label="AISpeech_sdk_samples">
   </application>

云端合成引擎报 errId:011006 , error:产品ID和 voiceId 无法匹配 ?

云端合成接口会校验产品ID 与 voiceId 信息，如果非该产品下训练的 voiceId 均会报 011006 错误码；部分私有云产品可能不会做检验，只需要设置 voiceId 就能正常合成。

7.2.2 声音复刻tts