【已解决】用和适在线的语言合成接口把文字转语音

{
    "access_token": "24.569b3b5b470938a522ce60d2e2ea2506.2592000.1528015602.282335-11192483",
    "session_key": "9mzdDoR4p/oexxx0Yp9VoSgFCFOSGEIA==",
    "scope": "public audio_voice_assistant_get audio_tts_post wise_adapt lebo_resource_base lightservice_public hetu_basic lightcms_map_poi kaidian_kaidian ApsMisTest_Test权限 vis-classify_flower lpq_开放 cop_helloScope ApsMis_fangdi_permission smartapp_snsapi_base",
    "refresh_token": "25.5axxxx5-xxx3",
    "session_secret": "12xxxa",
    "expires_in": 2592000
}

再去调用url：

http://tsn.baidu.com/text2audio?lan=zh&ctp=1&cuid=xxx_robot&tok=24.56xxx3&vol=9&per=0&spd=5&pit=5&tex=as+a+book-collector%2c+i+have+the+story+you+just+want+to+listen!

获取合成后的mp3：

很明显，此处是直接返回mp3的内容的

而不是希望的临时的mp3的临时的url

而之前的：

http://developer.baidu.com/vcast

是可以返回临时mp3的url的

比如：

http://boscdn.bpc.baidu.com/v1/developer/df29f25f-6003-4e9f-9f80-15be4babf831.mp3

搜了下：

boscdn bpc baidu.com

boscdn.bpc.baidu.com domain information – Reason Core Security Labs

ecomfe/edpx-bos: edp的bos扩展

感觉是百度的一个cdn的服务器

然后还有js接口上传内容上去，生成临时url的

但是貌似是百度内部自己用的？

接着需要：

1.最好把百度的token，弄成那个永久的，或者至少是1年的，而不是现在的1个月的

2.最好把生成的mp3的文件，弄成一个url可以返回给用户的

感觉需要是用自己的flask的rest的api中，封装百度的接口，给外界一个统一的接口，返回mp3的url，然后是有临时时限的 -》那内部可以考虑把mp3保存到 /tmp 或者是redis然后设置一个expire时限？

先去弄永久的token的事情：

【无法解决】获取百度的永久的或长期比如1年的有效的access token

而关于文档，从：

SPEECH-TTS-CORS/demo.html at master · Baidu-AIP/SPEECH-TTS-CORS

发现了：

百度语音合成 REST API

也找到了Python文档：

Python SDK文档

不过，对于，想要去模拟，当access_token失效时，百度接口会返回什么

突然想到，可以用刚才已经被refresh_token刷新后，而失效的之前的access_token，去调用看看，返回什么

结果之前的token，竟然还能用：

那算了，把token值随便改一下，去尝试模拟一个无效的token

结果返回：

{
    "err_detail": "Access token invalid or no longer valid",
    "err_msg": "authentication failed.",
    "err_no": 502,
    "err_subcode": 50004,
    "tts_logid": 1007366076
}

REST API文档

错误码解释

错误码	含义
500	不支持输入
501	输入参数不正确
502	token验证失败
503	合成后端错误

百度 err_subcode 50004

百度 authentication failed 50004

产品介绍_百度云推送_免费专业最精准的移动推送服务平台

50004

Passport Not Login

未登录百度账号passport

400

那目前就可以暂定为如下思路了：

用Flask去封装百度的语音合成的api

然后内部使用Python的SDK（用pip去安装）

如果返回dict，且发现是err_no是502的话，则确定是token无效或过期

则使用refresh_token去重新刷新获得有效的token

重新再去尝试一次

然后正常的话，返回得到mp3的数据

再考虑如何处理，放到哪里，生成一个外部可以直接访问的url

此处，参考：

SPEECH-TTS-CORS/demo.html at master · Baidu-AIP/SPEECH-TTS-CORS

发现是：

        // 参数含义请参考 https://ai.baidu.com/docs#/TTS-API/41ac79a6
        audio = btts({
。。。
            onSuccess: function(htmlAudioElement) {
 
                audio = htmlAudioElement;
                playBtn.innerText = '播放';
            },

发现是btts直接返回了audio这个html的element？

去看：

SPEECH-TTS-CORS/baidu_tts_cors.js at master · Baidu-AIP/SPEECH-TTS-CORS

发现是：

1 2	`document.body.append(audio);` `audio.setAttribute('src', URL.createObjectURL(xhr.response));`

好像是创建了本地的文件了？

去搜：

URL.createObjectURL

URL.createObjectURL() – Web API 接口 | MDN

Blob – Web API 接口 | MDN

URL.createObjectURL和URL.revokeObjectURL – 流浪猫の窝 – 博客园

“File对象,就是一个文件,比如我用input type=”file”标签来上传文件,那么里面的每个文件都是一个File对象.

Blob对象,就是二进制数据,比如通过new Blob()创建的对象就是Blob对象.又比如,在XMLHttpRequest里,如果指定responseType为blob,那么得到的返回值也是一个blob对象.”

所以此处就是返回了mp3的二进制数据，是blob格式，传递给createObjectURL，生成了临时的文件，可以用来播放了

-》那么我后续封装出来的接口，倒是也可以考虑支持两种：

直接返回mp3的url
返回mp3的二进制数据文件

而返回的类型，可以通过输入参数指定

然后就是去：

【已解决】后台用Flask封装百度的语音合成功能对外提供REST API接口

接着就可以继续去：

【已解决】Flask中如何保存临时文件且可以指定有效期

接着就是去：

前端web页面中把相关的之前输出text的接口，更新为，解析返回的mp3的（临时文件）的url，以及调用播放器播放出来：

var curResponseDict = respJsonObj["data"]["response"];
console.log("curResponseDict=%s", curResponseDict);
 
var curResponseText = curResponseDict["text"];
console.log("curResponseText=%s", curResponseText);
$('#response_text p').text(curResponseText);
 
var curResponseAudioUrl = curResponseDict["audioUrl"];
console.log("curResponseAudioUrl=%s", curResponseAudioUrl);
if (curResponseAudioUrl) {
    console.log("now play the response text's audio %s", curResponseAudioUrl);
 
    var respTextAudioObj = $(".response_text_audio_player audio")[0];
    console.log("respTextAudioObj=%o", respTextAudioObj);
 
    $(".response_text_audio_player .col-sm-offset-1").text(curResponseText);
 
    $(".response_text_audio_player audio source").attr("src", curResponseAudioUrl);
    respTextAudioObj.load();
    console.log("has load respTextAudioObj=%o", respTextAudioObj);
 
    respTextAudioPromise = respTextAudioObj.play();
    // console.log("respTextAudioPromise=%o", respTextAudioPromise);
    if (respTextAudioPromise !== undefined) {
        respTextAudioPromise.then(() => {
            // Auto-play started
            console.log("Auto paly audio started, respTextAudioPromise=%o", respTextAudioPromise);
        }).catch(error => {
            // Auto-play was prevented
            // Show a UI element to let the user manually start playback
 
            console.error("play response text's audio promise error=%o", error);
            //NotAllowedError: The request is not allowed by the user agent or the platform in the current context, possibly because the user denied permission.
        });
    }
}

已经可以去播放返回的text的audio了：

然后等个1秒左右，再播放被点播的文件

所以先要去解决：

【已解决】html网页中如何用js或jquery在音频文件播放后去触发其他动作

然后故意再去优化，当出错时显示错误信息，期间：

【已解决】js中如何实现字符串拼接或格式化

以及：

【已解决】js的jquery的ajax的get返回的error错误的详细信息

【总结】

最后实现了想要的效果

后端：

Flask的REST API和百度接口的初始化和调用：

app.py

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

from flask import Flask
from flask import jsonify
from flask_restful import Resource, Api, reqparse
import logging
from logging.handlers import RotatingFileHandler
from bson.objectid import ObjectId
from flask import send_file
import os
import io
import re
from urllib.parse import quote
import json
import uuid
from flask_cors import CORS
import requests
 
from celery import Celery
 
################################################################################
# Global Definitions
################################################################################
"""
http://ai.baidu.com/docs#/TTS-API/top
    500    不支持输入
    501    输入参数不正确
    502    token验证失败
    503    合成后端错误
"""
BAIDU_ERR_NOT_SUPPORT_PARAM = 500
BAIDU_ERR_PARAM_INVALID = 501
BAIDU_ERR_TOKEN_INVALID = 502
BAIDU_ERR_BACKEND_SYNTHESIS_FAILED = 503
 
################################################################################
# Global Variables
################################################################################
log = None
app = None
"""
{
    "access_token": "24.569bcccccccc11192484",
    "session_key": "9mxxxxxxEIB==",
    "scope": "public audio_voice_assistant_get audio_tts_post wise_adapt lebo_resource_base lightservice_public hetu_basic lightcms_map_poi kaidian_kaidian ApsMisTest_Test权限 vis-classify_flower lpq_开放 cop_helloScope ApsMis_fangdi_permission smartapp_snsapi_base",
    "refresh_token": "25.6acfxxxx2483",
    "session_secret": "121xxxxxfa",
    "expires_in": 2592000
}
"""
gCurBaiduRespDict = {} # get baidu token resp dict
gTempAudioFolder = ""
 
################################################################################
# Global Function
################################################################################
 
def generateUUID(prefix = ""):
    generatedUuid4 = uuid.uuid4()
    generatedUuid4Str = str(generatedUuid4)
    newUuid = prefix + generatedUuid4Str
    return newUuid
 
#----------------------------------------
# Audio Synthesis / TTS
#----------------------------------------
 
def createAudioTempFolder():
    """create foler to save later temp audio files"""
    global log, gTempAudioFolder
 
    # init audio temp folder for later store temp audio file
    audioTmpFolder = app.config["AUDIO_TEMP_FOLDER"]
    log.info("audioTmpFolder=%s", audioTmpFolder)
    curFolderAbsPath = os.getcwd() #'/Users/crifan/dev/dev_root/company/xxx/projects/robotDemo/server'
    log.info("curFolderAbsPath=%s", curFolderAbsPath)
    audioTmpFolderFullPath = os.path.join(curFolderAbsPath, audioTmpFolder)
    log.info("audioTmpFolderFullPath=%s", audioTmpFolderFullPath)
    if not os.path.exists(audioTmpFolderFullPath):
        os.makedirs(audioTmpFolderFullPath)
        log.info("++++++ Created tmp audio folder: %s", audioTmpFolderFullPath)
 
    gTempAudioFolder = audioTmpFolderFullPath
    log.info("gTempAudioFolder=%s", gTempAudioFolder)
 
def initAudioSynthesis():
    """
    init audio synthesis related:
        init token
    :return:
    """
 
    getBaiduToken()
    createAudioTempFolder()
 
 
def getBaiduToken():
    """get baidu token"""
    global app, log, gCurBaiduRespDict
 
    getBaiduTokenUrlTemplate = "
https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id=%s&client_secret=%s
"
    getBaiduTokenUrl = getBaiduTokenUrlTemplate % (app.config["BAIDU_API_KEY"], app.config["BAIDU_SECRET_KEY"])
    log.info("getBaiduTokenUrl=%s", getBaiduTokenUrl) #
https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id=xxxz&client_secret=xxxx
    resp = requests.get(getBaiduTokenUrl)
    log.info("resp=%s", resp)
    respJson = resp.json()
    log.info("respJson=%s", respJson) #{'access_token': '24.xxx.2592000.1528609320.282335-11192484', 'session_key': 'xx+I/xx+6KwgZmw==', 'scope': 'public audio_voice_assistant_get audio_tts_post wise_adapt lebo_resource_base lightservice_public hetu_basic lightcms_map_poi kaidian_kaidian ApsMisTest_Test权限 vis-classify_flower lpq_开放 cop_helloScope ApsMis_fangdi_permission smartapp_snsapi_base', 'refresh_token': '25.xxx', 'session_secret': 'cxxx6e', 'expires_in': 2592000}
    if resp.status_code == 200:
        gCurBaiduRespDict = respJson
        log.info("get baidu token resp: %s", gCurBaiduRespDict)
    else:
        log.error("error while get baidu token: %s", respJson)
        #{'error': 'invalid_client', 'error_description': 'Client authentication failed'}
        #{'error': 'invalid_client', 'error_description': 'unknown client id'}
        #{'error': 'unsupported_grant_type', 'error_description': 'The authorization grant type is not supported'}
 
def refreshBaiduToken():
    """refresh baidu token when current token invalid"""
    global app, log, gCurBaiduRespDict
    if gCurBaiduRespDict:
        refreshBaiduTokenUrlTemplate = "
https://openapi.baidu.com/oauth/2.0/token?grant_type=refresh_token&refresh_token=%s&client_id=%s&client_secret=%s
"
        refreshBaiduTokenUrl = refreshBaiduTokenUrlTemplate % (gCurBaiduRespDict["refresh_token"], app.config["BAIDU_API_KEY"], app.config["BAIDU_SECRET_KEY"])
        log.info("refreshBaiduTokenUrl=%s", refreshBaiduTokenUrl) #
https://openapi.baidu.com/oauth/2.0/token?grant_type=refresh_token&refresh_token=25.1xxxx.xx.1841379583.282335-11192483&client_id=Sxxxxz&client_secret=47dxxxxa
        resp = requests.get(refreshBaiduTokenUrl)
        log.info("resp=%s", resp)
        respJson = resp.json()
        log.info("respJson=%s", respJson)
        if resp.status_code == 200:
            gCurBaiduRespDict = respJson
            log.info("Ok to refresh baidu token response: %s", gCurBaiduRespDict)
        else:
            log.error("error while refresh baidu token: %s", respJson)
    else:
        log.error("Can't refresh baidu token for previous not get token")
 
 
def baiduText2Audio(unicodeText):
    """call baidu text2audio to generate mp3 audio from text"""
    global app, log, gCurBaiduRespDict
    log.info("baiduText2Audio: unicodeText=%s", unicodeText)
 
    isOk = False
    mp3BinData = None
    errNo = 0
    errMsg = "Unknown error"
 
    if not gCurBaiduRespDict:
        errMsg = "Need get baidu token before call text2audio"
        return isOk, mp3BinData, errNo, errMsg
 
    utf8Text = unicodeText.encode("utf-8")
    log.info("utf8Text=%s", utf8Text)
    encodedUtf8Text = quote(unicodeText)
    log.info("encodedUtf8Text=%s", encodedUtf8Text)
 
    #
  
http://ai.baidu.com/docs#/TTS-API/top
    tex = encodedUtf8Text #合成的文本，使用UTF-8编码。小于512个中文字或者英文数字。（文本在百度服务器内转换为GBK后，长度必须小于1024字节）
    tok = gCurBaiduRespDict["access_token"] #开放平台获取到的开发者access_token（见上面的“鉴权认证机制”段落）
    cuid = app.config["FLASK_APP_NAME"] #用户唯一标识，用来区分用户，计算UV值。建议填写能区分用户的机器 MAC 地址或 IMEI 码，长度为60字符以内
    ctp = 1 #客户端类型选择，web端填写固定值1
    lan = "zh" #固定值zh。语言选择,目前只有中英文混合模式，填写固定值zh
    spd = 5 #语速，取值0-9，默认为5中语速
    pit = 5 #音调，取值0-9，默认为5中语调
    # vol = 5 #音量，取值0-9，默认为5中音量
    vol = 9
    per = 0 #发音人选择, 0为普通女声，1为普通男生，3为情感合成-度逍遥，4为情感合成-度丫丫，默认为普通女声
    getBaiduSynthesizedAudioTemplate = "
http://tsn.baidu.com/text2audio?lan=%s&ctp=%s&cuid=%s&tok=%s&vol=%s&per=%s&spd=%s&pit=%s&tex=%s
"
    getBaiduSynthesizedAudioUrl = getBaiduSynthesizedAudioTemplate % (lan, ctp, cuid, tok, vol, per, spd, pit, tex)
    log.info("getBaiduSynthesizedAudioUrl=%s", getBaiduSynthesizedAudioUrl) #
http://tsn.baidu.com/text2audio?lan=zh&ctp=1&cuid=RobotQA&tok=24.5f056b15e9d5da63256bac89f64f61b5.2592000.1528609737.282335-11192483&vol=5&per=0&spd=5&pit=5&tex=as%20a%20book-collector%2C%20i%20have%20the%20story%20you%20just%20want%20to%20listen%21
    resp = requests.get(getBaiduSynthesizedAudioUrl)
    log.info("resp=%s", resp)
    respContentType = resp.headers["Content-Type"]
    respContentTypeLowercase = respContentType.lower() #'audio/mp3'
    log.info("respContentTypeLowercase=%s", respContentTypeLowercase)
    if respContentTypeLowercase == "audio/mp3":
        mp3BinData = resp.content
        log.info("resp content is binary data of mp3, length=%d", len(mp3BinData))
        isOk = True
        errMsg = ""
    elif respContentTypeLowercase == "application/json":
        """                        
            {
              'err_detail': 'Invalid params per or lan!',
              'err_msg': 'parameter error.',
              'err_no': 501,
              'err_subcode': 50000,
              'tts_logid': 642798357
            }
 
            {
              'err_detail': 'Invalid params per&pdt!',
              'err_msg': 'parameter error.',
              'err_no': 501,
              'err_subcode': 50000,
              'tts_logid': 1675521246
            }
 
            {
              'err_detail': 'Access token invalid or no longer valid',
              'err_msg': 'authentication failed.',
              'err_no': 502,
              'err_subcode': 50004,
              'tts_logid': 4221215043
            }
        """
        log.info("resp content is json -> occur error")
 
        isOk = False
        respDict = resp.json()
        log.info("respDict=%s", respDict)
        errNo = respDict["err_no"]
        errMsg = respDict["err_msg"] + " " + respDict["err_detail"]
    else:
        isOk = False
        errMsg = "Unexpected response content-type: %s" % respContentTypeLowercase
 
    return isOk, mp3BinData, errNo, errMsg
 
def doAudioSynthesis(unicodeText):
    """
        do audio synthesis from unicode text
        if failed for token invalid/expired, will refresh token to do one more retry
    """
    global app, log, gCurBaiduRespDict
    isOk = False
    audioBinData = None
    errMsg = ""
 
    # # for debug
    # gCurBaiduRespDict["access_token"] = "99.569b3b5b470938a522ce60d2e2ea2506.2592000.1528015602.282335-11192483"
 
    log.info("doAudioSynthesis: unicodeText=%s", unicodeText)
    isOk, audioBinData, errNo, errMsg = baiduText2Audio(unicodeText)
    log.info("isOk=%s, errNo=%d, errMsg=%s", isOk, errNo, errMsg)
 
    if isOk:
        errMsg = ""
        log.info("got synthesized audio binary data length=%d", len(audioBinData))
    else:
        if errNo == BAIDU_ERR_TOKEN_INVALID:
            log.warning("Token invalid -> refresh token")
            refreshBaiduToken()
 
            isOk, audioBinData, errNo, errMsg = baiduText2Audio(unicodeText)
            log.info("after refresh token: isOk=%ss, errNo=%s, errMsg=%s", isOk, errNo, errMsg)
        else:
            log.warning("try synthesized audio occur error: errNo=%d, errMsg=%s", errNo, errMsg)
            audioBinData = None
 
    log.info("return isOk=%s, errMsg=%s", isOk, errMsg)
    if audioBinData:
        log.info("audio binary bytes=%d", len(audioBinData))
    return isOk, audioBinData, errMsg
 
 
def testAudioSynthesis():
    global app, log, gTempAudioFolder
 
    testInputUnicodeText = u"as a book-collector, i have the story you just want to listen!"
    isOk, audioBinData, errMsg = doAudioSynthesis(testInputUnicodeText)
    if isOk:
        audioBinDataLen = len(audioBinData)
        log.info("Now will save audio binary data %d bytes to file", audioBinDataLen)
 
        # 1. save mp3 binary data into tmp file
        newUuid = generateUUID()
        log.info("newUuid=%s", newUuid)
        tempFilename = newUuid + ".mp3"
        log.info("tempFilename=%s", tempFilename)
        if not gTempAudioFolder:
            createAudioTempFolder()
        tempAudioFullname = os.path.join(gTempAudioFolder, tempFilename) #'/Users/crifan/dev/dev_root/company/xxx/projects/robotDemo/server/tmp/audio/2aba73d1-f8d0-4302-9dd3-d1dbfad44458.mp3'
        log.info("tempAudioFullname=%s", tempAudioFullname)
 
        with open(tempAudioFullname, 'wb') as tmpAudioFp:
            log.info("tmpAudioFp=%s", tmpAudioFp)
            tmpAudioFp.write(audioBinData)
            tmpAudioFp.close()
            log.info("Done to write audio data into file of %d bytes", audioBinDataLen)
 
        # 2. use celery to delay delete tmp file
    else:
        log.warning("Fail to get synthesis audio for errMsg=%s", errMsg)
 
 
#----------------------------------------
# Flask API
#----------------------------------------
def sendFile(fileBytes, contentType, outputFilename):
    """Flask API use this to send out file (to browser, browser can directly download file)"""
    return send_file(
        io.BytesIO(fileBytes),
        # io.BytesIO(fileObj.read()),
        mimetype=contentType,
        as_attachment=True,
        attachment_filename=outputFilename
    )
 
################################################################################
# Global Init App
################################################################################
app = Flask(__name__)
CORS(app)
# app.config.from_object('config.DevelopmentConfig')
app.config.from_object('config.ProductionConfig')
 
logFormatterStr = app.config["LOG_FORMAT"]
logFormatter = logging.Formatter(logFormatterStr)
 
fileHandler = RotatingFileHandler(
    app.config['LOG_FILE_FILENAME'],
    maxBytes=app.config["LOF_FILE_MAX_BYTES"],
    backupCount=app.config["LOF_FILE_BACKUP_COUNT"],
    encoding="UTF-8")
fileHandler.setLevel(logging.DEBUG)
fileHandler.setFormatter(logFormatter)
app.logger.addHandler(fileHandler)
 
 
app.logger.setLevel(logging.DEBUG) # set root log level
 
log = app.logger
log.info("app=%s", app)
# log.debug("app.config=%s", app.config)
 
api = Api(app)
log.info("api=%s", api)
 
celeryApp = Celery(app.name, broker=app.config['CELERY_BROKER_URL'])
celeryApp.conf.update(app.config)
log.info("celeryApp=%s", celeryApp)
 
aiContext = Context()
log.info("aiContext=%s", aiContext)
 
initAudioSynthesis()
# testAudioSynthesis()
 
...
 
#----------------------------------------
# Celery tasks
#----------------------------------------
 
# @celeryApp.task()
@celeryApp.task
# @celeryApp.task(name=app.config["CELERY_TASK_NAME"] + ".deleteTmpAudioFile")
def deleteTmpAudioFile(filename):
    """
        delete tmp audio file from filename
            eg: 98fc7c46-7aa0-4dd7-aa9d-89fdf516abd6.mp3
    """
    global log
 
    log.info("deleteTmpAudioFile: filename=%s", filename)
 
    audioTmpFolder = app.config["AUDIO_TEMP_FOLDER"]
    # audioTmpFolder = "tmp/audio"
    log.info("audioTmpFolder=%s", audioTmpFolder)
    curFolderAbsPath = os.getcwd() #'/Users/crifan/dev/dev_root/company/xxx/projects/robotDemo/server'
    log.info("curFolderAbsPath=%s", curFolderAbsPath)
    audioTmpFolderFullPath = os.path.join(curFolderAbsPath, audioTmpFolder)
    log.info("audioTmpFolderFullPath=%s", audioTmpFolderFullPath)
    tempAudioFullname = os.path.join(audioTmpFolderFullPath, filename)
    #'/Users/crifan/dev/dev_root/company/xxx/projects/robotDemo/server/tmp/audio/2aba73d1-f8d0-4302-9dd3-d1dbfad44458.mp3'
    if os.path.isfile(tempAudioFullname):
        os.remove(tempAudioFullname)
        log.info("Ok to delete file %s", tempAudioFullname)
    else:
        log.warning("No need to remove for not exist file %s", tempAudioFullname)
 
# log.info("deleteTmpAudioFile=%s", deleteTmpAudioFile)
# log.info("deleteTmpAudioFile.name=%s", deleteTmpAudioFile.name)
# log.info("celeryApp.tasks=%s", celeryApp.tasks)
 
#----------------------------------------
# Rest API
#----------------------------------------
 
class RobotQaAPI(Resource):
 
    def processResponse(self, respDict):
        """
            process response dict before return
                generate audio for response text part
        """
        global log, gTempAudioFolder
 
        tmpAudioUrl = ""
 
        unicodeText = respDict["data"]["response"]["text"]
        log.info("unicodeText=%s")
 
        if not unicodeText:
            log.info("No response text to do audio synthesis")
            return jsonify(respDict)
 
        isOk, audioBinData, errMsg = doAudioSynthesis(unicodeText)
        if isOk:
            audioBinDataLen = len(audioBinData)
            log.info("audioBinDataLen=%s", audioBinDataLen)
 
            # 1. save mp3 binary data into tmp file
            newUuid = generateUUID()
            log.info("newUuid=%s", newUuid)
            tempFilename = newUuid + ".mp3"
            log.info("tempFilename=%s", tempFilename)
            if not gTempAudioFolder:
                createAudioTempFolder()
            tempAudioFullname = os.path.join(gTempAudioFolder, tempFilename)
            log.info("tempAudioFullname=%s", tempAudioFullname) # 'xxx/tmp/audio/2aba73d1-f8d0-4302-9dd3-d1dbfad44458.mp3'
 
            with open(tempAudioFullname, 'wb') as tmpAudioFp:
                log.info("tmpAudioFp=%s", tmpAudioFp)
                tmpAudioFp.write(audioBinData)
                tmpAudioFp.close()
                log.info("Saved %d bytes data into temp audio file %s", audioBinDataLen, tempAudioFullname)
 
            # 2. use celery to delay delete tmp file
            delayTimeToDelete = app.config["CELERY_DELETE_TMP_AUDIO_FILE_DELAY"]
            deleteTmpAudioFile.apply_async([tempFilename], countdown=delayTimeToDelete)
            log.info("Delay %s seconds to delete %s", delayTimeToDelete, tempFilename)
 
            # generate temp audio file url
            # /tmp/audio
            tmpAudioUrl = "http://%s:%d/tmp/audio/%s" % (
                app.config["FILE_URL_HOST"],
                app.config["FLASK_PORT"],
                tempFilename)
            log.info("tmpAudioUrl=%s", tmpAudioUrl)
            respDict["data"]["response"]["audioUrl"] = tmpAudioUrl
        else:
            log.warning("Fail to get synthesis audio for errMsg=%s", errMsg)
 
        log.info("respDict=%s", respDict)
        return jsonify(respDict)
 
    def get(self):
        respDict = {
            "code": 200,
            "message": "generate response ok",
            "data": {
                "input": "",
                "response": {
                    "text": "",
                    "audioUrl": ""
                },
                "control": "",
                "audio": {}
            }
        }
 
        parser = reqparse.RequestParser()
        # i want to hear the story of Baby Sister Says No
        parser.add_argument('input', type=str, help="input words")
        log.info("parser=%s", parser)
 
        parsedArgs = parser.parse_args()  #
        log.info("parsedArgs=%s", parsedArgs)
        if not parsedArgs:
            respDict["data"]["response"]["text"] = "Can not recognize input"
            return self.processResponse(respDict)
 
        inputStr = parsedArgs["input"]
        log.info("inputStr=%s", inputStr)
 
        if not inputStr:
            respDict["data"]["response"]["text"] = "Can not recognize parameter input"
            return self.processResponse(respDict)
 
        respDict["data"]["input"] = inputStr
 
        aiResult = QueryAnalyse(inputStr, aiContext)
        log.info("aiResult=%s", aiResult)
 
        if aiResult["response"]:
            respDict["data"]["response"]["text"] = aiResult["response"]
        if aiResult["control"]:
            respDict["data"]["control"] = aiResult["control"]
        log.info('respDict["data"]=%s', respDict["data"])
 
        audioFileIdStr = aiResult["mediaId"]
        log.info("audioFileIdStr=%s", audioFileIdStr)
 
        if audioFileIdStr:
            audioFileObjectId = ObjectId(audioFileIdStr)
            log.info("audioFileObjectId=%s", audioFileObjectId)
 
            if fsCollection.exists(audioFileObjectId):
                audioFileObj = fsCollection.get(audioFileObjectId)
                log.info("audioFileObj=%s", audioFileObj)
                encodedFilename = quote(audioFileObj.filename)
                log.info("encodedFilename=%s", encodedFilename)
 
                respDict["data"]["audio"] = {
                    "contentType": audioFileObj.contentType,
                    "name": audioFileObj.filename,
                    "size": audioFileObj.length,
                    "url": "http://%s:%d/files/%s/%s" %
                           (app.config["FILE_URL_HOST"],
                            app.config["FLASK_PORT"],
                            audioFileObj._id,
                            encodedFilename)
                }
                log.info("respDict=%s", respDict)
                return self.processResponse(respDict)
            else:
                log.info("Can not find file from id %s", audioFileIdStr)
                respDict["data"]["audio"] = {}
                return self.processResponse(respDict)
        else:
            log.info("Not response file id")
            respDict["data"]["audio"] = {}
            return self.processResponse(respDict)
 
 
class GridfsAPI(Resource):
 
    def get(self, fileId, fileName=None):
        log.info("fileId=%s, file_name=%s", fileId, fileName)
 
        fileIdObj = ObjectId(fileId)
        log.info("fileIdObj=%s", fileIdObj)
        if not fsCollection.exists({"_id": fileIdObj}):
            respDict = {
                "code": 404,
                "message": "Can not find file from object id %s" % (fileId),
                "data": {}
            }
            return jsonify(respDict)
 
        fileObj = fsCollection.get(fileIdObj)
        log.info("fileObj=%s, filename=%s, chunkSize=%s, length=%s, contentType=%s",
                 fileObj, fileObj.filename, fileObj.chunk_size, fileObj.length, fileObj.content_type)
        log.info("lengthInMB=%.2f MB", float(fileObj.length / (1024 * 1024)))
 
        fileBytes = fileObj.read()
        log.info("len(fileBytes)=%s", len(fileBytes))
 
        outputFilename = fileObj.filename
        if fileName:
            outputFilename = fileName
        log.info("outputFilename=%s", outputFilename)
 
        return sendFile(fileBytes, fileObj.content_type, outputFilename)
 
 
class TmpAudioAPI(Resource):
 
    def get(self, filename=None):
        global gTempAudioFolder
 
        log.info("TmpAudioAPI: filename=%s", filename)
 
        tmpAudioFullPath = os.path.join(gTempAudioFolder, filename)
        log.info("tmpAudioFullPath=%s", tmpAudioFullPath)
 
        if not os.path.isfile(tmpAudioFullPath):
            log.warning("Not exists file %s", tmpAudioFullPath)
            respDict = {
                "code": 404,
                "message": "Can not find temp audio file %s" % filename,
                "data": {}
            }
            return jsonify(respDict)
 
        fileSize = os.path.getsize(tmpAudioFullPath)
        log.info("fileSize=%s", fileSize)
 
        with open(tmpAudioFullPath, "rb") as tmpAudioFp:
            fileBytes = tmpAudioFp.read()
            log.info("read out fileBytes length=%s", len(fileBytes))
 
            outputFilename = filename
            # contentType = "audio/mp3" # chrome use this
            contentType = "audio/mpeg" # most common and compatible
 
            return sendFile(fileBytes, contentType, outputFilename)
 
 
api.add_resource(PlaySongAPI, '/playsong', endpoint='playsong')
api.add_resource(RobotQaAPI, '/qa', endpoint='qa')
api.add_resource(GridfsAPI, '/files/<fileId>', '/files/<fileId>/<fileName>', endpoint='gridfs')
api.add_resource(TmpAudioAPI, '/tmp/audio/<filename>', endpoint='TmpAudio')
 
if __name__ == "__main__":
    app.run(
        host=app.config["FLASK_HOST"],
        port=app.config["FLASK_PORT"],
        debug=app.config["DEBUG"]
    )

config.py

class BaseConfig(object):
    DEBUG = False
 
    FLASK_PORT = 3xxxx
    # FLASK_HOST = "127.0.0.1"
    # FLASK_HOST = "localhost"
    # Note:
    # 1. to allow external access this server
    # 2. make sure here gunicorn parameter "bind" is same with here !!!
    FLASK_HOST = "0.0.0.0"
 
    # Flask app name
    FLASK_APP_NAME = "RobotQA"
 
    # Log File
    LOG_FILE_FILENAME = "logs/" + FLASK_APP_NAME + ".log"
    LOG_FORMAT = "[%(asctime)s %(levelname)s %(filename)s:%(lineno)d %(funcName)s] %(message)s"
    LOF_FILE_MAX_BYTES = 2*1024*1024
    LOF_FILE_BACKUP_COUNT = 10
 
    # reuturn file url's host
    # FILE_URL_HOST = FLASK_HOST
    FILE_URL_HOST = "127.0.0.1"
 
    # Audio Synthesis / TTS
    # BAIDU_APP_ID = "1xxx3"
    BAIDU_API_KEY = "Sxxxxz"
    BAIDU_SECRET_KEY = "4xxxxxa"
 
    AUDIO_TEMP_FOLDER = "tmp/audio"
 
    # CELERY_TASK_NAME = "Celery_" + FLASK_APP_NAME
    # CELERY_BROKER_URL = "
redis://localhost
"
    CELERY_BROKER_URL = "
redis://localhost:6379/0
"
    # CELERY_RESULT_BACKEND = "
redis://localhost:6379/0
" # current not use result
    CELERY_DELETE_TMP_AUDIO_FILE_DELAY = 60 * 2 # two minutes
 
class DevelopmentConfig(BaseConfig):
    # DEBUG = True
    # for local dev, need access remote mongodb
    MONGODB_HOST = "47.xx.xx.xx"
    FILE_URL_HOST = "127.0.0.1"
 
 
class ProductionConfig(BaseConfig):
    FILE_URL_HOST = "47.xx.xx.xx"

前端：

main.html

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

<!doctype html>
<html lang="en">
  <head>
    <!-- Required meta tags -->
    <meta charset="utf-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <!-- <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> -->
 
    <!-- Bootstrap CSS -->
    <link rel="stylesheet" href="css/bootstrap-3.3.1/bootstrap.css">
    <!-- <link rel="stylesheet" href="css/highlightjs_default.css"> -->
    <link rel="stylesheet" href="css/highlight_atom-one-dark.css">
    <!-- <link rel="stylesheet" href="css/highlight_monokai-sublime.css"> -->
    <link rel="stylesheet" href="css/bootstrap3_player.css">
    <link rel="stylesheet" href="css/main.css">
 
    <title>xxx英语智能机器人演示</title>
 
    <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
    <!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
    <!--[if lt IE 9]>
      <script src="
https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js
"></script>
      <script src="
https://oss.maxcdn.com/respond/1.4.2/respond.min.js
"></script>
    <![endif]-->
  </head>
    <div class="logo text-center">
        <img class="mb-4" src="img/logo_transparent_183x160.png" alt="xxx Logo" width="72" height="72">
    </div>
 
    <h2>xxx英语智能机器人</h2>
    <h4>xxx Bot for Kids</h4>
 
    <div class="panel panel-primary">
      <div class="panel-heading">
        <h3 class="panel-title">Input</h3>
      </div>
      <div class="panel-body">
          <ul class="list-group">
            <li class="list-group-item">
                <h3 class="panel-title">Input Example</h3>
                <ul>
                    <li>i want to hear the story of apple</li>
                    <li>say story apple</li>
                    <li>say apple</li>
                    <li>next episode</li>
                    <li>next</li>
                    <li>i want you stop reading</li>
                    <li>stop reading</li>
                    <li>please go on</li>
                    <li>go on</li>
                </ul>
            </li>
 
            <li class="list-group-item">
                <!--
                <form>
                  <div class="form-group input_request">
                    <input id="inputRequest" type="text" class="form-control" placeholder="请输入您要说的话" value="i want to hear the story of apple">
                  </div>
                  <div class="form-group">
                    <button id="submitInput" type="submit" class="btn btn-primary btn-lg col-sm-3 btn-block">提交</button>
                    <button id="clearInput" class="btn btn-secondary btn-lg col-sm-3" type="button">清除</button>
                    <button id="clearInput" class="btn btn-info btn-lg col-sm-3 btn-block" type="button">清除</button>
                  </div>
                </form>
                 -->
 
                <div class="row">
                  <div class="col-lg-12">
                    <div class="input-group">
                      <input id="inputRequest" type="text" class="form-control" placeholder="请输入您要说的话" value="say apple">
                      <span class="input-group-btn">
                        <button id="submitInput" type="submit" class="btn btn-primary" type="button">提交</button>
                      </span>
                    </div><!-- /input-group -->
                  </div><!-- /.col-lg-6 -->
                </div>
            </li>
          </ul>
 
      </div>
    </div>
 
<!--
  <div class="input_example bg-light box-shadow">
    <h5>Input Example:</h5>
    <ul>
        <li>i want to hear the story of apple</li>
        <li>next episode</li>
        <li>i want you stop reading</li>
        <li>please go on</li>
    </ul>
  </div>
-->
 
    <div class="panel panel-success">
      <div class="panel-heading">
        <h3 class="panel-title">Output</h3>
      </div>
      <div class="panel-body">
 
        <div id="response_text" class="alert alert-success" role="alert">
          <p>here will output response text</p>
 
          <div class="response_text_audio_player">
            <audio
                controls
                data-info-att="response text's audio">
                <source src="" type="audio/mpeg" />
            </audio>
          </div>
        </div>
 
        <div class="audio_player col-md-12 col-xs-12">
            <audio
                controls
                data-info-att="">
                <source src="" type="" />
            </audio>
        </div>
 
        <!--
        <div id="audio_play_prevented" class="alert alert-warning alert-dismissible col-md-12 col-xs-12">
          <button type = "button" class="close" data-dismiss = "alert">x</button>
          <strong>Notice:</strong> Auto play prevented, please mannually click above play button to play
        </div>
        -->
 
    <!--
        <div id="response_json" class="bg-light box-shadow">
            <pre><code class="json">here will output response</code></pre>
        </div>
    -->
 
    <!--
        <pre id="response_json">
            <code class="json">here will output response</code>
        </pre>
    -->
 
        <div id="response_json">
          <code class="json">here will output response</code>
        </div>
 
      </div>
    </div>
 
 
    <!-- Optional JavaScript -->
    <!-- jQuery first, then Popper.js, then Bootstrap JS -->
    <!-- <script src="js/jquery-3.3.1.js"></script> -->
    <!-- <script src="
https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js
"></script> -->
    <script src="js/jquery/1.11.1/jquery-1.11.1.js"></script>
    <!-- <script src="
https://code.jquery.com/jquery-3.3.1.slim.min.js
" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script> -->
    <!-- <script src="
https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.0/umd/popper.min.js
" integrity="sha384-cs/chFZiN24E4KMATLdqdvsezGxaGsi4hLGOzlXwp5UZB1LY//20VyM2taTB4QvJ" crossorigin="anonymous"></script> -->
    <script src="js/popper-1.14.0/popper.min.js"></script>
    <!-- <script src="js/bootstrap.js"></script> -->
    <script src="js/bootstrap-3.3.1/bootstrap.min.js"></script>
    <script src="js/highlight.js"></script>
    <script src="js/bootstrap3_player.js"></script>
    <script src="js/main.js"></script>
  </body>
</html>

main.css

.logo{
    padding: 10px 2%;
}
 
h2{
    text-align: center;
    margin-top: 10px;
    margin-bottom: 10px;
}
 
h4{
    text-align: center;
    margin-top: 0px;
    margin-bottom: 20px;
}
 
form {
    text-align: center;
}
 
.form-group {
    /*padding-left: 1%;*/
    /*padding-right: 1%;*/
}
 
.input_example {
    /*padding: 1px 1%;*/
}
 
#response_json {
    /*width: 96%;*/
    height: 380px;
    border-radius: 10px;
    padding-top: 20px;
 
    /*padding-left: 1%;*/
    /*padding-right: 1%;*/
}
 
#response_text {
    text-align: center !important;
    font-size: 14px;
 
    /* padding-left: 4%;
    padding-right: 4%; */
}
 
/*pre {*/
    /*padding-left: 2%;*/
    /*padding-right: 2%;*/
/*}*/
 
.audio_player {
    margin-top: 10px;
    margin-bottom: 5px;
    text-align: center;
 
    padding-left: 0 !important;
    padding-right: 0 !important;
}
 
.response_text_audio_player{
    /* visibility: hidden; */
 
    width: 100%;
    /* height: 1px !important; */
    height: 100px;
}
 
/* #audio_play_prevented {
    display: none;
} */

main.js

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

if (!String.format) {
    String.format = function(format) {
      var args = Array.prototype.slice.call(arguments, 1);
      return format.replace(/{(\d+)}/g, function(match, number) {
        return typeof args[number] != 'undefined'
          ? args[number]
          : match
        ;
      });
    };
}
 
 
$(document).ready(function(){
 
    $('[data-toggle="tooltip"]').tooltip();
 
    // when got response json, update to highlight it
    function updateHighlight() {
        console.log("updateHighlight");
 
        $('pre code').each(function(i, block) {
            hljs.highlightBlock(block);
        });
    }
 
    updateHighlight();
 
    $("#submitInput").click(function(event){
        event.preventDefault();
 
        ajaxSubmitInput();
    });
     
    function ajaxSubmitInput() {
        console.log("ajaxSubmitInput");
 
        var inputRequest = $("#inputRequest").val();
        console.log("inputRequest=%s", inputRequest);
        var encodedInputRequest = encodeURIComponent(inputRequest)
        console.log("encodedInputRequest=%s", encodedInputRequest);
 
        // var qaUrl = "http://127.0.0.1:32851/qa";
        var qaUrl = "http://xxx:32851/qa";
        console.log("qaUrl=%s", qaUrl);
        var fullQaUrl = qaUrl + "?input=" + encodedInputRequest
        console.log("fullQaUrl=%s", fullQaUrl);
 
        $.ajax({
            type : "GET",
            url : fullQaUrl,
            success: function(respJsonObj){
                console.log("respJsonObj=%o", respJsonObj);
 
                // var respnJsonStr = JSON.stringify(respJsonObj);
                //var beautifiedJespnJsonStr = JSON.stringify(respJsonObj, null, '\t');
                var beautifiedJespnJsonStr = JSON.stringify(respJsonObj, null, 2);
                console.log("beautifiedJespnJsonStr=%s", beautifiedJespnJsonStr);
                var prevOutputValue = $('#response_json').text();
                console.log("prevOutputValue=%o", prevOutputValue);
                var afterOutputValue = $('#response_json').html('<pre><code class="json">' + beautifiedJespnJsonStr + "</code></pre>");
                console.log("afterOutputValue=%o", afterOutputValue);
                updateHighlight();
 
                var curResponseDict = respJsonObj["data"]["response"];
                console.log("curResponseDict=%s", curResponseDict);
 
                var curResponseText = curResponseDict["text"];
                console.log("curResponseText=%s", curResponseText);
                $('#response_text p').text(curResponseText);
 
                var curResponseAudioUrl = curResponseDict["audioUrl"];
                console.log("curResponseAudioUrl=%s", curResponseAudioUrl);
                if (curResponseAudioUrl) {
                    console.log("now play the response text's audio %s", curResponseAudioUrl);
 
                    var respTextAudioObj = $(".response_text_audio_player audio")[0];
                    console.log("respTextAudioObj=%o", respTextAudioObj);
 
                    $(".response_text_audio_player .col-sm-offset-1").text(curResponseText);
 
                    $(".response_text_audio_player audio source").attr("src", curResponseAudioUrl);
                    respTextAudioObj.load();
                    console.log("has load respTextAudioObj=%o", respTextAudioObj);
 
                    respTextAudioObj.onended = function() {
                        console.log("play response text's audio ended");
 
                        var dataControl = respJsonObj["data"]["control"];
                        console.log("dataControl=%o", dataControl);
 
                        var audioElt = $(".audio_player audio");
                        console.log("audioElt=%o", audioElt);
                        var audioObject = audioElt[0];
                        console.log("audioObject=%o", audioObject);
 
                        var playAudioPromise = undefined;
 
                        if (dataControl === "stop") {
                            //audioObject.stop();
                            audioObject.pause();
                            console.log("has pause audioObject=%o", audioObject);
                        } else if (dataControl === "continue") {
                            // // audioObject.load();
                            // audioObject.play();
                            // // audioObject.continue();
                            // console.log("has load and play audioObject=%o", audioObject);
 
                            playAudioPromise = audioObject.play();
                        }
 
                        if (respJsonObj["data"]["audio"]) {
                            var audioDict = respJsonObj["data"]["audio"];
                            console.log("audioDict=%o", audioDict);
 
                            var audioName = audioDict["name"];
                            console.log("audioName=%o", audioName);
                            var audioSize = audioDict["size"];
                            console.log("audioSize=%o", audioSize);
                            var audioType = audioDict["contentType"];
                            console.log("audioType=%o", audioType);
                            var audioUrl = audioDict["url"];
                            console.log("audioUrl=%o", audioUrl);
 
                            var isAudioEmpty = (!audioName && !audioSize && !audioType && !audioUrl)
                            console.log("isAudioEmpty=%o", isAudioEmpty);
 
                            if (isAudioEmpty) {
                                // var pauseAudioResult = audioObject.pause();
                                // console.log("pauseAudioResult=%o", pauseAudioResult);
 
                                // audioElt.attr("data-info-att", "");
                                // $(".col-sm-offset-1").text("");
                            } else {
                                if (audioName) {
                                    audioElt.attr("data-info-att", audioName);
                                    $(".audio_player .col-sm-offset-1").text(audioName);
                                }
 
                                if (audioType) {
                                    $(".audio_player audio source").attr("type", audioType);
                                }
 
                                if (audioUrl) {
                                    $(".audio_player audio source").attr("src", audioUrl);
 
                                    audioObject.load();
                                    console.log("has load audioObject=%o", audioObject);
                                }
 
                                console.log("dataControl=%s,audioUrl=%s", dataControl, audioUrl);
 
                                if ((dataControl === "") && audioUrl) {
                                    playAudioPromise = audioObject.play();
                                } else if ((dataControl === "next") && (audioUrl)) {
                                    playAudioPromise = audioObject.play();
                                }
                            }
                        } else {
                            console.log("empty respJsonObj['data']['audio']=%o", respJsonObj["data"]["audio"]);
                        }
 
                        if (playAudioPromise !== undefined) {
                            playAudioPromise.then(() => {
                                // Auto-play started
                                console.log("Auto paly audio started, playAudioPromise=%o", playAudioPromise);
 
                                //for debug
                                // showAudioPlayPreventedNotice();
                            }).catch(error => {
                                // Auto-play was prevented
                                // Show a UI element to let the user manually start playback
                                showAudioPlayPreventedNotice();
 
                                console.error("play audio promise error=%o", error);
                                //NotAllowedError: The request is not allowed by the user agent or the platform in the current context, possibly because the user denied permission.
                            });
                        }
                    }
 
                    respTextAudioPromise = respTextAudioObj.play();
                    // console.log("respTextAudioPromise=%o", respTextAudioPromise);
                    if (respTextAudioPromise !== undefined) {
                        respTextAudioPromise.then(() => {
                            // Auto-play started
                            console.log("Auto paly audio started, respTextAudioPromise=%o", respTextAudioPromise);
                        }).catch(error => {
                            // Auto-play was prevented
                            // Show a UI element to let the user manually start playback
 
                            console.error("play response text's audio promise error=%o", error);
                            //NotAllowedError: The request is not allowed by the user agent or the platform in the current context, possibly because the user denied permission.
                        });
                    }
                }
 
            },
            error : function(jqXHR, textStatus, errorThrown) {
                console.error("jqXHR=%o, textStatus=%s, errorThrown=%s", jqXHR, textStatus, errorThrown);
                var errDetail = String.format("status={0}\n\tstatusText={1}\n\tresponseText={2}", jqXHR.status, jqXHR.statusText, jqXHR.responseText);
                var errStr = String.format("GET: {0}\nERROR:\t{1}", fullQaUrl,  errDetail);
                // $('#response_text p').text(errStr);
                var responseError = $('#response_json').html('<pre><code class="html">' + errStr + "</code></pre>");
                console.log("responseError=%o", responseError);
                updateHighlight();
            }
        });
    }
 
    function showAudioPlayPreventedNotice(){
        console.log("showAudioPlayPreventedNotice");
        // var prevDisplayValue = $("#audio_play_prevented").css("display");
        // console.log("prevDisplayValue=%o", prevDisplayValue);
        // $("#audio_play_prevented").css({"display":"block"});
 
        var curAudioPlayPreventedNoticeEltHtml = $("#audio_play_prevented").html();
        console.log("curAudioPlayPreventedNoticeEltHtml=%o", curAudioPlayPreventedNoticeEltHtml);
        if (curAudioPlayPreventedNoticeEltHtml !== undefined) {
            console.log("already exist audio play prevented notice, so not insert again");
        } else {
            var audioPlayPreventedNoticeHtml = '<div id="audio_play_prevented" class="alert alert-warning alert-dismissible col-md-12 col-xs-12"><button type = "button" class="close" data-dismiss = "alert">x</button><strong>Notice:</strong> Auto play prevented, please mannually click above play button to play</div>';
            console.log("audioPlayPreventedNoticeHtml=%o", audioPlayPreventedNoticeHtml);
            $(".audio_player").append(audioPlayPreventedNoticeHtml);    
        }
    }
 
    $("#clearInput").click(function(event){
        // event.preventDefault();
        console.log("event=%o", event);
        $('#inputRequest').val("");
 
        $('#response_json').html('<pre><code class="json">here will output response</code></pre>');
        updateHighlight();
    });
});

效果：

点击提交后，后端生成临时的mp3的文件，返回到前端，前端可以正常加载并播放：

转载请注明：在路上 » 【已解决】用和适在线的语言合成接口把文字转语音

Post Views: 1,604

【已解决】用和适在线的语言合成接口把文字转语音

与本文相关的文章

Hi，您需要填写昵称和邮箱！

与本文相关的文章

Hi，您需要填写昵称和邮箱！

订阅在路上