最新消息:20210816 当前crifan.com域名已被污染,为防止失联,请关注(页面右下角的)公众号

【已解决】用和适在线的语言合成接口把文字转语音

接口 crifan 1552浏览 0评论
之前已经做好了产品demo:
【已解决】把本地前端页面部署到在线Flask环境中
然后需要去使用在线语音接口,把文字转换为语音
然后集成到产品demo中。
所以不仅要能用语音接口,且要整合进来。
【调研】国内可用的效果好的英语的在线Web语音合成API接口
先去用目前发现的,相对来说最好用的:
百度的语音合成api
刚发现:
在线文字转语音|免费生成语音-百度广播开放平台
还可以生成临时的语音文件,以url的形式输出
但是有个缺点:
好像必须是:
标题也要有,内容也要有,才能生成
然后先去注册百度开发者账号:
【记录】注册百度开发者账号并创建应用
接着:
百度语音-文档中心
详细看官网文档:
百度语音合成-开发文档
“浏览器跨域
目前合成接口支持浏览器跨域。
跨域demo示例: https://github.com/Baidu-AIP/SPEECH-TTS-CORS
由于获取token的接口不支持浏览器跨域。因此需要您从服务端获取或者每个30天手动输入更新。”
Baidu-AIP/SPEECH-TTS-CORS: 百度语音 语音合成 跨域demo以及支持库
SPEECH-TTS-CORS/baidu_tts_cors.js at master · Baidu-AIP/SPEECH-TTS-CORS
调用流程示例
去获取token
然后去用浏览器打开:
https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id=SNjsggdYDNWtnlbKhxsPLcaz&client_secret=47d7c02dxxxxxxxxxxxxxxe7ba
获得了token:
1
2
3
4
5
6
7
8
{
    "access_token": "24.569b3b5b470938a522ce60d2e2ea2506.2592000.1528015602.282335-11192483",
    "session_key": "9mzdDoR4p/oexxx0Yp9VoSgFCFOSGEIA==",
    "scope": "public audio_voice_assistant_get audio_tts_post wise_adapt lebo_resource_base lightservice_public hetu_basic lightcms_map_poi kaidian_kaidian ApsMisTest_Test权限 vis-classify_flower lpq_开放 cop_helloScope ApsMis_fangdi_permission smartapp_snsapi_base",
    "refresh_token": "25.5axxxx5-xxx3",
    "session_secret": "12xxxa",
    "expires_in": 2592000
}
再去调用url:
http://tsn.baidu.com/text2audio?lan=zh&ctp=1&cuid=xxx_robot&tok=24.56xxx3&vol=9&per=0&spd=5&pit=5&tex=as+a+book-collector%2c+i+have+the+story+you+just+want+to+listen!
获取合成后的mp3:
很明显,此处是直接返回mp3的内容的
而不是希望的临时的mp3的临时的url
而之前的:
http://developer.baidu.com/vcast
是可以返回临时mp3的url的
比如:
http://boscdn.bpc.baidu.com/v1/developer/df29f25f-6003-4e9f-9f80-15be4babf831.mp3
搜了下:
boscdn bpc baidu.com
boscdn.bpc.baidu.com domain information – Reason Core Security Labs
ecomfe/edpx-bos: edp的bos扩展
感觉是百度的一个cdn的服务器
然后还有js接口上传内容上去,生成临时url的
但是貌似是百度内部自己用的?
接着需要:
1.最好把百度的token,弄成那个永久的,或者至少是1年的,而不是现在的1个月的
2.最好把生成的mp3的文件,弄成一个url可以返回给用户的
感觉需要是用自己的flask的rest的api中,封装百度的接口,给外界一个统一的接口,返回mp3的url,然后是有临时时限的  -》 那内部可以考虑把mp3保存到 /tmp 或者是redis然后设置一个expire时限?
先去弄永久的token的事情:
【无法解决】获取百度的永久的或长期比如1年的有效的access token
而关于文档,从:
SPEECH-TTS-CORS/demo.html at master · Baidu-AIP/SPEECH-TTS-CORS
发现了:
百度语音合成 REST API
也找到了Python文档:
Python SDK文档
不过,对于,想要去模拟,当access_token失效时,百度接口会返回什么
突然想到,可以用刚才已经被refresh_token刷新后,而失效的之前的access_token,去调用看看,返回什么
结果之前的token,竟然还能用:
那算了,把token值随便改一下,去尝试模拟一个无效的token
结果返回:
1
2
3
4
5
6
7
{
    "err_detail": "Access token invalid or no longer valid",
    "err_msg": "authentication failed.",
    "err_no": 502,
    "err_subcode": 50004,
    "tts_logid": 1007366076
}
REST API文档
错误码解释
错误码
含义
500
不支持输入
501
输入参数不正确
502
token验证失败
503
合成后端错误
百度 err_subcode 50004
百度 authentication failed 50004
产品介绍_百度云推送_免费专业最精准的移动推送服务平台
50004
Passport Not Login
未登录百度账号passport
400
那目前就可以暂定为如下思路了:
用Flask去封装百度的语音合成的api
然后内部使用Python的SDK(用pip去安装)
如果返回dict,且发现是err_no是502的话,则确定是token无效或过期
则使用refresh_token去重新刷新获得有效的token
重新再去尝试一次
然后正常的话,返回得到mp3的数据
再考虑如何处理,放到哪里,生成一个外部可以直接访问的url
此处,参考:
SPEECH-TTS-CORS/demo.html at master · Baidu-AIP/SPEECH-TTS-CORS
发现是:
1
2
3
4
5
6
7
8
        // 参数含义请参考 https://ai.baidu.com/docs#/TTS-API/41ac79a6
        audio = btts({
。。。
            onSuccess: function(htmlAudioElement) {
 
                audio = htmlAudioElement;
                playBtn.innerText = '播放';
            },
发现是btts直接返回了audio这个html的element?
去看:
SPEECH-TTS-CORS/baidu_tts_cors.js at master · Baidu-AIP/SPEECH-TTS-CORS
发现是:
1
2
document.body.append(audio);
audio.setAttribute('src', URL.createObjectURL(xhr.response));
好像是创建了本地的文件了?
去搜:
URL.createObjectURL
URL.createObjectURL() – Web API 接口 | MDN
Blob – Web API 接口 | MDN
URL.createObjectURL和URL.revokeObjectURL – 流浪猫の窝 – 博客园
“File对象,就是一个文件,比如我用input type=”file”标签来上传文件,那么里面的每个文件都是一个File对象.
Blob对象,就是二进制数据,比如通过new Blob()创建的对象就是Blob对象.又比如,在XMLHttpRequest里,如果指定responseType为blob,那么得到的返回值也是一个blob对象.”
所以此处就是返回了mp3的二进制数据,是blob格式,传递给createObjectURL,生成了临时的文件,可以用来播放了
-》那么我后续封装出来的接口,倒是也可以考虑支持两种:
  • 直接返回mp3的url
  • 返回mp3的二进制数据文件
而返回的类型,可以通过输入参数指定
然后就是去:
【已解决】后台用Flask封装百度的语音合成功能对外提供REST API接口
接着就可以继续去:
【已解决】Flask中如何保存临时文件且可以指定有效期
接着就是去:
前端web页面中把相关的之前输出text的接口,更新为,解析返回的mp3的(临时文件)的url,以及调用播放器播放出来:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
var curResponseDict = respJsonObj["data"]["response"];
console.log("curResponseDict=%s", curResponseDict);
 
var curResponseText = curResponseDict["text"];
console.log("curResponseText=%s", curResponseText);
$('#response_text p').text(curResponseText);
 
var curResponseAudioUrl = curResponseDict["audioUrl"];
console.log("curResponseAudioUrl=%s", curResponseAudioUrl);
if (curResponseAudioUrl) {
    console.log("now play the response text's audio %s", curResponseAudioUrl);
 
    var respTextAudioObj = $(".response_text_audio_player audio")[0];
    console.log("respTextAudioObj=%o", respTextAudioObj);
 
    $(".response_text_audio_player .col-sm-offset-1").text(curResponseText);
 
    $(".response_text_audio_player audio source").attr("src", curResponseAudioUrl);
    respTextAudioObj.load();
    console.log("has load respTextAudioObj=%o", respTextAudioObj);
 
    respTextAudioPromise = respTextAudioObj.play();
    // console.log("respTextAudioPromise=%o", respTextAudioPromise);
    if (respTextAudioPromise !== undefined) {
        respTextAudioPromise.then(() => {
            // Auto-play started
            console.log("Auto paly audio started, respTextAudioPromise=%o", respTextAudioPromise);
        }).catch(error => {
            // Auto-play was prevented
            // Show a UI element to let the user manually start playback
 
            console.error("play response text's audio promise error=%o", error);
            //NotAllowedError: The request is not allowed by the user agent or the platform in the current context, possibly because the user denied permission.
        });
    }
}
已经可以去播放返回的text的audio了:
然后等个1秒左右,再播放被点播的文件
所以先要去解决:
【已解决】html网页中如何用js或jquery在音频文件播放后去触发其他动作
然后故意再去优化,当出错时显示错误信息,期间:
【已解决】js中如何实现字符串拼接或格式化
以及:
【已解决】js的jquery的ajax的get返回的error错误的详细信息
【总结】
最后实现了想要的效果
后端:
Flask的REST API和百度接口的初始化和调用:
app.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
from flask import Flask
from flask import jsonify
from flask_restful import Resource, Api, reqparse
import logging
from logging.handlers import RotatingFileHandler
from bson.objectid import ObjectId
from flask import send_file
import os
import io
import re
from urllib.parse import quote
import json
import uuid
from flask_cors import CORS
import requests
 
from celery import Celery
 
################################################################################
# Global Definitions
################################################################################
"""
    500    不支持输入
    501    输入参数不正确
    502    token验证失败
    503    合成后端错误
"""
BAIDU_ERR_NOT_SUPPORT_PARAM = 500
BAIDU_ERR_PARAM_INVALID = 501
BAIDU_ERR_TOKEN_INVALID = 502
BAIDU_ERR_BACKEND_SYNTHESIS_FAILED = 503
 
################################################################################
# Global Variables
################################################################################
log = None
app = None
"""
{
    "access_token": "24.569bcccccccc11192484",
    "session_key": "9mxxxxxxEIB==",
    "scope": "public audio_voice_assistant_get audio_tts_post wise_adapt lebo_resource_base lightservice_public hetu_basic lightcms_map_poi kaidian_kaidian ApsMisTest_Test权限 vis-classify_flower lpq_开放 cop_helloScope ApsMis_fangdi_permission smartapp_snsapi_base",
    "refresh_token": "25.6acfxxxx2483",
    "session_secret": "121xxxxxfa",
    "expires_in": 2592000
}
"""
gCurBaiduRespDict = {} # get baidu token resp dict
gTempAudioFolder = ""
 
################################################################################
# Global Function
################################################################################
 
def generateUUID(prefix = ""):
    generatedUuid4 = uuid.uuid4()
    generatedUuid4Str = str(generatedUuid4)
    newUuid = prefix + generatedUuid4Str
    return newUuid
 
#----------------------------------------
# Audio Synthesis / TTS
#----------------------------------------
 
def createAudioTempFolder():
    """create foler to save later temp audio files"""
    global log, gTempAudioFolder
 
    # init audio temp folder for later store temp audio file
    audioTmpFolder = app.config["AUDIO_TEMP_FOLDER"]
    log.info("audioTmpFolder=%s", audioTmpFolder)
    curFolderAbsPath = os.getcwd() #'/Users/crifan/dev/dev_root/company/xxx/projects/robotDemo/server'
    log.info("curFolderAbsPath=%s", curFolderAbsPath)
    audioTmpFolderFullPath = os.path.join(curFolderAbsPath, audioTmpFolder)
    log.info("audioTmpFolderFullPath=%s", audioTmpFolderFullPath)
    if not os.path.exists(audioTmpFolderFullPath):
        os.makedirs(audioTmpFolderFullPath)
        log.info("++++++ Created tmp audio folder: %s", audioTmpFolderFullPath)
 
    gTempAudioFolder = audioTmpFolderFullPath
    log.info("gTempAudioFolder=%s", gTempAudioFolder)
 
def initAudioSynthesis():
    """
    init audio synthesis related:
        init token
    :return:
    """
 
    getBaiduToken()
    createAudioTempFolder()
 
 
def getBaiduToken():
    """get baidu token"""
    global app, log, gCurBaiduRespDict
 
    getBaiduTokenUrlTemplate = "
https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id=%s&client_secret=%s
"
    getBaiduTokenUrl = getBaiduTokenUrlTemplate % (app.config["BAIDU_API_KEY"], app.config["BAIDU_SECRET_KEY"])
    log.info("getBaiduTokenUrl=%s", getBaiduTokenUrl) #
https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials&client_id=xxxz&client_secret=xxxx
    resp = requests.get(getBaiduTokenUrl)
    log.info("resp=%s", resp)
    respJson = resp.json()
    log.info("respJson=%s", respJson) #{'access_token': '24.xxx.2592000.1528609320.282335-11192484', 'session_key': 'xx+I/xx+6KwgZmw==', 'scope': 'public audio_voice_assistant_get audio_tts_post wise_adapt lebo_resource_base lightservice_public hetu_basic lightcms_map_poi kaidian_kaidian ApsMisTest_Test权限 vis-classify_flower lpq_开放 cop_helloScope ApsMis_fangdi_permission smartapp_snsapi_base', 'refresh_token': '25.xxx', 'session_secret': 'cxxx6e', 'expires_in': 2592000}
    if resp.status_code == 200:
        gCurBaiduRespDict = respJson
        log.info("get baidu token resp: %s", gCurBaiduRespDict)
    else:
        log.error("error while get baidu token: %s", respJson)
        #{'error': 'invalid_client', 'error_description': 'Client authentication failed'}
        #{'error': 'invalid_client', 'error_description': 'unknown client id'}
        #{'error': 'unsupported_grant_type', 'error_description': 'The authorization grant type is not supported'}
 
def refreshBaiduToken():
    """refresh baidu token when current token invalid"""
    global app, log, gCurBaiduRespDict
    if gCurBaiduRespDict:
        refreshBaiduTokenUrlTemplate = "
https://openapi.baidu.com/oauth/2.0/token?grant_type=refresh_token&refresh_token=%s&client_id=%s&client_secret=%s
"
        refreshBaiduTokenUrl = refreshBaiduTokenUrlTemplate % (gCurBaiduRespDict["refresh_token"], app.config["BAIDU_API_KEY"], app.config["BAIDU_SECRET_KEY"])
        log.info("refreshBaiduTokenUrl=%s", refreshBaiduTokenUrl) #
https://openapi.baidu.com/oauth/2.0/token?grant_type=refresh_token&refresh_token=25.1xxxx.xx.1841379583.282335-11192483&client_id=Sxxxxz&client_secret=47dxxxxa
        resp = requests.get(refreshBaiduTokenUrl)
        log.info("resp=%s", resp)
        respJson = resp.json()
        log.info("respJson=%s", respJson)
        if resp.status_code == 200:
            gCurBaiduRespDict = respJson
            log.info("Ok to refresh baidu token response: %s", gCurBaiduRespDict)
        else:
            log.error("error while refresh baidu token: %s", respJson)
    else:
        log.error("Can't refresh baidu token for previous not get token")
 
 
def baiduText2Audio(unicodeText):
    """call baidu text2audio to generate mp3 audio from text"""
    global app, log, gCurBaiduRespDict
    log.info("baiduText2Audio: unicodeText=%s", unicodeText)
 
    isOk = False
    mp3BinData = None
    errNo = 0
    errMsg = "Unknown error"
 
    if not gCurBaiduRespDict:
        errMsg = "Need get baidu token before call text2audio"
        return isOk, mp3BinData, errNo, errMsg
 
    utf8Text = unicodeText.encode("utf-8")
    log.info("utf8Text=%s", utf8Text)
    encodedUtf8Text = quote(unicodeText)
    log.info("encodedUtf8Text=%s", encodedUtf8Text)
 
    #
  
http://ai.baidu.com/docs#/TTS-API/top
    tex = encodedUtf8Text #合成的文本,使用UTF-8编码。小于512个中文字或者英文数字。(文本在百度服务器内转换为GBK后,长度必须小于1024字节)
    tok = gCurBaiduRespDict["access_token"] #开放平台获取到的开发者access_token(见上面的“鉴权认证机制”段落)
    cuid = app.config["FLASK_APP_NAME"] #用户唯一标识,用来区分用户,计算UV值。建议填写能区分用户的机器 MAC 地址或 IMEI 码,长度为60字符以内
    ctp = 1 #客户端类型选择,web端填写固定值1
    lan = "zh" #固定值zh。语言选择,目前只有中英文混合模式,填写固定值zh
    spd = 5 #语速,取值0-9,默认为5中语速
    pit = 5 #音调,取值0-9,默认为5中语调
    # vol = 5 #音量,取值0-9,默认为5中音量
    vol = 9
    per = 0 #发音人选择, 0为普通女声,1为普通男生,3为情感合成-度逍遥,4为情感合成-度丫丫,默认为普通女声
    getBaiduSynthesizedAudioTemplate = "
http://tsn.baidu.com/text2audio?lan=%s&ctp=%s&cuid=%s&tok=%s&vol=%s&per=%s&spd=%s&pit=%s&tex=%s
"
    getBaiduSynthesizedAudioUrl = getBaiduSynthesizedAudioTemplate % (lan, ctp, cuid, tok, vol, per, spd, pit, tex)
    log.info("getBaiduSynthesizedAudioUrl=%s", getBaiduSynthesizedAudioUrl) #
http://tsn.baidu.com/text2audio?lan=zh&ctp=1&cuid=RobotQA&tok=24.5f056b15e9d5da63256bac89f64f61b5.2592000.1528609737.282335-11192483&vol=5&per=0&spd=5&pit=5&tex=as%20a%20book-collector%2C%20i%20have%20the%20story%20you%20just%20want%20to%20listen%21
    resp = requests.get(getBaiduSynthesizedAudioUrl)
    log.info("resp=%s", resp)
    respContentType = resp.headers["Content-Type"]
    respContentTypeLowercase = respContentType.lower() #'audio/mp3'
    log.info("respContentTypeLowercase=%s", respContentTypeLowercase)
    if respContentTypeLowercase == "audio/mp3":
        mp3BinData = resp.content
        log.info("resp content is binary data of mp3, length=%d", len(mp3BinData))
        isOk = True
        errMsg = ""
    elif respContentTypeLowercase == "application/json":
        """                        
            {
              'err_detail': 'Invalid params per or lan!',
              'err_msg': 'parameter error.',
              'err_no': 501,
              'err_subcode': 50000,
              'tts_logid': 642798357
            }
 
            {
              'err_detail': 'Invalid params per&pdt!',
              'err_msg': 'parameter error.',
              'err_no': 501,
              'err_subcode': 50000,
              'tts_logid': 1675521246
            }
 
            {
              'err_detail': 'Access token invalid or no longer valid',
              'err_msg': 'authentication failed.',
              'err_no': 502,
              'err_subcode': 50004,
              'tts_logid': 4221215043
            }
        """
        log.info("resp content is json -> occur error")
 
        isOk = False
        respDict = resp.json()
        log.info("respDict=%s", respDict)
        errNo = respDict["err_no"]
        errMsg = respDict["err_msg"] + " " + respDict["err_detail"]
    else:
        isOk = False
        errMsg = "Unexpected response content-type: %s" % respContentTypeLowercase
 
    return isOk, mp3BinData, errNo, errMsg
 
def doAudioSynthesis(unicodeText):
    """
        do audio synthesis from unicode text
        if failed for token invalid/expired, will refresh token to do one more retry
    """
    global app, log, gCurBaiduRespDict
    isOk = False
    audioBinData = None
    errMsg = ""
 
    # # for debug
    # gCurBaiduRespDict["access_token"] = "99.569b3b5b470938a522ce60d2e2ea2506.2592000.1528015602.282335-11192483"
 
    log.info("doAudioSynthesis: unicodeText=%s", unicodeText)
    isOk, audioBinData, errNo, errMsg = baiduText2Audio(unicodeText)
    log.info("isOk=%s, errNo=%d, errMsg=%s", isOk, errNo, errMsg)
 
    if isOk:
        errMsg = ""
        log.info("got synthesized audio binary data length=%d", len(audioBinData))
    else:
        if errNo == BAIDU_ERR_TOKEN_INVALID:
            log.warning("Token invalid -> refresh token")
            refreshBaiduToken()
 
            isOk, audioBinData, errNo, errMsg = baiduText2Audio(unicodeText)
            log.info("after refresh token: isOk=%ss, errNo=%s, errMsg=%s", isOk, errNo, errMsg)
        else:
            log.warning("try synthesized audio occur error: errNo=%d, errMsg=%s", errNo, errMsg)
            audioBinData = None
 
    log.info("return isOk=%s, errMsg=%s", isOk, errMsg)
    if audioBinData:
        log.info("audio binary bytes=%d", len(audioBinData))
    return isOk, audioBinData, errMsg
 
 
def testAudioSynthesis():
    global app, log, gTempAudioFolder
 
    testInputUnicodeText = u"as a book-collector, i have the story you just want to listen!"
    isOk, audioBinData, errMsg = doAudioSynthesis(testInputUnicodeText)
    if isOk:
        audioBinDataLen = len(audioBinData)
        log.info("Now will save audio binary data %d bytes to file", audioBinDataLen)
 
        # 1. save mp3 binary data into tmp file
        newUuid = generateUUID()
        log.info("newUuid=%s", newUuid)
        tempFilename = newUuid + ".mp3"
        log.info("tempFilename=%s", tempFilename)
        if not gTempAudioFolder:
            createAudioTempFolder()
        tempAudioFullname = os.path.join(gTempAudioFolder, tempFilename) #'/Users/crifan/dev/dev_root/company/xxx/projects/robotDemo/server/tmp/audio/2aba73d1-f8d0-4302-9dd3-d1dbfad44458.mp3'
        log.info("tempAudioFullname=%s", tempAudioFullname)
 
        with open(tempAudioFullname, 'wb') as tmpAudioFp:
            log.info("tmpAudioFp=%s", tmpAudioFp)
            tmpAudioFp.write(audioBinData)
            tmpAudioFp.close()
            log.info("Done to write audio data into file of %d bytes", audioBinDataLen)
 
        # 2. use celery to delay delete tmp file
    else:
        log.warning("Fail to get synthesis audio for errMsg=%s", errMsg)
 
 
#----------------------------------------
# Flask API
#----------------------------------------
def sendFile(fileBytes, contentType, outputFilename):
    """Flask API use this to send out file (to browser, browser can directly download file)"""
    return send_file(
        io.BytesIO(fileBytes),
        # io.BytesIO(fileObj.read()),
        mimetype=contentType,
        as_attachment=True,
        attachment_filename=outputFilename
    )
 
################################################################################
# Global Init App
################################################################################
app = Flask(__name__)
CORS(app)
# app.config.from_object('config.DevelopmentConfig')
app.config.from_object('config.ProductionConfig')
 
logFormatterStr = app.config["LOG_FORMAT"]
logFormatter = logging.Formatter(logFormatterStr)
 
fileHandler = RotatingFileHandler(
    app.config['LOG_FILE_FILENAME'],
    maxBytes=app.config["LOF_FILE_MAX_BYTES"],
    backupCount=app.config["LOF_FILE_BACKUP_COUNT"],
    encoding="UTF-8")
fileHandler.setLevel(logging.DEBUG)
fileHandler.setFormatter(logFormatter)
app.logger.addHandler(fileHandler)
 
 
app.logger.setLevel(logging.DEBUG) # set root log level
 
log = app.logger
log.info("app=%s", app)
# log.debug("app.config=%s", app.config)
 
api = Api(app)
log.info("api=%s", api)
 
celeryApp = Celery(app.name, broker=app.config['CELERY_BROKER_URL'])
celeryApp.conf.update(app.config)
log.info("celeryApp=%s", celeryApp)
 
aiContext = Context()
log.info("aiContext=%s", aiContext)
 
initAudioSynthesis()
# testAudioSynthesis()
 
...
 
#----------------------------------------
# Celery tasks
#----------------------------------------
 
# @celeryApp.task()
@celeryApp.task
# @celeryApp.task(name=app.config["CELERY_TASK_NAME"] + ".deleteTmpAudioFile")
def deleteTmpAudioFile(filename):
    """
        delete tmp audio file from filename
            eg: 98fc7c46-7aa0-4dd7-aa9d-89fdf516abd6.mp3
    """
    global log
 
    log.info("deleteTmpAudioFile: filename=%s", filename)
 
    audioTmpFolder = app.config["AUDIO_TEMP_FOLDER"]
    # audioTmpFolder = "tmp/audio"
    log.info("audioTmpFolder=%s", audioTmpFolder)
    curFolderAbsPath = os.getcwd() #'/Users/crifan/dev/dev_root/company/xxx/projects/robotDemo/server'
    log.info("curFolderAbsPath=%s", curFolderAbsPath)
    audioTmpFolderFullPath = os.path.join(curFolderAbsPath, audioTmpFolder)
    log.info("audioTmpFolderFullPath=%s", audioTmpFolderFullPath)
    tempAudioFullname = os.path.join(audioTmpFolderFullPath, filename)
    #'/Users/crifan/dev/dev_root/company/xxx/projects/robotDemo/server/tmp/audio/2aba73d1-f8d0-4302-9dd3-d1dbfad44458.mp3'
    if os.path.isfile(tempAudioFullname):
        os.remove(tempAudioFullname)
        log.info("Ok to delete file %s", tempAudioFullname)
    else:
        log.warning("No need to remove for not exist file %s", tempAudioFullname)
 
# log.info("deleteTmpAudioFile=%s", deleteTmpAudioFile)
# log.info("deleteTmpAudioFile.name=%s", deleteTmpAudioFile.name)
# log.info("celeryApp.tasks=%s", celeryApp.tasks)
 
#----------------------------------------
# Rest API
#----------------------------------------
 
class RobotQaAPI(Resource):
 
    def processResponse(self, respDict):
        """
            process response dict before return
                generate audio for response text part
        """
        global log, gTempAudioFolder
 
        tmpAudioUrl = ""
 
        unicodeText = respDict["data"]["response"]["text"]
        log.info("unicodeText=%s")
 
        if not unicodeText:
            log.info("No response text to do audio synthesis")
            return jsonify(respDict)
 
        isOk, audioBinData, errMsg = doAudioSynthesis(unicodeText)
        if isOk:
            audioBinDataLen = len(audioBinData)
            log.info("audioBinDataLen=%s", audioBinDataLen)
 
            # 1. save mp3 binary data into tmp file
            newUuid = generateUUID()
            log.info("newUuid=%s", newUuid)
            tempFilename = newUuid + ".mp3"
            log.info("tempFilename=%s", tempFilename)
            if not gTempAudioFolder:
                createAudioTempFolder()
            tempAudioFullname = os.path.join(gTempAudioFolder, tempFilename)
            log.info("tempAudioFullname=%s", tempAudioFullname) # 'xxx/tmp/audio/2aba73d1-f8d0-4302-9dd3-d1dbfad44458.mp3'
 
            with open(tempAudioFullname, 'wb') as tmpAudioFp:
                log.info("tmpAudioFp=%s", tmpAudioFp)
                tmpAudioFp.write(audioBinData)
                tmpAudioFp.close()
                log.info("Saved %d bytes data into temp audio file %s", audioBinDataLen, tempAudioFullname)
 
            # 2. use celery to delay delete tmp file
            delayTimeToDelete = app.config["CELERY_DELETE_TMP_AUDIO_FILE_DELAY"]
            deleteTmpAudioFile.apply_async([tempFilename], countdown=delayTimeToDelete)
            log.info("Delay %s seconds to delete %s", delayTimeToDelete, tempFilename)
 
            # generate temp audio file url
            # /tmp/audio
            tmpAudioUrl = "http://%s:%d/tmp/audio/%s" % (
                app.config["FILE_URL_HOST"],
                app.config["FLASK_PORT"],
                tempFilename)
            log.info("tmpAudioUrl=%s", tmpAudioUrl)
            respDict["data"]["response"]["audioUrl"] = tmpAudioUrl
        else:
            log.warning("Fail to get synthesis audio for errMsg=%s", errMsg)
 
        log.info("respDict=%s", respDict)
        return jsonify(respDict)
 
    def get(self):
        respDict = {
            "code": 200,
            "message": "generate response ok",
            "data": {
                "input": "",
                "response": {
                    "text": "",
                    "audioUrl": ""
                },
                "control": "",
                "audio": {}
            }
        }
 
        parser = reqparse.RequestParser()
        # i want to hear the story of Baby Sister Says No
        parser.add_argument('input', type=str, help="input words")
        log.info("parser=%s", parser)
 
        parsedArgs = parser.parse_args()  #
        log.info("parsedArgs=%s", parsedArgs)
        if not parsedArgs:
            respDict["data"]["response"]["text"] = "Can not recognize input"
            return self.processResponse(respDict)
 
        inputStr = parsedArgs["input"]
        log.info("inputStr=%s", inputStr)
 
        if not inputStr:
            respDict["data"]["response"]["text"] = "Can not recognize parameter input"
            return self.processResponse(respDict)
 
        respDict["data"]["input"] = inputStr
 
        aiResult = QueryAnalyse(inputStr, aiContext)
        log.info("aiResult=%s", aiResult)
 
        if aiResult["response"]:
            respDict["data"]["response"]["text"] = aiResult["response"]
        if aiResult["control"]:
            respDict["data"]["control"] = aiResult["control"]
        log.info('respDict["data"]=%s', respDict["data"])
 
        audioFileIdStr = aiResult["mediaId"]
        log.info("audioFileIdStr=%s", audioFileIdStr)
 
        if audioFileIdStr:
            audioFileObjectId = ObjectId(audioFileIdStr)
            log.info("audioFileObjectId=%s", audioFileObjectId)
 
            if fsCollection.exists(audioFileObjectId):
                audioFileObj = fsCollection.get(audioFileObjectId)
                log.info("audioFileObj=%s", audioFileObj)
                encodedFilename = quote(audioFileObj.filename)
                log.info("encodedFilename=%s", encodedFilename)
 
                respDict["data"]["audio"] = {
                    "contentType": audioFileObj.contentType,
                    "name": audioFileObj.filename,
                    "size": audioFileObj.length,
                    "url": "http://%s:%d/files/%s/%s" %
                           (app.config["FILE_URL_HOST"],
                            app.config["FLASK_PORT"],
                            audioFileObj._id,
                            encodedFilename)
                }
                log.info("respDict=%s", respDict)
                return self.processResponse(respDict)
            else:
                log.info("Can not find file from id %s", audioFileIdStr)
                respDict["data"]["audio"] = {}
                return self.processResponse(respDict)
        else:
            log.info("Not response file id")
            respDict["data"]["audio"] = {}
            return self.processResponse(respDict)
 
 
class GridfsAPI(Resource):
 
    def get(self, fileId, fileName=None):
        log.info("fileId=%s, file_name=%s", fileId, fileName)
 
        fileIdObj = ObjectId(fileId)
        log.info("fileIdObj=%s", fileIdObj)
        if not fsCollection.exists({"_id": fileIdObj}):
            respDict = {
                "code": 404,
                "message": "Can not find file from object id %s" % (fileId),
                "data": {}
            }
            return jsonify(respDict)
 
        fileObj = fsCollection.get(fileIdObj)
        log.info("fileObj=%s, filename=%s, chunkSize=%s, length=%s, contentType=%s",
                 fileObj, fileObj.filename, fileObj.chunk_size, fileObj.length, fileObj.content_type)
        log.info("lengthInMB=%.2f MB", float(fileObj.length / (1024 * 1024)))
 
        fileBytes = fileObj.read()
        log.info("len(fileBytes)=%s", len(fileBytes))
 
        outputFilename = fileObj.filename
        if fileName:
            outputFilename = fileName
        log.info("outputFilename=%s", outputFilename)
 
        return sendFile(fileBytes, fileObj.content_type, outputFilename)
 
 
class TmpAudioAPI(Resource):
 
    def get(self, filename=None):
        global gTempAudioFolder
 
        log.info("TmpAudioAPI: filename=%s", filename)
 
        tmpAudioFullPath = os.path.join(gTempAudioFolder, filename)
        log.info("tmpAudioFullPath=%s", tmpAudioFullPath)
 
        if not os.path.isfile(tmpAudioFullPath):
            log.warning("Not exists file %s", tmpAudioFullPath)
            respDict = {
                "code": 404,
                "message": "Can not find temp audio file %s" % filename,
                "data": {}
            }
            return jsonify(respDict)
 
        fileSize = os.path.getsize(tmpAudioFullPath)
        log.info("fileSize=%s", fileSize)
 
        with open(tmpAudioFullPath, "rb") as tmpAudioFp:
            fileBytes = tmpAudioFp.read()
            log.info("read out fileBytes length=%s", len(fileBytes))
 
            outputFilename = filename
            # contentType = "audio/mp3" # chrome use this
            contentType = "audio/mpeg" # most common and compatible
 
            return sendFile(fileBytes, contentType, outputFilename)
 
 
api.add_resource(PlaySongAPI, '/playsong', endpoint='playsong')
api.add_resource(RobotQaAPI, '/qa', endpoint='qa')
api.add_resource(GridfsAPI, '/files/<fileId>', '/files/<fileId>/<fileName>', endpoint='gridfs')
api.add_resource(TmpAudioAPI, '/tmp/audio/<filename>', endpoint='TmpAudio')
 
if __name__ == "__main__":
    app.run(
        host=app.config["FLASK_HOST"],
        port=app.config["FLASK_PORT"],
        debug=app.config["DEBUG"]
    )
config.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
class BaseConfig(object):
    DEBUG = False
 
    FLASK_PORT = 3xxxx
    # FLASK_HOST = "127.0.0.1"
    # FLASK_HOST = "localhost"
    # Note:
    # 1. to allow external access this server
    # 2. make sure here gunicorn parameter "bind" is same with here !!!
    FLASK_HOST = "0.0.0.0"
 
    # Flask app name
    FLASK_APP_NAME = "RobotQA"
 
    # Log File
    LOG_FILE_FILENAME = "logs/" + FLASK_APP_NAME + ".log"
    LOG_FORMAT = "[%(asctime)s %(levelname)s %(filename)s:%(lineno)d %(funcName)s] %(message)s"
    LOF_FILE_MAX_BYTES = 2*1024*1024
    LOF_FILE_BACKUP_COUNT = 10
 
    # reuturn file url's host
    # FILE_URL_HOST = FLASK_HOST
    FILE_URL_HOST = "127.0.0.1"
 
    # Audio Synthesis / TTS
    # BAIDU_APP_ID = "1xxx3"
    BAIDU_API_KEY = "Sxxxxz"
    BAIDU_SECRET_KEY = "4xxxxxa"
 
    AUDIO_TEMP_FOLDER = "tmp/audio"
 
    # CELERY_TASK_NAME = "Celery_" + FLASK_APP_NAME
    # CELERY_BROKER_URL = "
redis://localhost
"
    CELERY_BROKER_URL = "
redis://localhost:6379/0
"
    # CELERY_RESULT_BACKEND = "
redis://localhost:6379/0
" # current not use result
    CELERY_DELETE_TMP_AUDIO_FILE_DELAY = 60 * 2 # two minutes
 
class DevelopmentConfig(BaseConfig):
    # DEBUG = True
    # for local dev, need access remote mongodb
    MONGODB_HOST = "47.xx.xx.xx"
    FILE_URL_HOST = "127.0.0.1"
 
 
class ProductionConfig(BaseConfig):
    FILE_URL_HOST = "47.xx.xx.xx"
前端:
main.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
<!doctype html>
<html lang="en">
  <head>
    <!-- Required meta tags -->
    <meta charset="utf-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <!-- <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"> -->
 
    <!-- Bootstrap CSS -->
    <link rel="stylesheet" href="css/bootstrap-3.3.1/bootstrap.css">
    <!-- <link rel="stylesheet" href="css/highlightjs_default.css"> -->
    <link rel="stylesheet" href="css/highlight_atom-one-dark.css">
    <!-- <link rel="stylesheet" href="css/highlight_monokai-sublime.css"> -->
    <link rel="stylesheet" href="css/bootstrap3_player.css">
    <link rel="stylesheet" href="css/main.css">
 
    <title>xxx英语智能机器人演示</title>
 
    <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
    <!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
    <!--[if lt IE 9]>
      <script src="
https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js
"></script>
      <script src="
https://oss.maxcdn.com/respond/1.4.2/respond.min.js
"></script>
    <![endif]-->
  </head>
    <div class="logo text-center">
        <img class="mb-4" src="img/logo_transparent_183x160.png" alt="xxx Logo" width="72" height="72">
    </div>
 
    <h2>xxx英语智能机器人</h2>
    <h4>xxx Bot for Kids</h4>
 
    <div class="panel panel-primary">
      <div class="panel-heading">
        <h3 class="panel-title">Input</h3>
      </div>
      <div class="panel-body">
          <ul class="list-group">
            <li class="list-group-item">
                <h3 class="panel-title">Input Example</h3>
                <ul>
                    <li>i want to hear the story of apple</li>
                    <li>say story apple</li>
                    <li>say apple</li>
                    <li>next episode</li>
                    <li>next</li>
                    <li>i want you stop reading</li>
                    <li>stop reading</li>
                    <li>please go on</li>
                    <li>go on</li>
                </ul>
            </li>
 
            <li class="list-group-item">
                <!--
                <form>
                  <div class="form-group input_request">
                    <input id="inputRequest" type="text" class="form-control" placeholder="请输入您要说的话" value="i want to hear the story of apple">
                  </div>
                  <div class="form-group">
                    <button id="submitInput" type="submit" class="btn btn-primary btn-lg col-sm-3 btn-block">提交</button>
                    <button id="clearInput" class="btn btn-secondary btn-lg col-sm-3" type="button">清除</button>
                    <button id="clearInput" class="btn btn-info btn-lg col-sm-3 btn-block" type="button">清除</button>
                  </div>
                </form>
                 -->
 
                <div class="row">
                  <div class="col-lg-12">
                    <div class="input-group">
                      <input id="inputRequest" type="text" class="form-control" placeholder="请输入您要说的话" value="say apple">
                      <span class="input-group-btn">
                        <button id="submitInput" type="submit" class="btn btn-primary" type="button">提交</button>
                      </span>
                    </div><!-- /input-group -->
                  </div><!-- /.col-lg-6 -->
                </div>
            </li>
          </ul>
 
      </div>
    </div>
 
<!--
  <div class="input_example bg-light box-shadow">
    <h5>Input Example:</h5>
    <ul>
        <li>i want to hear the story of apple</li>
        <li>next episode</li>
        <li>i want you stop reading</li>
        <li>please go on</li>
    </ul>
  </div>
-->
 
    <div class="panel panel-success">
      <div class="panel-heading">
        <h3 class="panel-title">Output</h3>
      </div>
      <div class="panel-body">
 
        <div id="response_text" class="alert alert-success" role="alert">
          <p>here will output response text</p>
 
          <div class="response_text_audio_player">
            <audio
                controls
                data-info-att="response text's audio">
                <source src="" type="audio/mpeg" />
            </audio>
          </div>
        </div>
 
        <div class="audio_player col-md-12 col-xs-12">
            <audio
                controls
                data-info-att="">
                <source src="" type="" />
            </audio>
        </div>
 
        <!--
        <div id="audio_play_prevented" class="alert alert-warning alert-dismissible col-md-12 col-xs-12">
          <button type = "button" class="close" data-dismiss = "alert">x</button>
          <strong>Notice:</strong> Auto play prevented, please mannually click above play button to play
        </div>
        -->
 
    <!--
        <div id="response_json" class="bg-light box-shadow">
            <pre><code class="json">here will output response</code></pre>
        </div>
    -->
 
    <!--
        <pre id="response_json">
            <code class="json">here will output response</code>
        </pre>
    -->
 
        <div id="response_json">
          <code class="json">here will output response</code>
        </div>
 
      </div>
    </div>
 
 
    <!-- Optional JavaScript -->
    <!-- jQuery first, then Popper.js, then Bootstrap JS -->
    <!-- <script src="js/jquery-3.3.1.js"></script> -->
    <!-- <script src="
https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js
"></script> -->
    <script src="js/jquery/1.11.1/jquery-1.11.1.js"></script>
    <!-- <script src="
https://code.jquery.com/jquery-3.3.1.slim.min.js
" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script> -->
    <!-- <script src="
https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.0/umd/popper.min.js
" integrity="sha384-cs/chFZiN24E4KMATLdqdvsezGxaGsi4hLGOzlXwp5UZB1LY//20VyM2taTB4QvJ" crossorigin="anonymous"></script> -->
    <script src="js/popper-1.14.0/popper.min.js"></script>
    <!-- <script src="js/bootstrap.js"></script> -->
    <script src="js/bootstrap-3.3.1/bootstrap.min.js"></script>
    <script src="js/highlight.js"></script>
    <script src="js/bootstrap3_player.js"></script>
    <script src="js/main.js"></script>
  </body>
</html>
main.css
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
.logo{
    padding: 10px 2%;
}
 
h2{
    text-align: center;
    margin-top: 10px;
    margin-bottom: 10px;
}
 
h4{
    text-align: center;
    margin-top: 0px;
    margin-bottom: 20px;
}
 
form {
    text-align: center;
}
 
.form-group {
    /*padding-left: 1%;*/
    /*padding-right: 1%;*/
}
 
.input_example {
    /*padding: 1px 1%;*/
}
 
#response_json {
    /*width: 96%;*/
    height: 380px;
    border-radius: 10px;
    padding-top: 20px;
 
    /*padding-left: 1%;*/
    /*padding-right: 1%;*/
}
 
#response_text {
    text-align: center !important;
    font-size: 14px;
 
    /* padding-left: 4%;
    padding-right: 4%; */
}
 
/*pre {*/
    /*padding-left: 2%;*/
    /*padding-right: 2%;*/
/*}*/
 
.audio_player {
    margin-top: 10px;
    margin-bottom: 5px;
    text-align: center;
 
    padding-left: 0 !important;
    padding-right: 0 !important;
}
 
.response_text_audio_player{
    /* visibility: hidden; */
 
    width: 100%;
    /* height: 1px !important; */
    height: 100px;
}
 
/* #audio_play_prevented {
    display: none;
} */
main.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
if (!String.format) {
    String.format = function(format) {
      var args = Array.prototype.slice.call(arguments, 1);
      return format.replace(/{(\d+)}/g, function(match, number) {
        return typeof args[number] != 'undefined'
          ? args[number]
          : match
        ;
      });
    };
}
 
 
$(document).ready(function(){
 
    $('[data-toggle="tooltip"]').tooltip();
 
    // when got response json, update to highlight it
    function updateHighlight() {
        console.log("updateHighlight");
 
        $('pre code').each(function(i, block) {
            hljs.highlightBlock(block);
        });
    }
 
    updateHighlight();
 
    $("#submitInput").click(function(event){
        event.preventDefault();
 
        ajaxSubmitInput();
    });
     
    function ajaxSubmitInput() {
        console.log("ajaxSubmitInput");
 
        var inputRequest = $("#inputRequest").val();
        console.log("inputRequest=%s", inputRequest);
        var encodedInputRequest = encodeURIComponent(inputRequest)
        console.log("encodedInputRequest=%s", encodedInputRequest);
 
        // var qaUrl = "http://127.0.0.1:32851/qa";
        var qaUrl = "http://xxx:32851/qa";
        console.log("qaUrl=%s", qaUrl);
        var fullQaUrl = qaUrl + "?input=" + encodedInputRequest
        console.log("fullQaUrl=%s", fullQaUrl);
 
        $.ajax({
            type : "GET",
            url : fullQaUrl,
            success: function(respJsonObj){
                console.log("respJsonObj=%o", respJsonObj);
 
                // var respnJsonStr = JSON.stringify(respJsonObj);
                //var beautifiedJespnJsonStr = JSON.stringify(respJsonObj, null, '\t');
                var beautifiedJespnJsonStr = JSON.stringify(respJsonObj, null, 2);
                console.log("beautifiedJespnJsonStr=%s", beautifiedJespnJsonStr);
                var prevOutputValue = $('#response_json').text();
                console.log("prevOutputValue=%o", prevOutputValue);
                var afterOutputValue = $('#response_json').html('<pre><code class="json">' + beautifiedJespnJsonStr + "</code></pre>");
                console.log("afterOutputValue=%o", afterOutputValue);
                updateHighlight();
 
                var curResponseDict = respJsonObj["data"]["response"];
                console.log("curResponseDict=%s", curResponseDict);
 
                var curResponseText = curResponseDict["text"];
                console.log("curResponseText=%s", curResponseText);
                $('#response_text p').text(curResponseText);
 
                var curResponseAudioUrl = curResponseDict["audioUrl"];
                console.log("curResponseAudioUrl=%s", curResponseAudioUrl);
                if (curResponseAudioUrl) {
                    console.log("now play the response text's audio %s", curResponseAudioUrl);
 
                    var respTextAudioObj = $(".response_text_audio_player audio")[0];
                    console.log("respTextAudioObj=%o", respTextAudioObj);
 
                    $(".response_text_audio_player .col-sm-offset-1").text(curResponseText);
 
                    $(".response_text_audio_player audio source").attr("src", curResponseAudioUrl);
                    respTextAudioObj.load();
                    console.log("has load respTextAudioObj=%o", respTextAudioObj);
 
                    respTextAudioObj.onended = function() {
                        console.log("play response text's audio ended");
 
                        var dataControl = respJsonObj["data"]["control"];
                        console.log("dataControl=%o", dataControl);
 
                        var audioElt = $(".audio_player audio");
                        console.log("audioElt=%o", audioElt);
                        var audioObject = audioElt[0];
                        console.log("audioObject=%o", audioObject);
 
                        var playAudioPromise = undefined;
 
                        if (dataControl === "stop") {
                            //audioObject.stop();
                            audioObject.pause();
                            console.log("has pause audioObject=%o", audioObject);
                        } else if (dataControl === "continue") {
                            // // audioObject.load();
                            // audioObject.play();
                            // // audioObject.continue();
                            // console.log("has load and play audioObject=%o", audioObject);
 
                            playAudioPromise = audioObject.play();
                        }
 
                        if (respJsonObj["data"]["audio"]) {
                            var audioDict = respJsonObj["data"]["audio"];
                            console.log("audioDict=%o", audioDict);
 
                            var audioName = audioDict["name"];
                            console.log("audioName=%o", audioName);
                            var audioSize = audioDict["size"];
                            console.log("audioSize=%o", audioSize);
                            var audioType = audioDict["contentType"];
                            console.log("audioType=%o", audioType);
                            var audioUrl = audioDict["url"];
                            console.log("audioUrl=%o", audioUrl);
 
                            var isAudioEmpty = (!audioName && !audioSize && !audioType && !audioUrl)
                            console.log("isAudioEmpty=%o", isAudioEmpty);
 
                            if (isAudioEmpty) {
                                // var pauseAudioResult = audioObject.pause();
                                // console.log("pauseAudioResult=%o", pauseAudioResult);
 
                                // audioElt.attr("data-info-att", "");
                                // $(".col-sm-offset-1").text("");
                            } else {
                                if (audioName) {
                                    audioElt.attr("data-info-att", audioName);
                                    $(".audio_player .col-sm-offset-1").text(audioName);
                                }
 
                                if (audioType) {
                                    $(".audio_player audio source").attr("type", audioType);
                                }
 
                                if (audioUrl) {
                                    $(".audio_player audio source").attr("src", audioUrl);
 
                                    audioObject.load();
                                    console.log("has load audioObject=%o", audioObject);
                                }
 
                                console.log("dataControl=%s,audioUrl=%s", dataControl, audioUrl);
 
                                if ((dataControl === "") && audioUrl) {
                                    playAudioPromise = audioObject.play();
                                } else if ((dataControl === "next") && (audioUrl)) {
                                    playAudioPromise = audioObject.play();
                                }
                            }
                        } else {
                            console.log("empty respJsonObj['data']['audio']=%o", respJsonObj["data"]["audio"]);
                        }
 
                        if (playAudioPromise !== undefined) {
                            playAudioPromise.then(() => {
                                // Auto-play started
                                console.log("Auto paly audio started, playAudioPromise=%o", playAudioPromise);
 
                                //for debug
                                // showAudioPlayPreventedNotice();
                            }).catch(error => {
                                // Auto-play was prevented
                                // Show a UI element to let the user manually start playback
                                showAudioPlayPreventedNotice();
 
                                console.error("play audio promise error=%o", error);
                                //NotAllowedError: The request is not allowed by the user agent or the platform in the current context, possibly because the user denied permission.
                            });
                        }
                    }
 
                    respTextAudioPromise = respTextAudioObj.play();
                    // console.log("respTextAudioPromise=%o", respTextAudioPromise);
                    if (respTextAudioPromise !== undefined) {
                        respTextAudioPromise.then(() => {
                            // Auto-play started
                            console.log("Auto paly audio started, respTextAudioPromise=%o", respTextAudioPromise);
                        }).catch(error => {
                            // Auto-play was prevented
                            // Show a UI element to let the user manually start playback
 
                            console.error("play response text's audio promise error=%o", error);
                            //NotAllowedError: The request is not allowed by the user agent or the platform in the current context, possibly because the user denied permission.
                        });
                    }
                }
 
            },
            error : function(jqXHR, textStatus, errorThrown) {
                console.error("jqXHR=%o, textStatus=%s, errorThrown=%s", jqXHR, textStatus, errorThrown);
                var errDetail = String.format("status={0}\n\tstatusText={1}\n\tresponseText={2}", jqXHR.status, jqXHR.statusText, jqXHR.responseText);
                var errStr = String.format("GET: {0}\nERROR:\t{1}", fullQaUrl,  errDetail);
                // $('#response_text p').text(errStr);
                var responseError = $('#response_json').html('<pre><code class="html">' + errStr + "</code></pre>");
                console.log("responseError=%o", responseError);
                updateHighlight();
            }
        });
    }
 
    function showAudioPlayPreventedNotice(){
        console.log("showAudioPlayPreventedNotice");
        // var prevDisplayValue = $("#audio_play_prevented").css("display");
        // console.log("prevDisplayValue=%o", prevDisplayValue);
        // $("#audio_play_prevented").css({"display":"block"});
 
        var curAudioPlayPreventedNoticeEltHtml = $("#audio_play_prevented").html();
        console.log("curAudioPlayPreventedNoticeEltHtml=%o", curAudioPlayPreventedNoticeEltHtml);
        if (curAudioPlayPreventedNoticeEltHtml !== undefined) {
            console.log("already exist audio play prevented notice, so not insert again");
        } else {
            var audioPlayPreventedNoticeHtml = '<div id="audio_play_prevented" class="alert alert-warning alert-dismissible col-md-12 col-xs-12"><button type = "button" class="close" data-dismiss = "alert">x</button><strong>Notice:</strong> Auto play prevented, please mannually click above play button to play</div>';
            console.log("audioPlayPreventedNoticeHtml=%o", audioPlayPreventedNoticeHtml);
            $(".audio_player").append(audioPlayPreventedNoticeHtml);    
        }
    }
 
    $("#clearInput").click(function(event){
        // event.preventDefault();
        console.log("event=%o", event);
        $('#inputRequest').val("");
 
        $('#response_json').html('<pre><code class="json">here will output response</code></pre>');
        updateHighlight();
    });
});
效果:
点击提交后,后端生成临时的mp3的文件,返回到前端,前端可以正常加载并播放:

转载请注明:在路上 » 【已解决】用和适在线的语言合成接口把文字转语音

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
92 queries in 0.217 seconds, using 22.32MB memory