使用讯飞语音识别----前后端如何交互？-EW帮帮网

前端采用Uniapp，后端采用Springboot

语音识别采用讯飞的短语音识别，使用个人开发可以获得免费试用期。

本人踩坑经历：使用的uniapp转微信小程序，录音之后的音频文件死活不能通过讯飞的识别，于是我在网上寻找到了使用ffmpeg进行格式化音频文件，使其能够被识别，这一部分我是在后端完成的，前端只负责将语音进上传。FFmpeg使用的命令为：

ffmpeg -y -i 源文件 -acodec pcm_s16le -f s16le -ac 1 -ar 16000 生成文件

前端（核心在于options的设置，可以选择多种，但是我最初使用的是pcm，在本地运行是没问题，一到真机就会出问题，于是就改回mp3，在后端使用ffmpeg进行强装）：

			// 按住录音事件
			longpressBtn() {
				this.longPress = '2';
				this.countdown(60); // 倒计时
				clearInterval(init) // 清除定时器
				recorderManager.onStop((res) => {
					this.tempFilePath = res.tempFilePath;
					this.recordingTimer(this.time);
				})
				const options = {
					sampleRate: 16000, //采样率，有效值 8000/16000/44100
					numberOfChannels: 1, //录音通道数，有效值 1/2
					encodeBitRate: 96000, //编码码率
					format: 'mp3',
				}
				this.recordingTimer();
				recorderManager.start(options);
				// 监听音频开始事件
				recorderManager.onStart((res) => {})
			},
			// 松开录音事件
			async touchendBtn() {
				this.longPress = '1';
				await recorderManager.onStop((res) => {
					this.tempFilePath = res.tempFilePath
					console.log(this.tempFilePath)
					this.uploadVoice(this.tempFilePath)

				})
				this.recordingTimer(this.time)
				recorderManager.stop()

			},
            // 上传
            uploadVoice(tempFilePath) {
				const token = getToken()
				if (tempFilePath != '') {
					uni.uploadFile({
						url: this.endSideUrl1 +"/chat/voice", // 你的服务器上传接口地址
						header: {
							"token": token,
							'Content-Type': 'multipart/form-data; charset=UTF-8'
						},
						filePath: tempFilePath,
						name: 'audioFile',
						success: (res) => {
                            //以下是其他业务代码。。。。
							const question = JSON.parse(res.data).data
							console.log(question)
							if (question == "") {
								uni.showToast({
									title: '没听清?再讲一遍',
									icon: 'none'
								})
							} else {
								this.qaList.push({
									content: question,
									className: "chatpdfRow chatpdfAsk"
								})
								this.scrollToBottom();
								//流式回答

								this.send(`{user:${token},message:${question}}`);
								let item = ({
									content: "",
									className: "chatpdfRow"
								})
								this.qaList.push(item);
								this.scrollToBottom();
								this.tempItem = item;
							}
						},
						fail: (e) => {
							console.error("没听清?再讲一遍", e)
						}
					});
				}
			},

后端代码：

使用一个工具类进行封装

package com.farm.util;

import com.iflytek.cloud.speech.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.util.concurrent.CountDownLatch;
import java.util.concurrent.TimeUnit;

public class IatTool {

    private static final Logger LOGGER = LoggerFactory.getLogger(IatTool.class);
    private StringBuilder curRet;
    private SpeechRecognizer recognizer;
    private CountDownLatch latch;

    public IatTool(String appId) {
        LOGGER.info("------Speech Utility init iat------");
        SpeechUtility.createUtility(SpeechConstant.APPID + "=" + appId);
    }

    public String RecognizePcmfileByte(byte[] buffer) {
        curRet = new StringBuilder();
        latch = new CountDownLatch(1); // 初始化CountDownLatch，设置计数为1

        try {
            if (recognizer == null) {
                recognizer = SpeechRecognizer.createRecognizer();
                recognizer.setParameter(SpeechConstant.AUDIO_SOURCE, "-1");
                recognizer.setParameter(SpeechConstant.RESULT_TYPE, "plain");
            }

            recognizer.startListening(recListener);
            if (buffer == null || buffer.length == 0) {
                LOGGER.error("no audio available!");
                recognizer.cancel();
            } else {
                recognizer.writeAudio(buffer, 0, buffer.length);
                recognizer.stopListening();
                if (!latch.await(5000, TimeUnit.MILLISECONDS)) { // 等待超过5000ms则自行结束等待
                    LOGGER.warn("Recognition timed out.");
                }
                return curRet.toString();
            }

        } catch (Exception e) {
            e.printStackTrace();
        }
        return null;
    }

    private RecognizerListener recListener = new RecognizerListener() {
        @Override
        public void onBeginOfSpeech() {
            LOGGER.info("onBeginOfSpeech enter");
        }
        @Override
        public void onEndOfSpeech() {
            LOGGER.info("onEndOfSpeech enter");
        }

        @Override
        public void onResult(RecognizerResult results, boolean isLast) {
            LOGGER.info("onResult enter");

            String text = results.getResultString();
            curRet.append(text);

            if (isLast) {
                latch.countDown(); // 唤醒等待的线程
            }
        }

        @Override
        public void onVolumeChanged(int volume) {
            LOGGER.info("onVolumeChanged volume=" + volume);
        }

        @Override
        public void onError(SpeechError error) {
            LOGGER.error("onError enter");
            if (null != error) {
                LOGGER.error("onError Code：" + error.getErrorCode() + "," + error.getErrorDescription(true));
            }
        }

        @Override
        public void onEvent(int eventType, int arg1, int agr2, String msg) {
            LOGGER.info("onEvent enter");
        }
    };
}

编写前端对应的上传接口（这里主要的工作就是将上传的文件进行一个保存，因为要使用一个外部工具进行一个格式化操作，所以好像不能使用流进操作，所以我保存了下来，保存之后在使用java代码调取控制台使用FFmpeg将其进行格式转换，将其保存在其目录中，转后的格式后缀可以换成PCM，讯飞是可以使用MP3的所以我沿用了，还是同样可以识别的出来，识别完毕之后将文件进行一个删除，将磁盘空间返回系统）

    IatTool iatTool = new IatTool("填入你自己的ID");

    @PostMapping(value = "/voice",produces = "application/json; charset=UTF-8")
    public R<String> RecognizePcmfileByte(MultipartFile audioFile) {
        String question = "no idea";

        String originalFilename = audioFile.getOriginalFilename();
        // 截取文件后缀名
        String suffix = originalFilename.substring(originalFilename.lastIndexOf("."));
        // 利用uuid生成新的文件名
        String filename = UUID.randomUUID().toString() + suffix;
        // 创建文件目录
        File dir = new File(pcmPath);
        if (!dir.exists()) {
            // 文件不存在开始创建
            dir.mkdir();
        }
        // 将文件转存在本地文件中
        try {
            File oldFile = new File(pcmPath + filename);
            audioFile.transferTo(oldFile);
            String newFileName = UUID.randomUUID().toString() + suffix;
            String[] command = {
                    "ffmpeg","-loglevel", "quiet", "-y", "-i", pcmPath + filename, "-acodec", "pcm_s16le", "-f", "s16le", "-ac", "1", "-ar", "16000", pcmPath + newFileName
            };
            // 执行命令
            ProcessBuilder processBuilder = new ProcessBuilder(command);
            processBuilder.start().waitFor();
            // 等待FFmpeg命令执行完毕
            File newFile = new File(pcmPath + newFileName);
            FileInputStream fileInputStream = new FileInputStream(newFile);
            byte[] bytes = new byte[(int) newFile.length()];
            fileInputStream.read(bytes);
            question = iatTool.RecognizePcmfileByte(bytes);
            //删除两个文件
            boolean delete = oldFile.delete();
            if(delete)
                log.info("第一个文件删除成功");
            fileInputStream.close();
            boolean delete1 = newFile.delete();
            if(delete1)
                log.info("第二个文件删除成功");
            return R.success(question);
        } catch (IOException  e) {
            e.printStackTrace();
        } catch (InterruptedException e) {
            throw new RuntimeException(e);
        }
        return R.error("网络未知错误");
    }

使用讯飞语音识别----前后端如何交互？

网站公告

今日签到

热门文章

最新发布