Speech dispatcher+Piper朗读中文的问题

目前使用speech-dispatcherPiper(通过Flatpak分发的Pied)作为系统语音合成器。

不过遇到一个问题是speech-dispatcher(使用中文语音模型zh_CN-huayan-medium.onnx)阅读中文会跳过中文字符,只识别出了字母和数字。

搜了一下虽然找到了解决办法,在~/.config/speech-dispatcher/modules/piper.conf中添加了GenericLanguage "zh" "zh-CN" "utf-8"后重启speech-dispatcher服务,在终端模拟器输入spd-say "一二三四" 可以输出正常语音,而且启用的了GNOME的屏幕阅读器后也能正常朗读系统中的中文部分。

但是在Flatpak版本的Firefox和Foliate中使用speech-dispatcher仍然会出现只识别字母和数字的问题。

貌似Flatpak版的Firefox和Foliate是通过speech-dispatcher.socket来使用系统的speech-dispatcher服务,看了看服务状态也已经读取了~/.config/speech-dispatcher/modules/piper.conf的配置。

不太清楚是哪个环节出了问题,本来怀疑是Flatpak的沙盒环境,但用pacman安装的Firefox一样会出现同样的问题。


相关信息:

~/.config/speech-dispatcher/speechd.conf

###
### THIS CONFIG WAS GENERATED BY PIED
###

SymbolsPreproc "char"
SymbolsPreprocFile "gender-neutral.dic"
SymbolsPreprocFile "font-variants.dic"
SymbolsPreprocFile "symbols.dic"
SymbolsPreprocFile "emojis.dic"
SymbolsPreprocFile "orca.dic"
SymbolsPreprocFile "orca-chars.dic"
AddModule "piper" "sd_generic" "piper.conf"
DefaultModule "piper"
LanguageDefaultModule "zh"  "piper"
LanguageDefaultModule "zh-CN"  "piper"
Include "clients/*.conf"

~/.config/speech-dispatcher/modules/piper.conf

GenericExecuteSynth "if command -v sox > /dev/null; then\
        PROCESS=\'sox -r 22050 -c 1 -b 16 -e signed-integer -t raw - -t wav - tempo $RATE pitch $PITCH norm\';\
        if command -v paplay > /dev/null; then\
            OUTPUT=\'$PLAY_COMMAND\';\
        else\
            OUTPUT=\'aplay\';\
        fi;\
    elif command -v paplay > /dev/null; then\
        PROCESS=\'cat\'; OUTPUT=\'$PLAY_COMMAND --raw --channels 1 --rate 22050\';\
    else\
        PROCESS=\'cat\'; OUTPUT=\'aplay -t raw -c 1 -r 22050 -f S16_LE\';\
    fi;\
    echo \'$DATA\' | /home/shaan/.var/app/com.mikeasoft.pied/data/pied/piper/piper --model /home/shaan/.var/app/com.mikeasoft.pied/data/pied/models/zh_CN-huayan-medium.onnx -s 0 --output_raw | $PROCESS | $OUTPUT;"
GenericLanguage "zh" "zh-CN" "utf-8"
GenericRateAdd 1
GenericPitchAdd 1
GenericVolumeAdd 1
GenericRateMultiply 1
GenericPitchMultiply 1000
AddVoice "zh_CN" "MALE1" "Piper"

systemctl --user status speech-dispatcher.socket

● speech-dispatcher.socket - Speech Dispatcher Socket
     Loaded: loaded (/usr/lib/systemd/user/speech-dispatcher.socket; enabled; preset: enabled)
     Active: active (running) since Fri 2025-11-14 12:32:34 CST; 10h ago
 Invocation: 9a4d8fe80e744c909544d5aca963b655
   Triggers: ● speech-dispatcher.service
     Listen: /run/user/1000/speech-dispatcher/speechd.sock (Stream)

11月 14 12:32:34 cyan systemd[5306]: Listening on Speech Dispatcher Socket.

systemctl --user status speech-dispatcher.service

● speech-dispatcher.service - Common interface to speech synthesizers
     Loaded: loaded (/usr/lib/systemd/user/speech-dispatcher.service; static)
     Active: active (running) since Fri 2025-11-14 22:43:09 CST; 21min ago
 Invocation: e2546061208f4ba9a3ec5bfb55944829
TriggeredBy: ● speech-dispatcher.socket
   Main PID: 429325 (speech-dispatch)
      Tasks: 12 (limit: 37210)
     Memory: 7.9M (peak: 148.2M)
        CPU: 18.391s
     CGroup: /user.slice/user-1000.slice/user@1000.service/app.slice/speech-dispatcher.service
             ├─429325 /usr/bin/speech-dispatcher -s -t 0
             ├─429345 /usr/lib/speech-dispatcher/speech-dispatcher-modules/sd_generic /home/shaan/.config/speech-dispatcher/modules/piper.conf
             └─429349 /usr/lib/speech-dispatcher/speech-dispatcher-modules/sd_dummy /home/shaan/.config/speech-dispatcher/modules/

11月 14 22:43:09 cyan systemd[5306]: Stopped Common interface to speech synthesizers.
11月 14 22:43:09 cyan systemd[5306]: speech-dispatcher.service: Consumed 24.601s CPU time, 213.9M memory peak.
11月 14 22:43:09 cyan systemd[5306]: Started Common interface to speech synthesizers.
11月 14 22:43:10 cyan speech-dispatcher[429325]: [Fri Nov 14 22:43:10 2025 : 12693] speechd: Speech Dispatcher 0.12.1 starting
11月 14 22:43:10 cyan speech-dispatcher[429325]: [Fri Nov 14 22:43:10 2025 : 12800] speechd:  Configuration has been read from "/home/shaan/.config/speech-dispatcher/speechd.conf"

~/.config/speech-dispatcher/speechd.conf添加了DefaultLanguage zh-CN段落解决了。明明文档提到:

# The Default language with which to speak
# Note that the spd-say client in particular always sets the language to its
# current locale language, so this particular client will never pick this
# configuration.
请注意,spd-say 客户端会始终将其语言设置为当前区域设置语言,因此该客户端将永远不会使用此处的配置。

我的环境变量也是LANG=zh_CN.UTF-8了,怎么还是需要手动在配置文件设置语言为中文(恼

zh_CN-huayan-medium.onnx阅读英文的时候是按拼音的念法逐字母朗读大草,语气比espeak-NG正常但还是有点奇怪。