スクリプトでのテキスト読み上げ

Question

現在、テキスト読み上げにwineおよびWindows TTSApp.exeアプリケーションを使用しています。

これはGUIアプリケーションであり、SAPI-5ボイスでうまく機能します。数回クリックするだけでテキストファイルを選択し、それをWAVファイルに変換します。

しかし、私も何か違うことをしたいと思います。

変換用のコマンドラインスクリプトを作成して、次のように実行したいと思います。

wine ttsUtil.exe text.txt -voice=nick -output=speech.wav

これはワインの下で可能ですか？私の好きな声はWindowsでしか動かないので、ワインを使う必要があります。 GUI TTSApp.exeの代わりにttsUtil.exe（名前は関係ありません）を使用したいと思います。

すべての小さなテキストファイルが変換のためにそれほどクリックする時間がないので、私は本当にこのタスクを自動化する必要があります。

Klaatu von Schlacker · Answer

あなたが持っているコマンドが機能すると言っているなら、あなたはそれを自動化する必要があるだけです、そしてあなたはいくつかのオプションがあります：

変換したいファイルのディレクトリがある場合は、次のようにすることができます。

 #!/bin/bash ARG=$1 for i in "${ARG}"/* ; do wine ttsUtil.exe "${i}" -voice=nick -output="${i}".wav done

それをファイルとして保存し（おそらくttsconvert.shと呼びます）、実行可能にします。

chmod +x ttsconvert.sh

これで、変換するファイルのディレクトリへのパスをスクリプトに指定して、スクリプトを実行できます。

./ttsconvert.sh ~/path/to/stash/of/files

ファイルごとに必要な場合は、.desktopファイルを使用して独自のランチャーを作成できます。

たとえば、ttsconvert.desktopというファイルを作成します。

 [Desktop Entry] Version=0.1 Name=TTSConvert Exec=wine ttsUtil.exe %U -voice=nick -output=speech.wav Icon=multimedia-volume-control MimeType=text/plain;

Rootとして、このファイルを/ usr/share/applicationsに配置すると、右クリックメニューのOpen Withオプションを使用して、新しいコンバーターでテキストファイルを開くことができるようになります。それが機能していることを通知することはありません。 GUI通知を使用するように、より洗練されたスクリプトを作成することもできますが、これはまだすべての場所ではありません。

Mircea Vutcovici · Answer

次のようなSAPI5コマンドラインユーティリティを試してみます： http://www.nirsoft.net/articles/speak_from_command_line.html

また試してみてください： http://jampal.sourceforge.net/ptts.html

user147573 · Answer

これは応急修理ですが、確実に機能しました。 XephyrウィンドウでTTSApp.exeのスクリプトを作成し、マウスとキーボードの入力をシミュレートします。

Ubuntuパッケージをインストールします：xserver-xephyr metacity xdotool libav-tools

デフォルトの発話速度以外のものが必要な場合は、これを各テキストファイルの先頭に追加します：<prosody rate="medium"><prosody rate="+36%">および対応する終了を末尾に追加します：</prosody></prosody>（の第8章から第9章にあるさらに多くのXMLオプション AT＆T Natural Voicesシステム開発者ガイド）。

TTSApp.exeで必要なキーのスクリプトのkey --delay 100 iのiを置き換えて、希望の音声を選択します。

変換が成功したときにソースファイルを削除する場合は、スクリプトの最後にあるunlink（）のコメントを解除します。

次のような方法でこのスクリプトを実行します。

find . -name 'chapter*.txt' -print0 |xargs -0 txt2ogg

これがPerlスクリプトtxt2oggです（chmod +xを忘れないでください）：

#!/usr/bin/Perl -CS -w # use strict; use warnings; use utf8; # my $homeDir=$ENV{HOME}; $homeDir .= '/' if(substr($homeDir,length($homeDir)-1,1) ne '/'); my $oldDir = `pwd`; chomp($oldDir); $oldDir .= '/' if(substr($oldDir,length($oldDir)-1,1) ne '/'); chdir($homeDir) or die($!); system( q(Xephyr :4 -screen 600x480 >/dev/null 2>/dev/null &) ); # using the user's display works until you try to get other work done or the screensaver starts system( q(DISPLAY=:4 metacity >/dev/null 2>/dev/null &) ); # xdotool needs a window manager foreach(@ARGV) { s|^\./||; my $thisArg = $_; my $ttsIn = $oldDir.$thisArg; # make path absolute (my $ttsOut = $ttsIn) =~ s|\.[^\./]*$||; # strip file extension $ttsOut .= '.ogg'; my $attempt = 0; my $errorCodes = ""; # list of codes for recoverable errors my $closeDialogCmd = q(export DISPLAY=:4; xdotool search --name "File Saved" windowactivate --sync %@ key space 2>/dev/null); my $ExitCmd = q(export DISPLAY=:4; xdotool search --name "SAPI5 TTSAPP" windowactivate --sync %@ windowkill 2>/dev/null); while(1) { print("
$thisArg ... "); unlink("ttsin"); unlink("ttsout.wav"); unlink("ttsout.ogg"); symlink($ttsIn,"ttsin") or die($!); #xdotool is sometimes too fast, even with ''--delay 100'', so BackSpace makes sure the full name gets entered my $stallLimit = 10; my $seconds = 0; my $priorWavSize = 0; my $stalledTime = 0; my $wavSize = 0; #start TTSApp.exe in the background system( q(DISPLAY=:4 wine "C:\Program Files\eSpeak\TTSApp.exe" 2>/dev/null >/dev/null &) ); #in TTSApp.exe, enable XML; select proper voice; open "ttsin"; and save as "ttsout.wav" system( q(export DISPLAY=:4; xdotool search --sync --name "SAPI5 TTSAPP" windowactivate --sync %@ mousemove --window %@ 36 339 click 1 mousemove --window %@ 426 233 click 1 key --delay 100 i mousemove --window %@ 500 37 click 1 key --delay 100 BackSpace BackSpace t t s i n Return mousemove --window %@ 500 288 click 1 key --delay 100 BackSpace BackSpace t t s o u t Return 2>/dev/null >/dev/null) ); while(1) { # wait for "File Saved" dialog sleep(2); $seconds += 2; #check if "File Saved" dialog exists yet last if(system( q(export DISPLAY=:4; xdotool search --name "File Saved" >/dev/null) ) == 0); my $wavSizeCmd = q(stat --printf '%s' ttsout.wav 2>/dev/null); $wavSize = `$wavSizeCmd`; $wavSize = 0 if(!defined($wavSize) or length($wavSize) == 0); if($wavSize <= $priorWavSize) { $stalledTime += 2; if($stalledTime >= $stallLimit) { $errorCodes .= " 282"; # TTSApp.exe not responding if(system($ExitCmd) != 0) { # kill TTSApp.exe and try again $errorCodes .= " 443"; # TTSApp.exe still not responding sleep(2); system($ExitCmd); } last; } } else { $stalledTime = 0; } $priorWavSize = $wavSize; print("
$thisArg ...$wavSize bytes"); } if(($stalledTime < $stallLimit)) { # above loop not stalled if($wavSize == 11639) { $errorCodes .= " 639"; # size of .wav is exactly the size for "Enter text you whish spoken here" in the default voice } else { last; # success } } if($attempt++ >= 5) { die("unable to process file with TTSApp.exe"); } } #close "File Saved" dialog and exit TTSApp.exe if(system($closeDialogCmd) != 0) { $errorCodes .= " 934"; # closing dialog failed sleep(2); if(system($closeDialogCmd) != 0) { $errorCodes .= " 818"; # closing dialog failed again sleep(2); system($closeDialogCmd); } } if(system($ExitCmd) != 0) { $errorCodes .= " 245"; # closing TTSApp.exe failed sleep(2); if(system($ExitCmd) != 0) { $errorCodes .= " 871"; # closing TTSApp.exe failed again sleep(2); system($ExitCmd); } } print("
$thisArg ... converting to .ogg "); #''-qscale 0'' (24Kbps) has noticable whisper-like overtones and ''1'' (30Kbps) and ''2'' (35Kbps) are quite close, so I decided on ''-qscale 1'' system('cat ttsout.wav |avconv -i pipe:0 -codec:a libvorbis -qscale 1 ttsout.ogg 2>/dev/null >/dev/null') == 0 or die($!); unlink("ttsin"); unlink("ttsout.wav"); rename("ttsout.ogg",$ttsOut) or die($!); if(length($errorCodes) == 0) { print("
$thisArg ... done 
"); } else { print("
$thisArg ... done (recovered from: $errorCodes) 
"); } #unlink($ttsIn); # delete original only after .ogg is in place }

YoMismo · Answer

見たことがありますか this ？これはコマンドラインプログラムであり、Windowsで実行されているように見えるため、バッチで簡単に起動できます。

moriab · Answer

私の推奨は、ワインを排除し、Linuxpico2waveプログラムを使用することです。

Ubuntu 14.04では、pico2waveはlibttspico-utilsの一部です。

コマンドは次のようになります。

pico2wave --wave=test.wav "$(cat filename.txt)"