複数のapiリクエストから単一の書き込み可能なストリームに、複数の読み取り可能なストリームをパイプする方法は？

Question

-望ましい行動
-実際の行動
-何を試したか
-再現手順
-研究

望ましい行動

複数のAPIリクエストから受け取った複数の読み取り可能なストリームを単一の書き込み可能なストリームにパイプします。

API応答は、ibm-watsonの textToSpeech.synthesize（）メソッドからのものです。

複数のリクエストが必要な理由は、サービスのテキスト入力に_5KB_制限があるためです。

したがって、たとえば_18KB_の文字列を完了するには、4つの要求が必要です。

実際の行動

書き込み可能なストリームファイルが不完全で文字化けしています。

アプリケーションが「ハング」しているようです。

不完全な_.mp3_ファイルをオーディオプレーヤーで開こうとすると、ファイルが破損していると表示されます。

ファイルを開いたり閉じたりするプロセスでは、ファイルサイズが大きくなるようです。たとえば、ファイルを開くと、何らかの形でデータを入力するように求められます。

望ましくない動作は、入力が大きいほど顕著になります（4000バイト以下の4つの文字列など）。

私が試したこと

Npmパッケージを使用して、読み取り可能なストリームを単一の書き込み可能なストリームまたは複数の書き込み可能なストリームにパイプするいくつかの方法を試しました combined-stream 、 combined-stream2 、マルチストリームおよびアーカイバーであり、これらはすべて不完全なファイルになります。私の最後の試みはパッケージを使用せず、以下の_Steps To Reproduce_セクションに示されています。

したがって、私はアプリケーションロジックの各部分について質問しています。

01。音声API要求に対するワトソン語テキストの応答タイプは何ですか？

text to speech docs 、API応答タイプは次のようになります。

_Response type: NodeJS.ReadableStream|FileObject|Buffer _

応答タイプが3つの可能なものの1つであると混乱しています。

私のすべての試みにおいて、私はそれが_readable stream_であると想定しています。

02。マップ関数で複数のAPIリクエストを作成できますか？

03。各リクエストをpromise()内にラップしてresponseを解決できますか？

04。結果の配列をpromises変数に割り当てることはできますか？

05。var audio_files = await Promise.all(promises)を宣言できますか？

06。この宣言の後、すべての応答は「終了」しますか？

07。各応答を書き込み可能なストリームに正しくパイプするにはどうすればよいですか？

08。すべてのパイプが終了したことを検出するには、どうすればクライアントにファイルを送信できますか？

質問2〜6では、答えは「はい」であると想定しています。

私の失敗は質問7と8に関連していると思います。

再現する手順

このコードは、_3975_、_3863_、_3974_および_3629_バイトのそれぞれのバイトサイズを持つランダムに生成された4つのテキスト文字列の配列でテストできます- こちらその配列のPastebinです。

_// route handler app.route("/api/:api_version/tts") .get(api_tts_get); // route handler middleware const api_tts_get = async (req, res) => { var query_parameters = req.query; var file_name = query_parameters.file_name; var text_string_array = text_string_array; // eg: https://Pastebin.com/raw/JkK8ehwV var absolute_path = path.join(__dirname, "/src/temp_audio/", file_name); var relative_path = path.join("./src/temp_audio/", file_name); // path relative to server root // for each string in an array, send it to the watson api var promises = text_string_array.map(text_string => { return new Promise((resolve, reject) => { // credentials var textToSpeech = new TextToSpeechV1({ iam_apikey: iam_apikey, url: tts_service_url }); // params var synthesizeParams = { text: text_string, accept: 'audio/mp3', voice: 'en-US_AllisonV3Voice' }; // make request textToSpeech.synthesize(synthesizeParams, (err, audio) => { if (err) { console.log("synthesize - an error occurred: "); return reject(err); } resolve(audio); }); }); }); try { // wait for all responses var audio_files = await Promise.all(promises); var audio_files_length = audio_files.length; var write_stream = fs.createWriteStream(`${relative_path}.mp3`); audio_files.forEach((audio, index) => { // if this is the last value in the array, // pipe it to write_stream, // when finished, the readable stream will emit 'end' // then the .end() method will be called on write_stream // which will trigger the 'finished' event on the write_stream if (index == audio_files_length - 1) { audio.pipe(write_stream); } // if not the last value in the array, // pipe to write_stream and leave open else { audio.pipe(write_stream, { end: false }); } }); write_stream.on('finish', function() { // download the file (using absolute_path) res.download(`${absolute_path}.mp3`, (err) => { if (err) { console.log(err); } // delete the file (using relative_path) fs.unlink(`${relative_path}.mp3`, (err) => { if (err) { console.log(err); } }); }); }); } catch (err) { console.log("there was an error getting tts"); console.log(err); } } _

公式の例は以下を示しています。

_textToSpeech.synthesize(synthesizeParams) .then(audio => { audio.pipe(fs.createWriteStream('hello_world.mp3')); }) .catch(err => { console.log('error:', err); }); _

私の知る限り、これは単一のリクエストではうまくいくようですが、複数のリクエストではうまくいきません。

研究

読み取りおよび書き込み可能なストリーム、読み取り可能なストリームモード（フローおよび一時停止）、 'data'、 'end'、 'drain'および 'finish'イベント、pipe（）、fs.createReadStream（）およびfsに関する.createWriteStream（）

ほとんどすべてのNode.jsアプリケーションは、どんなに単純であっても、なんらかの方法でストリームを使用します...

_const server = http.createServer((req, res) => { // `req` is an http.IncomingMessage, which is a Readable Stream // `res` is an http.ServerResponse, which is a Writable Stream let body = ''; // get the data as utf8 strings. // if an encoding is not set, Buffer objects will be received. req.setEncoding('utf8'); // readable streams emit 'data' events once a listener is added req.on('data', (chunk) => { body += chunk; }); // the 'end' event indicates that the entire body has been received req.on('end', () => { try { const data = JSON.parse(body); // write back something interesting to the user: res.write(typeof data); res.end(); } catch (er) { // uh oh! bad json! res.statusCode = 400; return res.end(`error: ${er.message}`); } }); }); _

https://nodejs.org/api/stream.html#stream_api_for_stream_consumers

読み取り可能なストリームには2つの主要なモードがあり、それらを使用する方法に影響を与えます...それらはpausedモードまたはflowingモードのいずれかです。すべての読み取り可能なストリームはデフォルトで一時停止モードで開始しますが、必要に応じて簡単にflowingに切り替え、pausedに戻すことができます...dataイベントハンドラーを追加するだけで、ストリームをflowingモードに一時停止し、dataイベントハンドラーを削除すると、ストリームがpausedモードに戻ります。

https://www.freecodecamp.org/news/node-js-streams-everything-you-need-to-know-c9141306be9

以下は、読み取りおよび書き込み可能なストリームで使用できる重要なイベントと機能のリストです。

読み取り可能なストリームで最も重要なイベントは次のとおりです。

ストリームがデータのチャンクをコンシューマーに渡すたびに発行されるdataイベントストリームから消費されるデータがなくなったときに発行されるendイベント。

書き込み可能なストリームで最も重要なイベントは次のとおりです。

drainイベント。書き込み可能なストリームがより多くのデータを受信できることを示す信号です。 finishイベント。すべてのデータが基盤となるシステムにフラッシュされたときに発行されます。

https://www.freecodecamp.org/news/node-js-streams-everything-you-need-to-know-c9141306be9

.pipe()は、fs.createReadStream()からの「データ」および「終了」イベントのリスニングを処理します。

https://github.com/substack/stream-handbook#why-you-should-use-streams

.pipe()は、読み取り可能なソースストリームsrcを受け取り、出力先の書き込み可能なストリームにフックする関数ですdst

https://github.com/substack/stream-handbook#pipe

pipe()メソッドの戻り値は宛先ストリームです

https://flaviocopes.com/nodejs-streams/#pipe

デフォルトでは、ソースWritableストリームが_'end'_を発行するときに、宛先Readableストリームで stream.end（）が呼び出されるため、宛先はより長い書き込み可能。このデフォルトの動作を無効にするには、endオプションをfalseとして渡して、宛先ストリームを開いたままにします。

https://nodejs.org/api/stream.html#stream_readable_pipe_destination_options

_'finish'_イベントは、stream.end()メソッドが呼び出され、すべてのデータが基本システムにフラッシュされた後に発行されます。

_const writer = getWritableStreamSomehow(); for (let i = 0; i < 100; i++) { writer.write(`hello, #${i}!
`); } writer.end('This is the end
'); writer.on('finish', () => { console.log('All writes are now complete.'); }); _

https://nodejs.org/api/stream.html#stream_event_finish

複数のファイルを読み取ってそれらを書き込み可能なストリームにパイプしようとする場合、各ファイルを書き込み可能なストリームにパイプして、そのときに_end: false_を渡す必要があります。デフォルトでは、読み取り可能なストリームが書き込み可能なストリームを終了するためです。読み取るデータがなくなったとき。次に例を示します。

_var ws = fs.createWriteStream('output.pdf'); fs.createReadStream('pdf-sample1.pdf').pipe(ws, { end: false }); fs.createReadStream('pdf-sample2.pdf').pipe(ws, { end: false }); fs.createReadStream('pdf-sample3.pdf').pipe(ws); _

https://stackoverflow.com/a/30916248

最初の読み取りを完了するために、2番目の読み取りをイベントリスナーに追加したい...

_var a = fs.createReadStream('a'); var b = fs.createReadStream('b'); var c = fs.createWriteStream('c'); a.pipe(c, {end:false}); a.on('end', function() { b.pipe(c) } _

https://stackoverflow.com/a/28033554

Node Streams-part one and two の簡単な歴史。

関連するGoogle検索：

複数の読み取り可能なストリームを単一の書き込み可能なストリームにパイプする方法は？ nodejs

信頼できる回答なしの（または「古い」場合がある）同じまたは類似のトピックをカバーする質問：

複数のReadableStreamsを単一のWriteStreamにパイプする方法

異なる書き込み可能なストリームを介して同じ書き込み可能なストリームに2回パイプする

1つの応答に複数のファイルをパイプする

2つのパイプストリームからのNode.jsストリームの作成

Hamid Raza Noori · Answer

上記の問題にはWebRTCが適しています。ファイルが生成されたら、クライアントに聞いてもらいます。

https://www.npmjs.com/package/simple-peer

user1063287 · Answer

2つの解決策があります。

ソリューション01

_Bluebird.mapSeries_を使用します
個々の応答を一時ファイルに書き込みます
それらをZipファイルに入れます（ archiver を使用）
保存するためにクライアントにZipファイルを送り返す
一時ファイルを削除します

BMの answer の_Bluebird.mapSeries_を利用しますが、応答を単にマッピングする代わりに、リクエストand応答はマップ関数内で処理されます。また、読み取り可能なストリームfinishイベントではなく、書き込み可能なストリームendイベントのプロミスを解決します。 Bluebirdは、応答が受信されて処理されるまでマップ関数内でpauses反復し、次の反復に進むという点で役立ちます。

Bluebirdマップ関数がファイルを圧縮するのではなく、きれいなオーディオファイルを生成する場合、could Terry Lennoxの answer のようなソリューションを使用して複数のオーディオファイルを結合します1つのオーディオファイルに。 Bluebirdと_fluent-ffmpeg_を使用してそのソリューションを初めて試したところ、単一のファイルが生成されましたが、品質はやや低かったので、ffmpegの設定で微調整できることは間違いありませんが、それを行う時間がありませんでした。

_// route handler app.route("/api/:api_version/tts") .get(api_tts_get); // route handler middleware const api_tts_get = async (req, res) => { var query_parameters = req.query; var file_name = query_parameters.file_name; var text_string_array = text_string_array; // eg: https://Pastebin.com/raw/JkK8ehwV var absolute_path = path.join(__dirname, "/src/temp_audio/", file_name); var relative_path = path.join("./src/temp_audio/", file_name); // path relative to server root // set up archiver var archive = archiver('Zip', { zlib: { level: 9 } // sets the compression level }); var Zip_write_stream = fs.createWriteStream(`${relative_path}.Zip`); archive.pipe(Zip_write_stream); await Bluebird.mapSeries(text_chunk_array, async function(text_chunk, index) { // check if last value of array const isLastIndex = index === text_chunk_array.length - 1; return new Promise((resolve, reject) => { var textToSpeech = new TextToSpeechV1({ iam_apikey: iam_apikey, url: tts_service_url }); var synthesizeParams = { text: text_chunk, accept: 'audio/mp3', voice: 'en-US_AllisonV3Voice' }; textToSpeech.synthesize(synthesizeParams, (err, audio) => { if (err) { console.log("synthesize - an error occurred: "); return reject(err); } // write individual files to disk var file_name = `${relative_path}_${index}.mp3`; var write_stream = fs.createWriteStream(`${file_name}`); audio.pipe(write_stream); // on finish event of individual file write write_stream.on('finish', function() { // add file to archive archive.file(file_name, { name: `audio_${index}.mp3` }); // if not the last value of the array if (isLastIndex === false) { resolve(); } // if the last value of the array else if (isLastIndex === true) { resolve(); // when Zip file has finished writing, // send it back to client, and delete temp files from server Zip_write_stream.on('close', function() { // download the Zip file (using absolute_path) res.download(`${absolute_path}.Zip`, (err) => { if (err) { console.log(err); } // delete each audio file (using relative_path) for (let i = 0; i < text_chunk_array.length; i++) { fs.unlink(`${relative_path}_${i}.mp3`, (err) => { if (err) { console.log(err); } console.log(`AUDIO FILE ${i} REMOVED!`); }); } // delete the Zip file fs.unlink(`${relative_path}.Zip`, (err) => { if (err) { console.log(err); } console.log(`Zip FILE REMOVED!`); }); }); }); // from archiver readme examples archive.on('warning', function(err) { if (err.code === 'ENOENT') { // log warning } else { // throw error throw err; } }); // from archiver readme examples archive.on('error', function(err) { throw err; }); // from archiver readme examples archive.finalize(); } }); }); }); }); } _

ソリューション02

ライブラリを使用してmap()反復内で「一時停止」しないソリューションを見つけることに熱心だったので、次のようにしました。

map()関数を forループに交換
aPI呼び出しの前にawaitを使用しましたが、promiseでラップするのではなく、
return new Promise()を使用して応答処理を含める代わりに、await new Promise()を使用しました（ this answer

この最後の変更は、archive.file()およびaudio.pipe(writestream)操作が完了するまで、魔法のようにループを一時停止しました。これがどのように機能するかをよりよく理解したいと思います。

_// route handler app.route("/api/:api_version/tts") .get(api_tts_get); // route handler middleware const api_tts_get = async (req, res) => { var query_parameters = req.query; var file_name = query_parameters.file_name; var text_string_array = text_string_array; // eg: https://Pastebin.com/raw/JkK8ehwV var absolute_path = path.join(__dirname, "/src/temp_audio/", file_name); var relative_path = path.join("./src/temp_audio/", file_name); // path relative to server root // set up archiver var archive = archiver('Zip', { zlib: { level: 9 } // sets the compression level }); var Zip_write_stream = fs.createWriteStream(`${relative_path}.Zip`); archive.pipe(Zip_write_stream); for (const [index, text_chunk] of text_chunk_array.entries()) { // check if last value of array const isLastIndex = index === text_chunk_array.length - 1; var textToSpeech = new TextToSpeechV1({ iam_apikey: iam_apikey, url: tts_service_url }); var synthesizeParams = { text: text_chunk, accept: 'audio/mp3', voice: 'en-US_AllisonV3Voice' }; try { var audio_readable_stream = await textToSpeech.synthesize(synthesizeParams); await new Promise(function(resolve, reject) { // write individual files to disk var file_name = `${relative_path}_${index}.mp3`; var write_stream = fs.createWriteStream(`${file_name}`); audio_readable_stream.pipe(write_stream); // on finish event of individual file write write_stream.on('finish', function() { // add file to archive archive.file(file_name, { name: `audio_${index}.mp3` }); // if not the last value of the array if (isLastIndex === false) { resolve(); } // if the last value of the array else if (isLastIndex === true) { resolve(); // when Zip file has finished writing, // send it back to client, and delete temp files from server Zip_write_stream.on('close', function() { // download the Zip file (using absolute_path) res.download(`${absolute_path}.Zip`, (err) => { if (err) { console.log(err); } // delete each audio file (using relative_path) for (let i = 0; i < text_chunk_array.length; i++) { fs.unlink(`${relative_path}_${i}.mp3`, (err) => { if (err) { console.log(err); } console.log(`AUDIO FILE ${i} REMOVED!`); }); } // delete the Zip file fs.unlink(`${relative_path}.Zip`, (err) => { if (err) { console.log(err); } console.log(`Zip FILE REMOVED!`); }); }); }); // from archiver readme examples archive.on('warning', function(err) { if (err.code === 'ENOENT') { // log warning } else { // throw error throw err; } }); // from archiver readme examples archive.on('error', function(err) { throw err; }); // from archiver readme examples archive.finalize(); } }); }); } catch (err) { console.log("oh dear, there was an error: "); console.log(err); } } } _

学習体験

このプロセス中に発生したその他の問題を以下に示します。

ノードの使用時に長いリクエストがタイムアウトする（そしてリクエストを再送信する）...

_// solution req.connection.setTimeout( 1000 * 60 * 10 ); // ten minutes _

参照： https://github.com/expressjs/express/issues/2512

ノードの最大ヘッダーサイズ8KBにより400エラーが発生しました（クエリ文字列はヘッダーサイズに含まれます）...

_// solution (although probably not recommended - better to get text_string_array from server, rather than client) node --max-http-header-size 80000 app.js _

参照： https://github.com/nodejs/node/issues/24692