生成されたnode.js子プロセスの出力を行ごとに解析します

Question

process.spawn()を使用してnode.jsスクリプト内から実行しているPhantomJS/CasperJSスクリプトがあります。 CasperJSはrequire() ingモジュールをサポートしていないため、CasperJSからstdoutにコマンドを出力し、spawn.stdout.on('data', function(data) {});を使用してnode.jsスクリプトからそれらを読み取ろうとしていますオブジェクトをredis/mongooseに追加するなどの操作を行うには（複雑ですが、そうですが、このためのWebサービスを設定するよりも簡単に見えます...）CasperJSスクリプトは一連のコマンドを実行し、必要なスクリーンショットを20個作成します。私のデータベースに追加されます。

しかし、data変数（Buffer？）を行に分割する方法がわかりません...文字列に変換してから置換しようとしましたが、 spawn.stdout.setEncoding('utf8');を試してみましたが、何も機能しないようです...

これが私が今持っているものです

var spawn = require('child_process').spawn; var bin = "casperjs" //googlelinks.js is the example given at http://casperjs.org/#quickstart var args = ['scripts/googlelinks.js']; var cspr = spawn(bin, args); //cspr.stdout.setEncoding('utf8'); cspr.stdout.on('data', function (data) { var buff = new Buffer(data); console.log("foo: " + buff.toString('utf8')); }); cspr.stderr.on('data', function (data) { data += ''; console.log(data.replace("
", "
stderr: ")); }); cspr.on('exit', function (code) { console.log('child process exited with code ' + code); process.exit(code); });

https://Gist.github.com/2131204

maerics · Accepted Answer

これを試して：

cspr.stdout.setEncoding('utf8'); cspr.stdout.on('data', function(data) { var str = data.toString(), lines = str.split(/(
?
)/g); for (var i=0; i<lines.length; i++) { // Process the line, noting it might be incomplete. } });

「データ」イベントは必ずしも出力の行間で均等に分割されない場合があるため、1つの行が複数のデータイベントにまたがることがあることに注意してください。

Sam Day · Answer

私は実際にNodeライブラリをこの目的のために作成しました。これはストリームスプリッターと呼ばれ、Githubで見つけることができます： samcday/stream-splitter 。

ライブラリは、特別なStreamを提供します。これは、casper stdoutを区切り文字（あなたの場合は\ n）とともにパイプすることができ、行ごとに1つのきちんとしたtokenイベントを発行します入力Streamから分割されました。これの内部実装は非常に単純で、ほとんどの魔法を substack/node-buffers に委譲します。これは、不要なBuffer割り当て/コピーがないことを意味します。

mako · Answer

Maericsの回答に追加します。これは、データダンプでラインの一部のみが供給される場合を適切に処理しません（ラインの最初の部分と2番目の部分が2つの別々のラインとして個別に提供されます）。

var _breakOffFirstLine = /
?
/ function filterStdoutDataDumpsToTextLines(callback){ //returns a function that takes chunks of stdin data, aggregates it, and passes lines one by one through to callback, all as soon as it gets them. var acc = '' return function(data){ var splitted = data.toString().split(_breakOffFirstLine) var inTactLines = splitted.slice(0, splitted.length-1) var inTactLines[0] = acc+inTactLines[0] //if there was a partial, unended line in the previous dump, it is completed by the first section. acc = splitted[splitted.length-1] //if there is a partial, unended line in this dump, store it to be completed by the next (we assume there will be a terminating newline at some point. This is, generally, a safe assumption.) for(var i=0; i<inTactLines.length; ++i){ callback(inTactLines[i]) } } }

使用法：

process.stdout.on('data', filterStdoutDataDumpsToTextLines(function(line){ //each time this inner function is called, you will be getting a single, complete line of the stdout ^^ }) )

nyctef · Answer

私は純粋なノードでこれを行うより良い方法を見つけました、それはうまくいくようです：

const childProcess = require('child_process'); const readline = require('readline'); const cspr = childProcess.spawn(bin, args); const rl = readline.createInterface({ input: cspr.stdout }); rl.on('line', line => /* handle line here */)

Rick · Answer

あなたはこれを試してみることができます。空の行や空の改行は無視されます。

cspr.stdout.on('data', (data) => { data = data.toString().split(/(
?
)/g); data.forEach((item, index) => { if (data[index] !== '
' && data[index] !== '') { console.log(data[index]); } }); });

Julio · Answer

古いものですが、まだ便利です...

この目的のために、カスタムストリームのTransformサブクラスを作成しました。

参照 https://stackoverflow.com/a/59400367/4861714