Unixでの行/フィールドごとの文字の出現をカウントする

Question

このようなデータを持つファイルを指定します（つまり、stores.datファイル）

sid|storeNo|latitude|longitude 2tt|1|-28.0372000t0|153.42921670 9|2t|-33tt.85t09t0000|15t1.03274200

行ごとに「t」文字の出現回数を返すコマンドは何ですか？

例えば。戻ります：

count lineNum 4 1 3 2 6 3

また、フィールドごとの出現回数でそれを行うには、次の結果を返すコマンドは何ですか？

例えば。列2と文字 't'の入力

count lineNum 1 1 0 2 1 3

例えば。列3および文字 't'の入力

count lineNum 2 1 1 2 4 3

jaypal singh · Accepted Answer

行ごとに文字の出現をカウントするには、次のようにします。

_awk -F'|' 'BEGIN{print "count", "lineNum"}{print gsub(/t/,"") "	" NR}' file count lineNum 4 1 3 2 6 3 _

フィールド/列ごとに文字の出現をカウントするには、次のようにします。

列2：

_awk -F'|' -v fld=2 'BEGIN{print "count", "lineNum"}{print gsub(/t/,"",$fld) "	" NR}' file count lineNum 1 1 0 2 1 3 _

列目

_awk -F'|' -v fld=3 'BEGIN{print "count", "lineNum"}{print gsub(/t/,"",$fld) "	" NR}' file count lineNum 2 1 1 2 4 3 _

gsub()関数の戻り値は、行われた置換の数です。そのため、これを使用して番号を印刷します。
NRは行番号を保持するので、それを使用して行番号を出力します。
特定のフィールドの出現を印刷するには、変数fldを作成し、カウントを抽出するフィールド番号を入力します。

Gabriel Burt · Answer

grep -n -o "t" stores.dat | sort -n | uniq -c | cut -d : -f 1

ほぼ正確に必要な出力が得られます。

 4 1 3 2 6 3

grep -oヒントを提供してくれた@ raghav-bhushanに感謝します。 -nフラグには行番号も含まれます。

artm · Answer

行ごとに文字の出現をカウントするには：

$ awk -F 't' '{print NF-1, NR}' input.txt 4 1 3 2 6 3

これは、カウントする必要がある文字にフィールドセパレータを設定し、フィールドの数がセパレータの数よりも1多いという事実を使用します。

特定の列の発生をカウントするには、まずその列からcut：

$ cut -d '|' -f 2 input.txt | awk -F 't' '{print NF-1, NR}' 1 1 0 2 1 3 $ cut -d '|' -f 3 input.txt | awk -F 't' '{print NF-1, NR}' 2 1 1 2 4 3

jfg956 · Answer

AwkやPerlは必要ありません。bashと標準のUnixユーティリティのみが必要です。

cat file | tr -c -d "t
" | cat -n | { echo "count lineNum" while read num data; do test ${#data} -gt 0 && printf "%4d %5d
" ${#data} $num done; }

そして、特定の列について：

cut -d "|" -f 2 file | tr -c -d "t
" | cat -n | { echo -e "count lineNum" while read num data; do test ${#data} -gt 0 && printf "%4d %5d
" ${#data} $num done; }

trとcatsを避けることもできます：

echo "count lineNum" num=1 while read data; do new_data=${data//t/} count=$((${#data}-${#new_data})) test $count -gt 0 && printf "%4d %5d
" $count $num num=$(($num+1)) done < file

カットのイベント：

echo "count lineNum" num=1; OLF_IFS=$IFS; IFS="|" while read -a array_data; do data=${array_data[1]} new_data=${data//t/} count=$((${#data}-${#new_data})) test $count -gt 0 && printf "%4d %5d
" $count $num num=$(($num+1)) done < file IFS=$OLF_IFS

Birei · Answer

Perlを使用した1つの可能な解決策：

script.plの内容：

use warnings; use strict; ## Check arguments: ## 1.- Input file ## 2.- Char to search. ## 3.- (Optional) field to search. If blank, zero or bigger than number ## of columns, default to search char in all the line. (@ARGV == 2 || @ARGV == 3) or die qq(Usage: Perl $0 input-file char [column]
); my ($char,$column); ## Get values or arguments. if ( @ARGV == 3 ) { ($char, $column) = splice @ARGV, -2; } else { $char = pop @ARGV; $column = 0; } ## Check that $char must be a non-white space character and $column ## only accept numbers. die qq[Bad input
] if $char !~ m/^\S$/ or $column !~ m/^\d+$/; print qq[count	lineNum
]; while ( <> ) { ## Remove last '
' chomp; ## Get fields. my @f = split /\|/; ## If column is a valid one, select it to the search. if ( $column > 0 and $column <= scalar @f ) { $_ = $f[ $column - 1]; } ## Count. my $count = eval qq[tr/$char/$char/]; ## Print result. printf qq[%d	%d
], $count, $.; }

スクリプトは3つのパラメーターを受け入れます。

入力ファイル
検索する文字
検索する列：列が無効な数字である場合、すべての行を検索します。

引数なしでスクリプトを実行する：

Perl script.pl Usage: Perl script.pl input-file char [column]

引数とその出力：

ここで、0は悪い列です。すべての行を検索します。

Perl script.pl stores.dat 't' 0 count lineNum 4 1 3 2 6 3

ここでは、列1を検索します。

Perl script.pl stores.dat 't' 1 count lineNum 0 1 2 2 0 3

ここでは、列3を検索します。

Perl script.pl stores.dat 't' 3 count lineNum 2 1 1 2 4 3

thは文字ではありません。

Perl script.pl stores.dat 'th' 3 Bad input

vulcan · Answer

awk '{gsub("[^t]",""); print length($0),NR;}' stores.dat

Gsub（）を呼び出すと、t以外の行のすべてが削除され、残りの長さと現在の行番号が出力されます。

列2だけでやりたいですか？

awk 'BEGIN{FS="|"} {gsub("[^t]","",$2); print NR,length($2);}' stores.dat

Haven Holmes · Answer

 $ cat -n test.txt 1 test 1 2 you want 3 void 4 you don't want 5 ttttttttttt 6 t t t t t t $ awk '{n=split($0,c,"t")-1;if (n!=0) print n,NR}' test.txt 2 1 1 2 2 4 11 5 6 6

Steve Thorn · Answer

Perl -e 'while(<>) { $count = tr/t//; print "$count ".++$x."
"; }' stores.dat

もう1つのPerlの答えはいや！ tr/t //関数は、その行で翻訳が発生した回数、つまりtr文字 't'が見つかった回数を返します。 ++ $ xは行番号カウントを維持します。

Cole Tierney · Answer

「t」で行またはフィールドを分割し、結果の配列の長さを確認することもできます-1.行のcol変数を0または列の1〜3に設定します。

awk -F'|' -v col=0 -v OFS=$'	' 'BEGIN { print "count", "lineNum" }{ split($col, a, "t"); print length(a) - 1, NR } ' stores.dat

Jelena · Answer

cat stores.dat | awk 'BEGIN {FS = "|"}; {print $1}' | awk 'BEGIN {FS = "	"}; {print NF}'

どこ$1は、カウントする列番号です。