テキストファイル内の重複行をカウントするLinuxコマンドまたはスクリプト？

Question

次の内容のテキストファイルがある場合

red Apple green Apple green Apple orange orange orange

次の結果を得るために使用できるLinuxコマンドまたはスクリプトはありますか？

1 red Apple 2 green Apple 3 orange

borrible · Accepted Answer

sort（隣接するアイテムをまとめるため）を介して送信し、uniq -cを介してカウントを送信します。

sort filename | uniq -c

そして、リストを頻度順にソートして取得することができます

sort filename | uniq -c | sort -nr

Jaberino · Answer

Borribles 'とほぼ同じですが、dパラメータをuniqに追加すると、重複のみが表示されます。

sort filename | uniq -cd | sort -nr

mhyfritz · Answer

uniq -c file

ファイルがまだソートされていない場合：

sort file | uniq -c

Rahul · Answer

これを試して

cat myfile.txt| sort| uniq

user unknown · Answer

アルファベット順のリストで生活できますか：

echo "red Apple > green Apple > green Apple > orange > orange > orange > " | sort -u

？

green Apple orange red Apple

または

sort -u FILE

-uはuniqueを表し、一意性はソートを介してのみ到達します。

順序を保持するソリューション：

echo "red Apple green Apple green Apple orange orange orange " | { old=""; while read line ; do if [[ $line != $old ]]; then echo $line; old=$line; fi ; done } red Apple green Apple orange

そして、ファイルで

cat file | { old="" while read line do if [[ $line != $old ]] then echo $line old=$line fi done }

最後の2つは、すぐ後に続く重複のみを削除します。これは、例に適合します。

echo "red Apple green Apple lila banana green Apple " ...

バナナで分割された2つのリンゴを印刷します。

pajton · Answer

cat <filename> | sort | uniq -c

Chris Eberle · Answer

カウントを取得するには：

$> egrep -o '\w+' fruits.txt | sort | uniq -c 3 Apple 2 green 1 oragen 2 orange 1 red

ソートされたカウントを取得するには：

$> egrep -o '\w+' fruits.txt | sort | uniq -c | sort -nk1 1 oragen 1 red 2 green 2 orange 3 Apple

編集

ああ、これは言葉の境界に沿ったものではありませんでした。全行に使用するコマンドは次のとおりです。

$> cat fruits.txt | sort | uniq -c | sort -nk1 1 oragen 1 red Apple 2 green Apple 2 orange