行のグループでファイルを並べ替える

Question

次のような内容のファイルがある場合：

FirstSection Unique first line in first section Unique second line in first section SecondSection Unique first line in second section Unique second line in second section ... NthSection Unique first line in Nth section Unique second line in Nth section

UNIXコマンド（sort、awkなど）を使用して、インデントされた行を既存のグループの下に保持しながら、ファイルを3行グループの最初のインデントされていない行でアルファベット順に並べ替えることはできますか？

JJoao · Accepted Answer

Perlを使用すると、次の行に沿って何かを実行できます。

ファイルを丸呑みする（_Perl -0n_）
インデントされていない行で入力を分割するsplit(/^(?=\S)/m)
並べ替えて印刷

_Perl -0ne 'print sort split(/^(?=\S)/m) ' ex _

seshoumara · Answer

最初のsedは、テキスト<EOL>をセクション行間の区切り文字として使用して、各セクションを1行に配置します。次に、セクションを並べ替え、2番目のsedを使用して各<EOL>を改行に戻します。

sed -r ':r;$!{N;br};s:
([[:blank:]])(\1*):<EOL>\1\2:g' file|sort|sed -r '/^$/d;:l;G;s:(.*)<EOL>(.*)(
):\1\3\2:;tl;$s:
$::'

入力ファイルに文字が含まれている可能性があるため、区切り文字として文字を選択しなかったため、代わりに<EOL>を使用しました。

出力：入力ファイルのスタイルを再作成するために、最後を除く各セクションの後に改行を追加しました。

FirstSection Unique first line in first section Unique second line in first section NthSection Unique first line in Nth section Unique second line in Nth section SecondSection Unique first line in second section Unique second line in second section

αғsнιη · Answer

awkを使用すると、各グループ間の改行に基づいて、すべてのグループをawk連想配列に保持できます。次にasort()を実行し、使用したforループでソートして出力します。

awk '/^$/{ ++group; next} { saving[group]=(saving[group]==""? $0 : saving[group] RS $0) } END{ asort(saving); for(group in saving) print saving[group] }' infile

注：PROCINFO["sorted_in"]要素を使用して、必要な並べ替えのタイプを設定できます。たとえば、PROCINFO["sorted_in"]="@val_str_desc"は、配列のval ueをstr ingとしてdescの順序で並べ替えます。

次のような入力ファイルでのテスト：

BFirstSection Unique first line in first section Unique second line in first section DSecondSection Unique first line in second section Unique second line in second section Aanothersection... ... ... CfourthSection Unique first line in Nth section Unique second line in Nth section

次のように出力されます。

Aanothersection... ... ... BFirstSection Unique first line in first section Unique second line in first section CfourthSection Unique first line in Nth section Unique second line in Nth section DSecondSection Unique first line in second section Unique second line in second section