ファイルから特定の行を取得するにはどうすればよいですか？

Question

非常に大きなファイルから正確な行を抽出したい。たとえば、8000行目は次のようになります。

command -line 8000 > output_line_8000.txt

gniourf_gniourf · Accepted Answer

Perlとawkにはすでに回答があります。これがsedの回答です。

sed -n '8000{p;q}' file

qコマンドの利点は、8000行目が読み取られるとすぐにsedが終了することです（~~他のPerlおよびawkメソッドとは異なり~~ （それは共通の創造性の後に変更されました、笑））。

純粋なBashの可能性（bash≥4）：

mapfile -s 7999 -n 1 ary < file printf '%s' "${ary[0]}"

これにより、fileの内容が配列ary（フィールドごとに1行）に含まれますが、最初の7999行はスキップされます（-s 7999）と1行だけ読み取ります（-n 1）。

terdon · Answer

土曜日だし、他に何もすることがなかったので、速度をテストした。 sed、gawk、Perlのアプローチは基本的に同等であることがわかります。ヘッド＆テールの方が最も遅いですが、驚いたことに、最速桁違いは純粋なbashの方です。

これが私のテストです：

$ for i in {1..5000000}; do echo "This is line $i" >>file; done

上記は、1億を占める5,000万行のファイルを作成します。

$ for cmd in "sed -n '8000{p;q}' file" \ "Perl -ne 'print && exit if $. == 8000' file" \ "awk 'FNR==8000 {print;exit}' file" "head -n 8000 file | tail -n 1" \ "mapfile -s 7999 -n 1 ary < file; printf '%s' \"${ary[0]}\"" \ "tail -n 8001 file | head -n 1"; do echo "$cmd"; for i in {1..100}; do (time eval "$cmd") 2>&1 | grep -oP 'real.*?m\K[\d\.]+'; done | awk '{k+=$1}END{print k/100}'; done sed -n '8000{p;q}' file 0.04502 Perl -ne 'print && exit if $. == 8000' file 0.04698 awk 'FNR==8000 {print;exit}' file 0.04647 head -n 8000 file | tail -n 1 0.06842 mapfile -s 7999 -n 1 ary < file; printf '%s' "This is line 8000 " 0.00137 tail -n 8001 file | head -n 1 0.0033

cuonglm · Answer

あなたはそれを多くの方法で行うことができます。

Perlの使用：

Perl -nle 'print && exit if $. == 8000' file

awkの使用：

awk 'FNR==8000 {print;exit}' file

または、tailおよびheadを使用して、8000行目までファイル全体を読み取らないようにすることができます。

tail -n +8000 | head -n 1

masegaloeh · Answer

tailおよびheadを含む別のバージョン

head -n 8000 file | tail -n 1

devnull · Answer

sedを使用できます：

sed -n '8000p;' filename

ファイルが大きい場合は、終了することをお勧めします。

sed -n '8000p;8001q' filename

同様に、awkまたはPerlを使用してファイル全体の読み取りを中止することもできます。

awk 'NR==8000{print;exit}' filename

Perl -ne 'print if $.==8000; last if $.==8000' filename

mbsingh · Answer

これはどう？

$ cat -n filename | grep -E "[ 	]+8000"

例

$ cat -n /etc/abrt/plugins/CCpp.conf | grep -E "^[ 	]+16" 16 #DebuginfoLocation = /var/cache/abrt-di