新しい行で始まる文字列のHexdump？

Question

複数行の文字列がありますが、そのエントリが短いとします。 16進ダンプしようとすると、次のようになります。

echo "something is being written here" | hexdump -C #00000000 73 6f 6d 65 74 68 69 6e 67 0a 69 73 0a 62 65 69 |something.is.bei| #00000010 6e 67 0a 77 72 69 74 74 65 6e 0a 68 65 72 65 0a |ng.written.here.| #00000020

hexdumpを含むほとんどの16進ダンププログラムは、2Dマトリックスとして機能します（1行あたりのバイト数/列数を定義できます）。したがって、この場合、出力全体が2行のダンプに圧縮されます。

新しい行（0x0a-しかし、おそらく他の文字、またはそのシーケンス）、それはまた改行を開始しますか？この場合、次のような出力を想像します。

00000000 73 6f 6d 65 74 68 69 6e 67 0a |something.| 0000000a 69 73 0a |is.| 0000000d 62 65 69 6e 67 0a |being.| 00000013 77 72 69 74 74 65 6e 0a |written.| 0000001b 68 65 72 65 0a |here.| 00000020

Janis · Answer

これは、1つの可能性、つまりreadの機能を利用して読み取り文字数を制限するコンパクトなソリューションです。

c=0 while IFS= read -n16 -r line do len=${#line} ((len<16)) && { ((len++)) ; line+=$'
' ;} printf "%08x " $c for ((i=0; i<len; i++)) do printf " %02x" "'${line:i:1}" done printf " %*s %s
" $((50-3*len)) "" "'${line//[^[:print:]]/.}'" ((c+=len)) done

mxmlnkn · Answer

2つのファイルをdifftoolで比較するためにこれが必要でしたが、それでも印刷できない文字の種類が異なることを確認できます。

この関数は、hexdumpに_-n_オプションを追加します。 _-n_が指定されている場合、通常のhexdumpが呼び出されていなければ、出力は改行で分割されます。 @Janisの回答と比較すると、これはhexdumpの完全な書き直しではありませんが、指定されている場合は、指定された他のパラメーターを使用してhexdumpが呼び出されます。しかし、hexdumpは、オフセットを保持するために、headおよび_-s_ skipオプションを使用して、入力をライン単位でフィードします。この関数は、ファイルが指定されているときだけでなく、パイプされているときにも機能します。 hexdumpのように、指定された複数のファイルでは機能しませんが。

私はこれをより簡単で短い代替回答にしたかったのですが、入力に対してこれらすべてのEdgeケースを防ぐことで、実際には長くなりました。

_hexdump() { # introduces artifical line breaks in hexdump output at newline characters # might be useful for comparing files linewise, but still be able to # see the differences in non-printable characters utilizing hexdump # first argument must be -n else normal hexdump will be used local isTmpFile=0 if [ "$1" != '-n' ]; then command hexdump "$@"; else if [ -p /dev/stdin ]; then local file="$( mktemp )" args=( "${@:2}" ) isTmpFile=1 cat > "$file" # save pipe to temporary file else local file="${@: -1}" args=( "${@:2:$#-2}" ) fi # sed doesn't seem to work on file descripts for some very weird reason, # the linelength will always be zero, so check for that, too ... local readfile="$( readlink -- "$file" )" if [ -n "$readfile" ]; then # e.g. readlink might return pipe:[123456] if [ "${readfile::1}" != '/' ]; then readfile="$( mktemp )" isTmpFile=1 cat "$file" > "$readfile" file="$readfile" else file="$readfile" fi fi # we can't use read here else \x00 in the file gets ignored. # Plus read will ignore the last line if it does not have a 
! # Unfortunately using sed '<linenumbeer>p' prints an additional 
 # on the last line, if it wasn't there, but I guess still better than # ignoring it ... local linelength offset nBytes="$( cat "$file" | wc -c )" line=1 for (( offset = 0; offset < nBytes; )); do linelength=$( sed -n "$line{p;q}" -- "$file" | wc -c ) (( ++line )) head -c $(( offset + $linelength )) -- "$file" | command hexdump -s $offset "${args[@]}" | sed '$d' (( offset += $linelength )) done # Hexdump displays a last empty line by default showing the # file size, bute we delete this line in the loop using sed # Now insert this last empty line by letting hexdump skip all input head -c $offset -- "$file" | command hexdump -s $offset "$args" if [ "$isTmpFile" -eq 1 ]; then rm "$file"; fi fi } _

表示される_echo -e "test bbb omg " | hexdump -n -C_で試してみることができます：

_00000000 74 65 73 74 0a |test.| 00000005 62 62 62 0a |bbb.| 00000009 6f 6d 67 0a |omg.| 0000000d 0a |.| 0000000e _

おまけとして、ここに私のhexdiff関数があります。

_hexdiff() { # compares two files linewise in their hexadecimal representation # create temporary files, because else the two 'hexdump -n' calls # get executed multiple times alternatingly when using named pipes: # colordiff <( hexdump -n -C "${@: -2:1}" ) <( hexdump -n -C "${@: -1:1}" ) local a="$( mktemp )" b="$( mktemp )" hexdump -n -C "${@: -2:1}" | sed -r 's|^[0-9a-f]+[ 	]*||;' > "$a" hexdump -n -C "${@: -1:1}" | sed -r 's|^[0-9a-f]+[ 	]*||;' > "$b" colordiff "$a" "$b" rm "$a" "$b" } _

例えば。 hexdiff <( printf "test bbb\x00 omg bar" ) <( printf "test bbb omg foo" )でテストすると、次のように出力されます。

_2c2 < 62 62 62 11 20 0a |bbb. .| --- > 62 62 62 0a |bbb.| 4,5c4,5 < 62 61 72 |bar| < 00000012 --- > 0c 6f 6f |.oo| > 00000010 _

編集：わかりました、この関数は8MBのような大きなファイルには適していません。また、comparehexやdhexのようなツールも十分ではありません。改行を無視するため、違いを非常に一致させることができないためです。上手。 odとsedの組み合わせを使用すると、はるかに高速になります。

_hexlinedump() { local nChars=$1 file=$2 paste -d$'
' -- <( od -w$( cat -- "$file" | wc -c ) -tx1 -v -An -- "$file" | sed 's| 0a| 0a
|g' | sed -r 's|(.{'"$(( 3*nChars ))"'})|\1
|g' | sed '/^ *$/d' ) <( # need to delete empty lines, because 0a might be at the end of a char # boundary, so that not only 0a, but also the character limit introduces # a line break sed -r 's|(.{'"$nChars"'})|\1
|g' -- "$file" | sed -r 's|(.)| \1 |g' ) } hexdiff() { colordiff <( hexlinedump 16 "${@: -2:1}" ) <( hexlinedump 16 "${@: -1:1}" ) } _

mikeserv · Answer

さて、printf ..があります。

hex_split()( unset c dump slice rad pend _get(){ dd bs=1024 count=1; echo .; } 2>/dev/null _buf() case $((${#dump}>0)):$((${#slice}>0)) in (0:*) dump=$(_get); dump=${dump%.} [ -n "$dump" ] || [ -n "$slice" ];; (*:0) [ "${#dump}" -lt 16 ] && slice=${dump:-$slice} dump= && return slice=${dump%"${dump#$q}"} dump=${dump#$q};;esac _out(){ printf "%08x%02.0s" "$rad" "$((rad+=$#/2))" printf "%02x %.0s" "$@" printf "%-$(((16-($#/2))*3))s" printf "%.0s%.1s" '' ' ' '' \| "$@" '' \| '' "$nl" }; q=$(printf %016s|tr \ \?) ; IFS=\ nl=' ' rad=0 c=0 split=${split:-$nl} slice="$*"; set -- while [ -n "$slice" ] || _buf || ! ${1:+"_out"} "$@" && c=${slice%"${slice#?}"} slice=${slice#?} do set "$@" "'$c" "${c#[![:print:]]}." case $#$c in (32*|*$split) _out "$@"; set --;;esac done )

Stdinまたはarguments、あるいはその両方を渡すことができます。そう...

echo "something is being written here" | hex_split something else besides

...上記のプリント...

00000000 73 6f 6d 65 74 68 69 6e 67 20 65 6c 73 65 20 62 |something else b| 00000010 65 73 69 64 65 73 00 73 6f 6d 65 74 68 69 6e 67 |esides.something| 00000020 0a |.| 00000021 69 73 0a |is.| 00000024 62 65 69 6e 67 0a |being.| 0000002a 77 72 69 74 74 65 6e 0a |written.| 00000032 68 65 72 65 0a |here.|

デフォルトの分割文字を次のように変更します...

split=${somechar} hex_split