2つのタイムスタンプ間でログを抽出する方法

Question

2つのタイムスタンプ間のすべてのログを抽出したいと思います。一部の行にはタイムスタンプがない場合がありますが、それらの行も必要です。つまり、2つのタイムスタンプに該当するすべての行が必要です。私のログ構造は次のようになります：

[2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null --Checking user-- Post [2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall

2014-04-07 23:00と2014-04-08 02:00の間のすべてを抽出したいとします。

開始タイムスタンプまたは終了タイムスタンプがログにない場合がありますが、これら2つのタイムスタンプの間のすべての行が必要です。

maxschlepzig · Answer

これにはawkを使用できます。

_$ awk -F'[]]|[[]' \ '$0 ~ /^\[/ && $2 >= "2014-04-07 23:00" { p=1 } $0 ~ /^\[/ && $2 >= "2014-04-08 02:00" { p=0 } p { print $0 }' log _

どこ：

_-F_は、正規表現を使用してフィールド区切り文字として文字_[_および_]_を指定します
_$0_は完全な行を参照します
_$2_は日付フィールドを参照します
pは、実際の印刷を保護するブール変数として使用されます
正規表現が_$0 ~ /regex/_に一致する場合、_$0_はtrueです。
_>=_は、辞書式に文字列を比較するために使用されます（例：strcmp()）

バリエーション

上記のコマンドラインは right-open time interval マッチングを実装しています。クローズドインターバルセマンティクスを取得するには、正しい日付を増分するだけです。例：

_$ awk -F'[]]|[[]' \ '$0 ~ /^\[/ && $2 >= "2014-04-07 23:00" { p=1 } $0 ~ /^\[/ && $2 >= "2014-04-08 02:00:01" { p=0 } p { print $0 }' log _

別の形式のタイムスタンプを一致させる場合は、_$0 ~ /^\[/_サブ式を変更する必要があります。これは、印刷のオン/オフロジックからのタイムスタンプのない行を無視するために使用されていたことに注意してください。

たとえば、_YYYY-MM-DD HH24:MI:SS_（_[]_中括弧なし）のようなタイムスタンプ形式の場合、次のようにコマンドを変更できます。

_$ awk \ '$0 ~ /^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-2][0-9]:[0-5][0-9]:[0-5][0-9]/ { if ($1" "$2 >= "2014-04-07 23:00") p=1; if ($1" "$2 >= "2014-04-08 02:00:01") p=0; } p { print $0 }' log _

（フィールドセパレータも変更されていることに注意してください-デフォルトの空白/非空白遷移に）

cpugeniusmv · Answer

https://github.com/mdom/dategrep でdategrepを確認してください

説明：

dategrepは、指定された入力ファイルで日付範囲に一致する行を検索し、それらをstdoutに出力します。

Dategrepがシーク可能なファイルで機能する場合、バイナリ検索を実行して最初と最後の行を見つけ、かなり効率的に印刷できます。 dategrepは、ファイル名の引数の1つが単なるハイフンの場合、stdinから読み取ることもできますが、この場合、遅くなるすべての行を解析する必要があります。

使用例：

dategrep --start "12:00" --end "12:15" --format "%b %d %H:%M:%S" syslog dategrep --end "12:15" --format "%b %d %H:%M:%S" syslog dategrep --last-minutes 5 --format "%b %d %H:%M:%S" syslog dategrep --last-minutes 5 --format rsyslog syslog cat syslog | dategrep --end "12:15" -

この制限により、これはあなたの正確な質問には不適切になる可能性があります。

現時点では、dategrepは、解析できない行を見つけるとすぐに終了します。将来のバージョンでは、これは構成可能になる予定です。

Bratchley · Answer

awkまたは非標準のツールの1つの代替方法は、コンテキストグループにGNU grepを使用することです。GNUのgrepを使用すると、 -Aを使用して印刷する正の一致後の行数と-Bを使用して印刷する前の行例：

[davisja5@xxxxxxlp01 ~]$ cat test.txt Ignore this line, please. This one too while you're at it... [2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null --Checking user-- Post [2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall we don't want these lines. [davisja5@xxxxxxlp01 ~]$ egrep "^$$2014-04-07 23:59:58$$" test.txt -A 10000 | egrep "^$$2014-04-08 00:00:03$$" -B 10000 [2014-04-07 23:59:58] CheckForCallAction [ERROR] Exception caught in +CheckForCallAction :: null --Checking user-- Post [2014-04-08 00:00:03] MobileAppRequestFilter [DEBUG] Action requested checkforcall

上記は基本的にgrepに、開始したいパターンと一致する行に続く10,000行を印刷するように指示し、出力を目的の場所から開始して最後まで行きます（うまくいけば）一方、パイプラインの2番目のegrepは、終了デリミタがある行とその前の10,000行のみを出力するように指示します。これらの2つの最終結果は、希望するところから始まり、停止するように指示したところを通過しないことです。

10,000は私が思いついた数値です。出力が長すぎると思われる場合は、100万に変更してください。

UnX · Answer

Sedの使用：

#!/bin/bash E_BADARGS=23 if [ $# -ne "3" ] then echo "Usage: `basename $0` \"<start_date>\" \"<end_date>\" file" echo "NOTE:Make sure to put dates in between double quotes" exit $E_BADARGS fi isDatePresent(){ #check if given date exists in file. local date=$1 local file=$2 grep -q "$date" "$file" return $? } convertToEpoch(){ #converts to Epoch time local _date=$1 local Epoch_date=`date --date="$_date" +%s` echo $Epoch_date } convertFromEpoch(){ #converts to date/time format from Epoch local Epoch_date=$1 local _date=`date --date="@$Epoch_date" +"%F %T"` echo $_date } getDates(){ # collects all dates at beginning of lines in a file, converts them to Epoch and returns a sequence of numbers local file="$1" local state="$2" local i=0 local date_array=( ) if [[ "$state" -eq "S" ]];then datelist=`cat "$file" | sed -r -e "s/^$$([^\[]+)$$.*/\1/" | egrep "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}"` Elif [[ "$state" -eq "E" ]];then datelist=`tac "$file" | sed -r -e "s/^$$([^\[]+)$$.*/\1/" | egrep "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}"` else echo "Something went wrong while getting dates..." 1>&2 exit 500 fi while read _date do Epoch_date=`convertToEpoch "$_date"` date_array[$i]=$Epoch_date #echo "$_date" "$Epoch_date" 1>&2 (( i++ )) done<<<"$datelist" echo ${date_array[@]} } findneighbours(){ # search next best date if date is not in the file using recursivity IFS="$old_IFS" local elt=$1 shift local state="$1" shift local -a array=( "$@" ) index_pivot=`expr ${#array[@]} / 2` echo "#array="${#array[@]} ";array="${array[@]} ";index_pivot="$index_pivot 1>&2 if [ "$index_pivot" -eq 1 -a ${#array[@]} -eq 2 ];then if [ "$state" == "E" ];then echo ${array[0]} Elif [ "$state" == "S" ];then echo ${array[(( ${#array[@]} - 1 ))]} else echo "State" $state "undefined" 1>&2 exit 100 fi else echo "elt with index_pivot="$index_pivot":"${array[$index_pivot]} 1>&2 if [ $elt -lt ${array[$index_pivot]} ];then echo "elt is smaller than pivot" 1>&2 array=( ${array[@]:0:(($index_pivot + 1)) } ) else echo "elt is bigger than pivot" 1>&2 array=( ${array[@]:$index_pivot:(( ${#array[@]} - 1 ))} ) fi findneighbours "$elt" "$state" "${array[@]}" fi } findFirstDate(){ local file="$1" echo "Looking for first date in file" 1>&2 while read line do echo "$line" | egrep -q "^$$[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}$$" &>/dev/null if [ "$?" -eq "0" ] then #echo "line=" "$line" 1>&2 firstdate=`echo "$line" | sed -r -e "s/^$$([^\[]+)$$.*/\1/"` echo "$firstdate" break else echo $? 1>&2 fi done< <( cat "$file" ) } findLastDate(){ local file="$1" echo "Looking for last date in file" 1>&2 while read line do echo "$line" | egrep -q "^$$[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}$$" &>/dev/null if [ "$?" -eq "0" ] then #echo "line=" "$line" 1>&2 lastdate=`echo "$line" | sed -r -e "s/^$$([^\[]+)$$.*/\1/"` echo "$lastdate" break else echo $? 1>&2 fi done< <( tac "$file" ) } findBestDate(){ IFS="$old_IFS" local initdate="$1" local file="$2" local state="$3" local first_elts="$4" local last_elts="$5" local date_array=( ) local initdate_Epoch=`convertToEpoch "$initdate"` if [[ $initdate_Epoch -lt $first_elt ]];then echo `convertFromEpoch "$first_elt"` Elif [[ $initdate_Epoch -gt $last_elt ]];then echo `convertFromEpoch "$last_elt"` else date_array=( `getDates "$file" "$state"` ) echo "date_array="${date_array[@]} 1>&2 #first_elt=${date_array[0]} #last_elt=${date_array[(( ${#date_array[@]} - 1 ))]} echo `convertFromEpoch $(findneighbours "$initdate_Epoch" "$state" "${date_array[@]}")` fi } main(){ init_date_start="$1" init_date_end="$2" filename="$3" echo "problem start.." 1>&2 date_array=( "$init_date_start","$init_date_end" ) flag_array=( 0 0 ) i=0 #echo "$IFS" | cat -vte old_IFS="$IFS" #changing separator to avoid whitespace issue in date/time format IFS=, for _date in ${date_array[@]} do #IFS="$old_IFS" #echo "$IFS" | cat -vte if isDatePresent "$_date" "$filename";then if [ "$i" -eq 0 ];then echo "Starting date exists" 1>&2 #echo "date_start=""$_date" 1>&2 date_start="$_date" else echo "Ending date exists" 1>&2 #echo "date_end=""$_date" 1>&2 date_end="$_date" fi else if [ "$i" -eq 0 ];then echo "start date $_date not found" 1>&2 else echo "end date $_date not found" 1>&2 fi flag_array[$i]=1 fi #IFS=, (( i++ )) done IFS="$old_IFS" if [ ${flag_array[0]} -eq 1 -o ${flag_array[1]} -eq 1 ];then first_elt=`convertToEpoch "$(findFirstDate "$filename")"` last_elt=`convertToEpoch "$(findLastDate "$filename")"` border_dates_array=( "$first_elt","$last_elt" ) #echo "first_elt=" $first_elt "last_elt=" $last_elt 1>&2 i=0 IFS=, for _date in ${date_array[@]} do if [ $i -eq 0 -a ${flag_array[$i]} -eq 1 ];then date_start=`findBestDate "$_date" "$filename" "S" "${border_dates_array[@]}"` Elif [ $i -eq 1 -a ${flag_array[$i]} -eq 1 ];then date_end=`findBestDate "$_date" "$filename" "E" "${border_dates_array[@]}"` fi (( i++ )) done fi sed -r -n "/^$$${date_start}$$/,/^$$${date_end}$$/p" "$filename" } main "$1" "$2" "$3"

これをファイルにコピーします。デバッグ情報を表示したくない場合は、デバッグがstderrに送信されるため、「2>/dev/null」を追加するだけです