列値内の二重引用符を削除するシェルスクリプト

Question

10列の入力テキストファイルがあり、このファイルを処理しているときに、中央の列の1つでこのタイプのデータを取得しています。列の値は次のようにする必要があります。

入力列の値：「これは私の新しいプログラムです： "Hello World"」

必須の列値：「これは私の新しいプログラム：Hello World」です。

Unix Shellスクリプトまたはコマンドで私を助けてください。本当に本当にありがとうございました。

Jesus A. Sanchez · Answer

非常に簡単なオプションは、すべての二重引用符を削除する場合に@Daniが提案するようにsedを使用することです。

$ echo "This is my program \"Hello World\"" | sed 's/"//g' This is my program Hello World

それでも、内部の引用だけを削除したい場合は、次のようにすべての引用を削除し、最初と最後に1つずつ追加することをお勧めします。

次の内容のファイルsample.txtがあるとします。

$ cat sample.txt "This is the "First" Line" "This is the "Second" Line" "This is the "Third" Line"

次に、内部引用のみを削除する場合は、次のことをお勧めします。

$ cat sample.txt | sed 's/"//g' | sed 's/^/"/' |sed 's/$/"/' "This is the First Line" "This is the Second Line" "This is the Third Line"

説明：

sed 's/"// g'各行のすべての二重引用符を削除します

sed 's/^/"/'は、各行の先頭に二重引用符を追加します

sed 's/$/"/'各行の終わりに二重引用符を追加します

sed 's/|/"|"/g'各パイプの前後に引用符を追加します。

お役に立てれば。

[〜＃〜] edit [〜＃〜]：パイプ区切りコメントに従って、コマンドを少し変更する必要があります

Sample.txtを次のようにします。

$ cat sample.txt "This is the "First" column"|"This is the "Second" column"|"This is the "Third" column"

次に、パイプにreplacerコマンドを追加すると、最終的な解決策が得られます。

$ cat sample.txt | sed 's/"//g' | sed 's/^/"/' |sed 's/$/"/' | sed 's/|/"|"/g' "This is the First column"|"This is the Second column"|"This is the Third column"

スクリプトオプション

このsample.txtファイルを使用する

$ cat sample.txt "This is the "first" column"|12345|"This is the "second" column"|67890|"This is the "third" column"

そしてこのスクリプト

#!/bin/ksh counter=1 column="initialized" result="" while [[ "$column" != "" ]] do eval "column=$(cat sample.txt | cut -d"|" -f$counter)" eval "text=$(cat sample.txt | cut -d"|" -f$counter | grep '"')" if [[ "$column" = "$text" && -n "$column" ]] then if [[ "$result" = "" ]] then result="_2quotehere_${column}_2quotehere_" else result="${result}|_2quotehere_${column}_2quotehere_" fi else if [[ -n "$column" ]] then if [[ "$result" = "" ]] then result="${column}" else result="${result}|${column}" fi fi fi echo $result | sed 's/_2quotehere_/"/g' > output.txt (( counter+=1 )) done cat output.txt exit 0

あなたはこれを得るでしょう：

$ ./process.sh "This is the first column"|12345|"This is the second column"|67890|"This is the third column" $ cat output.txt "This is the first column"|12345|"This is the second column"|67890|"This is the third column"

これが必要な処理であることを願っています。

お知らせ下さい！

最終編集

このスクリプトは、指定した入力行を数回含めて処理します。唯一の制限は、20列すべてが同じ行になければならないことです。

#!/bin/ksh rm output.txt > /dev/null 2>&1 column="initialized" result="" lineCounter=1 while read line do print "LINE $lineCounter: $line" counter=1 while [[ ${counter} -le 20 ]] do eval 'column=$(print ${line} | cut -d"|" -f$counter)' eval 'text=$(print ${line} | cut -d"|" -f$counter | grep \")' print "LINE ${lineCounter} COLUMN ${counter}: $column" if [[ "$column" = "$text" && -n ${column} ]] then if [[ "$result" = "" ]] then result="_2quotehere_$(echo ${column} | sed 's/\"//g')_2quotehere_" else result="${result}|_2quotehere_$( echo ${column} | sed 's/\"//g')_2quotehere_" fi else if [[ "$result" = "" ]] then result=${column} else result="${result}|${column}" fi fi (( counter+=1 )) done (( lineCounter+=1 )) echo -e $result | sed 's/_2quotehere_/"/g' >> output.txt result="" done < input.txt print "OUTPUT CONTENTS:" cat output.txt exit 0

ここから、あなたはあなたの特定のケースのためにそれを機能させることができなければなりません。

user79743 · Answer

フィールドを編集する最も簡単な基準は、「文字がある場合」です。
数字のみのフィールド（およびいくつかの記号。、-など）はそのままにしてください。
次のシンプルなawkスクリプトがその役割を果たします。

#!/bin/bash awk -v FS='|' -v OFS='|' '{ for ( i=1; i<=NF; i++) { if ( $i ~ /[a-zA-Z]/ ) { gsub(/["]/,"",$i); $i="\"" $i "\"" # Remove dquotes, add them back. } } }1' input.txt >output.txt