Pythonを使用して、タブ区切りのtxtファイルをcsvファイルに変換します

Question

そこで、単純なタブ区切りテキストファイルをcsvファイルに変換したいと思います。 string.split（ '\ n'）を使用してtxtファイルを文字列に変換すると、各列の間に '\ t'を含む文字列として各リスト項目のリストが取得されます。 '\ t'をコンマに置き換えることができると考えていましたが、リスト内の文字列をstringのように扱わず、string.replaceを使用できます。タブ「\ t」を解析する方法が必要なコードの始まりです。

import csv import sys txt_file = r"mytxt.txt" csv_file = r"mycsv.csv" in_txt = open(txt_file, "r") out_csv = csv.writer(open(csv_file, 'wb')) file_string = in_txt.read() file_list = file_string.split('
') for row in ec_file_list: out_csv.writerow(row)

agf · Accepted Answer

csvはタブ区切りファイルをサポートします。 delimiter引数をreader に指定します。

import csv txt_file = r"mytxt.txt" csv_file = r"mycsv.csv" # use 'with' if the program isn't going to immediately terminate # so you don't leave files open # the 'b' is necessary on Windows # it prevents \x1a, Ctrl-z, from ending the stream prematurely # and also stops Python converting to / from different line terminators # On other platforms, it has no effect in_txt = csv.reader(open(txt_file, "rb"), delimiter = '	') out_csv = csv.writer(open(csv_file, 'wb')) out_csv.writerows(in_txt)

John Machin · Answer

csvモジュールでファイルを読み取るときに常に 'rb'モードを使用する必要がある理由：

Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.

サンプルファイルの内容：データベースからBlobなどを抽出して取得した制御文字を含む古いゴミ、またはExcelの数式でのCHAR関数の不注意な使用、または...

>>> open('demo.txt', 'rb').read() 'h1	"h2a
h2b"	h3
x1	"x2a
x2b"	x3
y1	y2a\x1ay2b	y3
'

Pythonは、テキストモードでファイルを読み取るときにCP/M、MS-DOS、およびWindowsに従います。は行区切り文字として認識され、および\x1a aka Ctrl -Zは、END-OF-FILEマーカーとして認識されます。

>>> open('demo.txt', 'r').read() 'h1	"h2a
h2b"	h3
x1	"x2a
x2b"	x3
y1	y2a' # WHOOPS

'rb'で開かれたファイルを使用したcsvは期待どおりに機能します。

>>> import csv >>> list(csv.reader(open('demo.txt', 'rb'), delimiter='	')) [['h1', 'h2a
h2b', 'h3'], ['x1', 'x2a
x2b', 'x3'], ['y1', 'y2a\x1ay2b', 'y3']]

ただし、テキストモードではできません。

>>> list(csv.reader(open('demo.txt', 'r'), delimiter='	')) [['h1', 'h2a
h2b', 'h3'], ['x1', 'x2a
x2b', 'x3'], ['y1', 'y2a']] >>>

iun1x · Answer

これは私がそれをする方法です

import csv with open(txtfile, 'r') as infile, open(csvfile, 'w') as outfile: stripped = (line.strip() for line in infile) lines = (line.split(",") for line in stripped if line) writer = csv.writer(outfile) writer.writerows(lines)