Pythonのファイルから一度に1つの文字を読み取る方法は？

Question

誰も私にこれを行う方法を教えてもらえますか？

jchl · Accepted Answer

with open(filename) as f: while True: c = f.read(1) if not c: print "End of file" break print "Read a character:", c

Raj · Answer

最初にファイルを開きます：

with open("filename") as fileobj: for line in fileobj: for ch in line: print ch

Escualo · Answer

私は受け入れられた答えが好きです：それは簡単で、仕事を終わらせるでしょう。また、代替の実装を提供したいと思います。

def chunks(filename, buffer_size=4096): """Reads `filename` in chunks of `buffer_size` bytes and yields each chunk until no more characters can be read; the last chunk will most likely have less than `buffer_size` bytes. :param str filename: Path to the file :param int buffer_size: Buffer size, in bytes (default is 4096) :return: Yields chunks of `buffer_size` size until exhausting the file :rtype: str """ with open(filename, "rb") as fp: chunk = fp.read(buffer_size) while chunk: yield chunk chunk = fp.read(buffer_size) def chars(filename, buffersize=4096): """Yields the contents of file `filename` character-by-character. Warning: will only work for encodings where one character is encoded as one byte. :param str filename: Path to the file :param int buffer_size: Buffer size for the underlying chunks, in bytes (default is 4096) :return: Yields the contents of `filename` character-by-character. :rtype: char """ for chunk in chunks(filename, buffersize): for char in chunk: yield char def main(buffersize, filenames): """Reads several files character by character and redirects their contents to `/dev/null`. """ for filename in filenames: with open("/dev/null", "wb") as fp: for char in chars(filename, buffersize): fp.write(char) if __== "__main__": # Try reading several files varying the buffer size import sys buffersize = int(sys.argv[1]) filenames = sys.argv[2:] sys.exit(main(buffersize, filenames))

私が提案するコードは、受け入れられた答えと本質的に同じアイデアです：ファイルから指定されたバイト数を読み取ります。違いは、最初に適切なデータチャンクを読み取り（X86の場合は4006が適切なデフォルトですが、1024、または8192、ページサイズの倍数を試してみてください）、そのチャンク1の文字を生成することです一つ。

私が提示するコードは、大きなファイルの方が高速かもしれません。たとえば、トルストイによる戦争と平和の全文。これらは私のタイミングの結果です（OS X 10.7.4を使用するMac BookPro。so.pyは貼り付けたコードに付けた名前です）。

$ time python so.py 1 2600.txt.utf-8 python so.py 1 2600.txt.utf-8 3.79s user 0.01s system 99% cpu 3.808 total $ time python so.py 4096 2600.txt.utf-8 python so.py 4096 2600.txt.utf-8 1.31s user 0.01s system 99% cpu 1.318 total

今：4096のバッファサイズを普遍的な真実として受け取らないでください。さまざまなサイズ（バッファーサイズ（バイト）対ウォール時間（秒））で得られる結果を見てください。

 2 2.726 4 1.948 8 1.693 16 1.534 32 1.525 64 1.398 128 1.432 256 1.377 512 1.347 1024 1.442 2048 1.316 4096 1.318

ご覧のとおり、早い段階でゲインを確認できます（私のタイミングは非常に不正確な可能性があります）。バッファサイズは、パフォーマンスとメモリのトレードオフです。デフォルトの4096は妥当な選択ですが、いつものように最初に測定します。

Mattias Nilsson · Answer

Python自体は、対話モードでこれを支援できます。

>>> help(file.read) Help on method_descriptor: read(...) read([size]) -> read at most size bytes, returned as a string. If the size argument is negative or omitted, read until EOF is reached. Notice that when in non-blocking mode, less data than what was requested may be returned, even if no size parameter was given.

joaquin · Answer

ただ：

myfile = open(filename) onecaracter = myfile.read(1)

Michael Kropat · Answer

今日、レイモンド・ヘッティンガーのコードの美しい、慣用的なPythonへの変換：を見ながら、このための新しいイディオムを学びました。

import functools with open(filename) as f: f_read_ch = functools.partial(f.read, 1) for ch in iter(f_read_ch, ''): print 'Read a character:', repr(ch)

Johan Kotlinski · Answer

f.read(1)を試してください。これは間違いなく正しく、正しいことです。

David Sykes · Answer

一文字読むだけ

f.read(1)

user1489833 · Answer

f = open('hi.txt', 'w') f.write('0123456789abcdef') f.close() f = open('hej.txt', 'r') f.seek(12) print f.read(1) # This will read just "c"

pambda · Answer

補足として、vvvveryという非常に大きな行を含むファイルを読み込んでいて、メモリを破壊する可能性がある場合は、それらをバッファに読み込んで、各文字を生成することを検討できます

def read_char(inputfile, buffersize=10240): with open(inputfile, 'r') as f: while True: buf = f.read(buffersize) if not buf: break for char in buf: yield char yield '' #handle the scene that the file is empty if __== "__main__": for Word in read_char('./very_large_file.txt'): process(char)

ParagAb · Answer

#reading out the file at once in a list and then printing one-by-one f=open('file.txt') for i in list(f.read()): print(i)

Pro Q · Answer

これも機能します：

with open("filename") as fileObj: for line in fileObj: for ch in line: print(ch)

ファイルのすべての行とすべての行のすべての文字を通過します。