作成方法python 3 print（）utf8

Question

UTF-8でstdoutにpython 3（3.1）print("Some text")_]を作成する方法、またはrawバイトを出力する方法

Test.py

_TestText = "Test - āĀēĒčČ..šŠūŪžŽ" # this is UTF-8 TestText2 = b"Test2 - \xc4\x81\xc4\x80\xc4\x93\xc4\x92\xc4\x8d\xc4\x8c..\xc5\xa1\xc5\xa0\xc5\xab\xc5\xaa\xc5\xbe\xc5\xbd" # just bytes print(sys.getdefaultencoding()) print(sys.stdout.encoding) print(TestText) print(TestText.encode("utf8")) print(TestText.encode("cp1252","replace")) print(TestText2) _

出力（CP1257およびcharをバイト値_[x00]_に置き換えました）：

_utf-8 cp1257 Test - [xE2][xC2][xE7][C7][xE8][xC8]..[xF0][xD0][xFB][xDB][xFE][xDE] b'Test - \xc4\x81\xc4\x80\xc4\x93\xc4\x92\xc4\x8d\xc4\x8c..\xc5\xa1\xc5\xa0\xc5\xab\xc5\xaa\xc5\xbe\xc5\xbd' b'Test - ??????..\x9a\x8a??\x9e\x8e' b'Test2 - \xc4\x81\xc4\x80\xc4\x93\xc4\x92\xc4\x8d\xc4\x8c..\xc5\xa1\xc5\xa0\xc5\xab\xc5\xaa\xc5\xbe\xc5\xbd' _

printはあまりにもスマートです...：Dエンコードされたテキストをprintで使用しても意味がありません（実際のバイトではなく常にバイトの表現のみを表示するため）。とにかく印刷し、常に_sys.stdout.encoding_でエンコードするためです。

例：print(chr(255))はエラーをスローします：

_Traceback (most recent call last): File "Test.py", line 1, in <module> print(chr(255)); File "H:\Python31\lib\encodings\cp1257.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\xff' in position 0: character maps to <undefined> _

ちなみにprint( TestText == TestText2.decode("utf8"))はFalseを返しますが、印刷出力は同じです。

Python 3は_sys.stdout.encoding_をどのように決定し、どのように変更できますか？

正常に機能するprintRAW()関数を作成しました（実際には出力をUTF-8にエンコードするため、実際には生ではありません...）：

_ def printRAW(*Text): RAWOut = open(1, 'w', encoding='utf8', closefd=False) print(*Text, file=RAWOut) RAWOut.flush() RAWOut.close() printRAW("Cool", TestText) _

出力（現在はUTF-8で印刷）：

_Cool Test - āĀēĒčČ..šŠūŪžŽ _

printRAW(chr(252))は、_ü_（UTF-8、_[xC3][xBC]_）をエラーなしでうまく出力します:)

今、私はもしあればもっと良い解決策を探しています...

Mark Tolonen · Accepted Answer

まず、修正：

TestText = "Test - āĀēĒčČ..šŠūŪžŽ" # this NOT utf-8...it is a Unicode string in Python 3.X. TestText2 = TestText.encode('utf8') # THIS is "just bytes" in UTF-8.

ここで、コンソールのエンコーディングに関係なく、UTF-8をstdoutに送信するには、ジョブに適したツールを使用します。

import sys sys.stdout.buffer.write(TestText2)

「バッファ」は標準出力への生のインターフェースです。

zwol · Answer

これは私がマニュアルから外すことができる最高のものであり、それは少し汚いハックです：

utf8stdout = open(1, 'w', encoding='utf-8', closefd=False) # fd 1 is stdout print(whatever, file=utf8stdout)

ファイルオブジェクトにはエンコーディングを変更するメソッドが必要なようですが、AFAICTにはエンコーディングがありません。

Utf8stdoutに書き込んでから、最初にutf8stdout.flush（）を呼び出さずにsys.stdoutに書き込む場合、またはその逆の場合、悪いことが起こる可能性があります。