Pythonを使用してサブ文字列を削除します

Question

すでにフォーラムからいくつかの情報を抽出しています。それは私が今持っている生の文字列です：

string = 'i think mabe 124 + <font color="black"><font face="Times New Roman">but I don\'t have a big experience it just how I see it in my eyes <font color="green"><font face="Arial">fun stuff'

私が気に入らないのは、サブストリング"<font color="black"><font face="Times New Roman">"および"<font color="green"><font face="Arial">"。これ以外の文字列の他の部分は保持したいです。結果は次のようになります

resultString = "i think mabe 124 + but I don't have a big experience it just how I see it in my eyes fun stuff"

どうすればこれができますか？実際、私は美しいスープを使ってフォーラムから上記の文字列を抽出しました。ここで、正規表現を使用してパーツを削除することができます。

juliomalegria · Accepted Answer

import re re.sub('<.*?>', '', string) "i think mabe 124 + but I don't have a big experience it just how I see it in my eyes fun stuff"

re.sub関数は、正規表現を使用して、文字列内のすべての一致を2番目のパラメーターに置き換えます。この場合、すべてのタグ（'<.*?>'）そしてそれらを何も置き換えない（''）。

?は、貪欲でない検索のreで使用されます。

re module 。

Abhijit · Answer

>>> import re >>> st = " i think mabe 124 + <font color=\"black\"><font face=\"Times New Roman\">but I don't have a big experience it just how I see it in my eyes <font color=\"green\"><font face=\"Arial\">fun stuff" >>> re.sub("<.*?>","",st) " i think mabe 124 + but I don't have a big experience it just how I see it in my eyes fun stuff" >>>