部分文字列をすべて見つける方法は？

Question

Pythonには、文字列内の部分文字列のインデックスを取得するためのstring.find()とstring.rfind()があります。

string.find_all()のように、見つかったすべてのインデックスを返すことができるものがあるかどうか疑問に思います（最初から最初だけでなく最後から最初まで）。

例えば：

string = "test test test test" print string.find('test') # 0 print string.rfind('test') # 15 #this is the goal print string.find_all('test') # [0,5,10,15]

marcog · Accepted Answer

探しているものを実行する単純な組み込みの文字列関数はありませんが、より強力な正規表現を使用することができます。

import re [m.start() for m in re.finditer('test', 'test test test test')] #[0, 5, 10, 15]

重複する一致を見つけたい場合は、 lookahead を実行してください。

[m.start() for m in re.finditer('(?=tt)', 'ttt')] #[0, 1]

オーバーラップせずに逆検索を行いたい場合は、正と負の先読みを次のような式にまとめることができます。

search = 'tt' [m.start() for m in re.finditer('(?=%s)(?!.{1,%d}%s)' % (search, len(search)-1, search), 'ttt')] #[1]

re.finditer は generator を返すので、上記の[]を()に変更すると、結果を繰り返すだけの場合に効率的なリストの代わりにジェネレータを取得できます。一度。

Karl Knechtel · Answer

>>> help(str.find) Help on method_descriptor: find(...) S.find(sub [,start [,end]]) -> int

したがって、私たちはそれを自分で作ることができます。

def find_all(a_str, sub): start = 0 while True: start = a_str.find(sub, start) if start == -1: return yield start start += len(sub) # use start += 1 to find overlapping matches list(find_all('spam spam spam spam', 'spam')) # [0, 5, 10, 15]

一時的な文字列や正規表現は必要ありません。

thkala · Answer

これは、 all （つまり、重複している場合でも）一致を取得する（非常に非効率的な）方法です。

>>> string = "test test test test" >>> [i for i in range(len(string)) if string.startswith('test', i)] [0, 5, 10, 15]

AkiRoss · Answer

繰り返しますが、古いスレッドですが、これが generator とプレーンなstr.findを使った私の解決策です。

def findall(p, s): '''Yields all the positions of the pattern p in the string s.''' i = s.find(p) while i != -1: yield i i = s.find(p, i+1)

例

x = 'banananassantana' [(i, x[i:i+2]) for i in findall('na', x)]

戻る

[(2, 'na'), (4, 'na'), (6, 'na'), (14, 'na')]

Chinmay Kanchi · Answer

重複しないマッチにはre.finditer()を使うことができます。

>>> import re >>> aString = 'this is a string where the substring "is" is repeated several times' >>> print [(a.start(), a.end()) for a in list(re.finditer('is', aString))] [(2, 4), (5, 7), (38, 40), (42, 44)]

しかし - しない

In [1]: aString="ababa" In [2]: print [(a.start(), a.end()) for a in list(re.finditer('aba', aString))] Output: [(0, 3)]

Cody Piersall · Answer

さあ、一緒に帰りましょう。

def locations_of_substring(string, substring): """Return a list of locations of a substring.""" substring_length = len(substring) def recurse(locations_found, start): location = string.find(substring, start) if location != -1: return recurse(locations_found + [location], location+substring_length) else: return locations_found return recurse([], 0) print(locations_of_substring('this is a test for finding this and this', 'this')) # prints [0, 27, 36]

このように正規表現は必要ありません。

jstaab · Answer

あなたがただ1人のキャラクターを探しているなら、これはうまくいくでしょう：

string = "dooobiedoobiedoobie" match = 'o' reduce(lambda count, char: count + 1 if char == match else count, string, 0) # produces 7

また、

string = "test test test test" match = "test" len(string.split(match)) - 1 # produces 4

私の狙いは、これらのどれも（特に＃2）どちらもひどくは実行できないということです。

Thurines · Answer

これは古いスレッドですが、私は興味を持って私の解決策を共有したいと思いました。

def find_all(a_string, sub): result = [] k = 0 while k < len(a_string): k = a_string.find(sub, k) if k == -1: return result else: result.append(k) k += 1 #change to k += len(sub) to not search overlapping results return result

部分文字列が見つかった位置のリストを返すはずです。エラーや改善の余地がある場合はコメントしてください。

Andrew H · Answer

このスレッドは少し古いですが、これは私にとって役に立ちました：

numberString = "onetwothreefourfivesixseveneightninefiveten" testString = "five" marker = 0 while marker < len(numberString): try: print(numberString.index("five",marker)) marker = numberString.index("five", marker) + 1 except ValueError: print("String not found") marker = len(numberString)

Bruno Vermeulen · Answer

これはre.finditerを使った私にとってのトリックです

import re text = 'This is sample text to test if this Pythonic '\ 'program can serve as an indexing platform for '\ 'finding words in a paragraph. It can give '\ 'values as to where the Word is located with the '\ 'different examples as stated' # find all occurances of the Word 'as' in the above text find_the_Word = re.finditer('as', text) for match in find_the_Word: print('start {}, end {}, search string \'{}\''. format(match.start(), match.end(), match.group()))

Harsha Biyani · Answer

あなたが試すことができます：

>>> string = "test test test test" >>> for index,value in enumerate(string): if string[index:index+(len("test"))] == "test": print index 0 5 10 15

naveen raja · Answer

他の人が提供する解決策がなんであれ、利用可能なメソッドfind（）または利用可能なメソッドに完全に基づいています。

文字列内の部分文字列のすべての出現箇所を見つけるための中心的な基本的なアルゴリズムは何ですか？

def find_all(string,substring): """ Function: Returning all the index of substring in a string Arguments: String and the search string Return:Returning a list """ length = len(substring) c=0 indexes = [] while c < len(string): if string[c:c+length] == substring: indexes.append(c) c=c+1 return indexes

また、strクラスを新しいクラスに継承して、以下の関数を使用することもできます。

class newstr(str): def find_all(string,substring): """ Function: Returning all the index of substring in a string Arguments: String and the search string Return:Returning a list """ length = len(substring) c=0 indexes = [] while c < len(string): if string[c:c+length] == substring: indexes.append(c) c=c+1 return indexes

メソッドを呼び出す

newstr.find_all（ 'この回答は役に立ちましたか？その後、これを評価してください！'、 'this'）

RaySaraiva · Answer

あなたは簡単に使うことができます：

string.count('test')!

https://www.programiz.com/python-programming/methods/string/count

乾杯！

Uri Goren · Answer

文書内で大量のキーワードを探す場合は、 flashtext を使用します。

from flashtext import KeywordProcessor words = ['test', 'exam', 'quiz'] txt = 'this is a test' kwp = KeywordProcessor() kwp.add_keywords_from_list(words) result = kwp.extract_keywords(txt, span_info=True)

Flashtextは、大量の検索語句に対して正規表現よりも高速に実行されます。

BONTHA SREEVIDHYA · Answer

スライスすることにより、可能なすべての組み合わせを見つけてリストに追加し、count関数を使用して発生回数を見つけます。

s=input() n=len(s) l=[] f=input() print(s[0]) for i in range(0,n): for j in range(1,n+1): l.append(s[i:j]) if f in l: print(l.count(f))