数字、句読点、空白を無視して、文中の単語の数を数える方法は？

Question

文の単語を数えるにはどうすればいいですか？ Pythonを使用しています。

たとえば、次の文字列があります。

string = "I am having a very Nice 23!@$ day. "

それは7ワードです。各Wordの前後のランダムなスペースや、数字や記号が含まれている場合に問題があります。

arshajii · Answer

str.split() 引数なしでは、空白文字の実行時に分割されます。

>>> s = 'I am having a very Nice day.' >>> >>> len(s.split()) 7

リンクされたドキュメントから：

sepが指定されていない場合、またはNoneである場合、異なる分割アルゴリズムが適用されます。連続する空白の実行は単一のセパレーターと見なされます。文字列の先頭または末尾に空白がある場合、結果の先頭または末尾に空の文字列は含まれません。

karthikr · Answer

regex.findall() を使用できます。

import re line = " I am having a very Nice day." count = len(re.findall(r'\w+', line)) print (count)

boon kwee · Answer

s = "I am having a very Nice 23!@$ day. " sum([i.strip(string.punctuation).isalpha() for i in s.split()])

上記のステートメントは、テキストの各チャンクを調べ、句読点を削除してから、チャンクが実際にアルファベットの文字列であるかどうかを確認します。

Aliyar · Answer

これは、正規表現を使用した単純なWordカウンターです。スクリプトには、完了時にループを終了できるループが含まれています。

#Word counter using regex import re while True: string =raw_input("Enter the string: ") count = len(re.findall("[a-zA-Z_]+", string)) if line == "Done": #command to terminate the loop break print (count) print ("Terminated")

JadedTuna · Answer

これが私のバージョンです。出力を7にする必要があることに気付きました。これは、特殊文字や数字をカウントしたくないことを意味します。正規表現パターンは次のとおりです。

re.findall("[a-zA-Z_]+", string)

[a-zA-Z_]は、any文字beetwen a-z（小文字）およびA-Z（大文字）に一致することを意味します。

スペースについて。余分なスペースをすべて削除する場合は、次のようにします。

string = string.rstrip().lstrip() # Remove all extra spaces at the start and at the end of the string while " " in string: # While there are 2 spaces beetwen words in our string... string = string.replace(" ", " ") # ... replace them by one space!

Anto · Answer

単純なループを使用して、スペースの数の発生をカウントしてみてはいかがですか？

txt = "Just an example here move along" count = 1 for i in txt: if i == " ": count += 1 print(count)

Darrell White · Answer

 def wordCount(mystring): tempcount = 0 count = 1 try: for character in mystring: if character == " ": tempcount +=1 if tempcount ==1: count +=1 else: tempcount +=1 else: tempcount=0 return count except Exception: error = "Not a string" return error mystring = "I am having a very Nice 23!@$ day." print(wordCount(mystring))

出力は8

Adam · Answer

import string sentence = "I am having a very Nice 23!@$ day. " # Remove all punctuations sentence = sentence.translate(str.maketrans('', '', string.punctuation)) # Remove all numbers" sentence = ''.join([Word for Word in sentence if not Word.isdigit()]) count = 0; for index in range(len(sentence)-1) : if sentence[index+1].isspace() and not sentence[index].isspace(): count += 1 print(count)