Try/exceptを使わずに、文字列がintを表しているかどうか調べるにはどうすればいいですか？

Question

string が整数を表すかどうかを判断する方法はありますか（例えば、'3'、'-17'が'3.14'または'asfasfas'を除く）、try/exceptメカニズムを使用せずに？

is_int('3.14') = False is_int('-7') = True

Triptych · Accepted Answer

try/exceptsをいたるところで使うことに本当に悩んでいるのであれば、単にヘルパー関数を書いてください。

def RepresentsInt(s): try: int(s) return True except ValueError: return False >>> print RepresentsInt("+123") True >>> print RepresentsInt("10.0") False

Pythonが整数と見なすすべての文字列を正確に網羅するには、もっと多くのコードが必要になります。私はこれをPythonicにしてください。

SilentGhost · Answer

正の整数では .isdigit を使うことができます。

>>> '16'.isdigit() True

負の整数ではうまくいきません。次のことができるとします。

>>> s = '-17' >>> s.startswith('-') and s[1:].isdigit() True

これは'16.0'フォーマットでは動作しません。これはこの意味でintキャストに似ています。

編集：

def check_int(s): if s[0] in ('-', '+'): return s[1:].isdigit() return s.isdigit()

Shavais · Answer

あなたが知っている、私はそれを試してみました（そして私は何度も何度もこれをテストしました）、何らかの理由でtry/exceptはそれほどうまく機能しません。私はよくいくつかのことをやろうとしますが、試したもののうち最も良いものを実行するためにtry/exceptを使う方法を見つけたとは思わないでしょう。最悪でなければ、最悪です。すべての場合ではなく、多くの場合。私は多くの人がそれが「Pythonic」方法であると言うのを知っています、しかしそれは私が彼らと方法を分ける一つの分野です。私にとって、それはそれほど高性能でも非常にエレガントでもないので、私はそれをエラーのトラップと報告のためだけに使用する傾向があります。

私は、PHP、Perl、Ruby、C、さらには変わったシェルでさえも、整数フードの文字列をテストするための単純な関数を持っていることに不満を抱いていました。どうやらこの不足は一般的な病気です。

これがBrunoの投稿の簡単で汚い編集です。

import sys, time, re g_intRegex = re.compile(r"^([+-]?[1-9]\d*|0)$") testvals = [ # integers 0, 1, -1, 1.0, -1.0, '0', '0.','0.0', '1', '-1', '+1', '1.0', '-1.0', '+1.0', '06', # non-integers 'abc 123', 1.1, -1.1, '1.1', '-1.1', '+1.1', '1.1.1', '1.1.0', '1.0.1', '1.0.0', '1.0.', '1..0', '1..', '0.0.', '0..0', '0..', 'one', object(), (1,2,3), [1,2,3], {'one':'two'}, # with spaces ' 0 ', ' 0.', ' .0','.01 ' ] def isInt_try(v): try: i = int(v) except: return False return True def isInt_str(v): v = str(v).strip() return v=='0' or (v if v.find('..') > -1 else v.lstrip('-+').rstrip('0').rstrip('.')).isdigit() def isInt_re(v): import re if not hasattr(isInt_re, 'intRegex'): isInt_re.intRegex = re.compile(r"^([+-]?[1-9]\d*|0)$") return isInt_re.intRegex.match(str(v).strip()) is not None def isInt_re2(v): return g_intRegex.match(str(v).strip()) is not None def check_int(s): s = str(s) if s[0] in ('-', '+'): return s[1:].isdigit() return s.isdigit() def timeFunc(func, times): t1 = time.time() for n in range(times): for v in testvals: r = func(v) t2 = time.time() return t2 - t1 def testFuncs(funcs): for func in funcs: sys.stdout.write( "	%s	|" % func.__name__) print() for v in testvals: if type(v) == type(''): sys.stdout.write("'%s'" % v) else: sys.stdout.write("%s" % str(v)) for func in funcs: sys.stdout.write( "		%s	|" % func(v)) sys.stdout.write("
") if __== '__main__': print() print("tests..") testFuncs((isInt_try, isInt_str, isInt_re, isInt_re2, check_int)) print() print("timings..") print("isInt_try: %6.4f" % timeFunc(isInt_try, 10000)) print("isInt_str: %6.4f" % timeFunc(isInt_str, 10000)) print("isInt_re: %6.4f" % timeFunc(isInt_re, 10000)) print("isInt_re2: %6.4f" % timeFunc(isInt_re2, 10000)) print("check_int: %6.4f" % timeFunc(check_int, 10000))

パフォーマンス比較結果は次のとおりです。

timings.. isInt_try: 0.6426 isInt_str: 0.7382 isInt_re: 1.1156 isInt_re2: 0.5344 check_int: 0.3452

Cメソッドはそれを1回だけスキャンして実行できます。文字列を1回スキャンするCメソッドが正しいことです。

編集：

上記のコードをPython 3.5で動作するように更新し、現在最も投票されている回答のcheck_int関数を含め、整数フードのテストに使用できる現在最も人気のある正規表現を使用します。この正規表現は 'abc 123'のような文字列を拒否します。テスト値として「abc 123」を追加しました。

Tryメソッド、一般的なcheck_int関数、および整数フードのテスト用の最も一般的な正規表現を含む、テストされた関数のどれもがこの時点で正しい答えを返すことに注意してください。テスト値（正解が正しいと思うものに応じて。下のテスト結果を参照してください）。

組み込みのint（）関数は、浮動小数点数が最初にストリングに変換されない限り、浮動小数点数の小数部を暗黙的に切り捨て、10進数の前の整数部を戻します。

Check_int（）関数は、0.0や1.0（厳密には整数）のような値に対してはfalseを返し、 '06'のような値に対してはtrueを返します。

これが現在の（Python 3.5）テスト結果です。

 isInt_try | isInt_str | isInt_re | isInt_re2 | check_int | 0 True | True | True | True | True | 1 True | True | True | True | True | -1 True | True | True | True | True | 1.0 True | True | False | False | False | -1.0 True | True | False | False | False | '0' True | True | True | True | True | '0.' False | True | False | False | False | '0.0' False | True | False | False | False | '1' True | True | True | True | True | '-1' True | True | True | True | True | '+1' True | True | True | True | True | '1.0' False | True | False | False | False | '-1.0' False | True | False | False | False | '+1.0' False | True | False | False | False | '06' True | True | False | False | True | 'abc 123' False | False | False | False | False | 1.1 True | False | False | False | False | -1.1 True | False | False | False | False | '1.1' False | False | False | False | False | '-1.1' False | False | False | False | False | '+1.1' False | False | False | False | False | '1.1.1' False | False | False | False | False | '1.1.0' False | False | False | False | False | '1.0.1' False | False | False | False | False | '1.0.0' False | False | False | False | False | '1.0.' False | False | False | False | False | '1..0' False | False | False | False | False | '1..' False | False | False | False | False | '0.0.' False | False | False | False | False | '0..0' False | False | False | False | False | '0..' False | False | False | False | False | 'one' False | False | False | False | False | <obj..> False | False | False | False | False | (1, 2, 3) False | False | False | False | False | [1, 2, 3] False | False | False | False | False | {'one': 'two'} False | False | False | False | False | ' 0 ' True | True | True | True | False | ' 0.' False | True | False | False | False | ' .0' False | False | False | False | False | '.01 ' False | False | False | False | False |

ちょうど今私はこの機能を追加しようとしました：

def isInt_float(s): try: return float(str(s)).is_integer() except: return False

Check_int（0.3486）とほぼ同等のパフォーマンスを発揮し、1.0と0.0、+ 1.0と0、.0などの値に対してtrueを返します。しかし、それはまた '06'にも当てはまります。あなたの毒を選んでください。

Greg Hewgill · Answer

正規表現を使う：

import re def RepresentsInt(s): return re.match(r"[-+]?\d+$", s) is not None

小数を受け入れる必要がある場合も、

def RepresentsInt(s): return re.match(r"[-+]?\d+(\.0*)?$", s) is not None

これを頻繁に行う場合にパフォーマンスを向上させるには、re.compile()を使用して正規表現を1回だけコンパイルしてください。

Bruno Bronosky · Answer

適切なRegExソリューションはGreg HewgillとNowellのアイデアを組み合わせますが、グローバル変数は使用しません。メソッドに属性を付けることでこれを達成できます。また、インポートをメソッドに含めるのは嫌なことですが、私が求めているのは http://peak.telecommunity.com/DevCenter/Importing#lazy-）のような "lazy module"効果です。インポート

edit： これまでのところ私のお気に入りのテクニックは、Stringオブジェクトのメソッドだけを使うことです。

#!/usr/bin/env python # Uses exclusively methods of the String object def isInteger(i): i = str(i) return i=='0' or (i if i.find('..') > -1 else i.lstrip('-+').rstrip('0').rstrip('.')).isdigit() # Uses re module for regex def isIntegre(i): import re if not hasattr(isIntegre, '_re'): print("I compile only once. Remove this line when you are confident in that.") isIntegre._re = re.compile(r"[-+]?\d+(\.0*)?$") return isIntegre._re.match(str(i)) is not None # When executed directly run Unit Tests if __== '__main__': for obj in [ # integers 0, 1, -1, 1.0, -1.0, '0', '0.','0.0', '1', '-1', '+1', '1.0', '-1.0', '+1.0', # non-integers 1.1, -1.1, '1.1', '-1.1', '+1.1', '1.1.1', '1.1.0', '1.0.1', '1.0.0', '1.0.', '1..0', '1..', '0.0.', '0..0', '0..', 'one', object(), (1,2,3), [1,2,3], {'one':'two'} ]: # Notice the integre uses 're' (intended to be humorous) integer = ('an integer' if isInteger(obj) else 'NOT an integer') integre = ('an integre' if isIntegre(obj) else 'NOT an integre') # Make strings look like strings in the output if isinstance(obj, str): obj = ("'%s'" % (obj,)) print("%30s is %14s is %14s" % (obj, integer, integre))

そして、クラスの冒険的ではないメンバーのために、ここに出力があります：

I compile only once. Remove this line when you are confident in that. 0 is an integer is an integre 1 is an integer is an integre -1 is an integer is an integre 1.0 is an integer is an integre -1.0 is an integer is an integre '0' is an integer is an integre '0.' is an integer is an integre '0.0' is an integer is an integre '1' is an integer is an integre '-1' is an integer is an integre '+1' is an integer is an integre '1.0' is an integer is an integre '-1.0' is an integer is an integre '+1.0' is an integer is an integre 1.1 is NOT an integer is NOT an integre -1.1 is NOT an integer is NOT an integre '1.1' is NOT an integer is NOT an integre '-1.1' is NOT an integer is NOT an integre '+1.1' is NOT an integer is NOT an integre '1.1.1' is NOT an integer is NOT an integre '1.1.0' is NOT an integer is NOT an integre '1.0.1' is NOT an integer is NOT an integre '1.0.0' is NOT an integer is NOT an integre '1.0.' is NOT an integer is NOT an integre '1..0' is NOT an integer is NOT an integre '1..' is NOT an integer is NOT an integre '0.0.' is NOT an integer is NOT an integre '0..0' is NOT an integer is NOT an integre '0..' is NOT an integer is NOT an integre 'one' is NOT an integer is NOT an integre <object object at 0x103b7d0a0> is NOT an integer is NOT an integre (1, 2, 3) is NOT an integer is NOT an integre [1, 2, 3] is NOT an integer is NOT an integre {'one': 'two'} is NOT an integer is NOT an integre

Catbuilts · Answer

str.isdigit()がうまくいくはずです。

例：

str.isdigit("23") ## True str.isdigit("abc") ## False str.isdigit("23.4") ## False

Nowell · Answer

Greg Hewgillのアプローチには、いくつかの要素が欠けていました。先頭の "^"は文字列の先頭にのみマッチし、そしてreを事前にコンパイルします。しかし、このアプローチは試しを避けることを可能にするでしょう：exept：

import re INT_RE = re.compile(r"^[-]?\d+$") def RepresentsInt(s): return INT_RE.match(str(s)) is not None

なぜあなたはトライを避けようとしているのですか？

alkos333 · Answer

>>> "+7".lstrip("-+").isdigit() True >>> "-7".lstrip("-+").isdigit() True >>> "7".lstrip("-+").isdigit() True >>> "13.4".lstrip("-+").isdigit() False

だからあなたの機能は次のようになります。

def is_int(val): return val[1].isdigit() and val.lstrip("-+").isdigit()

Vladyslav Savchenko · Answer

おもう

s.startswith('-') and s[1:].isdigit()

に書き直すことをお勧めします：

s.replace('-', '').isdigit()

s [1：]も新しい文字列を作成するため

しかし、もっと良い解決策は

s.lstrip('+-').isdigit()

brw59 · Answer

私はShavaisの投稿が本当に好きでしたが、もう1つテストケースを追加しました（＆内蔵のisdigit（）関数）。

def isInt_loop(v): v = str(v).strip() # swapping '0123456789' for '9876543210' makes nominal difference (might have because '1' is toward the beginning of the string) numbers = '0123456789' for i in v: if i not in numbers: return False return True def isInt_Digit(v): v = str(v).strip() return v.isdigit()

そしてそれは、一貫して他の時代を凌駕しています。

timings.. isInt_try: 0.4628 isInt_str: 0.3556 isInt_re: 0.4889 isInt_re2: 0.2726 isInt_loop: 0.1842 isInt_Digit: 0.1577

通常の2.7 pythonを使用します。

$ python --version Python 2.7.10

私が追加した2つのテストケース（isInt_loopとisInt_digit）はまったく同じテストケースに合格します（どちらも符号なし整数のみを受け入れます）が、組み込みのisdigitとは対照的に文字列実装（isInt_loop）を変更する方が賢いでしょう。（）関数なので、実行時間に多少の違いがあっても含めました。（そして両方の方法とも、他のすべてのものを大いに打ち負かしましたが、余分なものは扱わないでください： "./+/-"

また、regex（isInt_re2メソッド）が2012年にShavaisによって行われたのと同じテスト（現在2018年）で文字列の比較に勝ったことに注目するのは興味深いことがわかりました。正規表現ライブラリは改善されたのでしょうか。

Xenlyte · Answer

私の考えでは、これがおそらく最も直接的でPythonicな方法です。私はこの解決策を見ませんでした、そしてそれは基本的に正規表現と同じですが、正規表現がありません。

def is_int(test): import string return not (set(test) - set(string.digits))

Reut Sharabani · Answer

これはエラーを発生させずに解析する関数です。明らかなケースを扱い、失敗するとNoneを返します（CPythonでは、デフォルトで最大2000個の ' - /+'記号を扱います）。

#!/usr/bin/env python def get_int(number): splits = number.split('.') if len(splits) > 2: # too many splits return None if len(splits) == 2 and splits[1]: # handle decimal part recursively :-) if get_int(splits[1]) != 0: return None int_part = splits[0].lstrip("+") if int_part.startswith('-'): # handle minus sign recursively :-) return get_int(int_part[1:]) * -1 # successful 'and' returns last truth-y value (cast is always valid) return int_part.isdigit() and int(int_part)

いくつかのテスト：

tests = ["0", "0.0", "0.1", "1", "1.1", "1.0", "-1", "-1.1", "-1.0", "-0", "--0", "---3", '.3', '--3.', "+13", "+-1.00", "--+123", "-0.000"] for t in tests: print "get_int(%s) = %s" % (t, get_int(str(t)))

結果：

get_int(0) = 0 get_int(0.0) = 0 get_int(0.1) = None get_int(1) = 1 get_int(1.1) = None get_int(1.0) = 1 get_int(-1) = -1 get_int(-1.1) = None get_int(-1.0) = -1 get_int(-0) = 0 get_int(--0) = 0 get_int(---3) = -3 get_int(.3) = None get_int(--3.) = 3 get_int(+13) = 13 get_int(+-1.00) = -1 get_int(--+123) = 123 get_int(-0.000) = 0

あなたの必要性のためにあなたは使うことができます：

def int_predicate(number): return get_int(number) is not None

agomcas · Answer

Intをまったく使用しない可能性が1つあり、文字列が数値を表していない限り例外を発生させるべきではありません。

float(number)==float(number)//1

Floatが受け付けるあらゆる種類の文字列、正、負、工学的表記で動作するはずです。

Carlos Vega · Answer

Try/exceptには時間的なペナルティがあるので、質問はスピードに関連していると思います。

テストデータ

まず、200個の文字列、100個の失敗した文字列、100個の数値文字列のリストを作成しました。

from random import shuffle numbers = [u'+1'] * 100 nonumbers = [u'1abc'] * 100 testlist = numbers + nonumbers shuffle(testlist) testlist = np.array(testlist)

厄介な解決策（配列とUnicodeでのみ動作します）

np.core.defchararray.isnumericは、Unicode文字列np.core.defchararray.isnumeric(u'+12')でも機能しますが、戻り値としてarrayを使用します。ですから、何千もの変換を行わなければならず、欠けているデータや数値以外のデータがある場合は、良い解決策です。

import numpy as np %timeit np.core.defchararray.isnumeric(testlist) 10000 loops, best of 3: 27.9 µs per loop # 200 numbers per loop

/ exceptを試してください

def check_num(s): try: int(s) return True except: return False def check_list(l): return [check_num(e) for e in l] %timeit check_list(testlist) 1000 loops, best of 3: 217 µs per loop # 200 numbers per loop

でこぼこの解決策ははるかに速いようです。