Tweepy API検索の管理

Question

これが以前に他の場所で回答された質問のひどい繰り返しである場合はご容赦ください。しかし、tweepy API検索機能の使用方法がわかりません。 api.search()関数を使用してツイートを検索する方法に関するドキュメントはありますか？

返されるツイートの数、結果の種類などの機能を制御する方法はありますか？

何らかの理由で結果が100で最大になるようです。

私が使用するコードスニペットは次のとおりです

searched_tweets = self.api.search(q=query,rpp=100,count=1000)

gumption · Answer

私はもともと Yuva Raj の suggestion に基づいてソリューションを作成しました GET search/tweets で追加のパラメーターを使用する-max_idパラメーターとidの発生もチェックするループの各反復で返される最後のツイートのTweepError。

ただし、tweepy.Cursorを使用して問題を解決するはるかに簡単な方法があることを発見しました（Cursorの使用の詳細については tweepy Cursor tutorial を参照してください）。

次のコードは、'python'の最新の1000件の言及を取得します。

import tweepy # assuming Twitter_authentication.py contains each of the 4 oauth elements (1 per line) from Twitter_authentication import API_KEY, API_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET auth = tweepy.OAuthHandler(API_KEY, API_SECRET) auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET) api = tweepy.API(auth) query = 'python' max_tweets = 1000 searched_tweets = [status for status in tweepy.Cursor(api.search, q=query).items(max_tweets)]

更新：tweepy.Cursorの潜在的なメモリ消費問題に関する Andre Petre のコメントに応じて、元のソリューションを含め、上記の単一ステートメントリストの内包表記をsearched_tweetsの計算に置き換えます。

searched_tweets = [] last_id = -1 while len(searched_tweets) < max_tweets: count = max_tweets - len(searched_tweets) try: new_tweets = api.search(q=query, count=count, max_id=str(last_id - 1)) if not new_tweets: break searched_tweets.extend(new_tweets) last_id = new_tweets[-1].id except tweepy.TweepError as e: # depending on TweepError.code, one may want to retry or wait # to keep things simple, we will give up on an error break

Yuva Raj · Answer

コードに問題があります。 GET search/tweets のTwitterドキュメントに基づいて、

The number of tweets to return per page, up to a maximum of 100. Defaults to 15. This was formerly the "rpp" parameter in the old Search API.

あなたのコードは

CONSUMER_KEY = '....' CONSUMER_SECRET = '....' ACCESS_KEY = '....' ACCESS_SECRET = '....' auth = tweepy.auth.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET) auth.set_access_token(ACCESS_KEY, ACCESS_SECRET) api = tweepy.API(auth) search_results = api.search(q="hello", count=100) for i in search_results: # Do Whatever You need to print here

Lucas · Answer

他の質問は古く、APIは大きく変わりました。

カーソルを使用した簡単な方法（カーソルチュートリアルを参照）。 Pages は要素のリストを返します（返すページ数を制限できます。.pages(5)は5ページしか返しません）：

for page in tweepy.Cursor(api.search, q='python', count=100, Tweet_mode='extended').pages(): # process status here process_page(page)

ここで、qはクエリ、countはリクエストに対して何個（リクエストの場合は最大100）を、Tweet_mode='extended'は全文を取得します。（これがないと、テキストは140文字に切り捨てられます）詳細 here 。 RTは確認済みとして切り捨てられます jaycech3n 。

tweepy.Cursorを使用したくない場合は、max_idを指定して次のチャンクをもたらす必要があります。参照詳細については。

last_id = None result = True while result: result = api.search(q='python', count=100, Tweet_mode='extended', max_id=last_id) process_result(result) # we subtract one to not have the same again. last_id = result[-1]._json['id'] - 1

Ritesh Soni · Answer

以下に示すように、特定の文字列でツイートを検索できます。

tweets = api.search('Artificial Intelligence', count=200)