特定のアンカーテキストを持つすべての<a href>を検索するpython / beautifulsoup

Question

美しいスープを使用してhtmlを解析し、特定のアンカータグを持つすべてのhrefを見つけようとしています

<a href="http://example.com">TEXT</a> <a href="http://example.com/link">TEXT</a> <a href="http://example.com/page">TEXT</a>

私が探しているすべてのリンクは、まったく同じアンカーテキスト（この場合はTEXT）を持っています。 Word TEXTを探していません。WordTEXTを使用して、すべての異なるHREFを検索します

編集：

明確化のために、リンクを解析するためにクラスを使用することに似たものを探します

<a href="http://example.com" class="visible">TEXT</a> <a href="http://example.com/link" class="visible">TEXT</a> <a href="http://example.com/page" class="visible">TEXT</a>

そしてそれから

findAll('a', 'visible')

私が解析しているHTMLにはクラスがありませんが、常に同じアンカーテキストを除いて

RocketDonkey · Accepted Answer

このようなものでしょうか？

In [39]: from bs4 import BeautifulSoup In [40]: s = """\ ....: <a href="http://example.com">TEXT</a> ....: <a href="http://example.com/link">TEXT</a> ....: <a href="http://example.com/page">TEXT</a> ....: <a href="http://dontmatchme.com/page">WRONGTEXT</a>""" In [41]: soup = BeautifulSoup(s) In [42]: for link in soup.findAll('a', href=True, text='TEXT'): ....: print link['href'] ....: ....: http://example.com http://example.com/link http://example.com/page