自動ダウンロードのモーニングエディションポッドキャストをスケジュールするにはどうすればよいですか？

Question

モーニングエディションのポッドキャストを毎日自動的にダウンロードしたいと思います。私はApple製品を所有していません。flaregetをダウンロードしてインストールしましたが、これを行う方法がわかりません。そのツールに縛られていません。私は長年Firefoxユーザーです。しかし、現在Chromeを試運転しています。

プログラムのURLは次のとおりです。 http://www.npr.org/programs/morning-edition/

RSSアドレスは次のとおりです。 http://www.npr.org/rss/rss.php?id=

問題は、RSSにmp3へのリンクではなく、個々のストーリーのWebページへのリンクが含まれていることです。

<rss xmlns:npr="http://www.npr.org/rss/" xmlns:nprml="http://api.npr.org/nprml" xmlns:iTunes="http://www.iTunes.com/dtds/podcast-1.0.dtd" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0"> <channel> <title> Morning Edition : NPR </title> <link> http://www.npr.org/templates/story/story.php?storyId=3 </link> <description> Morning Edition gives its audience news, analysis, commentary, and coverage of arts and sports. Stories are told through conversation as well as full reports. It's up-to-the-minute news that prepares listeners for the day ahead. </description> <language>en</language> <copyright>Copyright 2015 NPR - For Personal Use Only</copyright> <generator>NPR API RSS Generator 0.94</generator> <lastBuildDate>Fri, 06 Nov 2015 12:45:00 -0500</lastBuildDate> <image> <url>http://media.npr.org/images/podcasts/primary/npr_generic_image_300.jpg?s=200</url> <title>Morning Edition</title> <link>http://www.npr.org/templates/story/story.php?storyId=3</link> </image> <item> <title>Russian Airliner Crash Update</title> <description> The latest information on the Russian airliner that crashed in Egypt. All 224 people on board were killed. </description> <pubDate>Fri, 06 Nov 2015 12:45:00 -0500</pubDate> <link> http://www.npr.org/2015/11/06/455019224/russian-airliner-crash-update?utm_medium=RSS&utm_campaign=morningedition </link> <guid> http://www.npr.org/2015/11/06/455019224/russian-airliner-crash-update?utm_medium=RSS&utm_campaign=morningedition </guid> <content:encoded> <![CDATA[ <p>The latest information on the Russian airliner that crashed in Egypt. All 224 people on board were killed.</p> ]]> </content:encoded> <dc:creator>Corey Flintoff</dc:creator> </item> ...

開くとhttp://www.npr.org/2015/11/06/455019224/russian-airliner-crash-update?utm_medium=RSS&utm_campaign=morningedition私のブラウザでは、ストーリーのmp3ファイルへのリンクがページにあります：http://pd.npr.org/anon.npr-mp3/npr/me/2015/11/20151106_me_egypt_plane_crash_probe_russia.mp3?dl=1

簡単に識別できるパターンがあることはわかりますが、どのツールを使用するのか、どのようにツールを使用するのかがわかりません。

すべてのストーリーのオーディオファイルは次で始まります。

http://pd.npr.org/anon.npr-mp3/npr/me/

次に、その年のフォルダを追加します

http://pd.npr.org/anon.npr-mp3/npr/me/2015

そしてその月に1つ

http://pd.npr.org/anon.npr-mp3/npr/me/2015/11

今日のショーのすべてのmp3は

yyyymmdd_me*.mp3

末尾の?dl=1は必要ないようです。

cas · Answer

ダウンロードしたい.mp3URLが見つかるまでサイトをナビゲートするWebロボットを作成し、それらのURLを正確にダウンロードする必要があります。

Perlの場合、明らかな解決策は libwww-Perl パッケージ（別名LWP）を使用することです。

pythonについては、 mechanize または scrapy pythonライブラリをお勧めします。

これらのpython libsはDebianとUbuntu用にpython-mechanizeとpython-scrapyとしてパッケージ化されているので、パッケージをインストールします（そして、pip installまたはWebサイトの指示）

他の言語用の同様のライブラリがあります。