各子ノードの個別の親ノードデータを取得するxpath

Question

Google SpreadsheetsにはIMPORTXML関数があり、 http://services.tvrage.com/feeds/episode_list.phpのAPIに基づいて、TVエピソードのリストを維持するために使用しようとしています。？sid = 3183 。

そのXMLからDate | Season No | Episode No | Titleのテーブルを取得しようとしています。

XMLは、季節ごとの階層的なネストエピソードです。

Xpathを使用して、日付、エピソード、タイトルの列を取得できます。

日付-/Show/Episodelist/Season/episode/airdate
エピソード番号-/Show/Episodelist/Season/episode/seasonnum
タイトル-/Show/Episodelist/Season/episode/title

ただし、no親ノードへのSeason属性である、各エピソードエントリのシーズン番号を抽出するための正しいxpathを見つけることができないようです。

私が試してみました：

/Show/Episodelist/Season/episode/../@no
/Show/Episodelist/Season/episode/parent::Season/@no

これらの両方は、季節の明確なリストを取得することになります、すなわち：

Date Season Episode Title 01-01-2001 1 1 foo 02-01-2001 2 2 bar 03-01-2001 3 3 baz 04-01-2001 4 fee 05-01-2001 5 fob 01-03-2002 1 bix 02-03-2002 2 buz 03-03-2002 3 fez 04-03-2002 4 baj

...私が探しているのは：

Date Season Episode Title 01-01-2001 1 1 foo 02-01-2001 1 2 bar 03-01-2001 1 3 baz 04-01-2001 1 4 fee 05-01-2001 1 5 fob 01-03-2002 2 1 bix 02-03-2002 2 2 buz 03-03-2002 2 3 fez 04-03-2002 2 4 baj

IMPORTXMLルーチンがインポートされたデータに対して個別の操作を行わないようにするための正しいxpathは何ですか？

Vidar S. Ramdal · Accepted Answer

私のコメントで述べたように、=importXMLとXPathでこれを行うのは難しいようです。

しかし、 XmlService of Google Apps Script を使用してプログラムで実行すると、簡単に見えます。

私は次のことを書きました：

function parseTvRageXml(url) { var result = []; var xml = UrlFetchApp.fetch(url).getContentText(); var document = XmlService.parse(xml); var root = document.getRootElement(); var show = document.getRootElement(); var episodeList = show.getChild("Episodelist"); var seasons = episodeList.getChildren("Season"); for (var i = 0; i < seasons.length; i++) { var season = seasons[i]; var seasonNum = season.getAttribute("no").getValue(); var episodes = season.getChildren("episode"); for (var j = 0; j < episodes.length; j++) { var episode = episodes[j]; var resultRow = []; resultRow.Push(seasonNum); resultRow.Push(episode.getChild("epnum").getText()); resultRow.Push(episode.getChild("seasonnum").getText()); resultRow.Push(episode.getChild("airdate").getText()); resultRow.Push(episode.getChild("link").getText()); resultRow.Push(episode.getChild("title").getText()); result.Push(resultRow); } } return result; }

実際には、指定されたURLからコンテンツを取得し、XMLとして解析し、要素を反復処理して、必要な要素の値を抽出し、2次元配列（result）に格納して返します。

これを使用するには、スプレッドシートにスクリプトをインストールする必要があります。 ツールメニュー→スクリプトエディターをクリックし、上記のコードを貼り付けます。スクリプトを保存します。

出力するセルに入力します

=parseTvRageXml("http://services.tvrage.com/feeds/episode_list.php?sid=3183")

結果は、次の列を含むエピソードのリストになります。

Season number | epnum | seasonnum | airdate | link | title

スプレッドシートの例を設定しました。これを示すために、気軽にコピーしてください。

Aurielle Perlmann · Answer

Importxmlおよび/またはimportfeedを使用できるようになりました。

これにより、自動的にスタックされます：

=IMPORTXML("http://www.tvrage.com/the-layover/episode_list/all","//h1 |//*[@size='4']| //*[@class='b'][1]/tr | //*[@class='b'][2]/tr")