HTMLアジリティパック：hrefタグの解析

Question

これからhref属性値を効果的に解析するにはどうすればよいですか：

<tr> <td rowspan="1" colspan="1">7</td> <td rowspan="1" colspan="1"> <a class="undMe" href="/ice/player.htm?id=8475179" rel="skaterLinkData" shape="rect">D. Kulikov</a> </td> <td rowspan="1" colspan="1">D</td> <td rowspan="1" colspan="1">0</td> <td rowspan="1" colspan="1">0</td> <td rowspan="1" colspan="1">0</td> [...]

プレイヤーIDの取得に興味があります。これは8475179これが私がこれまでに持っているコードです。

 // Iterate all rows (players) for (int i = 1; i < rows.Count; ++i) { HtmlNodeCollection cols = rows[i].SelectNodes(".//td"); // new player Dim_Player player = new Dim_Player(); // Iterate all columns in this row for (int j = 1; j < 6; ++j) { switch (j) { case 1: player.Name = cols[j].InnerText; player.Player_id = Int32.Parse(/* this is where I want to parse the href value */); break; case 2: player.Position = cols[j].InnerText; break; case 3: stats.Goals = Int32.Parse(cols[j].InnerText); break; case 4: stats.Assists = Int32.Parse(cols[j].InnerText); break; case 5: stats.Points = Int32.Parse(cols[j].InnerText); break; } }

BrokenGlass · Accepted Answer

あなたの例に基づいてこれは私のために働きました：

HtmlDocument htmlDoc = new HtmlDocument(); htmlDoc.Load("test.html"); var link = htmlDoc.DocumentNode .Descendants("a") .First(x => x.Attributes["class"] != null && x.Attributes["class"].Value == "undMe"); string hrefValue = link.Attributes["href"].Value; long playerId = Convert.ToInt64(hrefValue.Split('=')[1]);

実際に使用するには、エラーチェックなどを追加する必要があります。

csharptest.net · Answer

XPath式を使用して検索します。

 foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@class='undMe']")) { HtmlAttribute att = link.Attributes["href"]; Console.WriteLine(new Regex(@"(?<=[\?&]id=)\d+(?=\&|\#|$)").Match(att.Value).Value); }