XmlNodeインスタンスからxpathを取得する方法

Question

System.Xml.XmlNodeインスタンスのxpathを取得するコードを誰かが提供できますか？

ありがとう！

Jon Skeet · Accepted Answer

さて、私はそれに挑戦することに抵抗することができませんでした。属性と要素に対してのみ機能しますが、ちょっと... 15分で何が期待できますか:)同様に、それを行うよりクリーンな方法があるかもしれません。

すべての要素（特にルート要素）にインデックスを含めることは不要ですが、あいまいな点があるかどうかを判断するよりも簡単です。

using System; using System.Text; using System.Xml; class Test { static void Main() { string xml = @" <root> <foo /> <foo> <bar attr='value'/> <bar other='va' /> </foo> <foo><bar /></foo> </root>"; XmlDocument doc = new XmlDocument(); doc.LoadXml(xml); XmlNode node = doc.SelectSingleNode("//@attr"); Console.WriteLine(FindXPath(node)); Console.WriteLine(doc.SelectSingleNode(FindXPath(node)) == node); } static string FindXPath(XmlNode node) { StringBuilder builder = new StringBuilder(); while (node != null) { switch (node.NodeType) { case XmlNodeType.Attribute: builder.Insert(0, "/@" + node.Name); node = ((XmlAttribute) node).OwnerElement; break; case XmlNodeType.Element: int index = FindElementIndex((XmlElement) node); builder.Insert(0, "/" + node.Name + "[" + index + "]"); node = node.ParentNode; break; case XmlNodeType.Document: return builder.ToString(); default: throw new ArgumentException("Only elements and attributes are supported"); } } throw new ArgumentException("Node was not in a document"); } static int FindElementIndex(XmlElement element) { XmlNode parentNode = element.ParentNode; if (parentNode is XmlDocument) { return 1; } XmlElement parent = (XmlElement) parentNode; int index = 1; foreach (XmlNode candidate in parent.ChildNodes) { if (candidate is XmlElement && candidate.Name == element.Name) { if (candidate == element) { return index; } index++; } } throw new ArgumentException("Couldn't find element within parent"); } }

Robert Rossney · Answer

Jonは、インスタンスドキュメントに同じノードを生成するXPath式がいくつもあることを修正しています。特定のノードを明確に生成する式を構築する最も簡単な方法は、述語のノード位置を使用するノードテストのチェーンです。

_/node()[0]/node()[2]/node()[6]/node()[1]/node()[2] _

明らかに、この式は要素名を使用していませんが、ドキュメント内のノードを特定するだけであれば、その名前は必要ありません。また、属性の検索にも使用できません（属性はノードではなく、位置を持たないため、名前でのみ検索できます）が、他のすべてのノードタイプを検索します。

この式を作成するには、XmlNodeがプロパティとしてそれを公開しないため、親の子ノードにおけるノードの位置を返すメソッドを記述する必要があります。

_static int GetNodePosition(XmlNode child) { for (int i=0; i<child.ParentNode.ChildNodes.Count; i++) { if (child.ParentNode.ChildNodes[i] == child) { // tricksy XPath, not starting its positions at 0 like a normal language return i + 1; } } throw new InvalidOperationException("Child node somehow not found in its parent's ChildNodes property."); } _

（おそらくXmlNodeListはIEnumerableを実装するため、LINQを使用してよりエレガントな方法がありますが、ここで知っていることを行っています。）

次に、次のような再帰的なメソッドを記述できます。

_static string GetXPathToNode(XmlNode node) { if (node.NodeType == XmlNodeType.Attribute) { // attributes have an OwnerElement, not a ParentNode; also they have // to be matched by name, not found by position return String.Format( "{0}/@{1}", GetXPathToNode(((XmlAttribute)node).OwnerElement), node.Name ); } if (node.ParentNode == null) { // the only node with no parent is the root node, which has no path return ""; } // the path to a node is the path to its parent, plus "/node()[n]", where // n is its position among its siblings. return String.Format( "{0}/node()[{1}]", GetXPathToNode(node.ParentNode), GetNodePosition(node) ); } _

ご覧のとおり、属性を見つけるための方法でハッキングしました。

私が書いている間、ジョンは彼のバージョンに滑り込んだ。彼のコードには何か不満がありますが、Jonを悪用しているように聞こえる場合は事前に謝罪します。（私はそうではありません。Jonが私から学ばなければならないことのリストは非常に短いと確信しています。）しかし、私が述べようとしているポイントは、について考える。

Jonのソリューションは、XMLドキュメントを要素と属性のツリーと考える多くの開発者が見ているものから生まれたのではないかと考えています。これは、主にXMLの主な用途がシリアル化形式である開発者によるものだと思います。なぜなら、使用しているXMLはすべてこのように構成されているからです。「ノード」と「エレメント」という用語を同じ意味で使用しているため、これらの開発者を見つけることができます。これにより、他のすべてのノードタイプを特別なケースとして扱うソリューションが考え出されます。（私は非常に長い間、これらの人の一人でした。）

これは、作成中の単純化された仮定のように感じます。しかし、そうではありません。問題をより難しくし、コードをより複雑にします。すべてのノードタイプを一般的に処理するように特別に設計されたXMLテクノロジー（XPathのnode()関数など）をバイパスするように導きます。

Jonのコードには、要件が何であるかを知らなくてもコードレビューで照会できるようにする赤い旗があり、それはGetElementsByTagNameです。その方法が使用されているのを見るたびに、頭に浮かぶ疑問は常に「なぜ要素でなければならないのか」ということです。そして、答えは非常に多くの場合、「ああ、このコードはテキストノードも処理する必要がありますか？」

Roemer · Answer

古い投稿は知っていますが、私が一番気に入ったバージョン（名前のあるバージョン）には欠陥がありました：親ノードに異なる名前のノードがある場合、最初に一致しないノード名が見つかった後、インデックスのカウントを停止しました。

これが私の修正版です：

/// <summary> /// Gets the X-Path to a given Node /// </summary> /// <param name="node">The Node to get the X-Path from</param> /// <returns>The X-Path of the Node</returns> public string GetXPathToNode(XmlNode node) { if (node.NodeType == XmlNodeType.Attribute) { // attributes have an OwnerElement, not a ParentNode; also they have // to be matched by name, not found by position return String.Format("{0}/@{1}", GetXPathToNode(((XmlAttribute)node).OwnerElement), node.Name); } if (node.ParentNode == null) { // the only node with no parent is the root node, which has no path return ""; } // Get the Index int indexInParent = 1; XmlNode siblingNode = node.PreviousSibling; // Loop thru all Siblings while (siblingNode != null) { // Increase the Index if the Sibling has the same Name if (siblingNode.Name == node.Name) { indexInParent++; } siblingNode = siblingNode.PreviousSibling; } // the path to a node is the path to its parent, plus "/node()[n]", where n is its position among its siblings. return String.Format("{0}/{1}[{2}]", GetXPathToNode(node.ParentNode), node.Name, indexInParent); }

rugg · Answer

ここで私が使用した、私のために働いた簡単な方法があります。

 static string GetXpath(XmlNode node) { if (node.Name == "#document") return String.Empty; return GetXpath(node.SelectSingleNode("..")) + "/" + (node.NodeType == XmlNodeType.Attribute ? "@":String.Empty) + node.Name; }

James Randle · Answer

私の10pの価値は、RobertとCoreyの回答のハイブリッドです。余分なコード行を実際に入力した場合にのみ、クレジットを請求できます。

 private static string GetXPathToNode(XmlNode node) { if (node.NodeType == XmlNodeType.Attribute) { // attributes have an OwnerElement, not a ParentNode; also they have // to be matched by name, not found by position return String.Format( "{0}/@{1}", GetXPathToNode(((XmlAttribute)node).OwnerElement), node.Name ); } if (node.ParentNode == null) { // the only node with no parent is the root node, which has no path return ""; } //get the index int iIndex = 1; XmlNode xnIndex = node; while (xnIndex.PreviousSibling != null) { iIndex++; xnIndex = xnIndex.PreviousSibling; } // the path to a node is the path to its parent, plus "/node()[n]", where // n is its position among its siblings. return String.Format( "{0}/node()[{1}]", GetXPathToNode(node.ParentNode), iIndex ); }

Ren&#233; Endress · Answer

これを行うと、「/ Service [1]/System [1]/Group [1]/Folder [2」のような同じ名前のノードがある場合、derノードの名前と位置を含むパスが取得されます。 ]/File [2] "

public string GetXPathToNode(XmlNode node) { if (node.NodeType == XmlNodeType.Attribute) { // attributes have an OwnerElement, not a ParentNode; also they have // to be matched by name, not found by position return String.Format("{0}/@{1}", GetXPathToNode(((XmlAttribute)node).OwnerElement), node.Name); } if (node.ParentNode == null) { // the only node with no parent is the root node, which has no path return ""; } //get the index int iIndex = 1; XmlNode xnIndex = node; while (xnIndex.PreviousSibling != null && xnIndex.PreviousSibling.Name == xnIndex.Name) { iIndex++; xnIndex = xnIndex.PreviousSibling; } // the path to a node is the path to its parent, plus "/node()[n]", where // n is its position among its siblings. return String.Format("{0}/{1}[{2}]", GetXPathToNode(node.ParentNode), node.Name, iIndex); }

Jon Skeet · Answer

ノードの「the」xpathのようなものはありません。特定のノードには、多くのxpath式が存在する可能性があります。

おそらく、特定の要素のインデックスなどを考慮して、ツリーと一致してan式を構築することができますが、ひどくニースのコードにはなりません。

なぜこれが必要なのですか？より良い解決策があるかもしれません。

Sandy · Answer

作業プロジェクトでこれを行うために、Excel用のVBAを作成しました。 Xpathのタプルと、要素または属性からの関連テキストを出力します。その目的は、ビジネスアナリストがxmlを特定してマッピングできるようにすることでした。これはC＃フォーラムであることに感謝しますが、これは興味深いかもしれないと考えました。

Sub Parse2(oSh As Long, inode As IXMLDOMNode, Optional iXstring As String = "", Optional indexes) Dim chnode As IXMLDOMNode Dim attr As IXMLDOMAttribute Dim oXString As String Dim chld As Long Dim idx As Variant Dim addindex As Boolean chld = 0 idx = 0 addindex = False 'determine the node type: Select Case inode.NodeType Case NODE_ELEMENT If inode.ParentNode.NodeType = NODE_DOCUMENT Then 'This gets the root node name but ignores all the namespace attributes oXString = iXstring & "//" & fp(inode.nodename) Else 'Need to deal with indexing. Where an element has siblings with the same nodeName,it needs to be indexed using [index], e.g swapstreams or schedules For Each chnode In inode.ParentNode.ChildNodes If chnode.NodeType = NODE_ELEMENT And chnode.nodename = inode.nodename Then chld = chld + 1 Next chnode If chld > 1 Then '//inode has siblings of the same nodeName, so needs to be indexed 'Lookup the index from the indexes array idx = getIndex(inode.nodename, indexes) addindex = True Else End If 'build the XString oXString = iXstring & "/" & fp(inode.nodename) If addindex Then oXString = oXString & "[" & idx & "]" 'If type is element then check for attributes For Each attr In inode.Attributes 'If the element has attributes then extract the data pair XString + Element.Name, @Attribute.Name=Attribute.Value Call oSheet(oSh, oXString & "/@" & attr.Name, attr.Value) Next attr End If Case NODE_TEXT 'build the XString oXString = iXstring Call oSheet(oSh, oXString, inode.NodeValue) Case NODE_ATTRIBUTE 'Do nothing Case NODE_CDATA_SECTION 'Do nothing Case NODE_COMMENT 'Do nothing Case NODE_DOCUMENT 'Do nothing Case NODE_DOCUMENT_FRAGMENT 'Do nothing Case NODE_DOCUMENT_TYPE 'Do nothing Case NODE_ENTITY 'Do nothing Case NODE_ENTITY_REFERENCE 'Do nothing Case NODE_INVALID 'do nothing Case NODE_NOTATION 'do nothing Case NODE_PROCESSING_INSTRUCTION 'do nothing End Select 'Now call Parser2 on each of inode's children. If inode.HasChildNodes Then For Each chnode In inode.ChildNodes Call Parse2(oSh, chnode, oXString, indexes) Next chnode Set chnode = Nothing Else End If End Sub

以下を使用して要素のカウントを管理します。

Function getIndex(tag As Variant, indexes) As Variant 'Function to get the latest index for an xml tag from the indexes array 'indexes array is passed from one parser function to the next up and down the tree Dim i As Integer Dim n As Integer If IsArrayEmpty(indexes) Then ReDim indexes(1, 0) indexes(0, 0) = "Tag" indexes(1, 0) = "Index" Else End If For i = 0 To UBound(indexes, 2) If indexes(0, i) = tag Then 'tag found, increment and return the index then exit 'also destroy all recorded tag names BELOW that level indexes(1, i) = indexes(1, i) + 1 getIndex = indexes(1, i) ReDim Preserve indexes(1, i) 'should keep all tags up to i but remove all below it Exit Function Else End If Next i 'tag not found so add the tag with index 1 at the end of the array n = UBound(indexes, 2) ReDim Preserve indexes(1, n + 1) indexes(0, n + 1) = tag indexes(1, n + 1) = 1 getIndex = 1 End Function

cjbarth · Answer

上記のどれもXDocumentで動作しないことがわかったので、XDocumentをサポートする独自のコードを作成し、再帰を使用しました。このコードは、複数の同一ノードをここで他のコードよりもうまく処理できると思います。なぜなら、最初にできる限りXMLパスに深く入り込もうとし、次に必要なものだけをビルドするためにバックアップするからです。したがって、/home/white/bobおよび/home/white/mikeと作成したい/home/white/bob/garageコードはそれを作成する方法を知っています。ただし、述語やワイルドカードをいじりたくなかったので、明示的に禁止しました。しかし、それらのサポートを簡単に追加できます。

Private Sub NodeItterate(XDoc As XElement, XPath As String) 'get the deepest path Dim nodes As IEnumerable(Of XElement) nodes = XDoc.XPathSelectElements(XPath) 'if it doesn't exist, try the next shallow path If nodes.Count = 0 Then NodeItterate(XDoc, XPath.Substring(0, XPath.LastIndexOf("/"))) 'by this time all the required parent elements will have been constructed Dim ParentPath As String = XPath.Substring(0, XPath.LastIndexOf("/")) Dim ParentNode As XElement = XDoc.XPathSelectElement(ParentPath) Dim NewElementName As String = XPath.Substring(XPath.LastIndexOf("/") + 1, XPath.Length - XPath.LastIndexOf("/") - 1) ParentNode.Add(New XElement(NewElementName)) End If 'if we find there are more than 1 elements at the deepest path we have access to, we can't proceed If nodes.Count > 1 Then Throw New ArgumentOutOfRangeException("There are too many paths that match your expression.") End If 'if there is just one element, we can proceed If nodes.Count = 1 Then 'just proceed End If End Sub Public Sub CreateXPath(ByVal XDoc As XElement, ByVal XPath As String) If XPath.Contains("//") Or XPath.Contains("*") Or XPath.Contains(".") Then Throw New ArgumentException("Can't create a path based on searches, wildcards, or relative paths.") End If If Regex.IsMatch(XPath, "()@='<>\|") Then Throw New ArgumentException("Can't create a path based on predicates.") End If 'we will process this recursively. NodeItterate(XDoc, XPath) End Sub

Plasmabubble · Answer

クラス拡張機能の使用はどうですか？ ;）私のバージョン（他の作品に基づいて構築）では、name [index] ...という構文を使用しています。要素のインデックスを取得するループは、独立したルーチン（クラス拡張でもあります）の外側にあります。

任意のユーティリティクラス（またはメインプログラムクラス）で次の直前

static public int GetRank( this XmlNode node ) { // return 0 if unique, else return position 1...n in siblings with same name try { if( node is XmlElement ) { int rank = 1; bool alone = true, found = false; foreach( XmlNode n in node.ParentNode.ChildNodes ) if( n.Name == node.Name ) // sibling with same name { if( n.Equals(node) ) { if( ! alone ) return rank; // no need to continue found = true; } else { if( found ) return rank; // no need to continue alone = false; rank++; } } } } catch{} return 0; } static public string GetXPath( this XmlNode node ) { try { if( node is XmlAttribute ) return String.Format( "{0}/@{1}", (node as XmlAttribute).OwnerElement.GetXPath(), node.Name ); if( node is XmlText || node is XmlCDataSection ) return node.ParentNode.GetXPath(); if( node.ParentNode == null ) // the only node with no parent is the root node, which has no path return ""; int rank = node.GetRank(); if( rank == 0 ) return String.Format( "{0}/{1}", node.ParentNode.GetXPath(), node.Name ); else return String.Format( "{0}/{1}[{2}]", node.ParentNode.GetXPath(), node.Name, rank ); } catch{} return ""; }

Andrei · Answer

あなたの問題に対する別の解決策は、後でカスタム属性で識別したいxmlnodesを「マーク」することです。

var id = _currentNode.OwnerDocument.CreateAttribute("some_id"); id.Value = Guid.NewGuid().ToString(); _currentNode.Attributes.Append(id);

たとえば、辞書に保存できます。また、後でxpathクエリでノードを特定できます。

newOrOldDocument.SelectSingleNode(string.Format("//*[contains(@some_id,'{0}')]", id));

私はこれがあなたの質問に対する直接的な答えではないことを知っていますが、ノードのxpathを知りたい理由がコードでそれへの参照を失った後に後でノードに「到達する」方法がある場合に役立ちます。

これは、ドキュメントが要素を追加/移動する際の問題も克服します。これにより、xpath（またはインデックス、他の回答で示唆されているように）が台無しになります。

Art · Answer

私は最近これをしなければなりませんでした。考慮する必要があるのは要素のみです。これは私が思いついたものです：

 private string GetPath(XmlElement el) { List<string> pathList = new List<string>(); XmlNode node = el; while (node is XmlElement) { pathList.Add(node.Name); node = node.ParentNode; } pathList.Reverse(); string[] nodeNames = pathList.ToArray(); return String.Join("/", nodeNames); }

Corey Fournier · Answer

これはさらに簡単です

 ''' <summary> ''' Gets the full XPath of a single node. ''' </summary> ''' <param name="node"></param> ''' <returns></returns> ''' <remarks></remarks> Private Function GetXPath(ByVal node As Xml.XmlNode) As String Dim temp As String Dim sibling As Xml.XmlNode Dim previousSiblings As Integer = 1 'I dont want to know that it was a generic document If node.Name = "#document" Then Return "" 'Prime it sibling = node.PreviousSibling 'Perculate up getting the count of all of this node's sibling before it. While sibling IsNot Nothing 'Only count if the sibling has the same name as this node If sibling.Name = node.Name Then previousSiblings += 1 End If sibling = sibling.PreviousSibling End While 'Mark this node's index, if it has one ' Also mark the index to 1 or the default if it does have a sibling just no previous. temp = node.Name + IIf(previousSiblings > 0 OrElse node.NextSibling IsNot Nothing, "[" + previousSiblings.ToString() + "]", "").ToString() If node.ParentNode IsNot Nothing Then Return GetXPath(node.ParentNode) + "/" + temp End If Return temp End Function

Mabrouk MAHDHI · Answer

 public static string GetFullPath(this XmlNode node) { if (node.ParentNode == null) { return ""; } else { return $"{GetFullPath(node.ParentNode)}\{node.ParentNode.Name}"; } }