<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "">
<html xmlns="" lang="en" xml:lang="en">
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8" />
<link rel="stylesheet" href=".resources/report.css" type="text/css" />
<link rel="shortcut icon" href=".resources/report.gif" type="image/gif" />
<title>Unified coverage</title>
<script type="text/javascript" src=".resources/sort.js"></script>
<body onload="initialSort(['breadcrumb', 'coveragetable'])">
doc = new XmlSlurper( /* false, false, false */ ).parse("index.html")
[Fatal Error] index.html:1:48: DOCTYPE is disallowed when the feature "" set to true.
DOCTYPE is disallowed when the feature "" set to true.
doc = new XmlSlurper(false, false, true).parse("index.html")
[Fatal Error] index.html:1:148: External DTD: Failed to read external DTD 'xhtml1-strict.dtd', because 'http' access is not allowed due to restriction set by the accessExternalDTD property.
External DTD: Failed to read external DTD 'xhtml1-strict.dtd', because 'http' access is not allowed due to restriction set by the accessExternalDTD property.
doc = new XmlSlurper(false, true, true).parse("index.html")
[Fatal Error] index.html:1:148: External DTD: Failed to read external DTD 'xhtml1-strict.dtd', because 'http' access is not allowed due to restriction set by the accessExternalDTD property.
External DTD: Failed to read external DTD 'xhtml1-strict.dtd', because 'http' access is not allowed due to restriction set by the accessExternalDTD property.
doc = new XmlSlurper(true, true, true).parse("index.html")
External DTD: Failed to read external DTD 'xhtml1-strict.dtd', because 'http' access is not allowed due to restriction set by the accessExternalDTD property.
doc = new XmlSlurper(true, false, true).parse("index.html")
External DTD: Failed to read external DTD 'xhtml1-strict.dtd', because 'http' access is not allowed due to restriction set by the accessExternalDTD property.
parser=new XmlSlurper()
parser.setFeature("", false)
parser.setFeature("", false);
HTMLもたまたま整形式のXMLですが、HTMLを解析するためのより一般的なソリューションは、真のHTMLパーサーを使用することです。私は過去に TagSoup パーサーを使用しましたが、実際のHTMLを非常にうまく処理します。
TagSoupは javax.xml.parsers.SAXParser
import org.ccil.cowan.tagsoup.Parser
def doc = new XmlSlurper(new Parser()).parse("index.html")