{"id":74727,"date":"2022-11-08T08:00:56","date_gmt":"2022-11-08T07:00:56","guid":{"rendered":"https:\/\/drafts.code-maze.com\/?p=74727"},"modified":"2022-12-13T18:41:00","modified_gmt":"2022-12-13T17:41:00","slug":"html-agility-pack-csharp","status":"publish","type":"post","link":"https:\/\/code-maze.com\/html-agility-pack-csharp\/","title":{"rendered":"How to Use HTML Agility Pack in C#"},"content":{"rendered":"<p>In this article, we&#8217;re going to learn how to use HTML Agility Pack in C# and review some examples of its most important features.<\/p>\n<div style=\"padding: 20px; border-left: 5px #dc2323 solid; display: block; margin-bottom: 20px; box-shadow: 1px 1px 5px 0px lightgrey;\">To download the source code for this article, you can visit our <a href=\"https:\/\/github.com\/CodeMazeBlog\/CodeMazeGuides\/tree\/main\/dotnet-client-libraries\/HowToUseHtmlAgilityPack\" target=\"_blank\" rel=\"nofollow noopener\">GitHub repository<\/a>.<\/div>\n<p>Let&#8217;s start.<\/p>\n<h2>What Is HTML Agility Pack and How to Use It<\/h2>\n<p>HTML Agility Pack is a <strong>tool to read, write and update HTML documents<\/strong>. It is <strong>commonly used for web scraping, which is the process of programmatically extracting information from public websites<\/strong>.<\/p>\n<p>To start using HTML Agility Pack, we can install it using NuGet Package Manager:<\/p>\n<p><code class=\"EnlighterJSRAW\" data-enlighter-language=\"powershell\">Install-Package HtmlAgilityPack<\/code><\/p>\n<p>Once done, we can easily parse an HTML string:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\" data-enlighter-highlight=\"9-12\">var html = @\"&lt;!DOCTYPE html&gt;\r\n            &lt;html&gt;\r\n            &lt;body&gt;\r\n                &lt;h1&gt;Learn To Code in C#&lt;\/h1&gt;\r\n                &lt;p&gt;Programming is really &lt;i&gt;easy&lt;\/i&gt;.&lt;\/p&gt;\r\n            &lt;\/body&gt;\r\n            &lt;\/html&gt;\";\r\n\r\nvar dom = new HtmlDocument();\r\ndom.LoadHtml(html);\r\n\r\nvar documentHeader = dom.DocumentNode.SelectSingleNode(\"\/\/h1\");\r\n\r\nAssert.Equal(\"Learn To Code in C#\", documentHeader.InnerHtml);<\/pre>\n<p>Here, we parse a string containing some basic HTML to get an <code>HtmlDocument<\/code> object.<\/p>\n<p>The <code>HtmlDocument<\/code> object exposes a <code>DocumentNode<\/code> property that represents the root tag of the snippet. We use <code>SelectSingleNode()<\/code> on it to query the document model searching for the <code>h1<\/code> tag inside the document. And, finally, we access the text content of the <code>h1<\/code> tag through the <code>InnerHtml<\/code> property.<\/p>\n<h2>Parsing HTML With HTML Agility Pack<\/h2>\n<p>While parsing HTML documents from strings is simple, sometimes we will need to obtain our HTML from other sources.<\/p>\n<h3>Parsing HTML From a Local File<\/h3>\n<p><strong>We can easily load HTML from files located on a local hard drive<\/strong>. To demonstrate that, let&#8217;s first create an HTML file and save it with the name <code>test.html<\/code>:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"html\">&lt;!DOCTYPE html&gt;\r\n&lt;html&gt;\r\n&lt;body&gt;\r\n    &lt;h1&gt;Learn To Code in C#&lt;\/h1&gt;\r\n    &lt;p&gt;Programming is really &lt;i&gt;easy&lt;\/i&gt;.&lt;\/p&gt;\r\n    &lt;h2&gt;HTML Agility Pack&lt;\/h2&gt;\r\n    &lt;p id='second'&gt;HTML Agility Pack is a popular web scraping tool.&lt;\/p&gt;\r\n    &lt;p&gt;Features:&lt;\/p&gt;\r\n    &lt;ul&gt;\r\n        &lt;li&gt;Parser&lt;\/li&gt;\r\n        &lt;li&gt;Selectors&lt;\/li&gt;\r\n        &lt;li&gt;DOM management&lt;\/li&gt;\r\n    &lt;\/ul&gt;\r\n&lt;\/body&gt;\r\n&lt;\/html&gt;<\/pre>\n<p>Then, we can instantiate a new <code>HtmlDocument<\/code> object and use its <code>Load()<\/code> method to parse the content of our HTML file:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\" data-enlighter-highlight=\"3-6\">var path = @\"test.html\";\r\n\r\nvar doc = new HtmlDocument();\r\ndoc.Load(path);\r\n\r\nvar htmlHeader = doc.DocumentNode.SelectSingleNode(\"\/\/h2\");\r\n\r\nAssert.Equal(\"HTML Agility Pack\", htmlHeader.InnerHtml);<\/pre>\n<p>Once loaded, we can query the document contents by using <code>DocumentNode.SelectSingleNode()<\/code> method. In this case, we are retrieving the second-level header text via the <code>InnerHtml<\/code> of the <code>h2<\/code> tag in the document.<\/p>\n<h3>Parsing HTML From the Internet<\/h3>\n<p>Let&#8217;s say our goal is to get HTML from a public website. To parse content straight from a URL, we need to use an instance of the <code>HtmlWeb<\/code> class instead of <code>HtmlDocument<\/code>:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\" data-enlighter-highlight=\"3-6\">var url = @\"https:\/\/code-maze.com\/\";\r\n\r\nHtmlWeb web = new HtmlWeb();\r\nvar htmlDoc = web.Load(url);\r\n\r\nvar node = htmlDoc.DocumentNode.SelectSingleNode(\"\/\/head\/title\");\r\n\r\nAssert.Equal(\"Code Maze - C#, .NET and Web Development Tutorials\", node.InnerHtml);<\/pre>\n<p>Once we parse the content by calling the <code>Load()<\/code> method of the\u00a0 <code>HtmlWeb<\/code> instance with the site&#8217;s URL, we can use the methods we already know to access the content. In this case, we are selecting the <code>title<\/code> tag inside the <code>head<\/code> section of the document.<\/p>\n<h3>Parsing HTML From a Browser Using Selenium<\/h3>\n<p>Often, websites use client code like javascript to render HTML elements dynamically. This may be a problem when we try to parse HTML from a remote website, causing the content to be unavailable to our program since the client code hasn&#8217;t been executed.<\/p>\n<p><strong>If we need to parse dynamically rendered HTML content we can use a browser automation tool like Selenium WebDriver<\/strong>. This works because we will be using an actual browser to retrieve the HTML page. A real browser like Chrome is capable of executing any client code present on the page thus generating all the dynamic content.<\/p>\n<p>We can easily find <a href=\"https:\/\/code-maze.com\/selenium-aspnet-core-ui-tests\/\" target=\"_blank\" rel=\"noopener\">resources to learn how to work with Selenium WebDriver<\/a> to load a remote website. Once done, we can use the content loaded in the driver&#8217;s <code>PageSource<\/code> property:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\" data-enlighter-highlight=\"9\">var options = new ChromeOptions();\r\noptions.AddArguments(\"headless\");\r\n\r\nusing (var driver = new ChromeDriver(options))\r\n{\r\n    driver.Navigate().GoToUrl(\"https:\/\/code-maze.com\/\");\r\n\r\n    var doc = new HtmlDocument();\r\n    doc.LoadHtml(driver.PageSource);\r\n\r\n    var node = doc.DocumentNode.SelectSingleNode(\"\/\/head\/title\");\r\n\r\n    Assert.Equal(\"Code Maze - C#, .NET and Web Development Tutorials\", node.InnerHtml);\r\n}<\/pre>\n<h2>Structure of HtmlDocument<\/h2>\n<p><strong>Inside a<\/strong> <code>HtmlDocument<\/code> <strong>instance, there&#8217;s a tree of<\/strong> <code>HtmlNode<\/code> <strong>elements with a single root node<\/strong>. The root node can be accessed through the <code>DocumentNode<\/code> property.<\/p>\n<p>Each node has a <code>Name<\/code> property that will match the HTML tag that represents, like <code>body<\/code>, or <code>h2<\/code>. On the other hand, elements that are not HTML tags also have nodes whose names will start with a <code>#<\/code>. Examples of this are <code>#document<\/code>, <code>#comment<\/code> or <code>#text<\/code>:<\/p>\n<p><a href=\"https:\/\/code-maze.com\/wp-content\/uploads\/2022\/11\/HTMLnodes.drawio.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-76552\" src=\"https:\/\/code-maze.com\/wp-content\/uploads\/2022\/11\/HTMLnodes.drawio.png\" alt=\"html nodes\" width=\"641\" height=\"521\" srcset=\"https:\/\/code-maze.com\/wp-content\/uploads\/2022\/11\/HTMLnodes.drawio.png 641w, https:\/\/code-maze.com\/wp-content\/uploads\/2022\/11\/HTMLnodes.drawio-300x244.png 300w\" sizes=\"auto, (max-width: 641px) 100vw, 641px\" \/><\/a><\/p>\n<p>Each <code>HtmlNode<\/code> exposes the <code>SelectSingleNode()<\/code> and <code>SelectNodes()<\/code> methods to query the entire tree using XPath expressions.<\/p>\n<p><code>SelectSingleNode()<\/code> will return the first <code>HtmlNode<\/code> that matches the XPath expression along with all its descendants, while if there are no matching nodes it will return <code>null<\/code>.<\/p>\n<p><code>SelectNodes()<\/code> will return a <code>HtmlNodeCollection<\/code> object containing all nodes that match the XPath expression with its descendants.\u00a0<\/p>\n<p>We will often use <code>HtmlNode<\/code> properties <code>InnerHtml<\/code>, <code>InnerText,<\/code> and <code>OuterHtml<\/code> to access the node&#8217;s content.<\/p>\n<p>Finally, we can access neighboring nodes with the <code>ChildNodes<\/code>, <code>FirstChild<\/code>, and <code>ParentNode<\/code> properties among others.<\/p>\n<h2>Using Selectors<\/h2>\n<p>Putting all this into practice, we can select all nodes of a specific name regardless of their position in the document tree using <code>\/\/<\/code>:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\" data-enlighter-highlight=\"4\">var doc = new HtmlDocument();\r\ndoc.Load(\"test.html\");\r\n\r\nvar nodes = doc.DocumentNode.SelectNodes(\"\/\/li\");\r\n\r\nAssert.Equal(\"Parser\", nodes[0].InnerHtml);\r\nAssert.Equal(\"Selectors\", nodes[1].InnerHtml);\r\nAssert.Equal(\"DOM Management\", nodes[2].InnerHtml);<\/pre>\n<p>Here, we select all the <code>li<\/code> elements in the HTML file we used in a previous example without having to specify the exact path to the elements.<\/p>\n<p>Alternatively, we can use an expression to select a node by explicitly defining its position in the hierarchy using <code>\/<\/code>:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\">var node = doc.DocumentNode.SelectSingleNode(\"\/html\/body\/h2\");\r\n\r\nAssert.Equal(\"HTML Agility Pack\", node.InnerHtml);<\/pre>\n<p>To select nodes relative to the current node we can use the dot (<code>.<\/code>)\u00a0 expression:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\" data-enlighter-highlight=\"2\">var body = dom.DocumentNode.SelectSingleNode(\"\/html\/body\");\r\nvar listItems = body.SelectNodes(\".\/ul\/li\");\r\n\r\nAssert.Equal(3, listItems.Count);<\/pre>\n<h3>Attribute Selectors<\/h3>\n<p><strong>We can also select nodes based on their attributes<\/strong> like <code>class<\/code> or even <code>id<\/code>. This is done using square bracket syntax:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\">var node = dom.DocumentNode.SelectSingleNode(\"\/\/p[@id='second']\");\r\n\r\nAssert.Equal(\"HTML Agility Pack is a popular web scraping tool.\", node.InnerHtml);<\/pre>\n<h3>Collections<\/h3>\n<p><strong>XPath expressions can select specific items in a collection by its zero-based index<\/strong> or using functions like <code>first()<\/code> or <code>last()<\/code>:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\">var secondParagraph = dom.DocumentNode.SelectSingleNode(\"\/\/p[1]\");\r\nvar lastParagraph = dom.DocumentNode.SelectSingleNode(\"\/\/p[last()]\");\r\n\r\nAssert.Equal(\"Programming is really &lt;i&gt;easy&lt;\/i&gt;.\", secondParagraph.InnerHtml);\r\nAssert.Equal(\"Features:\", lastParagraph.InnerHtml);<\/pre>\n<h2>HTML Manipulation<\/h2>\n<p>Once we have an <code>HtmlDocument<\/code> object, <strong>we can change the structure of the underlying HTML <\/strong>using a collection of methods that work with document nodes. <strong>We can manipulate a document by adding and removing nodes as well as changing their content or even their attributes<\/strong>:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\" data-enlighter-highlight=\"6\">var dom = new HtmlDocument();\r\ndom.Load(\"test.html\");\r\n\r\nvar list = dom.DocumentNode.SelectSingleNode(\"\/\/ul\");\r\n\r\nlist.ChildNodes.Add(HtmlNode.CreateNode(\"&lt;li&gt;Added dynamically&lt;\/li&gt;\"));\r\n\r\nAssert.Equal(@\"&lt;ul&gt;\r\n                  &lt;li&gt;Parser&lt;\/li&gt;\r\n                  &lt;li&gt;Selectors&lt;\/li&gt;\r\n                  &lt;li&gt;DOM management&lt;\/li&gt;\r\n                  &lt;li&gt;Added dynamically&lt;\/li&gt;&lt;\/ul&gt;\", list.OuterHtml);<\/pre>\n<p style=\"text-align: left;\">Here we select a node in our <code>HtmlDocument<\/code> corresponding to the unordered list <code>ul<\/code> that originally contains three list items. Then, we add a newly created <code>HtmlNode<\/code> to the <code>ChildNodes<\/code> collection property of the selected node. Once done, we can inspect the <code>OuterHtml<\/code> property of the <code>ul<\/code> node and see how the new list item node has been added to the document.<\/p>\n<p>Similarly,<strong> we can remove HTML nodes from a document<\/strong>:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\" data-enlighter-highlight=\"3\">var list = dom.DocumentNode.SelectSingleNode(\"\/\/ul\");\r\n\r\nlist.RemoveChild(list.SelectNodes(\"li\").First());\r\n\r\nAssert.Equal(@\"&lt;ul&gt;\r\n    \r\n                 &lt;li&gt;Selectors&lt;\/li&gt;\r\n                 &lt;li&gt;DOM management&lt;\/li&gt;\r\n              &lt;\/ul&gt;\", list.OuterHtml);<\/pre>\n<p>In this case, starting from the same unordered list, we remove the first list item by calling the <code>RemoveChild()<\/code> method in the previously selected <code>HtmlNode<\/code>.<\/p>\n<p>Likewise, <strong>we can alter existing nodes<\/strong> using properties exposes by the <code>HtmlNode<\/code> object:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\" data-enlighter-highlight=\"5,6\">var list = dom.DocumentNode.SelectSingleNode(\"\/\/ul\");\r\n\r\nforeach (var node in list.ChildNodes.Where(x =&gt; x.Name == \"li\"))\r\n{\r\n    node.FirstChild.InnerHtml = \"List Item Text\";\r\n    node.Attributes.Append(\"class\", \"list-item\");\r\n}\r\n\r\nAssert.Equal(@\"&lt;ul&gt;\r\n    &lt;li class=\"\"list-item\"\"&gt;List Item Text&lt;\/li&gt;\r\n    &lt;li class=\"\"list-item\"\"&gt;List Item Text&lt;\/li&gt;\r\n    &lt;li class=\"\"list-item\"\"&gt;List Item Text&lt;\/li&gt;\r\n&lt;\/ul&gt;\", list.OuterHtml);<\/pre>\n<p>Starting with the same unordered list, we replace the inner text in each one of the items in the list and append a <code>class<\/code> attribute using <code>Attributes.Append()<\/code>.<\/p>\n<h2>Writing Out HTML<\/h2>\n<p>Often, we need to write HTML to a file after working with it. We can use the <code>Save()<\/code> method of the <code>HtmlDocument<\/code> class to do it. This method will <strong>save all the nodes in the document to a file including all the changes we may have done using the manipulation API<\/strong>:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\" data-enlighter-highlight=\"4,5\">var dom = new HtmlDocument();\r\ndom.Load(\"test.html\");\r\n\r\nusing var textWriter = File.CreateText(\"test_out.html\");\r\ndom.Save(textWriter);<\/pre>\n<p>Equally important is writing<strong>\u00a0out only part of a document<\/strong>, usually the nodes under a specific known node. The <code>HtmlNode<\/code> class exposes the <code>WriteTo()<\/code> method that writes the current node along with all its descendants and the <code>WriteContentTo()<\/code> method that will output only its children:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\" data-enlighter-highlight=\"3,8\">using (var textWriter = File.CreateText(\"list.html\"))\r\n{\r\n    list.WriteTo(textWriter);\r\n}\r\n\r\nusing (var textWriter = File.CreateText(\"items_only.html\"))\r\n{\r\n    list.WriteContentTo(textWriter);\r\n}\r\n\r\nAssert.Equal(\r\n@\"&lt;ul&gt;\r\n    &lt;li&gt;Parser&lt;\/li&gt;\r\n    &lt;li&gt;Selectors&lt;\/li&gt;\r\n    &lt;li&gt;DOM management&lt;\/li&gt;\r\n&lt;\/ul&gt;\", File.ReadAllText(\"list.html\"));\r\n\r\nAssert.Equal(\r\n@\"\r\n    &lt;li&gt;Parser&lt;\/li&gt;\r\n    &lt;li&gt;Selectors&lt;\/li&gt;\r\n    &lt;li&gt;DOM management&lt;\/li&gt;\r\n\", File.ReadAllText(\"items_only.html\"));<\/pre>\n<h2>Traversing the DOM<\/h2>\n<p>There are several properties and methods that allow us to conveniently navigate the tree of nodes that make the document.<\/p>\n<p><code>HtmlNode<\/code>&#8216;s properties <code>ParentNode<\/code>, <code>ChildNodes<\/code>, <code>NextSibling<\/code>, and others let us access neighboring nodes in the document&#8217;s hierarchy. <strong>We can use these properties to traverse the node tree one node at a time<\/strong>. To optimally traverse the entire document, it may be a good idea to use <a href=\"https:\/\/code-maze.com\/csharp-basics-recursion\/\" target=\"_blank\" rel=\"noopener\">recursion<\/a>:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\" data-enlighter-highlight=\"9\">var toc = new List&lt;HtmlNode&gt;();\r\nvar headerTags = new string[] { \"h1\", \"h2\", \"h3\", \"h4\", \"h5\", \"h6\" };\r\n\r\nvoid VisitNodesRecursively(HtmlNode node)\r\n{\r\n    if (headerTags.Contains(node.Name))\r\n        toc.Add(node);\r\n\r\n    foreach(var child in node.ChildNodes)\r\n        VisitNodesRecursively(child);\r\n}\r\n\r\nVisitNodesRecursively(dom.DocumentNode);\r\n\r\n\/\/ extracted nodes:\r\n\/\/ h1 -&gt; Learn To Code in C#\r\n\/\/ h2 --&gt; HTML Agility Pack<\/pre>\n<p>Here, we traverse all nodes in document order and save all the headers we find along the way in the <code>toc<\/code> collection to build a table of contents for the document. We use the <code>ChildNodes<\/code> property to recursively process all nodes.<\/p>\n<p>On the other hand, methods like <code>Descendants()<\/code>, <code>DescendantsAndSelf()<\/code>, <code>Ancestors()<\/code>, and <code>AncestorsAndSelf()<\/code> <strong>return a flat list of nodes relative to the node we call the method on<\/strong>:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"csharp\">var groups = dom.DocumentNode.DescendantsAndSelf()\r\n    .Where(n =&gt; !n.Name.StartsWith(\"#\"))\r\n    .GroupBy(n =&gt; n.Name);\r\n            \r\nforeach (var group in groups)\r\n    Console.WriteLine($\"Tag '{group.Key}' found {group.Count()} times.\");<\/pre>\n<p>Here, we get all the descendants of the root node and group them by tag name. Finally, we count the occurrences of each tag used in the document. If we apply this to the example HTML that we&#8217;ve used before, the output should look like this:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"raw\">Tag 'html' found 1 times.\r\nTag 'body' found 1 times.\r\nTag 'h1' found 1 times.\r\nTag 'p' found 3 times.\r\nTag 'i' found 1 times.\r\nTag 'h2' found 1 times.\r\nTag 'ul' found 1 times.\r\nTag 'li' found 3 times.<\/pre>\n<h2>Third-Party Libraries<\/h2>\n<p><strong>There are some packages that, despite being external to HTML Agility PPack work on top of it to provide additional features.<\/strong><\/p>\n<p><a href=\"https:\/\/github.com\/atifaziz\/Hazz\" target=\"_blank\" rel=\"nofollow noopener\">Hazz<\/a> adds W3C-style CSS selectors as an alternative to the XPath syntax that comes bundled with HTML Agility Pack. These are JQuery-style selectors that we may like or know better than XPath.<\/p>\n<p><a href=\"https:\/\/github.com\/rflechner\/ScrapySharp\" target=\"_blank\" rel=\"nofollow noopener\">ScrapySharp<\/a> and <a href=\"https:\/\/github.com\/dotnetcore\/DotnetSpider\" target=\"_blank\" rel=\"nofollow noopener\">DotnetSpider<\/a> are higher-level web scraping frameworks that use HTML Agility Pack as their core HTML parsing engine.<\/p>\n<h2>Conclusion<\/h2>\n<p>In this article, we&#8217;ve learned what HTML Agility Pack is and how to work with it. We&#8217;ve also learned how to parse HTML from various sources and how to correctly parse websites that use client code to render dynamic content.<\/p>\n<p>Then, we talked about the structure of an HTML document, how to use selectors to query it, and how to read and manipulate the elements in an HTML document.<\/p>\n<p>Finally, we&#8217;ve seen some examples of how to traverse the entire document tree and learned about third-party libraries that work with HTML Agility Pack.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this article, we&#8217;re going to learn how to use HTML Agility Pack in C# and review some examples of its most important features. Let&#8217;s start. What Is HTML Agility Pack and How to Use It HTML Agility Pack is a tool to read, write and update HTML documents. It is commonly used for web [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":62189,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":""},"categories":[12],"tags":[10,944,1480,1481],"class_list":["post-74727","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-csharp","tag-net","tag-net-core-client-libraries","tag-html","tag-html-agility-pack","et-has-post-format-content","et_post_format-et-post-format-standard"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.7 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How to Use HTML Agility Pack in C# - Code Maze<\/title>\n<meta name=\"description\" content=\"HTML Agility Pack is a popular third-party library to read, write and update HTML documents programmatically in C#.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/code-maze.com\/html-agility-pack-csharp\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How to Use HTML Agility Pack in C# - Code Maze\" \/>\n<meta property=\"og:description\" content=\"HTML Agility Pack is a popular third-party library to read, write and update HTML documents programmatically in C#.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/code-maze.com\/html-agility-pack-csharp\/\" \/>\n<meta property=\"og:site_name\" content=\"Code Maze\" \/>\n<meta property=\"article:published_time\" content=\"2022-11-08T07:00:56+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-12-13T17:41:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/code-maze.com\/wp-content\/uploads\/2021\/12\/social-csharp.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1100\" \/>\n\t<meta property=\"og:image:height\" content=\"620\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Code Maze\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@https:\/\/twitter.com\/CodeMazeBlog\" \/>\n<meta name=\"twitter:site\" content=\"@CodeMazeBlog\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Code Maze\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":[\"Article\",\"BlogPosting\"],\"@id\":\"https:\/\/code-maze.com\/html-agility-pack-csharp\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/code-maze.com\/html-agility-pack-csharp\/\"},\"author\":{\"name\":\"Code Maze\",\"@id\":\"https:\/\/code-maze.com\/#\/schema\/person\/09d29b223012c8e94a68ba62861d0b04\"},\"headline\":\"How to Use HTML Agility Pack in C#\",\"datePublished\":\"2022-11-08T07:00:56+00:00\",\"dateModified\":\"2022-12-13T17:41:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/code-maze.com\/html-agility-pack-csharp\/\"},\"wordCount\":1371,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/code-maze.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/code-maze.com\/html-agility-pack-csharp\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/code-maze.com\/wp-content\/uploads\/2021\/12\/social-csharp.png\",\"keywords\":[\".NET\",\".NET Core client libraries\",\"HTML\",\"HTML Agility Pack\"],\"articleSection\":[\"C#\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/code-maze.com\/html-agility-pack-csharp\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/code-maze.com\/html-agility-pack-csharp\/\",\"url\":\"https:\/\/code-maze.com\/html-agility-pack-csharp\/\",\"name\":\"How to Use HTML Agility Pack in C# - Code Maze\",\"isPartOf\":{\"@id\":\"https:\/\/code-maze.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/code-maze.com\/html-agility-pack-csharp\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/code-maze.com\/html-agility-pack-csharp\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/code-maze.com\/wp-content\/uploads\/2021\/12\/social-csharp.png\",\"datePublished\":\"2022-11-08T07:00:56+00:00\",\"dateModified\":\"2022-12-13T17:41:00+00:00\",\"description\":\"HTML Agility Pack is a popular third-party library to read, write and update HTML documents programmatically in C#.\",\"breadcrumb\":{\"@id\":\"https:\/\/code-maze.com\/html-agility-pack-csharp\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/code-maze.com\/html-agility-pack-csharp\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/code-maze.com\/html-agility-pack-csharp\/#primaryimage\",\"url\":\"https:\/\/code-maze.com\/wp-content\/uploads\/2021\/12\/social-csharp.png\",\"contentUrl\":\"https:\/\/code-maze.com\/wp-content\/uploads\/2021\/12\/social-csharp.png\",\"width\":1100,\"height\":620,\"caption\":\"C# Development\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/code-maze.com\/html-agility-pack-csharp\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/code-maze.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How to Use HTML Agility Pack in C#\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/code-maze.com\/#website\",\"url\":\"https:\/\/code-maze.com\/\",\"name\":\"Code Maze\",\"description\":\"Learn. Code. Succeed.\",\"publisher\":{\"@id\":\"https:\/\/code-maze.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/code-maze.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/code-maze.com\/#organization\",\"name\":\"Code Maze\",\"url\":\"https:\/\/code-maze.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/code-maze.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/code-maze.com\/wp-content\/uploads\/2020\/01\/Code-Maze-Only-Logo-Transparent-HRez.png\",\"contentUrl\":\"https:\/\/code-maze.com\/wp-content\/uploads\/2020\/01\/Code-Maze-Only-Logo-Transparent-HRez.png\",\"width\":3511,\"height\":3510,\"caption\":\"Code Maze\"},\"image\":{\"@id\":\"https:\/\/code-maze.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/CodeMazeBlog\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/code-maze.com\/#\/schema\/person\/09d29b223012c8e94a68ba62861d0b04\",\"name\":\"Code Maze\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/code-maze.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/code-maze.com\/wp-content\/uploads\/2020\/01\/Code-Maze-Only-Logo-Transparent-HRez-150x150.png\",\"contentUrl\":\"https:\/\/code-maze.com\/wp-content\/uploads\/2020\/01\/Code-Maze-Only-Logo-Transparent-HRez-150x150.png\",\"caption\":\"Code Maze\"},\"description\":\"This is the standard author on the site. Most articles are published by individual authors, with their profiles, but when several authors have contributed, we publish collectively as a part of this profile.\",\"sameAs\":[\"https:\/\/www.linkedin.com\/company\/codemaze\/\",\"https:\/\/x.com\/https:\/\/twitter.com\/CodeMazeBlog\"],\"url\":\"https:\/\/code-maze.com\/author\/codemazecontributor\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"How to Use HTML Agility Pack in C# - Code Maze","description":"HTML Agility Pack is a popular third-party library to read, write and update HTML documents programmatically in C#.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/code-maze.com\/html-agility-pack-csharp\/","og_locale":"en_US","og_type":"article","og_title":"How to Use HTML Agility Pack in C# - Code Maze","og_description":"HTML Agility Pack is a popular third-party library to read, write and update HTML documents programmatically in C#.","og_url":"https:\/\/code-maze.com\/html-agility-pack-csharp\/","og_site_name":"Code Maze","article_published_time":"2022-11-08T07:00:56+00:00","article_modified_time":"2022-12-13T17:41:00+00:00","og_image":[{"width":1100,"height":620,"url":"https:\/\/code-maze.com\/wp-content\/uploads\/2021\/12\/social-csharp.png","type":"image\/png"}],"author":"Code Maze","twitter_card":"summary_large_image","twitter_creator":"@https:\/\/twitter.com\/CodeMazeBlog","twitter_site":"@CodeMazeBlog","twitter_misc":{"Written by":"Code Maze","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":["Article","BlogPosting"],"@id":"https:\/\/code-maze.com\/html-agility-pack-csharp\/#article","isPartOf":{"@id":"https:\/\/code-maze.com\/html-agility-pack-csharp\/"},"author":{"name":"Code Maze","@id":"https:\/\/code-maze.com\/#\/schema\/person\/09d29b223012c8e94a68ba62861d0b04"},"headline":"How to Use HTML Agility Pack in C#","datePublished":"2022-11-08T07:00:56+00:00","dateModified":"2022-12-13T17:41:00+00:00","mainEntityOfPage":{"@id":"https:\/\/code-maze.com\/html-agility-pack-csharp\/"},"wordCount":1371,"commentCount":0,"publisher":{"@id":"https:\/\/code-maze.com\/#organization"},"image":{"@id":"https:\/\/code-maze.com\/html-agility-pack-csharp\/#primaryimage"},"thumbnailUrl":"https:\/\/code-maze.com\/wp-content\/uploads\/2021\/12\/social-csharp.png","keywords":[".NET",".NET Core client libraries","HTML","HTML Agility Pack"],"articleSection":["C#"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/code-maze.com\/html-agility-pack-csharp\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/code-maze.com\/html-agility-pack-csharp\/","url":"https:\/\/code-maze.com\/html-agility-pack-csharp\/","name":"How to Use HTML Agility Pack in C# - Code Maze","isPartOf":{"@id":"https:\/\/code-maze.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/code-maze.com\/html-agility-pack-csharp\/#primaryimage"},"image":{"@id":"https:\/\/code-maze.com\/html-agility-pack-csharp\/#primaryimage"},"thumbnailUrl":"https:\/\/code-maze.com\/wp-content\/uploads\/2021\/12\/social-csharp.png","datePublished":"2022-11-08T07:00:56+00:00","dateModified":"2022-12-13T17:41:00+00:00","description":"HTML Agility Pack is a popular third-party library to read, write and update HTML documents programmatically in C#.","breadcrumb":{"@id":"https:\/\/code-maze.com\/html-agility-pack-csharp\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/code-maze.com\/html-agility-pack-csharp\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/code-maze.com\/html-agility-pack-csharp\/#primaryimage","url":"https:\/\/code-maze.com\/wp-content\/uploads\/2021\/12\/social-csharp.png","contentUrl":"https:\/\/code-maze.com\/wp-content\/uploads\/2021\/12\/social-csharp.png","width":1100,"height":620,"caption":"C# Development"},{"@type":"BreadcrumbList","@id":"https:\/\/code-maze.com\/html-agility-pack-csharp\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/code-maze.com\/"},{"@type":"ListItem","position":2,"name":"How to Use HTML Agility Pack in C#"}]},{"@type":"WebSite","@id":"https:\/\/code-maze.com\/#website","url":"https:\/\/code-maze.com\/","name":"Code Maze","description":"Learn. Code. Succeed.","publisher":{"@id":"https:\/\/code-maze.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/code-maze.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/code-maze.com\/#organization","name":"Code Maze","url":"https:\/\/code-maze.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/code-maze.com\/#\/schema\/logo\/image\/","url":"https:\/\/code-maze.com\/wp-content\/uploads\/2020\/01\/Code-Maze-Only-Logo-Transparent-HRez.png","contentUrl":"https:\/\/code-maze.com\/wp-content\/uploads\/2020\/01\/Code-Maze-Only-Logo-Transparent-HRez.png","width":3511,"height":3510,"caption":"Code Maze"},"image":{"@id":"https:\/\/code-maze.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/CodeMazeBlog"]},{"@type":"Person","@id":"https:\/\/code-maze.com\/#\/schema\/person\/09d29b223012c8e94a68ba62861d0b04","name":"Code Maze","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/code-maze.com\/#\/schema\/person\/image\/","url":"https:\/\/code-maze.com\/wp-content\/uploads\/2020\/01\/Code-Maze-Only-Logo-Transparent-HRez-150x150.png","contentUrl":"https:\/\/code-maze.com\/wp-content\/uploads\/2020\/01\/Code-Maze-Only-Logo-Transparent-HRez-150x150.png","caption":"Code Maze"},"description":"This is the standard author on the site. Most articles are published by individual authors, with their profiles, but when several authors have contributed, we publish collectively as a part of this profile.","sameAs":["https:\/\/www.linkedin.com\/company\/codemaze\/","https:\/\/x.com\/https:\/\/twitter.com\/CodeMazeBlog"],"url":"https:\/\/code-maze.com\/author\/codemazecontributor\/"}]}},"_links":{"self":[{"href":"https:\/\/code-maze.com\/wp-json\/wp\/v2\/posts\/74727","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/code-maze.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/code-maze.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/code-maze.com\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/code-maze.com\/wp-json\/wp\/v2\/comments?post=74727"}],"version-history":[{"count":5,"href":"https:\/\/code-maze.com\/wp-json\/wp\/v2\/posts\/74727\/revisions"}],"predecessor-version":[{"id":76805,"href":"https:\/\/code-maze.com\/wp-json\/wp\/v2\/posts\/74727\/revisions\/76805"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/code-maze.com\/wp-json\/wp\/v2\/media\/62189"}],"wp:attachment":[{"href":"https:\/\/code-maze.com\/wp-json\/wp\/v2\/media?parent=74727"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/code-maze.com\/wp-json\/wp\/v2\/categories?post=74727"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/code-maze.com\/wp-json\/wp\/v2\/tags?post=74727"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}