{"id":17350,"date":"2013-09-17T13:00:07","date_gmt":"2013-09-17T10:00:07","guid":{"rendered":"http:\/\/www.javacodegeeks.com\/?p=17350"},"modified":"2013-09-17T09:40:17","modified_gmt":"2013-09-17T06:40:17","slug":"clojure-all-things-regex","status":"publish","type":"post","link":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html","title":{"rendered":"Clojure: All things regex"},"content":{"rendered":"<p>I\u2019ve been doing some <a href=\"http:\/\/www.markhneedham.com\/blog\/2013\/08\/26\/clojureenlive-screen-scraping-a-html-file-from-disk\/\">scrapping of web pages recently using Clojure and Enlive<\/a> and as part of that I\u2019ve had to write regular expressions to extract the data I\u2019m interested in.<\/p>\n<p>On my travels I\u2019ve come across a few different functions and I\u2019m never sure which is the right one to use so I thought I\u2019d document what I\u2019ve tried for future me.<\/p>\n<h2>Check if regex matches<\/h2>\n<p>The first regex I wrote was while scrapping the <a href=\"http:\/\/www.rsssf.com\/ec\/ec200203det.html\">Champions League results<\/a> from the Rec.Sport.Soccer Statistics Foundation and I wanted to determine which spans contained the match result and which didn\u2019t.<\/p>\n<p>A matching line would look like this:<\/p>\n<pre class=\" brush:java\">Real Madrid-Juventus Turijn 2 - 1<\/pre>\n<p>And a non matching one like this:<\/p>\n<pre class=\" brush:java\">53\u2019Nedved 0-1, 66'Xavi Hern\u00e1ndez 1-1, 114\u2019Zalayeta 1-2<\/pre>\n<p>I wrote the following regex to detect match results:<\/p>\n<pre class=\" brush:java\">[a-zA-Z\\s]+-[a-zA-Z\\s]+ [0-9][\\s]?.[\\s]?[0-9]<\/pre>\n<p>I then wrote the following function using <cite><a href=\"http:\/\/clojuredocs.org\/clojure_core\/clojure.core\/re-matches\">re-matches<\/a><\/cite> which would return true or false depending on the input:<\/p>\n<pre class=\" brush:java\">(defn recognise-match? [row]\r\n  (not (clojure.string\/blank? (re-matches #\"[a-zA-Z\\s]+-[a-zA-Z\\s]+ [0-9][\\s]?.[\\s]?[0-9]\" row))))<\/pre>\n<pre class=\" brush:java\">&gt; (recognise-match? \"Real Madrid-Juventus Turijn 2 - 1\")\r\ntrue\r\n&gt; (recognise-match? \"53\u2019Nedved 0-1, 66'Xavi Hern\u00e1ndez 1-1, 114\u2019Zalayeta 1-2\")\r\nfalse<\/pre>\n<p><cite>re-matches<\/cite> only returns matches if the whole string matches the pattern which means if we had a line with some spurious text after the score it wouldn\u2019t match:<\/p>\n<pre class=\" brush:java\">&gt; (recognise-match? \"Real Madrid-Juventus Turijn 2 - 1 abc\")\r\nfalse<\/pre>\n<p>If we don\u2019t mind that and we just want some part of the string to match our pattern then we can use <cite><a href=\"http:\/\/clojuredocs.org\/clojure_core\/clojure.core\/re-find\">re-find<\/a><\/cite> instead:<div style=\"display:inline-block; margin: 15px 0;\"> <div id=\"adngin-JavaCodeGeeks_incontent_video-0\" style=\"display:inline-block;\"><\/div> <\/div><\/p>\n<pre class=\" brush:java\">(defn recognise-match? [row]\r\n  (not (clojure.string\/blank? (re-find #\"[a-zA-Z\\s]+-[a-zA-Z\\s]+ [0-9][\\s]?.[\\s]?[0-9]\" row))))<\/pre>\n<pre class=\" brush:java\">&gt; (recognise-match? \"Real Madrid-Juventus Turijn 2 - 1 abc\")\r\ntrue<\/pre>\n<h2>Extract capture groups<\/h2>\n<p>The next thing I wanted to do was to capture the teams and the score of the match which I initially did using <cite><a href=\"http:\/\/clojuredocs.org\/clojure_core\/clojure.core\/re-seq\">re-seq<\/a><\/cite>:<\/p>\n<pre class=\" brush:java\">&gt; (first (re-seq #\"([a-zA-Z\\s]+)-([a-zA-Z\\s]+) ([0-9])[\\s]?.[\\s]?([0-9])\" \"FC Valencia-Internazionale Milaan 2 - 1\"))\r\n[\"FC Valencia-Internazionale Milaan 2 - 1\" \"FC Valencia\" \"Internazionale Milaan\" \"2\" \"1\"]<\/pre>\n<p>I then extracted the various parts like so:<\/p>\n<pre class=\" brush:java\">&gt; (def result (first (re-seq #\"([a-zA-Z\\s]+)-([a-zA-Z\\s]+) ([0-9])[\\s]?.[\\s]?([0-9])\" \"FC Valencia-Internazionale Milaan 2 - 1\")))\r\n\r\n&gt; result\r\n[\"FC Valencia-Internazionale Milaan 2 - 1\" \"FC Valencia\" \"Internazionale Milaan\" \"2\" \"1\"]\r\n\r\n&gt; (nth result 1)\r\n\"FC Valencia\"\r\n\r\n&gt; (nth result 2)\r\n\"Internazionale Milaan\"<\/pre>\n<p><cite>re-seq<\/cite> returns a list which contains consecutive matches of the regex. The list will either contain strings if we don\u2019t specify capture groups or a vector containing the pattern matched and each of the capture groups.<\/p>\n<p>For example if we now match only sequences of A-Z or spaces and remove the rest of the pattern from above we\u2019d get the following results:<\/p>\n<pre class=\" brush:java\">&gt; (re-seq #\"([a-zA-Z\\s]+)\" \"FC Valencia-Internazionale Milaan 2 - 1\")\r\n([\"FC Valencia\" \"FC Valencia\"] [\"Internazionale Milaan \" \"Internazionale Milaan \"] [\" \" \" \"] [\" \" \" \"])\r\n\r\n&gt; (re-seq #\"[a-zA-Z\\s]+\" \"FC Valencia-Internazionale Milaan 2 - 1\")\r\n(\"FC Valencia\" \"Internazionale Milaan \" \" \" \" \")<\/pre>\n<p>In our case <cite>re-find<\/cite> or <cite>re-matches<\/cite> actually makes more sense since we only want to match the pattern once. If there are further matches after this those aren\u2019t included in the results. e.g.<\/p>\n<pre class=\" brush:java\">&gt; (re-find #\"[a-zA-Z\\s]+\" \"FC Valencia-Internazionale Milaan 2 - 1\")\r\n\"FC Valencia\"\r\n\r\n&gt; (re-matches #\"[a-zA-Z\\s]*\" \"FC Valencia-Internazionale Milaan 2 - 1\")\r\nnil<\/pre>\n<p><cite>re-matches<\/cite> returns nil here because there are characters in the string which don\u2019t match the pattern i.e. the hyphen between the two scores.<\/p>\n<p>If we tie that in with our capture groups we end up with the following:<\/p>\n<pre class=\" brush:java\">&gt; (def result \r\n    (re-find #\"([a-zA-Z\\s]+)-([a-zA-Z\\s]+) ([0-9])[\\s]?.[\\s]?([0-9])\" \"FC Valencia-Internazionale Milaan 2 - 1\"))\r\n\r\n&gt; result\r\n[\"FC Valencia-Internazionale Milaan 2 - 1\" \"FC Valencia\" \"Internazionale Milaan\" \"2\" \"1\"]\r\n\r\n&gt; (nth result 1)\r\n\"FC Valencia\"\r\n\r\n&gt; (nth result 2)\r\n\"Internazionale Milaan\"<\/pre>\n<p>I also came across the <cite><a href=\"http:\/\/clojuredocs.org\/clojure_core\/clojure.core\/re-pattern\">re-pattern<\/a><\/cite> function which provides a more verbose way of creating a pattern and then evaluationg it with <cite>re-find<\/cite>:<\/p>\n<pre class=\" brush:java\">&gt; (re-find (re-pattern \"([a-zA-Z\\\\s]+)-([a-zA-Z\\\\s]+) ([0-9])[\\\\s]?.[\\\\s]?([0-9])\") \"FC Valencia-Internazionale Milaan 2 - 1\")\r\n[\"FC Valencia-Internazionale Milaan 2 - 1\" \"FC Valencia\" \"Internazionale Milaan\" \"2\" \"1\"]<\/pre>\n<p>One difference here is that I had to escape the special sequence \u2018\\s\u2019 otherwise I was getting the following exception:<\/p>\n<pre class=\" brush:java\">RuntimeException Unsupported escape character: \\s  clojure.lang.Util.runtimeException (Util.java:170)<\/pre>\n<p>I wanted to play around with <cite><a href=\"http:\/\/clojuredocs.org\/clojure_core\/clojure.core\/re-groups\">re-groups<\/a><\/cite> as well but that seemed to throw an exception reasonably frequently when I expected it to work.<\/p>\n<p>The last function I looked at was <cite><a href=\"http:\/\/clojuredocs.org\/clojure_core\/clojure.core\/re-matcher\">re-matcher<\/a><\/cite> which seemed to be a long-hand for the \u2018#\u201d&#8221;\u2018 syntax used earlier in the post to define matchers:<\/p>\n<pre class=\" brush:java\">&gt; (re-find (re-matcher #\"([a-zA-Z\\s]+)-([a-zA-Z\\s]+) ([0-9])[\\s]?.[\\s]?([0-9])\" \"FC Valencia-Internazionale Milaan 2 - 1\"))\r\n[\"FC Valencia-Internazionale Milaan 2 - 1\" \"FC Valencia\" \"Internazionale Milaan\" \"2\" \"1\"]<\/pre>\n<h2>In summary<\/h2>\n<p>So in summary I think most use cases are covered by <cite>re-find<\/cite> and <cite>re-matches<\/cite> and maybe <cite>re-seq<\/cite> on special occasions. I couldn\u2019t see where I\u2019d use the other functions but I\u2019m happy to be proved wrong.<br \/>\n&nbsp;<\/p>\n<div style=\"border: 1px solid #D8D8D8; background: #FAFAFA; width: 100%; padding-left: 5px;\"><b><i>Reference: <\/i><\/b><a href=\"http:\/\/www.markhneedham.com\/blog\/2013\/09\/14\/clojure-all-things-regex\/\">Clojure: All things regex<\/a> from our <a href=\"http:\/\/www.javacodegeeks.com\/jcg\">JCG partner<\/a> Mark Needham at the <a href=\"http:\/\/www.markhneedham.com\/blog\/\">Mark Needham Blog<\/a> blog.<\/div>\n","protected":false},"excerpt":{"rendered":"<p>I\u2019ve been doing some scrapping of web pages recently using Clojure and Enlive and as part of that I\u2019ve had to write regular expressions to extract the data I\u2019m interested in. On my travels I\u2019ve come across a few different functions and I\u2019m never sure which is the right one to use so I thought &hellip;<\/p>\n","protected":false},"author":134,"featured_media":93,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[22],"tags":[],"class_list":["post-17350","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-clojure"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.5 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Clojure: All things regex<\/title>\n<meta name=\"description\" content=\"I\u2019ve been doing some scrapping of web pages recently using Clojure and Enlive and as part of that I\u2019ve had to write regular expressions to extract the\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Clojure: All things regex\" \/>\n<meta property=\"og:description\" content=\"I\u2019ve been doing some scrapping of web pages recently using Clojure and Enlive and as part of that I\u2019ve had to write regular expressions to extract the\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html\" \/>\n<meta property=\"og:site_name\" content=\"Java Code Geeks\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/javacodegeeks\" \/>\n<meta property=\"article:published_time\" content=\"2013-09-17T10:00:07+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/clojure-logo.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"150\" \/>\n\t<meta property=\"og:image:height\" content=\"150\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Mark Needham\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@javacodegeeks\" \/>\n<meta name=\"twitter:site\" content=\"@javacodegeeks\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Mark Needham\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2013\\\/09\\\/clojure-all-things-regex.html#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2013\\\/09\\\/clojure-all-things-regex.html\"},\"author\":{\"name\":\"Mark Needham\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#\\\/schema\\\/person\\\/fdb35381baa5059768d6788cfb685313\"},\"headline\":\"Clojure: All things regex\",\"datePublished\":\"2013-09-17T10:00:07+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2013\\\/09\\\/clojure-all-things-regex.html\"},\"wordCount\":532,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2013\\\/09\\\/clojure-all-things-regex.html#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2012\\\/10\\\/clojure-logo.jpg\",\"articleSection\":[\"Clojure\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.javacodegeeks.com\\\/2013\\\/09\\\/clojure-all-things-regex.html#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2013\\\/09\\\/clojure-all-things-regex.html\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2013\\\/09\\\/clojure-all-things-regex.html\",\"name\":\"Clojure: All things regex\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2013\\\/09\\\/clojure-all-things-regex.html#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2013\\\/09\\\/clojure-all-things-regex.html#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2012\\\/10\\\/clojure-logo.jpg\",\"datePublished\":\"2013-09-17T10:00:07+00:00\",\"description\":\"I\u2019ve been doing some scrapping of web pages recently using Clojure and Enlive and as part of that I\u2019ve had to write regular expressions to extract the\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2013\\\/09\\\/clojure-all-things-regex.html#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.javacodegeeks.com\\\/2013\\\/09\\\/clojure-all-things-regex.html\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2013\\\/09\\\/clojure-all-things-regex.html#primaryimage\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2012\\\/10\\\/clojure-logo.jpg\",\"contentUrl\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2012\\\/10\\\/clojure-logo.jpg\",\"width\":150,\"height\":150},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2013\\\/09\\\/clojure-all-things-regex.html#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.javacodegeeks.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"JVM Languages\",\"item\":\"https:\\\/\\\/www.javacodegeeks.com\\\/category\\\/jvm-languages\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Clojure\",\"item\":\"https:\\\/\\\/www.javacodegeeks.com\\\/category\\\/jvm-languages\\\/clojure\"},{\"@type\":\"ListItem\",\"position\":4,\"name\":\"Clojure: All things regex\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#website\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/\",\"name\":\"Java Code Geeks\",\"description\":\"Java Developers Resource Center\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#organization\"},\"alternateName\":\"JCG\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.javacodegeeks.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#organization\",\"name\":\"Exelixis Media P.C.\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2022\\\/06\\\/exelixis-logo.png\",\"contentUrl\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2022\\\/06\\\/exelixis-logo.png\",\"width\":864,\"height\":246,\"caption\":\"Exelixis Media P.C.\"},\"image\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/javacodegeeks\",\"https:\\\/\\\/x.com\\\/javacodegeeks\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#\\\/schema\\\/person\\\/fdb35381baa5059768d6788cfb685313\",\"name\":\"Mark Needham\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/5489baed26ce2d932bf951ecfb47afe80bec45d3648c23521d87c83b8f1c3ea9?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/5489baed26ce2d932bf951ecfb47afe80bec45d3648c23521d87c83b8f1c3ea9?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/5489baed26ce2d932bf951ecfb47afe80bec45d3648c23521d87c83b8f1c3ea9?s=96&d=mm&r=g\",\"caption\":\"Mark Needham\"},\"sameAs\":[\"http:\\\/\\\/www.markhneedham.com\\\/blog\\\/\"],\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/author\\\/Mark-Needham\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Clojure: All things regex","description":"I\u2019ve been doing some scrapping of web pages recently using Clojure and Enlive and as part of that I\u2019ve had to write regular expressions to extract the","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html","og_locale":"en_US","og_type":"article","og_title":"Clojure: All things regex","og_description":"I\u2019ve been doing some scrapping of web pages recently using Clojure and Enlive and as part of that I\u2019ve had to write regular expressions to extract the","og_url":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html","og_site_name":"Java Code Geeks","article_publisher":"https:\/\/www.facebook.com\/javacodegeeks","article_published_time":"2013-09-17T10:00:07+00:00","og_image":[{"width":150,"height":150,"url":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/clojure-logo.jpg","type":"image\/jpeg"}],"author":"Mark Needham","twitter_card":"summary_large_image","twitter_creator":"@javacodegeeks","twitter_site":"@javacodegeeks","twitter_misc":{"Written by":"Mark Needham","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html#article","isPartOf":{"@id":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html"},"author":{"name":"Mark Needham","@id":"https:\/\/www.javacodegeeks.com\/#\/schema\/person\/fdb35381baa5059768d6788cfb685313"},"headline":"Clojure: All things regex","datePublished":"2013-09-17T10:00:07+00:00","mainEntityOfPage":{"@id":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html"},"wordCount":532,"commentCount":0,"publisher":{"@id":"https:\/\/www.javacodegeeks.com\/#organization"},"image":{"@id":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html#primaryimage"},"thumbnailUrl":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/clojure-logo.jpg","articleSection":["Clojure"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html","url":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html","name":"Clojure: All things regex","isPartOf":{"@id":"https:\/\/www.javacodegeeks.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html#primaryimage"},"image":{"@id":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html#primaryimage"},"thumbnailUrl":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/clojure-logo.jpg","datePublished":"2013-09-17T10:00:07+00:00","description":"I\u2019ve been doing some scrapping of web pages recently using Clojure and Enlive and as part of that I\u2019ve had to write regular expressions to extract the","breadcrumb":{"@id":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html#primaryimage","url":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/clojure-logo.jpg","contentUrl":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/clojure-logo.jpg","width":150,"height":150},{"@type":"BreadcrumbList","@id":"https:\/\/www.javacodegeeks.com\/2013\/09\/clojure-all-things-regex.html#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.javacodegeeks.com\/"},{"@type":"ListItem","position":2,"name":"JVM Languages","item":"https:\/\/www.javacodegeeks.com\/category\/jvm-languages"},{"@type":"ListItem","position":3,"name":"Clojure","item":"https:\/\/www.javacodegeeks.com\/category\/jvm-languages\/clojure"},{"@type":"ListItem","position":4,"name":"Clojure: All things regex"}]},{"@type":"WebSite","@id":"https:\/\/www.javacodegeeks.com\/#website","url":"https:\/\/www.javacodegeeks.com\/","name":"Java Code Geeks","description":"Java Developers Resource Center","publisher":{"@id":"https:\/\/www.javacodegeeks.com\/#organization"},"alternateName":"JCG","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.javacodegeeks.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.javacodegeeks.com\/#organization","name":"Exelixis Media P.C.","url":"https:\/\/www.javacodegeeks.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.javacodegeeks.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2022\/06\/exelixis-logo.png","contentUrl":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2022\/06\/exelixis-logo.png","width":864,"height":246,"caption":"Exelixis Media P.C."},"image":{"@id":"https:\/\/www.javacodegeeks.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/javacodegeeks","https:\/\/x.com\/javacodegeeks"]},{"@type":"Person","@id":"https:\/\/www.javacodegeeks.com\/#\/schema\/person\/fdb35381baa5059768d6788cfb685313","name":"Mark Needham","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/5489baed26ce2d932bf951ecfb47afe80bec45d3648c23521d87c83b8f1c3ea9?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/5489baed26ce2d932bf951ecfb47afe80bec45d3648c23521d87c83b8f1c3ea9?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/5489baed26ce2d932bf951ecfb47afe80bec45d3648c23521d87c83b8f1c3ea9?s=96&d=mm&r=g","caption":"Mark Needham"},"sameAs":["http:\/\/www.markhneedham.com\/blog\/"],"url":"https:\/\/www.javacodegeeks.com\/author\/Mark-Needham"}]}},"_links":{"self":[{"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/posts\/17350","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/users\/134"}],"replies":[{"embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/comments?post=17350"}],"version-history":[{"count":0,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/posts\/17350\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/media\/93"}],"wp:attachment":[{"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/media?parent=17350"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/categories?post=17350"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/tags?post=17350"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}