{"id":3143,"date":"2021-11-29T11:07:10","date_gmt":"2021-11-29T11:07:10","guid":{"rendered":"https:\/\/www.pythontutorial.net\/?page_id=3143"},"modified":"2023-06-01T01:24:13","modified_gmt":"2023-06-01T01:24:13","slug":"python-regex-word-boundary","status":"publish","type":"page","link":"https:\/\/www.pythontutorial.net\/python-regex\/python-regex-word-boundary\/","title":{"rendered":"Python Regex Word Boundary"},"content":{"rendered":"\n<p><strong>Summary<\/strong>: in this tutorial, you&#8217;ll learn how to construct regular expressions that match word boundary positions in a string.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id='introduction-to-the-python-regex-word-boundary'>Introduction to the Python regex word boundary <a href=\"#introduction-to-the-python-regex-word-boundary\" class=\"anchor\" id=\"introduction-to-the-python-regex-word-boundary\" title=\"Anchor for Introduction to the Python regex word boundary\">#<\/a><\/h2>\n\n\n\n<p>A string has the following positions that qualify as word boundaries:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Before the first character in the string if the first character is a word character (<code>\\w<\/code>).<\/li>\n\n\n\n<li>Between two characters in the string if the first character is a word character (<code>\\w<\/code>) and the other is not (<code>\\W<\/code> &#8211; inverse character set of the word character <code>\\w<\/code>).<\/li>\n\n\n\n<li>After the last character in a string if the last character is the word character (<code>\\w<\/code>)<\/li>\n<\/ol>\n\n\n\n<p>The following picture shows the word boundary positions in the string <code>\"PYTHON 3!\"<\/code>:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" src=\"https:\/\/www.pythontutorial.net\/wp-content\/uploads\/2021\/11\/python-regex-word-boundary.svg\" alt=\"python regex word boundary\" class=\"wp-image-3146\"\/><\/figure>\n\n\n\n<p>In this example, the <code>\"PYTHON 3!\"<\/code> string has four word boundary positions:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Before the letter P (criteria #1)<\/li>\n\n\n\n<li>After the letter N (criteria #2)<\/li>\n\n\n\n<li>Before the digit 3 (criteria #2)<\/li>\n\n\n\n<li>After the digit 3 (criteria #2)<\/li>\n<\/ul>\n\n\n\n<p><a href=\"https:\/\/www.pythontutorial.net\/python-regex\/python-regular-expressions\/\">Regular expressions<\/a> use the <code>\\b<\/code> to represent a word boundary. For example, you can use the <code>\\b<\/code> to match the whole <code>word<\/code> using the following pattern:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-1\" data-shcb-language-name=\"JavaScript\" data-shcb-language-slug=\"javascript\"><span><code class=\"hljs language-javascript\">r<span class=\"hljs-string\">'\\bword\\b'<\/span><\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-1\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">JavaScript<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">javascript<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>The following example matches the word <code>Python<\/code> in a string:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-2\" data-shcb-language-name=\"JavaScript\" data-shcb-language-slug=\"javascript\"><span><code class=\"hljs language-javascript\"><span class=\"hljs-keyword\">import<\/span> re\n\ns = <span class=\"hljs-string\">'CPython is the implementation of Python in C'<\/span>\nmatches = re.finditer(<span class=\"hljs-string\">'Python'<\/span>, s)\n<span class=\"hljs-keyword\">for<\/span> match <span class=\"hljs-keyword\">in<\/span> matches:\n    print(match.group())<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-2\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">JavaScript<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">javascript<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>It returns two matches, one in the word <code>CPython<\/code> and another in the word <code>Python<\/code>.<\/p>\n\n\n<pre class=\"wp-block-code\"><span><code class=\"hljs\">Python\nPython<\/code><\/span><\/pre>\n\n\n<p>However, if you use the word boundary <code>\\b<\/code>, the program returns one match:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-3\" data-shcb-language-name=\"JavaScript\" data-shcb-language-slug=\"javascript\"><span><code class=\"hljs language-javascript\"><span class=\"hljs-keyword\">import<\/span> re\n\ns = <span class=\"hljs-string\">'CPython is the implementation of Python in C'<\/span>\nmatches = re.finditer(r<span class=\"hljs-string\">'\\bPython\\b'<\/span>, s)\n<span class=\"hljs-keyword\">for<\/span> match <span class=\"hljs-keyword\">in<\/span> matches:\n    print(match.group())\n<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-3\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">JavaScript<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">javascript<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>Output:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-4\" data-shcb-language-name=\"HTML, XML\" data-shcb-language-slug=\"xml\"><span><code class=\"hljs language-xml\"><span class=\"hljs-tag\">&lt;<span class=\"hljs-name\">re.Match<\/span> <span class=\"hljs-attr\">object<\/span>; <span class=\"hljs-attr\">span<\/span>=<span class=\"hljs-string\">(33,<\/span> <span class=\"hljs-attr\">39<\/span>), <span class=\"hljs-attr\">match<\/span>=<span class=\"hljs-string\">'Python'<\/span>&gt;<\/span><\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-4\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">HTML, XML<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">xml<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>In this example, the <code>'\\bPython\\b'<\/code> pattern matches the whole word <code>Python<\/code> in the string <code>'CPython is the implementation of Python in C'<\/code>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id='summary'>Summary <a href=\"#summary\" class=\"anchor\" id=\"summary\" title=\"Anchor for Summary\">#<\/a><\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The <code>\\b<\/code> represents a word boundary in a string.<\/li>\n\n\n\n<li>Use the <code>r'\\bword\\b'<\/code> pattern to match the whole <code>word<\/code><\/li>\n<\/ul>\n<div class=\"helpful-block-content\" data-title=\"\">\n\t<header>\n\t\t<div class=\"wth-question\">Was this tutorial helpful ?<\/div>\n\t\t<div class=\"wth-thumbs\">\n\t\t\t<button\n\t\t\t\tdata-post=\"3143\"\n\t\t\t\tdata-post-url=\"https:\/\/www.pythontutorial.net\/python-regex\/python-regex-word-boundary\/\"\n\t\t\t\tdata-post-title=\"Python Regex Word Boundary\"\n\t\t\t\tdata-response=\"1\"\n\t\t\t\tclass=\"wth-btn-rounded wth-yes-btn\"\n\t\t\t>\n\t\t\t\t<svg\n\t\t\t\t\txmlns=\"http:\/\/www.w3.org\/2000\/svg\"\n\t\t\t\t\tviewBox=\"0 0 24 24\"\n\t\t\t\t\tfill=\"none\"\n\t\t\t\t\tstroke=\"currentColor\"\n\t\t\t\t\tstroke-width=\"2\"\n\t\t\t\t\tstroke-linecap=\"round\"\n\t\t\t\t\tstroke-linejoin=\"round\"\n\t\t\t\t\tclass=\"feather feather-thumbs-up block w-full h-full\"\n\t\t\t\t>\n\t\t\t\t\t<path\n\t\t\t\t\t\td=\"M14 9V5a3 3 0 0 0-3-3l-4 9v11h11.28a2 2 0 0 0 2-1.7l1.38-9a2 2 0 0 0-2-2.3zM7 22H4a2 2 0 0 1-2-2v-7a2 2 0 0 1 2-2h3\"\n\t\t\t\t\t><\/path>\n\t\t\t\t<\/svg>\n\t\t\t\t<span class=\"sr-only\"> Yes <\/span>\n\t\t\t<\/button>\n\n\t\t\t<button\n\t\t\t\tdata-response=\"0\"\n\t\t\t\tdata-post=\"3143\"\n\t\t\t\tdata-post-url=\"https:\/\/www.pythontutorial.net\/python-regex\/python-regex-word-boundary\/\"\n\t\t\t\tdata-post-title=\"Python Regex Word Boundary\"\n\t\t\t\tclass=\"wth-btn-rounded wth-no-btn\"\n\t\t\t>\n\t\t\t\t<svg\n\t\t\t\t\txmlns=\"http:\/\/www.w3.org\/2000\/svg\"\n\t\t\t\t\tviewBox=\"0 0 24 24\"\n\t\t\t\t\tfill=\"none\"\n\t\t\t\t\tstroke=\"currentColor\"\n\t\t\t\t\tstroke-width=\"2\"\n\t\t\t\t\tstroke-linecap=\"round\"\n\t\t\t\t\tstroke-linejoin=\"round\"\n\t\t\t\t>\n\t\t\t\t\t<path\n\t\t\t\t\t\td=\"M10 15v4a3 3 0 0 0 3 3l4-9V2H5.72a2 2 0 0 0-2 1.7l-1.38 9a2 2 0 0 0 2 2.3zm7-13h2.67A2.31 2.31 0 0 1 22 4v7a2.31 2.31 0 0 1-2.33 2H17\"\n\t\t\t\t\t><\/path>\n\t\t\t\t<\/svg>\n\t\t\t\t<span class=\"sr-only\"> No <\/span>\n\t\t\t<\/button>\n\t\t<\/div>\n\t<\/header>\n\n\t<div class=\"wth-form hidden\">\n\t\t<div class=\"wth-form-wrapper\">\n\t\t\t<div class=\"wth-title\"><\/div>\n\t\t\t<textarea class=\"wth-message\"><\/textarea>\n\t\t\t<input type=\"button\" name=\"wth-submit\" class=\"wth-btn wth-btn-submit\" id=\"wth-submit\" \/>\n\t\t\t<input type=\"button\" class=\"wth-btn wth-btn-cancel\" value=\"Cancel\" \/>\n\t\t<\/div>\n\t<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>In this tutorial, you&#8217;ll learn how to construct regular expressions that match word boundary positions in a string.<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":3122,"menu_order":3,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-3143","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/pages\/3143","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/comments?post=3143"}],"version-history":[{"count":0,"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/pages\/3143\/revisions"}],"up":[{"embeddable":true,"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/pages\/3122"}],"wp:attachment":[{"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/media?parent=3143"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}