{"id":3198,"date":"2021-12-02T15:27:43","date_gmt":"2021-12-02T15:27:43","guid":{"rendered":"https:\/\/www.pythontutorial.net\/?page_id=3198"},"modified":"2022-02-18T07:34:10","modified_gmt":"2022-02-18T07:34:10","slug":"python-regex-alternation","status":"publish","type":"page","link":"https:\/\/www.pythontutorial.net\/python-regex\/python-regex-alternation\/","title":{"rendered":"Python Regex Alternation"},"content":{"rendered":"\n<p><strong>Summary<\/strong>: in this tutorial, you&#8217;ll learn about Python regex alternation, which behaves like the &#8220;OR&#8221; operator in regular expressions.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"introduction-to-the-python-regex-alternation\" id='introduction-to-the-python-regex-alternation'>Introduction to the Python regex alternation <a href=\"#introduction-to-the-python-regex-alternation\" class=\"anchor\" id=\"introduction-to-the-python-regex-alternation\" title=\"Anchor for Introduction to the Python regex alternation\">#<\/a><\/h2>\n\n\n\n<p>To represent an alternation in <a href=\"https:\/\/www.pythontutorial.net\/python-regex\/python-regular-expressions\/\">regular expressions<\/a>, you use the pipe operator (<code>|<\/code>). The pipe operator is called the <strong>alternation<\/strong>. It is like the <code>or<\/code> operator in Python.<\/p>\n\n\n\n<p>The following regular expression uses an alternation to match either the literal string <code>complex<\/code> and <code>simple<\/code>:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-1\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-string\">'simple|complex'<\/span><\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-1\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>For example, the following program uses the above regular expression to match either the literal string <code>simple<\/code> or <code>complex<\/code>:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-2\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-keyword\">import<\/span> re\n\ns = <span class=\"hljs-string\">'simple is better than complex'<\/span>\npattern = <span class=\"hljs-string\">r'simple|complex'<\/span>\n\nmatches = re.findall(pattern,s)\nprint(matches)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-2\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>Output:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-3\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\">&#91;<span class=\"hljs-string\">'simple'<\/span>, <span class=\"hljs-string\">'complex'<\/span>]<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-3\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h2 class=\"wp-block-heading\" id=\"python-regex-alternation-examples\" id='python-regex-alternation-examples'>Python regex alternation examples <a href=\"#python-regex-alternation-examples\" class=\"anchor\" id=\"python-regex-alternation-examples\" title=\"Anchor for Python regex alternation examples\">#<\/a><\/h2>\n\n\n\n<p>Let&#8217;s take more examples of using the regex alternation.  <\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"1-use-python-regex-alternation-for-matching-time-in-hh-mm-format\" id='1-use-python-regex-alternation-for-matching-time-in-hhmm-format'>1) Use Python regex alternation for matching time in hh:mm format <a href=\"#1-use-python-regex-alternation-for-matching-time-in-hhmm-format\" class=\"anchor\" id=\"1-use-python-regex-alternation-for-matching-time-in-hhmm-format\" title=\"Anchor for 1) Use Python regex alternation for matching time in hh:mm format\">#<\/a><\/h3>\n\n\n\n<p>To match a time string in the <code>hh:mm<\/code> format, you can combine the <code>\\d<\/code> <a href=\"https:\/\/www.pythontutorial.net\/python-regex\/python-regex-character-set\/\">character set<\/a> with the <a href=\"https:\/\/www.pythontutorial.net\/python-regex\/python-regex-quantifiers\/\">quantifiers<\/a> <code>{}<\/code>:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-4\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-string\">'\\d{2}:\\d{2}'<\/span><\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-4\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>In this pattern:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><code>\\d{2}<\/code> matches two digits.<\/li><li><code>:<\/code> matches the colon character.<\/li><li><code>\\d{2}<\/code> matches two digits.<\/li><\/ul>\n\n\n\n<p>However, the rule <code>\\d{2}<\/code> also matches a number that is not a valid hour or minute, such as <code>99<\/code>.<\/p>\n\n\n\n<p>To fix this, you can use the regex alternation. <\/p>\n\n\n\n<p>If the valid hour ranges from <code>01<\/code> to <code>23<\/code>, you can use the following pattern to match the hour part:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-5\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\">&#91;<span class=\"hljs-number\">01<\/span>]\\d|<span class=\"hljs-number\">2<\/span>&#91;<span class=\"hljs-number\">0<\/span><span class=\"hljs-number\">-3<\/span>]<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-5\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>In this pattern:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><code>[01]<\/code> matches a single digit 0 or 1<\/li><li><code>\\d<\/code> matches a single digit from 0 to 9<\/li><li><code>[01]\\d<\/code> matches 00, 01 to 19<\/li><li><code>2<\/code> matches the digit 2<\/li><li><code>[0-3]<\/code> matches a single digit from 0 to 3 including 0, 1, 2, 3<\/li><li><code>2[0-3]<\/code> matches two digits 20, 21, 22, and 23.<\/li><\/ul>\n\n\n\n<p>Therefore, the <code>[01]\\d|2[0-3]<\/code> matches two digits from 00 to 23 <\/p>\n\n\n\n<p>Because the valid minute ranges from <code>00<\/code> to <code>59<\/code>, you can use the following pattern to match it:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-6\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\">&#91;<span class=\"hljs-number\">0<\/span><span class=\"hljs-number\">-5<\/span>]\\d<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-6\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>The following regular expression combines the two rules above to match the time in the <code>hh:mm<\/code> format:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-7\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-string\">'&#91;01]\\d|2&#91;0-3]:&#91;0-5]\\d'<\/span><\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-7\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>However, this regular expression will not work as expected. For example:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-8\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-keyword\">import<\/span> re\n\ns = <span class=\"hljs-string\">'09:30 30:61 22:30 25:99'<\/span>\npattern = <span class=\"hljs-string\">r'&#91;01]\\d|2&#91;0-3]:&#91;0-5]\\d'<\/span>\n\nmatches = re.finditer(pattern, s)\n<span class=\"hljs-keyword\">for<\/span> match <span class=\"hljs-keyword\">in<\/span> matches:\n    print(match.group())<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-8\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>Output:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-9\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-number\">09<\/span>\n<span class=\"hljs-number\">22<\/span>:<span class=\"hljs-number\">30<\/span><\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-9\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>In this example, the regex engine treats pattern <code>[01]\\d|2[0-3]:[0-5]\\d<\/code>  as two main parts separated by the alternation:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-10\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\">&#91;<span class=\"hljs-number\">01<\/span>]\\d\nOR\n<span class=\"hljs-number\">2<\/span>&#91;<span class=\"hljs-number\">0<\/span><span class=\"hljs-number\">-3<\/span>]):(&#91;<span class=\"hljs-number\">0<\/span><span class=\"hljs-number\">-5<\/span>]\\d)<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-10\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>To fix it, you need to wrap the alternation inside parentheses to indicate that only that part is alternated, not the whole expression like this:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-11\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\">(&#91;<span class=\"hljs-number\">01<\/span>]\\d|<span class=\"hljs-number\">2<\/span>&#91;<span class=\"hljs-number\">0<\/span><span class=\"hljs-number\">-3<\/span>]):&#91;<span class=\"hljs-number\">0<\/span><span class=\"hljs-number\">-5<\/span>]\\d<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-11\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>Now, the program works as expected:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-12\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-keyword\">import<\/span> re\n\ns = <span class=\"hljs-string\">'09:30 30:61 22:30 25:99'<\/span>\npattern = <span class=\"hljs-string\">r'(&#91;01]\\d|2&#91;0-3]):&#91;0-5]\\d'<\/span>\n\nmatches = re.finditer(pattern, s)\n<span class=\"hljs-keyword\">for<\/span> match <span class=\"hljs-keyword\">in<\/span> matches:\n    print(match.group())<\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-12\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<p>Output:<\/p>\n\n\n<pre class=\"wp-block-code\" aria-describedby=\"shcb-language-13\" data-shcb-language-name=\"Python\" data-shcb-language-slug=\"python\"><span><code class=\"hljs language-python\"><span class=\"hljs-number\">09<\/span>:<span class=\"hljs-number\">30<\/span>\n<span class=\"hljs-number\">22<\/span>:<span class=\"hljs-number\">30<\/span><\/code><\/span><small class=\"shcb-language\" id=\"shcb-language-13\"><span class=\"shcb-language__label\">Code language:<\/span> <span class=\"shcb-language__name\">Python<\/span> <span class=\"shcb-language__paren\">(<\/span><span class=\"shcb-language__slug\">python<\/span><span class=\"shcb-language__paren\">)<\/span><\/small><\/pre>\n\n\n<h2 class=\"wp-block-heading\" id=\"summary\" id='summary'>Summary <a href=\"#summary\" class=\"anchor\" id=\"summary\" title=\"Anchor for Summary\">#<\/a><\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>The regex alternation <code>X | Y<\/code> matches either <code>X<\/code> or <code>Y<\/code>.<\/li><li>The regex alternation is like an OR operator in regular expressions.<\/li><li>Place the alternation part inside parentheses <code>()<\/code> to express that only that part is alternated.<\/li><\/ul>\n<div class=\"helpful-block-content\" data-title=\"\">\n\t<header>\n\t\t<div class=\"wth-question\">Was this tutorial helpful ?<\/div>\n\t\t<div class=\"wth-thumbs\">\n\t\t\t<button\n\t\t\t\tdata-post=\"3198\"\n\t\t\t\tdata-post-url=\"https:\/\/www.pythontutorial.net\/python-regex\/python-regex-alternation\/\"\n\t\t\t\tdata-post-title=\"Python Regex Alternation\"\n\t\t\t\tdata-response=\"1\"\n\t\t\t\tclass=\"wth-btn-rounded wth-yes-btn\"\n\t\t\t>\n\t\t\t\t<svg\n\t\t\t\t\txmlns=\"http:\/\/www.w3.org\/2000\/svg\"\n\t\t\t\t\tviewBox=\"0 0 24 24\"\n\t\t\t\t\tfill=\"none\"\n\t\t\t\t\tstroke=\"currentColor\"\n\t\t\t\t\tstroke-width=\"2\"\n\t\t\t\t\tstroke-linecap=\"round\"\n\t\t\t\t\tstroke-linejoin=\"round\"\n\t\t\t\t\tclass=\"feather feather-thumbs-up block w-full h-full\"\n\t\t\t\t>\n\t\t\t\t\t<path\n\t\t\t\t\t\td=\"M14 9V5a3 3 0 0 0-3-3l-4 9v11h11.28a2 2 0 0 0 2-1.7l1.38-9a2 2 0 0 0-2-2.3zM7 22H4a2 2 0 0 1-2-2v-7a2 2 0 0 1 2-2h3\"\n\t\t\t\t\t><\/path>\n\t\t\t\t<\/svg>\n\t\t\t\t<span class=\"sr-only\"> Yes <\/span>\n\t\t\t<\/button>\n\n\t\t\t<button\n\t\t\t\tdata-response=\"0\"\n\t\t\t\tdata-post=\"3198\"\n\t\t\t\tdata-post-url=\"https:\/\/www.pythontutorial.net\/python-regex\/python-regex-alternation\/\"\n\t\t\t\tdata-post-title=\"Python Regex Alternation\"\n\t\t\t\tclass=\"wth-btn-rounded wth-no-btn\"\n\t\t\t>\n\t\t\t\t<svg\n\t\t\t\t\txmlns=\"http:\/\/www.w3.org\/2000\/svg\"\n\t\t\t\t\tviewBox=\"0 0 24 24\"\n\t\t\t\t\tfill=\"none\"\n\t\t\t\t\tstroke=\"currentColor\"\n\t\t\t\t\tstroke-width=\"2\"\n\t\t\t\t\tstroke-linecap=\"round\"\n\t\t\t\t\tstroke-linejoin=\"round\"\n\t\t\t\t>\n\t\t\t\t\t<path\n\t\t\t\t\t\td=\"M10 15v4a3 3 0 0 0 3 3l4-9V2H5.72a2 2 0 0 0-2 1.7l-1.38 9a2 2 0 0 0 2 2.3zm7-13h2.67A2.31 2.31 0 0 1 22 4v7a2.31 2.31 0 0 1-2.33 2H17\"\n\t\t\t\t\t><\/path>\n\t\t\t\t<\/svg>\n\t\t\t\t<span class=\"sr-only\"> No <\/span>\n\t\t\t<\/button>\n\t\t<\/div>\n\t<\/header>\n\n\t<div class=\"wth-form hidden\">\n\t\t<div class=\"wth-form-wrapper\">\n\t\t\t<div class=\"wth-title\"><\/div>\n\t\t\t<textarea class=\"wth-message\"><\/textarea>\n\t\t\t<input type=\"button\" name=\"wth-submit\" class=\"wth-btn wth-btn-submit\" id=\"wth-submit\" \/>\n\t\t\t<input type=\"button\" class=\"wth-btn wth-btn-cancel\" value=\"Cancel\" \/>\n\t\t<\/div>\n\t<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>In this tutorial, you&#8217;ll learn about Python regex alternation, which behaves like the &#8220;OR&#8221; operator in the regular expression<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":3122,"menu_order":10,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-3198","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/pages\/3198","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/comments?post=3198"}],"version-history":[{"count":0,"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/pages\/3198\/revisions"}],"up":[{"embeddable":true,"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/pages\/3122"}],"wp:attachment":[{"href":"https:\/\/www.pythontutorial.net\/wp-json\/wp\/v2\/media?parent=3198"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}