Bad regexp in Chinese search

Hi @enhao!

`sphinx/search/zh.py` contains this:

``` #!python
    latin1_letters = re.compile(r'\w+(?u)[\u0000-\u00ff]')
```

But the `\u` sequence is supported by the re module only since Python 3.3. In previous versions this regexp is equivalent to:

``` #!python
r'\w+(?u)[0-u]'
```

which is definitely not what you wanted.

But even in Python 3.3+, the regexp looks dubious to me. The variable name is `latin1_letters`, but the regexp matches a sequence of (not necessarily Latin 1) alphanumeric characters, followed by a Latin 1 character (not necessarily a letter).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bad regexp in Chinese search #2544

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Bad regexp in Chinese search #2544

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions