-
-
Notifications
You must be signed in to change notification settings - Fork 117
Introspect if table is FTS4 or FTS5 #196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is surprisingly difficult. I need to parse the CREATE VIRTUAL TABLE "global-power-plants_fts" USING FTS5 ("name", content="global-power-plants") The problem is I need to be able to handle various different quoting formats for the table name ( |
This is blocking landing |
I could use a parsing library like https://parsy.readthedocs.io/en/latest/tutorial.html for this - or I'd rather not add a new dependency for this though so I'm going to see if I can get something that's good-enough just using a regular expression. |
|
|
I confirmed all three of these are valid syntax for creating tables:
|
Good news: I don't think I have to deal with
|
This matches a single quote, then 0+ not-single-quotes, then 0+ (either 0+ not-single quotes or a double single quote), then a single quote. Unrolling the loop technique described here: http://www.softec.lu/site/RegularExpressions/UnrollingTheLoop |
I'm going to try charref = re.compile(r"""
&[#] # Start of a numeric entity reference
(
0[0-7]+ # Octal form
| [0-9]+ # Decimal form
| x[0-9a-fA-F]+ # Hexadecimal form
)
; # Trailing semicolon
""", re.VERBOSE) |
I think this might be it: create_virtual_table_re = re.compile(r"""
\s*CREATE\s+VIRTUAL\s+TABLE\s+ # CREATE VIRTUAL TABLE
(
'(?P<squoted_table>[^']*(?:''[^']*)*)' | # single quoted name
"(?P<dquoted_table>[^"]*(?:""[^"]*)*)" | # double quoted name
`(?P<backtick_table>[^`]+)` | # `backtick` quoted name
\[(?P<squarequoted_table>[^\]]+)\] # [...] quoted name
)
\s+(IF\s+NOT\s+EXISTS\s+)? # IF NOT EXISTS (optional)
USING\s+(?P<using>\w+)
""", re.VERBOSE | re.IGNORECASE) |
Next step: lots of unit tests. |
I'm missing the case where a table has no quotes around it at all - So I need to know how to create a regex for a SQLite identifier. https://www.sqlite.org/draft/tokenreq.html seems to be the only available documentation for that.
|
More from that document, describing
|
"any other character larger than u007f." Need to figure that out! |
I think that's |
So... ALPHABETIC: NUMERIC: |
And an identifier is "ALPHABETIC character and continue with zero or more ALPHANUMERIC characters and/or "$" (u0024) characters" So...
|
The finished monster: _virtual_table_using_re = re.compile(r"""
^ # Start of string
\s*CREATE\s+VIRTUAL\s+TABLE\s+ # CREATE VIRTUAL TABLE
(
'(?P<squoted_table>[^']*(?:''[^']*)*)' | # single quoted name
"(?P<dquoted_table>[^"]*(?:""[^"]*)*)" | # double quoted name
`(?P<backtick_table>[^`]+)` | # `backtick` quoted name
\[(?P<squarequoted_table>[^\]]+)\] | # [...] quoted name
(?P<identifier> # SQLite non-quoted identifier
[A-Za-z_\u0080-\uffff] # \u0080-\uffff = "any character larger than u007f"
[A-Za-z_\u0080-\uffff0-9\$]* # zero-or-more alphanemuric or $
)
)
\s+(IF\s+NOT\s+EXISTS\s+)? # IF NOT EXISTS (optional)
USING\s+(?P<using>\w+) # e.g. USING FTS5
""", re.VERBOSE | re.IGNORECASE) |
Originally posted by @simonw in #192 (comment)
The text was updated successfully, but these errors were encountered: