Skip to content

Conversation

@oldlaptop
Copy link
Contributor

@oldlaptop oldlaptop commented Jun 19, 2021

DB4S uses its own lexer and parser to process the database's schema for display in the Database Structure tab. This lexer does not permit dollar signs ($) in unquoted identifiers. SQLite's tokenizer, however, permits (and therefore real SQLite databases may have) dollar signs in unquoted identifiers, for compatibility with certain other database systems that also permit these non-standard identifiers. (I tripped over this issue while playing with databases exported from Oracle, which often uses dollar signs in identifiers for special "system" tables and columns.) See:

https://www.sqlite.org/cgi/src/file?name=src/tokenize.c&ci=c5e2de1d24d34589&ln=162-165
https://www.sqlite.org/cgi/src/file?name=src/global.c&ci=fd8b68a474226e3b&ln=113-116

My opinion from the peanut gallery is that it might be better to use SQLite's tokenizer and parser (which could be included inline or something) so as to absolutely assure precise, bug-for-bug compatibility with at least some version of SQLite, but that's not a five-minute drive-by patch, and adding one character to one regex is. This follows what appears to be the practice in this codebase of committing flex's output after making any change to the source; if there's something undesirable about my copy of flex's output, I can leave that out.

The behavior without this patch is that tables with unquoted names containing dollar signs, or which contain unquoted column names containing dollar signs, appear in the Database Structure tab to have no columns, but otherwise seem to behave normally. Test case (as an sqlite3(1) .dump of a database with offending names):

PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE foo(bar$);
INSERT INTO foo VALUES(1);
INSERT INTO foo VALUES(2);
INSERT INTO foo VALUES(3);
INSERT INTO foo VALUES(4);
CREATE TABLE baz$(a);
INSERT INTO "baz$" VALUES('a');
INSERT INTO "baz$" VALUES('b');
INSERT INTO "baz$" VALUES('c');
INSERT INTO "baz$" VALUES('d');
COMMIT;

@mgrojo mgrojo requested a review from MKleusberg June 19, 2021 22:32
@justinclift
Copy link
Member

justinclift commented Jun 20, 2021

Awesome, thanks for this PR @oldlaptop!

The best person to review this is @MKleusberg, whom Manuel has already requested review from. Hopefully he'll get to it in the next few days. 😄

@MKleusberg
Copy link
Member

Thank you very much, @oldlaptop! This is a very good catch 👍 I'll add a test case too in a second.

My opinion from the peanut gallery is that it might be better to use SQLite's tokenizer and parser (which could be included inline or something) so as to absolutely assure precise, bug-for-bug compatibility with at least some version of SQLite, but that's not a five-minute drive-by patch, and adding one character to one regex is

In theory that would be very desirable. I've actually tried this before developing the current parser but it turned out that we would have to make so many adjustments to the parser code that the amount of work would be similar to developing a parser from scratch. The reason is that the parser actions need to be completely different and at the same time we cannot just replace their function bodies because sometimes we're interested in different information than SQLite is when it parses a statement and calls its functions. Also we're only a small developer team and keeping up with SQLite's changes might be tricky and is potentially more time consuming than a little fix to the lexer or parser from time to time. But that's just my thinking from a couple of years ago. Maybe one day we'll actually switch to using the original parser code, for example when we also want to parse SELECT statements and similar in the future.

@MKleusberg MKleusberg merged commit c585372 into sqlitebrowser:master Jul 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants