Skip to content

CLI to export/import any database to/from SQLite#2496

Merged
Alkarex merged 25 commits intoFreshRSS:devfrom
Alkarex:export_sqlite
Sep 15, 2019
Merged

CLI to export/import any database to/from SQLite#2496
Alkarex merged 25 commits intoFreshRSS:devfrom
Alkarex:export_sqlite

Conversation

@Alkarex
Copy link
Copy Markdown
Member

@Alkarex Alkarex commented Aug 18, 2019

Require PHP 5.5+ #2495
If we move to PHP 5.5+, this is a PR to celebrate that with a new feature 😃
I wanted to have this ability to move from one system to another for a good while, including changing database type, since our current export/import features were not quite good enough, by not taking all entries. This is a full database copy to/from SQLite. This allows e.g. MySQL → SQLite → PostgreSQL.
It takes advantage of yield without which such a massive data stream would be quite less elegant to write (and I consider using it to our big SELECT as well to avoid big memory copies like we have at the moment).
I found amusing to accompany a powerful newish operator with one of the oldest powerful operators (goto) but I can refactor if you are allergic to that part 😆

@Alkarex Alkarex added this to the 1.15.0 milestone Aug 18, 2019
@Alkarex Alkarex added the Work in progress 🚧 Wait before merging label Aug 18, 2019
@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Aug 18, 2019

Hum, it looks like I have a memory problem somewhere when testing on a tiny server with a very big DB.

@Frenzie
Copy link
Copy Markdown
Member

Frenzie commented Aug 18, 2019

Huh, that's very interesting. Also just to switch servers it might be easier to handle than MySQL export → MySQL import.

This done/exit routine looks like a good use of goto to me. It's just using it to go backward (like a while/for loop) that's confusing really.

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Aug 18, 2019

Tested with success using a MySQL database of 340k articles on an Atom server with 2GB RAM, resulting in a 675MB SQLite file 😃

@Alkarex Alkarex removed the Work in progress 🚧 Wait before merging label Aug 18, 2019
@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Aug 18, 2019

To import from SQLite, which would only work properly if the destination DB is empty, it would be nice to provide a command to clear the destination database. But I do not want to make it too easy to avoid user mistakes. Any suggestion? Maybe as a distinct command?

./cli/db-truncate.php --user alice
./cli/import-sqlite-for-user.php --user alice--filename /tmp/alice.sqlite

@Frenzie
Copy link
Copy Markdown
Member

Frenzie commented Aug 18, 2019

Perhaps just a sufficiently dangerous sounding flag like --force-overwrite?

@Alkarex Alkarex added the Work in progress 🚧 Wait before merging label Aug 22, 2019
@Alkarex Alkarex removed the Work in progress 🚧 Wait before merging label Sep 11, 2019
@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 11, 2019

I have made a bit of refactoring to make it easier to work with more than one user / more than one database at a time (fewer static objects).
Ready for more testing.

More uniform logic for the 3 databases.
Fix wrong DROP TABLE for SQLite.
@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 12, 2019

I was not so happy of the special case for SQLite. I found the problem with PDO (which gets buggy if we delete the initial .sqlite database) and now all three databases should have the same behaviour.

@Alkarex Alkarex added the Work in progress 🚧 Wait before merging label Sep 12, 2019
@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 12, 2019

Damn, I have forgotten to drop some indexes for PostgreSQL. Coming back to it a bit later.
Edit: nothing to do with indexes, there was just a bug in the deletion of existing data. Fixed

@Alkarex Alkarex removed the Work in progress 🚧 Wait before merging label Sep 12, 2019
@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 12, 2019

Ok, NOW I think it is ready for testing :-)
I have successfully exported and imported with MySQL, SQLite, and PostgreSQL (testing with a ~700MB database)

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 12, 2019

Now also tested with MariaDB :-)

@marienfressinaud
Copy link
Copy Markdown
Member

This is very cool! I tried with my personal database: a 70Mo SQLite file (pretty small compared to yours ^^) to pgsql and it worked just fine :) I'll take a look to the code later

./cli/user-info.php -h --user username
# -h is to use a human-readable format
# --user can be a username, or '*' to loop on all users
# Returns: 1) a * iff the user is admin, 2) the name of the user,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Returns: 1) a * iff the user is admin, 2) the name of the user,
# Returns: 1) a * if the user is admin, 2) the name of the user,

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VALUES(:url, 1, :name, :website, :description, 86400);'
);

define('SQL_DROP_TABLES', 'DROP TABLE IF EXISTS `entrytag`, `tag`, `entrytmp`, `entry`, `feed`, `category`');
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why you changed that…

Also it implies that we maintain two versions of the same code in UserDAO.::deleteUser which is more difficult to maintain and to understand.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because SQLite does not support dropping multiple tables at once. This code never worked.
But in my new PR, I have made all these calls uniform for the 3 databases.

}
}

public function select() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[naming, opinion] When I read select, I expect the method to take an id and to return only one result. I think list or all is more understandable. I think you chose select to be close to the SQL syntax but I'm not sure the result is obvious in a development context. My comment applies to all the *DAO->select methods of course.

} else {
return true;
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a big part and I have to admit I was very sceptical when I saw the size of the method and the different calls to goto. In the end, it is pretty straightforward to read so I give you some insights but feel free to not consider them :)

  • I would rather create two distinct method import_from_sqlite($filename, $clearFirst = false) and export_to_sqlite($filename) to don't have to manipulate a $mode parameter and its related switches
  • it implies a bit of factoring so I would have created import methods for each model (for instance import_category($daoFrom, $daoTo)) which would return false if an error occurs
  • your gotos are nothing more than function calls but with a hidden $error parameter (goto done; vs return done($error);). While I understand the fun of using a pretty old notation next to the yield, I'm not sure that we gain in readability.

Copy link
Copy Markdown
Member Author

@Alkarex Alkarex Sep 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the goto (there has been so much bashing during one generation, that other readers might have the same reaction), but before it dies:

  • In this case, it is indeed a bit like calling a function, but better, because more lightweight (function calls are more expensive), better scoped, and with the code located at a more natural place in the flow.
  • In a language that supports nested functions (e.g. C#, JavaScript, Pascal), this is what I would have used, but PHP does not support them, so now there is one more function with a scoping that is too large...

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 14, 2019

Tested again after the changes.

@marienfressinaud
Copy link
Copy Markdown
Member

Tested again, it still works fine except that I just realize that, for some reason, the feed visibility value (priority column) isn't correctly imported. I tried from SQLite to pgsql and my "category visibility" feeds became "main stream visibility" feeds

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 15, 2019

Fix in progress!

@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 15, 2019

@marienfressinaud Should be fixed in 2a9bfdc

Copy link
Copy Markdown
Member

@marienfressinaud marienfressinaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good for me!

@Alkarex Alkarex merged commit c76a318 into FreshRSS:dev Sep 15, 2019
@Alkarex
Copy link
Copy Markdown
Member Author

Alkarex commented Sep 15, 2019

Ok :-) Merging, with more testing in the follow-up PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants