-
-
Notifications
You must be signed in to change notification settings - Fork 117
Options for how r.parsedate()
should handle invalid dates
#416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Good call-out: right now the It would be better if this was configurable by the user (and properly documented) - options could include "set null if date is invalid" and "leave the value as it is if invalid" in addition to throwing an error. |
r.parsedate()
should handle invalid dates
Python's
Imitating this might be the way to go. |
Generating a test database using a pattern from https://www.geekytidbits.com/date-range-table-sqlite/
After running that:
Then to make one of them invalid:
|
Then I ran this to convert
|
I confirmed that if it fails for any value ALL values are left alone, since it runs in a transaction. Here's the code that does that: sqlite-utils/sqlite_utils/db.py Lines 2523 to 2526 in 4338136
|
I think the options here should be:
These need to be indicated by parameters to the Some design options:
|
I'm going to try the |
This is quite nice:
|
Prototype: diff --git a/sqlite_utils/cli.py b/sqlite_utils/cli.py
index 8255b56..0a3693e 100644
--- a/sqlite_utils/cli.py
+++ b/sqlite_utils/cli.py
@@ -2583,7 +2583,11 @@ def _generate_convert_help():
"""
).strip()
recipe_names = [
- n for n in dir(recipes) if not n.startswith("_") and n not in ("json", "parser")
+ n
+ for n in dir(recipes)
+ if not n.startswith("_")
+ and n not in ("json", "parser")
+ and callable(getattr(recipes, n))
]
for name in recipe_names:
fn = getattr(recipes, name)
diff --git a/sqlite_utils/recipes.py b/sqlite_utils/recipes.py
index 6918661..569c30d 100644
--- a/sqlite_utils/recipes.py
+++ b/sqlite_utils/recipes.py
@@ -1,17 +1,38 @@
from dateutil import parser
import json
+IGNORE = object()
+SET_NULL = object()
-def parsedate(value, dayfirst=False, yearfirst=False):
+
+def parsedate(value, dayfirst=False, yearfirst=False, errors=None):
"Parse a date and convert it to ISO date format: yyyy-mm-dd"
- return (
- parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).date().isoformat()
- )
+ try:
+ return (
+ parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst)
+ .date()
+ .isoformat()
+ )
+ except parser.ParserError:
+ if errors is IGNORE:
+ return value
+ elif errors is SET_NULL:
+ return None
+ else:
+ raise
-def parsedatetime(value, dayfirst=False, yearfirst=False):
+def parsedatetime(value, dayfirst=False, yearfirst=False, errors=None):
"Parse a datetime and convert it to ISO datetime format: yyyy-mm-ddTHH:MM:SS"
- return parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).isoformat()
+ try:
+ return parser.parse(value, dayfirst=dayfirst, yearfirst=yearfirst).isoformat()
+ except parser.ParserError:
+ if errors is IGNORE:
+ return value
+ elif errors is SET_NULL:
+ return None
+ else:
+ raise
def jsonsplit(value, delimiter=",", type=str): |
Needs tests and documentation. |
Thanks for addressing this @simonw! However, I just reinstalled sqlite-utils 3.26.1 and get an
I definitely have some invalid data in the db. Happy to send a copy if it's helpful. |
Exceptions are normal expected behavior when typecasting an invalid format. However, r.parsedate() is really just re-formatting strings and keeping the type as text. So it may be better to print-and-pass on exception so the user can see a complete list of invalid values -- while also allowing for the parser to reformat the remaining valid values.
In this case, I had just one variation of an invalid date: ' / / '. But theoretically there could be many values that would have to be fixed one at a time with the current exception handling.
The text was updated successfully, but these errors were encountered: