attic to borg one time converter#231
Conversation
for now, just in the test suite, but will be migrated to a separate command
|
obviously, tests fail because i rely on attic being present there. not sure how i can fix that, but for now, i prefer to continue working with tests than with my real repo, thank you very much. :p |
046549c to
34c9aee
Compare
34c9aee to
9ab1e19
Compare
the unit tests themselves still use attic to generate an attic repository for testing, but the converter code should now be standalone
Current coverage is
|
|
On 2015-09-30 23:40:11, Codecov wrote:
this stuff is really noisy... |
|
this is pretty much ready now. the only thing that is not done here is the cache conversion, for which i am somewhat too lazy right now - but the existing code should be a good example of how it can be done. i have only tested this with the unit tests, so i have no idea if it really works. i'm in the process of making a copy of my attic repo to try it out on real data (~460GB repo with daily snapshots since december 2014, so around 280 snapshots), which will in itself take a few hours - so i can't confirm until tomorrow that any of this really works. but the converter should be pretty fast, O(n) where n is the number of segments, and only a small write is done for each segment, so it can be very fast. the converter also assumes some compatibility between borg and attic. for example, it assumes it can load an attic repository and list the segments as if it was a borg repo (which actually works right now). reimplementing this so that it still works if borg changes too much shouldn't be much of a problem. in fact, i did that for the keys discovery, for example. oh, and regarding unit testing - i don't quite get it: most of the code i wrote is unit-tested, in fact, it's how i wrote it... maybe someone familiar with codecov can explain to me what i did wrong? |
|
The convert test is skipped on CI due to not having attic in a tox env |
|
sorry, i'm not familiar with tox - what does that mean? |
this way we don't depend on attic for regular build, but we can still see proper test coverage
|
you would also need to change the travis matrix from the normal envs, to the envs with attic as dependency given modern tools i might have been wong in suggesting to use a factor, since it can just be installed in tox in a normal env @ThomasWaldmann any oppinion on that? |
|
@RonnyPfannschmidt What are the pros and cons of the methods? |
borg/archiver.py
Outdated
There was a problem hiding this comment.
please use """triple-double-quoted""" docstrings
There was a problem hiding this comment.
done. but why? just above that, single-single quotes are used...
that's what i feared. is there any point in converting the existing cache then? it does take a bit of time that we could skip (a few seconds) to copy the ~1GB cache file from attic here... |
|
currently, there is not much point. BUT, if we realize the faster ideas for cache resync, we would need it (again). so maybe keep it for now. |
|
doing some testing of this code right now. the progress indication has some issue: Then it ended. As it is not showing 100%, users might be confused whether it really did all it needed to do. |
|
hehehe... oops! i forgot to multiply by 100. :) |
|
try again now? |
|
percentage works now. just trying to convert the same src repo twice does not work, because cache files then already exist at the target directory. maybe it should ask for permission to clear an existing cache. |
we separate the conversion and the copy in order to be able to copy arbitrary files from attic without converting them. this allows us to copy the config file cleanly without attempting to rewrite its magic number
|
@ThomasWaldmann what do you mean it "does not work"? does it crash? it should just produce a warning and move on, normally. i have rewired the cache copy mechanism as well, but i'd appreciate if you could test it as well... have any idea on how to integrate some cache generation in the unit tests? i couldn't figure it out looking at |
|
I killed the attic2b repo from my first attempt and made a new copy from attic using cp -a attic attic2b. Then: It stumbles over the cache as that is already there for this repoid. |
|
well, it's just a safety check: if you mistakenly run borg convert over an already existing borg report that has is not related to attic, you definitely don't want to overwrite those cache files! the solution here for you is to simply flush and the conversion still works, it's just the cache that is not being copied: you could simply remove the files and rerun the conversion again. |
|
yes, just seen the "config" handling, sorry. |
if there's no attic cache, it's no use checking for individual files this also makes the code a little clearer also added comments
|
just reshuffled and commented the code to make that clearer. |
|
@tw how did you create the attic2b repo in the first place? i am wondering if there wouldn't be an easy way to create an attic repo with a cache without replicating all of the maybe it would be useful to have generic routines to create a bunch of files in the test suite... |
convert is too generic for the Attic conversion: we may have other converters, from other, more foreign systems that will require different options and different upgrade mechanisms that convert could never cover appropriately. we are more likely to use an approach similar to "git fast-import" instead here, and have the conversion tools be external tool that feed standard data into borg during conversion. upgrade seems like a more natural fit: Attic could be considered like a pre-historic version of Borg that requires invasive changes for borg to be able to use the repository. we may require such changes in the future of borg as well: if we make backwards-incompatible changes to the repository layout or data format, it is possible that we require such changes to be performed on the repository before it is usable again. instead of scattering those conversions all over the code, we should simply have assertions that check the layout is correct and point the user to upgrade if it is not. upgrade should eventually automatically detect the repository format or version and perform appropriate conversions. Attic is only the first one. we still need to implement an adequate API for auto-detection and upgrade, only the seeds of that are present for now. of course, changes to the upgrade command should be thoroughly documented in the release notes and an eventual upgrade manual.
this makes it clear how to start from scratch, in case the chunk cache was failed to be copied and so on.
|
renamed convert command to upgrade, as described in the last commit. there will be some more work to be done for the upgrade command to be useful for other upgrades in the future, mostly internal API changes, but that can wait until after this is merged. i have ran and all is good. i may try to convert my main repo again now that #235 has a workaround, but that shouldn't keep this from happening, as copying the attic repo takes several hours here... |
it seems the file cache does *not* have the ATTIC magic header (nor does it have one in borg), so we don't need to edit the file - we just copy it like a regular file. while i'm here, simplify the cache conversion loop: it's no use splitting the copy and the edition since the latter is so fast, just do everything in one loop, which makes it much easier to read.
attic to borg one time converter
|
whoohoo! thanks! |
for now, just in the test suite, but will be migrated to a separate command.
currently converts segments, which may be enough for unencrypted repositories. will require a cache rebuild.
to be continued.
see #21.
update: converter seems to work, testing would be appreciated.