Skip to content

Conversation

@SeeSpotRun
Copy link
Collaborator

Addresses #462

New option --hash-uniques means even unique files get fully hashed. These will generally be outputted to the json report by default. By also specifying --xattr-write, the checksums will be written to the files' extended attributes.

@SeeSpotRun SeeSpotRun mentioned this pull request Mar 18, 2021
@SeeSpotRun
Copy link
Collaborator Author

@sahib I wonder if write-unfinished can / should be deprecated with this in place? Unfinished checksums seem a bit flakey. They're only really useful if another same-length file has a different unfinished checksum after exactly the same number of bytes. I can't see where we are storing the number of bytes hashed in the json file for --replay so I'm not sure how it's really supposed to work.
While --hash-uniques is a bigger overhead for the first run, it's arguably more useful and robust than partial checksums.

@sahib
Copy link
Owner

sahib commented Mar 19, 2021

I can't find the ticket right now, but there were some issues with --write-unfinished anyways (at least in the form that most users assumed it does what --hash-uniques does). Especially since it got implicitly enabled with the xattr feature. I would vote to remove it altogether.

@SeeSpotRun
Copy link
Collaborator Author

Ok have removed and also added --hash-unmatched which is like --hash-uniques but only hashes files that have one or more size twins. This will be much more efficient in most usecases.

@SeeSpotRun SeeSpotRun merged commit b276cd2 into sahib:develop Mar 20, 2021
@SeeSpotRun SeeSpotRun deleted the hash_uniques branch March 20, 2021 22:05
@SeeSpotRun SeeSpotRun mentioned this pull request Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants