Skip to content

Feat: (pseudo-) unique hash for dataset #384

@MerlinSmiles

Description

@MerlinSmiles

I'd like to implement a hash for each dataset, that is unique enough for our general applications, and short enough for humans to read :)

Reasons for that:

  • easily identify a dataset by its hash so that it is independent of folder structure and things like that
  • add that (human readable) hash to i.e. a plot or my logbook so that I am always sure that I know where the data was exactly
  • eventually in the far future have a way that I in my logbook have a link qc://notebook/S6N09W65ZL which loads a data-analysis notebook

If we think that is useful we'd need to make sure we can generate smart hashes, where it is unlikely that we have a double version.
We could:

import random
import string
N = 10
hash = ''.join(random.SystemRandom().choice(string.ascii_lowercase + string.digits) for _ in range(N))

I have no idea how unique it is though...

we can of course make a really really long string but thats hard to put in a plot title or something like that.

Any ideas/comments about that?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions