Skip to content

Validate Blake2 Uniqueness prior to saving #2490

@dstufft

Description

@dstufft

For some reason sometimes people are uploading files that already exist with different filenames. I'm not sure if there is a valid use case for this, but if there is we need to either remove the UNIQUE constraint on the blake2 hash or if there isn't we should validate this before we attempt to save the row.


https://sentry.io/python-software-foundation/warehouse-production/issues/147880375/

IntegrityError: duplicate key value violates unique constraint "release_files_blake2_256_digest_key"
DETAIL:  Key (blake2_256_digest)=(1d31e5fd783d7ce96471b6d66ba3dd59e3e0722811ecbf5c19173a5194be514c) already exists.

  File "sqlalchemy/engine/base.py", line 1182, in _execute_context
    context)
  File "sqlalchemy/engine/default.py", line 470, in do_execute
    cursor.execute(statement, parameters)

IntegrityError: (psycopg2.IntegrityError) duplicate key value violates unique constraint "release_files_blake2_256_digest_key"
DETAIL:  Key (blake2_256_digest)=(1d31e5fd783d7ce96471b6d66ba3dd59e3e0722811ecbf5c19173a5194be514c) already exists.
 [SQL: 'INSERT INTO release_files (name, version, python_version, requires_python, packagetype, comment_text, filename, path, size, has_signature, md5_digest, sha256_digest, blake2_256_digest) VALUES (%(name)s, %(version)s, %(python_version)s, %(requires_python)s, %(packagetype)s, %(comment_text)s, %(filename)s, %(path)s, %(size)s, %(has_signature)s, %(md5_digest)s, %(sha256_digest)s, %(blake2_256_digest)s) RETURNING release_files.id'] [parameters: {'name': 'treelite', 'version': '0.1a1', 'python_version': 'cp27', 'requires_python': None, 'packagetype': 'bdist_wheel', 'comment_text': '', 'filename': 'treelite-0.1a3-cp27-cp27m-macosx_10_6_intel.whl', 'path': '1d/31/e5fd783d7ce96471b6d66ba3dd59e3e0722811ecbf5c19173a5194be514c/treelite-0.1a3-cp27-cp27m-macosx_10_6_intel.whl', 'size': 1574670, 'has_signature': False, 'md5_digest': '32229792ce7968ecd29f9b099cb69644', 'sha256_digest': 'eea762a38b556c0a1f76640be5fcea73eb897d67dc5006dbe590162fbdc5bdea', 'blake2_256_digest': '1d31e5fd783d7ce96471b6d66ba3dd59e3e0722811ecbf5c19173a5194be514c'}]
(44 additional frame(s) were not displayed)
...
  File "warehouse/config.py", line 85, in require_https_tween
    return handler(request)
  File "warehouse/static.py", line 78, in whitenoise_tween
    return handler(request)
  File "warehouse/utils/compression.py", line 93, in compression_tween
    response = handler(request)
  File "warehouse/raven.py", line 41, in raven_tween
    return handler(request)
  File "warehouse/cache/http.py", line 69, in conditional_http_tween
    response = handler(request)

IntegrityError: (psycopg2.IntegrityError) duplicate key value violates unique constraint "release_files_blake2_256_digest_key"
DETAIL:  Key (blake2_256_digest)=(1d31e5fd783d7ce96471b6d66ba3dd59e3e0722811ecbf5c19173a5194be514c) already exists.
 [SQL: 'INSERT INTO release_files (name, version, python_version, requires_python, packagetype, comment_text, filename, path, size, has_signature, md5_digest, sha256_digest, blake2_256_digest) VALUES (%(name)s, %(version)s, %(python_version)s, %(requires_python)s, %(packagetype)s, %(comment_text)s, %(filename)s, %(path)s, %(size)s, %(has_signature)s, %(md5_digest)s, %(sha256_digest)s, %(blake2_256_digest)s) RETURNING release_files.id'] [parameters: {'name': 'treelite', 'version': '0.1a1', 'python_version': 'cp27', 'requires_python': None, 'packagetype': 'bdist_wheel', 'comment_text': '', 'filename': 'treelite-0.1a3-cp27-cp27m-macosx_10_6_intel.whl', 'path': '1d/31/e5fd783d7ce96471b6d66ba3dd59e3e0722811ecbf5c19173a5194be514c/treelite-0.1a3-cp27-cp27m-macosx_10_6_intel.whl', 'size': 1574670, 'has_signature': False, 'md5_digest': '32229792ce7968ecd29f9b099cb69644', 'sha256_digest': 'eea762a38b556c0a1f76640be5fcea73eb897d67dc5006dbe590162fbdc5bdea', 'blake2_256_digest': '1d31e5fd783d7ce96471b6d66ba3dd59e3e0722811ecbf5c19173a5194be514c'}]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions