Skip to content

Conversation

@tomasr8
Copy link
Member

@tomasr8 tomasr8 commented May 5, 2023

Allows users to export all their user data stored in Indico. The export is a single zip file containing a yaml file data.yml with all the data.
Additionally, if the user has uploaded any files to Indico (attachments, abstracts, papers, registration file fields, etc..) these will also be included in the zip file in different subfolders.

Overview of the added functionality

A new tab is added in the user profile (/user/data-export)

obrazek

You can select what categories you want to export. Once submitted, you will be notified via email when the
export is finished (successfully or not). You can also see the progress on the Data export tab.

You can download the data using the link provided in the email or by going back to your user profile and downloading it from there:

obrazek

To simplify things, only one request can exist at a time. When you request a new export, the old one is deleted (if finished, if it's still running you can't request it again).

When the export is processing:
obrazek

When it has failed (hopefully should not happen, but at least we log failures)
obrazek

When the export expires (by default the exports are cleaned after 1 month):
obrazek

When the export exceeds maximum (configurable) size:
obrazek

Technical overview

New DB model DataExportRequest was added. This object describes the export request - state, requested_dt, selected options, etc..
It's in 1-to-1 relation with User. You can get the current export request via user.data_export_request

There are two new endpoints, one to render the Data export tab (RHUserDataExport) and one for the API (RHUserDataExportAPI) which is used to create new requests and check the state.

The data that ends up in data.yml is serialized using UserDataExportSchema. Files written to the zip archive have
unique names.

The export itself is a celery task since it is expected to run for a long time (= lots of queries to do)

I added lots of new DB fixtures which I needed to test this PR. The schema tests use snapshots from modules/users/tests.

TODO:

  • Export editables
  • Export subcontributions
  • Do we want to export deleted items? Technically we still have the data for them.. Yes
  • Tests!
  • File size limit
  • Option to include/not include files

@ThiefMaster ThiefMaster added this to the v3.3 milestone May 5, 2023
@tomasr8 tomasr8 force-pushed the user-data-export branch 9 times, most recently from 27d2819 to 555878e Compare May 17, 2023 14:12
@tomasr8 tomasr8 force-pushed the user-data-export branch 2 times, most recently from d7f558b to 81acae5 Compare May 30, 2023 14:42
@tomasr8 tomasr8 marked this pull request as ready for review June 2, 2023 07:47
@tomasr8 tomasr8 changed the title WIP: export all user data Export all user data Jun 2, 2023
@tomasr8 tomasr8 self-assigned this Jun 7, 2023


def test_contribution_export_schema(snapshot, db, dummy_user, dummy_contribution, dummy_event_person):
from indico.modules.users.schemas import ContributionExportSchema
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

importing this on the top level doesn't work? i wouldn't expect circular import issues in here :o

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope :/ I'll try to dig into why later (it's the standard sqlalchemy error)

@ThiefMaster ThiefMaster force-pushed the user-data-export branch 2 times, most recently from 804ef0a to b8204fd Compare September 6, 2023 11:59
@ThiefMaster ThiefMaster merged commit d43dac7 into indico:master Dec 4, 2023
@ThiefMaster ThiefMaster deleted the user-data-export branch December 4, 2023 19:29
@uebmaster

This comment was marked as off-topic.

@ThiefMaster

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants