django.db.utils.OperationalError: (1114, "The table 'huey_monitor_taskprogressmodel' is full") #46

formacube · 2021-08-21T11:59:36Z

I had a bad surprise this morning while checking on the progress of a process which had it's first sub-process running for over than 24h.

There was an error: django.db.utils.OperationalError: (1114, "The table 'huey_monitor_taskprogressmodel' is full")

Then I realized that for each tiny progress report, a record is created for TaskProgressModel

Could you please tell me what is the benefit of creating a new TaskProgressModel (even two when you want to report progress on both sub-task and maintask) each time you report a progress vs. incrementing directly the progress_count field of an existing TaskProgressModel ?

That would mean that in general, we would have only one TaskProgressModel for each TaskModel.

Thanks in advance for your feedback

The text was updated successfully, but these errors were encountered:

jedie · 2021-08-25T07:01:34Z

Thats a good point. The initial idea was to to avoid database locking mechanisms. If it runs in parallel. But I think that should be rethought.

formacube · 2021-08-25T08:17:53Z

Since 3 days I'm using this minimal rewrite (to keep maximum compatibility with your existing code):


    def update(self, n=1):
        """
        Create a TaskProgressModel instance to main and sub tasks
        to store the progress information.
        """
        self.total_progress += n

        now = timezone.now()
        ids = [self.task.id]
        objects = [
            TaskProgressModel.objects.get_or_create(
                task_id=self.task.id)[0]
            # TaskProgressModel(
            #     task_id=self.task.id,
            #     progress_count=n,
            #     create_dt=now
            # )
        ]

        if self.parent_task_id:
            # Store information for main task, too:
            ids.append(self.parent_task_id)
            
            if self.cumulate_w_parent_progress:
                objects.append(
                    TaskProgressModel.objects.get_or_create(
                    task_id=self.parent_task_id)[0]
                )

        # TaskProgressModel.objects.bulk_create(objects)
        for obj in objects:
            if obj.progress_count:
                obj.progress_count += n
            else:
                obj.progress_count = n
            obj.save()

        # Update the last change date times:
        TaskModel.objects.filter(task_id__in=ids).update(
            update_dt=now
        )

formacube · 2021-09-30T06:55:08Z

Can we move on with this PR (#44)?
It is also straightforward.

As I don't really understand what you would expect through your comment

Expand the README is fine... On the other hand, this has a general problem: it is not guaranteed that the code works, because it is untested. Actually, it is better to add example code and refer to it. Because they can be added to the tests.

Maybe we should enhance https://github.com/boxine/django-huey-monitor/blob/master/huey_monitor_tests/test_app/tasks.py and add more DocStrings there and make more links from README to it?

I cannot do anything further unless you give me more precise instructions in a not technical language that I can understand (see my answer from 1 month ago).

Kind regards

jedie · 2022-01-03T09:22:07Z

@formacube Maybe you want to share your opinion here: #57 That will fix this issues.

Remove the `TaskProgressModel` model that was used to store process information. But this has some disadvantages: * Every `process_info.update()` call creates one `TaskProgressModel` instance and this model was never cleaned. So if many items was processed, the table may be get full. So this PR fixed #46 * If many small `process_info.update()` calls happens, then we get a high database load Refactor that completely by using the Django cache to store progress information. If a task has been finished: Transfer the information from cache to database.

jedie · 2022-01-03T19:18:04Z

will be fixed by: #58

Remove the `TaskProgressModel` model that was used to store process information. But this has some disadvantages: * Every `process_info.update()` call creates one `TaskProgressModel` instance and this model was never cleaned. So if many items was processed, the table may be get full. So this PR fixed #46 * If many small `process_info.update()` calls happens, then we get a high database load Refactor that completely by using the Django cache to store progress information. If a task has been finished: Transfer the information from cache to database.

Skrattoune · 2022-01-27T14:18:36Z

Hi Jedie,
We need to reopen this.
The fix for this issue is not yet been released
I have submitted a PR #67 while we are waiting for PR #58 to be ready and availlable

jedie · 2022-01-28T07:21:28Z

#58 is on my TODO, but it will take some time to finish this PR. Hopefully next week...

I will merge your idea from #67

Don't create a new TaskProgressModel instances for every `ProcessInfo.update()` call and increment a existing TaskProgressModel instances. So we will not flood the database ;) Based on #67

Fix #46 By increment existing TaskProgressModel instances

jedie · 2022-01-28T08:30:18Z

fix releases as v0.4.5 -> https://pypi.org/project/django-huey-monitor/0.4.5/

Remove the `TaskProgressModel` model that was used to store process information. But this has some disadvantages: * Every `process_info.update()` call creates one `TaskProgressModel` instance and this model was never cleaned. So if many items was processed, the table may be get full. So this PR fixed #46 * If many small `process_info.update()` calls happens, then we get a high database load Refactor that completely by using the Django cache to store progress information. If a task has been finished: Transfer the information from cache to database.

jedie mentioned this issue Jan 3, 2022

Use cache backend to store tqdm process information #58

Closed

2 tasks

jedie closed this as completed Jan 3, 2022

Skrattoune added a commit to Skrattoune/django-huey-monitor that referenced this issue Jan 27, 2022

boxine#46 Prevent ProcessInfo.update to overflow DB

1ad7fb8

Skrattoune mentioned this issue Jan 27, 2022

#46 Prevent ProcessInfo.update to overflow DB #67

Closed

phihag added a commit that referenced this issue Jan 28, 2022

Merge pull request #68 from boxine/merge#67

c73031c

Fix #46 By increment existing TaskProgressModel instances

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

django.db.utils.OperationalError: (1114, "The table 'huey_monitor_taskprogressmodel' is full") #46

django.db.utils.OperationalError: (1114, "The table 'huey_monitor_taskprogressmodel' is full") #46

formacube commented Aug 21, 2021

jedie commented Aug 25, 2021

formacube commented Aug 25, 2021

formacube commented Sep 30, 2021

jedie commented Jan 3, 2022

jedie commented Jan 3, 2022

Skrattoune commented Jan 27, 2022

jedie commented Jan 28, 2022

jedie commented Jan 28, 2022

django.db.utils.OperationalError: (1114, "The table 'huey_monitor_taskprogressmodel' is full") #46

django.db.utils.OperationalError: (1114, "The table 'huey_monitor_taskprogressmodel' is full") #46

Comments

formacube commented Aug 21, 2021

jedie commented Aug 25, 2021

formacube commented Aug 25, 2021

formacube commented Sep 30, 2021

jedie commented Jan 3, 2022

jedie commented Jan 3, 2022

Skrattoune commented Jan 27, 2022

jedie commented Jan 28, 2022

jedie commented Jan 28, 2022