Remove ratelimits for migrating repositories #896

Open
opened 2023-01-16 01:13:20 +01:00 by thatonecalculator · 19 comments
https://codeberg.org/forgejo/forgejo/issues/248#issuecomment-772890
Member

I think you can use Forgejo's dump-repo command to export archives of the units individually. I have not tested if importing them individually works, however.

I think you can use Forgejo's `dump-repo` command to export archives of the units individually. I have not tested if importing them individually works, however.
Member

So, I got it working, but you cannot just import each archive individually. The dump-repo command unfortunately always clones the git repository, which is by far the easiest thing to do externally, so perhaps the command could be adapted to skip that step. Other than that, after exporting the individual units, you have import archives that are good for a fresh repo with each individual unit. Merging these archives is pretty easy, you just need to combine all the respective yaml files from the dumps and adjust the repo.yml file to reflect which units are present in the dump. I also needed to specifically use the git dump from the pull requests archive, it seems the clone from the issues archive didn't contain the refs for the pull request for some reason.

You can see what the repository looks like when it's all reassembled and imported here: https://forgejo.crystal.vg:3000/crystal/test-import

This done by using the method described above to export my test-forgejo-dump-repo repository here.

So, I got it working, but you cannot just import each archive individually. The `dump-repo` command unfortunately always clones the git repository, which is by far the easiest thing to do externally, so perhaps the command could be adapted to skip that step. Other than that, after exporting the individual units, you have import archives that are good for a fresh repo with each individual unit. Merging these archives is pretty easy, you just need to combine all the respective yaml files from the dumps and adjust the repo.yml file to reflect which units are present in the dump. I also needed to specifically use the git dump from the pull requests archive, it seems the clone from the issues archive didn't contain the refs for the pull request for some reason. You can see what the repository looks like when it's all reassembled and imported here: https://forgejo.crystal.vg:3000/crystal/test-import This done by using the method described above to export my test-forgejo-dump-repo repository here.
Owner

Rate limits should not apply on our side for incoming migrations, but happen per instance from external providers (e.g. GitHub applies rate-limiting for our IP, because many push / pull / migration / API access etc happens on our side).

I'll have a look if I can see why it fails.

Rate limits should not apply on our side for incoming migrations, but happen per instance from external providers (e.g. GitHub applies rate-limiting for our IP, because many push / pull / migration / API access etc happens on our side). I'll have a look if I can see why it fails.
Owner

@thatonecalculator I don't see any error messages that are related to this migration. Can you please share the exact timestamp if available, maybe for the next migration?

@thatonecalculator I don't see any error messages that are related to this migration. Can you please share the exact timestamp if available, maybe for the next migration?
Owner

Ah sorrry, I think I only now understand your issue. You want to migrate the data out of Codeberg, right?

Hmm, from a computing perspective it's obvious that in the term of load issues, there is no way to guarantee that migrations to elsewhere are performant. But please retry now, the performance should be much better now.

Regarding rate limits, there is a limit indeed. Any software consuming it should just wait and retry, this should also be true for Forgejo (and I think it does, at least they seem to do for GitHub / GitLab).

If the issue persists, please let me know and we'll work something out.

Ah sorrry, I think I only now understand your issue. You want to migrate the data out of Codeberg, right? Hmm, from a computing perspective it's obvious that in the term of load issues, there is no way to guarantee that migrations to elsewhere are performant. But please retry now, the performance should be much better now. Regarding rate limits, there is a limit indeed. Any software consuming it should just wait and retry, this should also be true for Forgejo (and I think it does, at least they seem to do for GitHub / GitLab). If the issue persists, please let me know and we'll work something out.

Still persists, just tried again ~10 minutes ago.

Still persists, just tried again ~10 minutes ago.
Owner

it would be helpful to receive an exact log that indicates how rate limiting kicks in. Or if rate limiting kicks in at all or if there is a different kind of issue.

it would be helpful to receive an exact log that indicates how rate limiting kicks in. Or if rate limiting kicks in at all or if there is a different kind of issue.
Owner

Please see forgejo/forgejo#398 - the migration is ongoing, but the number of API calls is just insane.

This migration is currently causing 54.5% of all requests to Codeberg. The idea of rate limiting is obviously to prevent a single actor from creating so much load. I'm very sorry that you run into this, but I hope we can work something out in order to allow letting you move out more seamless.

Thank you for your patience.

Please see https://codeberg.org/forgejo/forgejo/issues/398 - the migration is ongoing, but the number of API calls is just insane. This migration is currently causing 54.5% of all requests to Codeberg. The idea of rate limiting is obviously to prevent a single actor from creating so much load. I'm very sorry that you run into this, but I hope we can work something out in order to allow letting you move out more seamless. Thank you for your patience.
Owner

@Gusted @crystal Does anyone of you recall by chance if (and how) it was possible to dump repo archives via some admin command? I think I read about this. It's obviously not suited for day-to-day use as it doesn't allow self-service, but it could be an option to help this project move, given that they know the operator of their target instance.

@Gusted @crystal Does anyone of you recall by chance if (and how) it was possible to dump repo archives via some admin command? I think I read about this. It's obviously not suited for day-to-day use as it doesn't allow self-service, but it could be an option to help this project move, given that they know the operator of their target instance.
Owner

Sorry for the ping, I just read it was mentioned before 🙈. If the migration doesn't complete next time I'm online, I'll try to dig through the docs.

Sorry for the ping, I just read it was mentioned before 🙈. If the migration doesn't complete next time I'm online, I'll try to dig through the docs.
Owner

gitea dump-repo would be the command, but I'm pretty sure that still uses the api...

`gitea dump-repo` would be the command, but I'm pretty sure that still uses the api...
Member

Yes, as mentioned in my previous comment on this issue, it is possible to individually export each unit from a repo using the API with the dump-repo command, but the git repository will be cloned each time. It is also possible to reassemble the various dumps into a single dump that can be imported, but it requires a lot of delicate effort.

EDIT: If the dump-repo command targeted the API on localhost from the container running Forgejo, perhaps it could bypass the rate limits and allow the entire repository to be exported by Codeberg administrators for this project. The rate limit is imposed by HAProxy, right?

Yes, as mentioned in [my previous comment](https://codeberg.org/Codeberg/Community/issues/896#issuecomment-772926) on this issue, it is possible to individually export each unit from a repo using the API with the `dump-repo` command, but the git repository will be cloned each time. It is also possible to reassemble the various dumps into a single dump that can be imported, but it requires a lot of delicate effort. EDIT: If the `dump-repo` command targeted the API on localhost from the container running Forgejo, perhaps it could bypass the rate limits and allow the entire repository to be exported by Codeberg administrators for this project. The rate limit is imposed by HAProxy, right?
Owner

The current running migration (10 hours + or something like this) is also bypassing the API rate limits and working locally (cross-LXC-container). It doesn't make a difference, then. Unfortunate.

The current running migration (10 hours + or something like this) is also bypassing the API rate limits and working locally (cross-LXC-container). It doesn't make a difference, then. Unfortunate.
Owner

@thatonecalculator Overnight, the migration has hung our staging instance 😞. My ideas how to workaround prior to optimizing migrations in the software are exhausted now. We can try the dump-repo thing, but if it just uses the API, chances aren't really high it helps ...

@thatonecalculator Overnight, the migration has hung our staging instance 😞. My ideas how to workaround prior to optimizing migrations in the software are exhausted now. We can try the dump-repo thing, but if it just uses the API, chances aren't really high it helps ...

Since I really only need open issues, I wonder if there would be a way to do a migration with just the 300 or so open ones instead of the 5000+ closed ones...

Since I really only need *open* issues, I wonder if there would be a way to do a migration with just the 300 or so open ones instead of the 5000+ closed ones...
Owner

I fear this is out of scope for Codeberg, but in theory you could provide a patch for either a migration code or the dump-repo part which filters this (could be easy). It would be no big deal to deploy the custom binary to our staging instance to do the migration to there.

Otherwise, my main problem is that the dump-repo command does not display any output. It's running, but hard to know what it does. Can the verbosity be toggled? I had no luck yet.

I fear this is out of scope for Codeberg, but in theory you could provide a patch for either a migration code or the dump-repo part which filters this (could be easy). It would be no big deal to deploy the custom binary to our staging instance to do the migration to there. Otherwise, my main problem is that the dump-repo command does not display any output. It's running, but hard to know what it does. Can the verbosity be toggled? I had no luck yet.
Owner

The migration failed in the end for some availability issues it seems (could have been an update):

...migrations/common.go:19:WarnAndNotice() [W] Unable to load comment reactions during migrating issue #8251 for comment 574152 in migration from gitea server http://localhost:3000 calckey/calckey. Error: Get "http://localhost:3000/api/v1/repos/calckey/calckey/issues/comments/574152/reactions": dial tcp [::1]:3000: connect: connection refused                                                                                                                
...migrations/common.go:19:WarnAndNotice() [W] Unable to load comment reactions during migrating issue #8251 for comment 574153 in migration from gitea server http://localhost:3000 calckey/calckey. Error: Get "http://localhost:3000/api/v1/repos/calckey/calckey/issues/comments/574153/reactions": dial tcp [::1]:3000: connect: connection refused                                                                                                                
...migrations/common.go:19:WarnAndNotice() [W] Unable to load comment reactions during migrating issue #8251 for comment 574154 in migration from gitea server http://localhost:3000 calckey/calckey. Error: Get "http://localhost:3000/api/v1/repos/calckey/calckey/issues/comments/574154/reactions": dial tcp [::1]:3000: connect: connection refused                                                                                                                
cmd/dump_repo.go:185:runDumpRepository() [F] Failed to dump repository: error while listing comments for issue #8251. Error: Get "http://localhost:3000/api/v1/repos/calckey/calckey/issues/8251/comments?limit=50&page=14": read tcp [::1]:56900->[::1]:3000: read: connection reset by peer
The migration failed in the end for some availability issues it seems (could have been an update): ~~~ ...migrations/common.go:19:WarnAndNotice() [W] Unable to load comment reactions during migrating issue #8251 for comment 574152 in migration from gitea server http://localhost:3000 calckey/calckey. Error: Get "http://localhost:3000/api/v1/repos/calckey/calckey/issues/comments/574152/reactions": dial tcp [::1]:3000: connect: connection refused ...migrations/common.go:19:WarnAndNotice() [W] Unable to load comment reactions during migrating issue #8251 for comment 574153 in migration from gitea server http://localhost:3000 calckey/calckey. Error: Get "http://localhost:3000/api/v1/repos/calckey/calckey/issues/comments/574153/reactions": dial tcp [::1]:3000: connect: connection refused ...migrations/common.go:19:WarnAndNotice() [W] Unable to load comment reactions during migrating issue #8251 for comment 574154 in migration from gitea server http://localhost:3000 calckey/calckey. Error: Get "http://localhost:3000/api/v1/repos/calckey/calckey/issues/comments/574154/reactions": dial tcp [::1]:3000: connect: connection refused cmd/dump_repo.go:185:runDumpRepository() [F] Failed to dump repository: error while listing comments for issue #8251. Error: Get "http://localhost:3000/api/v1/repos/calckey/calckey/issues/8251/comments?limit=50&page=14": read tcp [::1]:56900->[::1]:3000: read: connection reset by peer ~~~

Any update at all?

Any update at all?
Owner

Cross-linking: forgejo/forgejo#398

Cross-linking: https://codeberg.org/forgejo/forgejo/issues/398
Sign in to join this conversation.
No milestone
No project
No assignees
5 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Codeberg/Community#896
No description provided.