Skip to content

Conversation

@DmitryFrolovTri
Copy link
Contributor

@DmitryFrolovTri DmitryFrolovTri commented Nov 15, 2022

The goal of this PR is to define a repo limit size using "no-worsen" approach (allow things that do not increase or decrease the size on disk eventually) I think org and user level restriction could come later.
Thanks to @sapk as this is a continuation of his work that was started in #7833
Would address (except for LFS): #3658

**Screenshots:** global limit admin_screen

individual repo size limit for admin
repo_options_screen

app.ini with settings

;; Specify a global repository size limit in bytes to apply for each repository. -1 - Disabled, 0 - Enabled with no limit
;; If repository has it's own limit set in repositiry settings UI it will override the global setting
;; Standard units of measurements for size can be used like B, KB, KiB, ... , EB, EiB, etc or if not provided - bytes
;; This is experimental and subject to change
;REPO_SIZE_LIMIT = -1

;; Specify a global LFS size limit in bytes to apply for each repository. -1 - Disabled, 0 - Enabled with no limit
;; If repository has it's own limit set in repository settings UI it will override the global setting
;; Standard units of measurements for size can be used like B, KB, KiB, ... , EB, EiB, etc or if not provided - bytes
;; This is experimental and subject to change
;LFS_SIZE_LIMIT = -1
;;
;; If true, LFS size will be included in the repository size calculation.
;; Which means even if LFS_SIZE_LIMIT is not set (-1) pushes will be rejected if LFS size + repository size exceeds REPO_SIZE_LIMIT
;; This is experimental and subject to change
;LFS_SIZE_IN_REPO_SIZE = false

how currently push is rejected if limit is reached
web_reject

TODO: (1)

  • Calculate Push Size (sapk had still have some corner-case to test mostly to not block deletion and some force push)
  • Edit max repo size
  • Enforce repo size
  • Fix tests
  • Add ability to have the feature on or off in the config (repository section in config), default - FALSE - A boolean parameter in config to enable the size-feature checking. If it is enabled then size limit is enforced. There is a field to edit size limit in the repo settings UI. The size limit is taken into account. If disabled then repo size limit is ignored. We can leave the size limit field in the repo settings UI and allow to edit it, however, the color of size limit value should be grayed. (telling that it is not working)
    the config parameter - ENABLE_SIZE_LIMIT = true/false
  • Add ability to have a global repo limit per installation that is enforced unless an individual repo size limit present (repository section in config) - DEFAULT: 0 - global default size limit for any repo where individual size limit is not defined. if 0 - no limit if >0 limit is set so that if a repo has 0/undefined sizelimit this configuration limit will be used instead. Config parameter REPO_SIZE_LIMIT = XXXX (should accept bytes, K, and M, and G, may be T like 1000 - 1000 bytes, 1K - 1 kilobyte. 1M - 1 M megabyte.)
  • UI to enable/disable set global repository limit (Note. The setting get's reset every gitea restart to what it was in the config)

TOFIX: (2)

  • UI operations trigger 500 error when repo is over - present a correct message instead
Deletion of file from UI trigger 500 when repo is over. -> TODO catch this specific error. 2019/08/16 05:23:58 ...uters/repo/editor.go:432:DeleteFilePost() [E] DeleteRepoFile: git push: remote: Gitea: new repo size is over limitation 10000 To /home/sapk/go/src/code.gitea.io/gitea/data/repositories/sapk/test.git ! [remote rejected] d9629b41f9c58da756cf806aabf5811b1ff45b50 -> master (pre-receive hook declined) error: impossible de pousser des références vers '/home/sapk/go/src/code.gitea.io/gitea/data/repositories/sapk/test.git'
Creation of branch from UI trigger 500 when repo is over. -> TODO catch this specific error. 2019/08/16 05:28:42 ...uters/repo/branch.go:287:CreateBranch() [E] CreateNewBranch: Push: exit status 1 - remote: Gitea: new repo size is over limitation 10000 To /home/sapk/go/src/code.gitea.io/gitea/data/repositories/sapk/test.git ! [remote rejected] test -> test (pre-receive hook declined) error: impossible de pousser des références vers '/home/sapk/go/src/code.gitea.io/gitea/data/repositories/sapk/test.git'

Note: If functionality to control size is enabled and a push triggers a size check (it's size would breach the limit) then push will be accepted only if total size of not referenced objects (removed size) is over or equal to total size of newly added objects in push. Since git controls when not referenced objects are purged and it's not fast this condition could last for a while. Administrator of instance could speed it up via following steps:

  1. Done from the data folder of the specific repository on the gitea server <path_to_gitea_server_folder>/data/gitea-repositories/<user>/<repository>:
git reflog expire --expire-unreachable=all --all
git gc --prune=now
  1. Execute Git GC from the UI.
    This would free up all not referenced objects and update repository size in UI. On next push (if push size is smaller then limit) adding new objects will be allowed.
  2. As for LFS more complicated - raised a request The LFS maintenance scripts after orphaned objects removal do not update repo.LFSSize after operations. #36169 so that repo.LFSSize is actually updated after LFS garbage collect / LFS storage doctor
  • Review the need to test doDeleteCommitAndPush type of test - Not Needed - Reverted
  • Prevent the upload as well in case it would breach the size limit on the repo. (server.go) here it can't be "no-worsen" it will fail if new size over the limit:
$ git lfs push origin main --all
batch response: LFS size 1.7 GiB would exceed limit 1.0 MiB
  • Enforce repo size with lfs added
  • adequate error messages to user upon lfs operation (if they are failed due to size)
  • add tests for lfs sizes
  • add lfs specific repo limit size configuration option and UI to edit

NEXT PR (4)

  • prevent migrating (mirror) repositories that would overflow the limit / Extend the size checking logic into the code for repository mirrors. We shouldn't mirror if that would breach the limit on repo/user/org level
  • Develop a go-git variant for size calculation
  • Add a test to confirm that already existing in the store objects do not count as new in both server.go and hook_pre_recieve.go
  • Not agreed yet. Update Git GC logic to allow for faster space release

EVEN further PR (5)

  • Add a per user/organisation account size limit that can be set by administrator (Global per repo used everywhere unless a per repo limit is set. If an action crosses the border of such account limit the action should be denied)
  • implement an organization/user level size restriction
  • Add a hard limit on repository size that would cancel any operation: (config option with percentage of overage, special message). No commit can be accepted in such case. This is because the limit is "soft" i.e. it does allow to increase space, but it would already prevent push that moves the repository over.

Related: #3658 #7833

@techknowlogick techknowlogick mentioned this pull request Nov 15, 2022
5 tasks
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Nov 15, 2022
@lafriks
Copy link
Member

lafriks commented Nov 15, 2022

Does it count also LFS? (currently repo size is git repo size + LFS object size)

Imho we should finally split to have two repo sizes in db (git repo size and LFS object size) and both should have different limits settable.

@DmitryFrolovTri
Copy link
Contributor Author

@lafriks it does count the LFS now and the repo limit applies to both, which I think is correct.

@KN4CK3R
Copy link
Member

KN4CK3R commented Nov 15, 2022

LFS size is not counted. And LFS is tricky because we may end up with LFS files uploaded and the pointer push is denied because of the limit or (worse) the pointer push is allowed and the file upload fails afterwards because of the limit.

@DmitryFrolovTri
Copy link
Contributor Author

Ok will look into this further. Thank you!

@lafriks
Copy link
Member

lafriks commented Nov 15, 2022

Yes that's why my suggestion was for those to be different limits and splitting also how sizes are saved in database for repo

@lunny lunny added the type/enhancement An improvement of existing functionality label Nov 16, 2022
@DmitryFrolovTri
Copy link
Contributor Author

Hi @techknowlogick @kdumontnu I have fixed the opencollective link to make it easy to donate. I've moved the funds I was able to gather 5$ - there :) Here is the new link for this activity: https://opencollective.com/oss-code-ge/projects/gitea-limit-repository-size

@morevnaproject
Copy link

Hi @DmitryFrolovTri
We are interested in this feature and contributed to your campaign.
Will spread a word in our social medias. Keep it up! 💪

@techknowlogick
Copy link
Member

@DmitryFrolovTri I've heard back from OC, and because you are not using them as a fiscal sponsor that's why I couldn't make the transfer. So I will just mention it here that upon completion of this PR/issue we will pay out $500 from our collective for the bounty (this amount was chosen to limit tax burden to whoever gets paid out), and @sapk for your work so far, if you are interested, we can pay you for your work so far (reach out to me via email and I can get you sorted).

@DmitryFrolovTri
Copy link
Contributor Author

@techknowlogick I've since then re-registered and the link above is with the opencollective (OC host) now.

@DmitryFrolovTri
Copy link
Contributor Author

DmitryFrolovTri commented Dec 18, 2025

@lunny @techknowlogick @lafriks please take a look. I have updated the task with new screenshots and all things done so far. For now I have decided not to do major refactor to adapt to forgejo or similair logic but we definitely need to go there. Potentially start with limits stored in another table and go from there. Open to suggestions but knowing how long I sat on this I would prefer to refactor later.

The feature works with the current gitea.

Let's decide whether we are good to let it in experimental or you want some changes, please take a notice that there are UI screens and there is a save of those variables from UI.

So I would love to ship this somehow so can go on with other things. Please review.

Copy link
Member

@silverwind silverwind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get it in.

func AddSizeLimitOnRepo(x *xorm.Engine) error {
type Repository struct {
ID int64 `xorm:"pk autoincr"`
SizeLimit int64 `xorm:"NOT NULL DEFAULT 0"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More and more columns added to this table. Whether it's better to have another table to store these limitations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size limit is related to the repository. It would make sense to make it a separate struct if it could be reused, eg. for the global limits. Otherwise what would be the point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend a new table to store them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this recommendation, though?

The repository table as tens of columns by now, and all of them are applicable to all repositories.

It seems that some of these values are inherent and some calculated. eg. number of starts or size need to be updated by some bookkeeping

It might be useful to split out the calculated values but the values added here are inherent part of the repository configuration.

If this was followed from the start and every couple repository configuration options were in a separate table how many tables would be there by now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above:

  • The repository table has already accumulated an increasing number of columns. At the same time, we maintain several dedicated tables for configuration data—such as repo_unit, system_setting, and user_setting. The proposed column is not a core attribute of a repository; it exists purely as a constraint or limitation. Given this, there is no strong justification for further bloating the repository table.
  • Most repositories will inherit the global configuration without any overrides. Storing an additional column for every repository in this case would largely be wasted space.
  • Adding a new column to an existing, heavily used repository table is more time-consuming and risky than creating a new table during Gitea’s startup or migration process.


// FileSize calculates the file size and generate user-friendly string.
func FileSize(s int64) string {
if s == -1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why -1 is possible and we need to handle it here?

Copy link
Contributor Author

@DmitryFrolovTri DmitryFrolovTri Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lunny It is used for display values in UI. -1 is used in config to represent a feature toggle. I was asked to remove the the master switch from the config

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this logic be placed outside of this package? Since this is a tool/utility package, it shouldn’t be coupled with business logic.

// Get FileSize bytes value from String.
func GetFileSize(s string) (int64, error) {
s = strings.TrimSpace(s)
if s == "-1" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as above

)

// CountObjects returns the results of git count-objects on the repoPath
func CountObjects(ctx context.Context, repoPath string) (*CountObject, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to move all the functions to modules/gitrepo and used gitrepo.Repository instead of repoPath.

return
}

err = setting.SaveGlobalRepositorySetting(repoSizeLimit, lfsSizeLimit, form.LFSSizeInRepoSize)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will be allowed. It will cause many problems. Many users use HA mode.

Copy link
Contributor Author

@DmitryFrolovTri DmitryFrolovTri Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lunny Pllease advise what is your view on this:

  1. Keep how done currently, remove the save, make this a runtime only configuration
  2. Modify the UI in some way (which), remove the save, make this a runtime only configuration
  3. Make this a read only/display only configuration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, there are two possible approaches in the codebase:

  • 1 Define it in the configuration file and do not allow it to be modified through the UI.
  • 2 Store it in the database and make it configurable via the UI.

@lunny
Copy link
Member

lunny commented Dec 18, 2025

opencollective.com/oss-code-ge/projects/gitea-limit-repository-size

Thank you for bringing this up. I’ve been reviewing the PR, and unfortunately, from my perspective, there is still a fair amount of work needed to make this feature more robust.

@hramrach
Copy link
Contributor

hramrach commented Dec 19, 2025

  • UI operations trigger 500 error when repo is over - present a correct message instead

That's correct, file deletion increases repo size.

Edit: I see, the problem is not that deletion is rejected but that the error message is not correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-update-needed The document needs to be updated synchronously lgtm/blocked A maintainer has reservations with the PR and thus it cannot be merged modifies/api This PR adds API routes or modifies them modifies/go Pull requests that update Go code modifies/migrations modifies/templates This PR modifies the template files modifies/translation type/docs This PR mainly updates/creates documentation type/feature Completely new functionality. Can only be merged if feature freeze is not active.

Projects

None yet

Development

Successfully merging this pull request may close these issues.