feat: opt-in Swarm.ResourceMgr (go-libp2p v0.18)#8680
Conversation
013048b to
7bd8191
Compare
6f9c3c9 to
caba218
Compare
|
Do you have plans to publish metrics when a RM req is rejected due to exceeded scope limits? That would be very useful operationally to help know when to scale gateway fleets. |
|
The resource manager has a tracer integration (see https://github.com/libp2p/go-libp2p-resource-manager/blob/master/trace.go), so it would be possible to hook this up with Prometheus in one way or the other. |
|
travis asked for the same thing so i will add metrics support to the
reource manager.
…On Fri, Feb 11, 2022, 14:50 Marten Seemann ***@***.***> wrote:
The resource manager has a tracer integration (see
https://github.com/libp2p/go-libp2p-resource-manager/blob/master/trace.go),
so it would be possible to hook this up with Prometheus in one way or the
other.
—
Reply to this email directly, view it on GitHub
<#8680 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAI4SREFOVU7UPT2MVGGMTU2UA2XANCNFSM5MBCYZ7Q>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
caba218 to
c3be8e9
Compare
|
"exchange files" → "cat file" interop test always gets stuck after Quick way to reproduce/debug is to run specific test with: Given this PR adds resource manager, I guess some limit is hit? Update: set |
c3be8e9 to
3bb2ecf
Compare
Revealed connection being dropped by go-libp2p-resource-manager: |
There's no connection being dropped. The resource manager is blocking a memory with prio 128, which is probably a yamux stream window increase: https://github.com/libp2p/go-yamux/blob/cd22a37b789cf6e1fe4f2c2cff6da5df7a49dfac/stream.go#L229. This won't cause any problems. Do we have logs from the other node? |
|
These logs I decided to remove 👉 With this setup you can tweak Node A (source)One terminal: $ export IPFS_PATH=/tmp/node-a
$ ipfs daemon --init --init-profile server,randomports,testSecond terminal: $ ipfs id | jq '.Addresses[0]'
{addrA}
$ ipfs swarm connect {addrB}
$ head -c 65M /dev/urandom | ipfs add -Q
{cid-OK}
$ head -c 8M /dev/urandom | ipfs add -Q
{cid-FAIL}Node B (destination)One terminal: $ export IPFS_PATH=/tmp/node-b
$ ipfs daemon --init --init-profile server,randomports,testSecond terminal: $ ipfs id | jq '.Addresses[0]'
{addrB}
$ ipfs swarm connect {addrA}
$ ipfs cat {cid-OK} > /tmp/8M # finished instantly
$ ipfs cat {cid-FAIL} > /tmp/65M # stops in the middle (around 20-50%)
30.50 MiB / 65.00 MiB [==================================>--------------------------------------] 46.92% |
|
@marten-seemann @lidel I'm not totally sure, but am wondering if we're running into a combination of go-bitswap not gracefully handling the lack of streams and go-bitswap using more streams than it should (ipfs/boxo#80). |
|
or yamux not handling properly the refusal yo increase the window from the
other side, that also a possibility.
…On Thu, Feb 24, 2022, 16:17 Adin Schmahmann ***@***.***> wrote:
@marten-seemann <https://github.com/marten-seemann> @lidel
<https://github.com/lidel> I'm not totally sure, but am wondering if
we're running into a combination of go-bitswap not gracefully handling the
lack of streams and go-bitswap using more streams than it should (
ipfs/boxo#80 <ipfs/boxo#80>).
—
Reply to this email directly, view it on GitHub
<#8680 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAI4SXHGGPWMEJAWSY2AHLU4Y4ZBANCNFSM5MBCYZ7Q>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
6ddf5bf to
0c9ddba
Compare
fc84c9a to
c4fb623
Compare
|
2022-03-03 conversation:
|
* add metrics for the resource manager * export protocol and service name in Prometheus metrics * fix: expose rcmgr metrics only when enabled Co-authored-by: Marcin Rataj <[email protected]>
This includes CI fix for go-ipfs-http-client
This file defines implicit limit defaults used when Swarm.ResourceMgr.Enabled We keep vendored copy to ensure go-ipfs is not impacted when go-libp2p decides to change defaults in any of the future releases.
Cleans up the way we initialize defaults and adds a fix for case when connection manager runs with high limits. It also hides `Swarm.ResourceMgr.Limits` until we have a better understanding what syntax makes sense.
|
Applied changes based on feedback from stewards sync:
Before this is merged, we should make a decision if we are ok with
|
* update go-libp2p to v0.18.0
* initialize the resource manager
* add resource manager stats/limit commands
* load limit file when building resource manager
* log absent limit file
* write rcmgr to file when IPFS_DEBUG_RCMGR is set
* fix: mark swarm limit|stats as experimental
* feat(cfg): opt-in Swarm.ResourceMgr
This ensures we can safely test the resource manager without impacting
default behavior.
- Resource manager is disabled by default
- Default for Swarm.ResourceMgr.Enabled is false for now
- Swarm.ResourceMgr.Limits allows user to tweak limits per specific
scope in a way that is persisted across restarts
- 'ipfs swarm limit system' outputs human-readable json
- 'ipfs swarm limit system new-limits.json' sets new runtime limits
(but does not change Swarm.ResourceMgr.Limits in the config)
Conventions to make libp2p devs life easier:
- 'IPFS_RCMGR=1 ipfs daemon' overrides the config and enables resource manager
- 'limit.json' overrides implicit defaults from libp2p (if present)
* docs(config): small tweaks
* fix: skip libp2p.ResourceManager if disabled
This ensures 'ipfs swarm limit|stats' work only when enabled.
* fix: use NullResourceManager when disabled
This reverts commit b19f7c9eca4cee4187f8cba3389dc2c930258512.
after clarification feedback from
ipfs/kubo#8680 (comment)
* style: rename IPFS_RCMGR to LIBP2P_RCMGR
preexisting libp2p toggles use LIBP2P_ prefix
* test: Swarm.ResourceMgr
* fix: location of opt-in limit.json and rcmgr.json.gz
Places these files inside of IPFS_PATH
* Update docs/config.md
* feat: expose rcmgr metrics when enabled (#8785)
* add metrics for the resource manager
* export protocol and service name in Prometheus metrics
* fix: expose rcmgr metrics only when enabled
Co-authored-by: Marcin Rataj <[email protected]>
* refactor: rcmgr_metrics.go
* refactor: rcmgr_defaults.go
This file defines implicit limit defaults used when Swarm.ResourceMgr.Enabled
We keep vendored copy to ensure go-ipfs is not impacted when go-libp2p
decides to change defaults in any of the future releases.
* refactor: adjustedDefaultLimits
Cleans up the way we initialize defaults and adds a fix for case
when connection manager runs with high limits.
It also hides `Swarm.ResourceMgr.Limits` until we have a better
understanding what syntax makes sense.
* chore: cleanup after a review
* fix: restore go-ipld-prime v0.14.2
* fix: restore go-ds-flatfs v0.5.1
Co-authored-by: Lucas Molas <[email protected]>
Co-authored-by: Marcin Rataj <[email protected]>
* update go-libp2p to v0.18.0
* initialize the resource manager
* add resource manager stats/limit commands
* load limit file when building resource manager
* log absent limit file
* write rcmgr to file when IPFS_DEBUG_RCMGR is set
* fix: mark swarm limit|stats as experimental
* feat(cfg): opt-in Swarm.ResourceMgr
This ensures we can safely test the resource manager without impacting
default behavior.
- Resource manager is disabled by default
- Default for Swarm.ResourceMgr.Enabled is false for now
- Swarm.ResourceMgr.Limits allows user to tweak limits per specific
scope in a way that is persisted across restarts
- 'ipfs swarm limit system' outputs human-readable json
- 'ipfs swarm limit system new-limits.json' sets new runtime limits
(but does not change Swarm.ResourceMgr.Limits in the config)
Conventions to make libp2p devs life easier:
- 'IPFS_RCMGR=1 ipfs daemon' overrides the config and enables resource manager
- 'limit.json' overrides implicit defaults from libp2p (if present)
* docs(config): small tweaks
* fix: skip libp2p.ResourceManager if disabled
This ensures 'ipfs swarm limit|stats' work only when enabled.
* fix: use NullResourceManager when disabled
This reverts commit b19f7c9eca4cee4187f8cba3389dc2c930258512.
after clarification feedback from
ipfs/kubo#8680 (comment)
* style: rename IPFS_RCMGR to LIBP2P_RCMGR
preexisting libp2p toggles use LIBP2P_ prefix
* test: Swarm.ResourceMgr
* fix: location of opt-in limit.json and rcmgr.json.gz
Places these files inside of IPFS_PATH
* Update docs/config.md
* feat: expose rcmgr metrics when enabled (#8785)
* add metrics for the resource manager
* export protocol and service name in Prometheus metrics
* fix: expose rcmgr metrics only when enabled
Co-authored-by: Marcin Rataj <[email protected]>
* refactor: rcmgr_metrics.go
* refactor: rcmgr_defaults.go
This file defines implicit limit defaults used when Swarm.ResourceMgr.Enabled
We keep vendored copy to ensure go-ipfs is not impacted when go-libp2p
decides to change defaults in any of the future releases.
* refactor: adjustedDefaultLimits
Cleans up the way we initialize defaults and adds a fix for case
when connection manager runs with high limits.
It also hides `Swarm.ResourceMgr.Limits` until we have a better
understanding what syntax makes sense.
* chore: cleanup after a review
* fix: restore go-ipld-prime v0.14.2
* fix: restore go-ds-flatfs v0.5.1
Co-authored-by: Lucas Molas <[email protected]>
Co-authored-by: Marcin Rataj <[email protected]>
This commit was moved from ipfs/kubo@514411b
* update go-libp2p to v0.18.0
* initialize the resource manager
* add resource manager stats/limit commands
* load limit file when building resource manager
* log absent limit file
* write rcmgr to file when IPFS_DEBUG_RCMGR is set
* fix: mark swarm limit|stats as experimental
* feat(cfg): opt-in Swarm.ResourceMgr
This ensures we can safely test the resource manager without impacting
default behavior.
- Resource manager is disabled by default
- Default for Swarm.ResourceMgr.Enabled is false for now
- Swarm.ResourceMgr.Limits allows user to tweak limits per specific
scope in a way that is persisted across restarts
- 'ipfs swarm limit system' outputs human-readable json
- 'ipfs swarm limit system new-limits.json' sets new runtime limits
(but does not change Swarm.ResourceMgr.Limits in the config)
Conventions to make libp2p devs life easier:
- 'IPFS_RCMGR=1 ipfs daemon' overrides the config and enables resource manager
- 'limit.json' overrides implicit defaults from libp2p (if present)
* docs(config): small tweaks
* fix: skip libp2p.ResourceManager if disabled
This ensures 'ipfs swarm limit|stats' work only when enabled.
* fix: use NullResourceManager when disabled
This reverts commit b19f7c9eca4cee4187f8cba3389dc2c930258512.
after clarification feedback from
ipfs/kubo#8680 (comment)
* style: rename IPFS_RCMGR to LIBP2P_RCMGR
preexisting libp2p toggles use LIBP2P_ prefix
* test: Swarm.ResourceMgr
* fix: location of opt-in limit.json and rcmgr.json.gz
Places these files inside of IPFS_PATH
* Update docs/config.md
* feat: expose rcmgr metrics when enabled (#8785)
* add metrics for the resource manager
* export protocol and service name in Prometheus metrics
* fix: expose rcmgr metrics only when enabled
Co-authored-by: Marcin Rataj <[email protected]>
* refactor: rcmgr_metrics.go
* refactor: rcmgr_defaults.go
This file defines implicit limit defaults used when Swarm.ResourceMgr.Enabled
We keep vendored copy to ensure go-ipfs is not impacted when go-libp2p
decides to change defaults in any of the future releases.
* refactor: adjustedDefaultLimits
Cleans up the way we initialize defaults and adds a fix for case
when connection manager runs with high limits.
It also hides `Swarm.ResourceMgr.Limits` until we have a better
understanding what syntax makes sense.
* chore: cleanup after a review
* fix: restore go-ipld-prime v0.14.2
* fix: restore go-ds-flatfs v0.5.1
Co-authored-by: Lucas Molas <[email protected]>
Co-authored-by: Marcin Rataj <[email protected]>
This commit was moved from ipfs/kubo@514411b
* update go-libp2p to v0.18.0
* initialize the resource manager
* add resource manager stats/limit commands
* load limit file when building resource manager
* log absent limit file
* write rcmgr to file when IPFS_DEBUG_RCMGR is set
* fix: mark swarm limit|stats as experimental
* feat(cfg): opt-in Swarm.ResourceMgr
This ensures we can safely test the resource manager without impacting
default behavior.
- Resource manager is disabled by default
- Default for Swarm.ResourceMgr.Enabled is false for now
- Swarm.ResourceMgr.Limits allows user to tweak limits per specific
scope in a way that is persisted across restarts
- 'ipfs swarm limit system' outputs human-readable json
- 'ipfs swarm limit system new-limits.json' sets new runtime limits
(but does not change Swarm.ResourceMgr.Limits in the config)
Conventions to make libp2p devs life easier:
- 'IPFS_RCMGR=1 ipfs daemon' overrides the config and enables resource manager
- 'limit.json' overrides implicit defaults from libp2p (if present)
* docs(config): small tweaks
* fix: skip libp2p.ResourceManager if disabled
This ensures 'ipfs swarm limit|stats' work only when enabled.
* fix: use NullResourceManager when disabled
This reverts commit b19f7c9eca4cee4187f8cba3389dc2c930258512.
after clarification feedback from
ipfs/kubo#8680 (comment)
* style: rename IPFS_RCMGR to LIBP2P_RCMGR
preexisting libp2p toggles use LIBP2P_ prefix
* test: Swarm.ResourceMgr
* fix: location of opt-in limit.json and rcmgr.json.gz
Places these files inside of IPFS_PATH
* Update docs/config.md
* feat: expose rcmgr metrics when enabled (#8785)
* add metrics for the resource manager
* export protocol and service name in Prometheus metrics
* fix: expose rcmgr metrics only when enabled
Co-authored-by: Marcin Rataj <[email protected]>
* refactor: rcmgr_metrics.go
* refactor: rcmgr_defaults.go
This file defines implicit limit defaults used when Swarm.ResourceMgr.Enabled
We keep vendored copy to ensure go-ipfs is not impacted when go-libp2p
decides to change defaults in any of the future releases.
* refactor: adjustedDefaultLimits
Cleans up the way we initialize defaults and adds a fix for case
when connection manager runs with high limits.
It also hides `Swarm.ResourceMgr.Limits` until we have a better
understanding what syntax makes sense.
* chore: cleanup after a review
* fix: restore go-ipld-prime v0.14.2
* fix: restore go-ds-flatfs v0.5.1
Co-authored-by: Lucas Molas <[email protected]>
Co-authored-by: Marcin Rataj <[email protected]>
This commit was moved from ipfs/kubo@514411b
* update go-libp2p to v0.18.0
* initialize the resource manager
* add resource manager stats/limit commands
* load limit file when building resource manager
* log absent limit file
* write rcmgr to file when IPFS_DEBUG_RCMGR is set
* fix: mark swarm limit|stats as experimental
* feat(cfg): opt-in Swarm.ResourceMgr
This ensures we can safely test the resource manager without impacting
default behavior.
- Resource manager is disabled by default
- Default for Swarm.ResourceMgr.Enabled is false for now
- Swarm.ResourceMgr.Limits allows user to tweak limits per specific
scope in a way that is persisted across restarts
- 'ipfs swarm limit system' outputs human-readable json
- 'ipfs swarm limit system new-limits.json' sets new runtime limits
(but does not change Swarm.ResourceMgr.Limits in the config)
Conventions to make libp2p devs life easier:
- 'IPFS_RCMGR=1 ipfs daemon' overrides the config and enables resource manager
- 'limit.json' overrides implicit defaults from libp2p (if present)
* docs(config): small tweaks
* fix: skip libp2p.ResourceManager if disabled
This ensures 'ipfs swarm limit|stats' work only when enabled.
* fix: use NullResourceManager when disabled
This reverts commit b19f7c9eca4cee4187f8cba3389dc2c930258512.
after clarification feedback from
ipfs/kubo#8680 (comment)
* style: rename IPFS_RCMGR to LIBP2P_RCMGR
preexisting libp2p toggles use LIBP2P_ prefix
* test: Swarm.ResourceMgr
* fix: location of opt-in limit.json and rcmgr.json.gz
Places these files inside of IPFS_PATH
* Update docs/config.md
* feat: expose rcmgr metrics when enabled (#8785)
* add metrics for the resource manager
* export protocol and service name in Prometheus metrics
* fix: expose rcmgr metrics only when enabled
Co-authored-by: Marcin Rataj <[email protected]>
* refactor: rcmgr_metrics.go
* refactor: rcmgr_defaults.go
This file defines implicit limit defaults used when Swarm.ResourceMgr.Enabled
We keep vendored copy to ensure go-ipfs is not impacted when go-libp2p
decides to change defaults in any of the future releases.
* refactor: adjustedDefaultLimits
Cleans up the way we initialize defaults and adds a fix for case
when connection manager runs with high limits.
It also hides `Swarm.ResourceMgr.Limits` until we have a better
understanding what syntax makes sense.
* chore: cleanup after a review
* fix: restore go-ipld-prime v0.14.2
* fix: restore go-ds-flatfs v0.5.1
Co-authored-by: Lucas Molas <[email protected]>
Co-authored-by: Marcin Rataj <[email protected]>
This commit was moved from ipfs/kubo@514411b
* update go-libp2p to v0.18.0
* initialize the resource manager
* add resource manager stats/limit commands
* load limit file when building resource manager
* log absent limit file
* write rcmgr to file when IPFS_DEBUG_RCMGR is set
* fix: mark swarm limit|stats as experimental
* feat(cfg): opt-in Swarm.ResourceMgr
This ensures we can safely test the resource manager without impacting
default behavior.
- Resource manager is disabled by default
- Default for Swarm.ResourceMgr.Enabled is false for now
- Swarm.ResourceMgr.Limits allows user to tweak limits per specific
scope in a way that is persisted across restarts
- 'ipfs swarm limit system' outputs human-readable json
- 'ipfs swarm limit system new-limits.json' sets new runtime limits
(but does not change Swarm.ResourceMgr.Limits in the config)
Conventions to make libp2p devs life easier:
- 'IPFS_RCMGR=1 ipfs daemon' overrides the config and enables resource manager
- 'limit.json' overrides implicit defaults from libp2p (if present)
* docs(config): small tweaks
* fix: skip libp2p.ResourceManager if disabled
This ensures 'ipfs swarm limit|stats' work only when enabled.
* fix: use NullResourceManager when disabled
This reverts commit b19f7c9eca4cee4187f8cba3389dc2c930258512.
after clarification feedback from
ipfs/kubo#8680 (comment)
* style: rename IPFS_RCMGR to LIBP2P_RCMGR
preexisting libp2p toggles use LIBP2P_ prefix
* test: Swarm.ResourceMgr
* fix: location of opt-in limit.json and rcmgr.json.gz
Places these files inside of IPFS_PATH
* Update docs/config.md
* feat: expose rcmgr metrics when enabled (#8785)
* add metrics for the resource manager
* export protocol and service name in Prometheus metrics
* fix: expose rcmgr metrics only when enabled
Co-authored-by: Marcin Rataj <[email protected]>
* refactor: rcmgr_metrics.go
* refactor: rcmgr_defaults.go
This file defines implicit limit defaults used when Swarm.ResourceMgr.Enabled
We keep vendored copy to ensure go-ipfs is not impacted when go-libp2p
decides to change defaults in any of the future releases.
* refactor: adjustedDefaultLimits
Cleans up the way we initialize defaults and adds a fix for case
when connection manager runs with high limits.
It also hides `Swarm.ResourceMgr.Limits` until we have a better
understanding what syntax makes sense.
* chore: cleanup after a review
* fix: restore go-ipld-prime v0.14.2
* fix: restore go-ds-flatfs v0.5.1
Co-authored-by: Lucas Molas <[email protected]>
Co-authored-by: Marcin Rataj <[email protected]>
This commit was moved from ipfs/kubo@514411b
Part of #8761
ipfs swarm limit --helpipfs swarm stats --helpSwarm.ResourceMgrSwarm.ResourceMgr.Enabledis a flag, disabled by defaultCloses #8722
Closes #1482 🧙♂️