Skip to content

High CPU on ATC due to lock acquisition from db #4200

@ebilling

Description

@ebilling

Bug Report

We have experienced higher than expected cpu on the concourse web application for a long time and believe it is the same issue raised in #2346. However, after running a trace and looking carefully over the code, I believe the issue is caused by too few open connections to the DB for the lock manager when using TLS connections. (see Additional Context for trace)

I tracked this down to a separate connection pool for the lock and the setup specifies only 1 max idle and max open connection in RunCommand.constructLockConn. I am going to attempt to raise the number of allowed open and idle connections and see if this has a positive impact.

Additional Context

(pprof) top 20
Showing nodes accounting for 0, 0% of 24s total
Dropped 666 nodes (cum <= 0.12s)
Showing top 20 nodes out of 298
flat flat% sum% cum cum%
0 0% 0% 17.01s 70.88% github.com/tedsuo/ifrit.(*process).run
0 0% 0% 14.51s 60.46% github.com/concourse/concourse/atc/scheduler.(*Runner).Run
0 0% 0% 14.49s 60.38% github.com/concourse/concourse/atc/scheduler.(*Runner).tick
0 0% 0% 12.22s 50.92% github.com/cenkalti/backoff.Retry
0 0% 0% 12.22s 50.92% github.com/cenkalti/backoff.RetryNotify
*0 0% 0% 12.21s 50.88% database/sql.(DB).conn
0 0% 0% 12.20s 50.83% database/sql.dsnConnector.Connect

0 0% 0% 12.20s 50.83% github.com/concourse/concourse/atc/db.(*connectionRetryingDriver).Open
0 0% 0% 12.20s 50.83% github.com/concourse/concourse/atc/db.(*connectionRetryingDriver).Open.func1
0 0% 0% 12.20s 50.83% github.com/lib/pq.DialOpen
0 0% 0% 12.17s 50.71% github.com/lib/pq.(*conn).ssl
0 0% 0% 12.07s 50.29% github.com/lib/pq.ssl.func1
0 0% 0% 12.07s 50.29% github.com/lib/pq.sslVerifyCertificateAuthority
0 0% 0% 10.30s 42.92% github.com/concourse/concourse/atc/db.(*db).Exec
0 0% 0% 10.30s 42.92% github.com/concourse/concourse/atc/db.(*pipeline).AcquireSchedulingLock
0 0% 0% 10.30s 42.92% github.com/concourse/concourse/atc/metric.(*countingConn).Exec
0 0% 0% 10.25s 42.71% database/sql.(*DB).Exec
0 0% 0% 10.25s 42.71% database/sql.(*DB).ExecContext
0 0% 0% 10.25s 42.71% database/sql.(*DB).exec
0 0% 0% 8.47s 35.29% crypto/x509.(*Certificate).CheckSignature

Version Info

  • Concourse version: 4.2.2, but same issue exists in 4.5.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions