Skip to content
This repository was archived by the owner on Jan 30, 2020. It is now read-only.

Conversation

@dongsupark
Copy link

Functional test seems to have a limitation of 40 tests in total. When the number of tests becomes more than 40, no more cluster member can be created, so that test ends up with hanging mysteriously. To avoid that, let's reduce number of tests, mainly for dynamic metadata tests as well as unit action tests. It's just squashing similar tests.

As go test natively has an upper limit of 600sec in total for all tests, we should reduce total running time. So let's reduce number of units to be tested in the unit state tests, and reduce maximum block attempts for the metadata template test. This will fix the failure on semaphoreci, which has suddenly appeared since the dynamic metadata tests. This should make tests work for now. In the long run, it would be necessary to be able to tune the timeout.

As go test natively has an upper limit of 600sec in total for all
tests, we should reduce total running time.
So let's reduce number of units to be tested in the unit state tests,
and reduce maximum block attempts for the metadata template test.

This will fix the failure on semaphoreci, which has suddenly appeared
since the dynamic metadata tests. This should make tests work for now.
In the long run, it would be necessary to be able to tune the timeout.
Functional test seems to have a limitation of 40 tests in total.
When the number of tests becomes more than 40, no more cluster
member can be created, so that test ends up with hanging mysteriously.

To avoid that, let's reduce number of tests, mainly for dynamic metadata
tests as well as unit action tests. It's just squashing similar tests.
@dongsupark dongsupark merged commit 493f224 into coreos:master Nov 11, 2016
@dongsupark dongsupark deleted the dongsu/fix-hanging-fxtests branch November 11, 2016 23:27
dongsupark pushed a commit that referenced this pull request Nov 11, 2016
functional: reduce number of tests and total running time
dongsupark pushed a commit to endocode/fleet that referenced this pull request Nov 15, 2016
As described in coreos#1704, functional
test has a bug hitting an upper limit of ~40 tests in total. That's
actually an issue of DBUS connections remaining opened even after
nspawn containers got successfully terminated. That's why the number
of unix sockets grows up to 256.

  $ netstat -nap | grep /var/run/dbus/system_bus_socket | wc -l
  256

From that moment on, functional test hangs mysteriously. Sometimes
users could see errors like "The maximum number of active connections
for UID 0 has been reached."

Its reason is that every DBUS connection was never closed. The more
tests we add, the more stale DBUS connections we have. This bug has
existed since the beginning.

Fix it by adding conn.Close() after running systemd commands.
dongsupark pushed a commit to endocode/fleet that referenced this pull request Nov 15, 2016
As described in coreos#1704, functional
test has a bug hitting an upper limit of ~40 tests in total. That's
actually an issue of DBUS connections remaining opened even after
nspawn containers got successfully terminated. That's why the number
of unix sockets grows up to 256.

  $ netstat -nap | grep /var/run/dbus/system_bus_socket | wc -l
  256

From that moment on, functional test hangs mysteriously. Sometimes
users could see errors like "The maximum number of active connections
for UID 0 has been reached."

Its reason is that every DBUS connection was never closed. The more
tests we add, the more stale DBUS connections we have. This bug has
existed since the beginning.

Fix it by adding conn.Close() after running systemd commands.
dongsupark pushed a commit to endocode/fleet that referenced this pull request Nov 15, 2016
As described in coreos#1704, functional
test has a bug hitting an upper limit of ~40 tests in total. That's
actually an issue of DBUS connections remaining opened even after
nspawn containers got successfully terminated. That's why the number
of unix sockets grows up to 256.

  $ netstat -nap | grep /var/run/dbus/system_bus_socket | wc -l
  256

From that moment on, functional test hangs mysteriously. Sometimes
users could see errors like "The maximum number of active connections
for UID 0 has been reached."

Its reason is that every DBUS connection was never closed. The more
tests we add, the more stale DBUS connections we have. This bug has
existed since the beginning.

Fix it by adding conn.Close() after running systemd commands.
@jonboulle
Copy link
Contributor

can this be reverted after #1706?

@dongsupark
Copy link
Author

@jonboulle Sure, I'll revert it.

dongsupark pushed a commit to endocode/fleet that referenced this pull request Nov 21, 2016
This reverts commit 0f84672,
and reverts commit f81c3b2.
a.k.a reverts coreos#1704
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants