flaky agentmgr.test.js tests in Travis

The tests in agentmgr.test.js have become very flaky (Dec 2020 after not many changes and Travis job runs since the 1.3.0 release in July). Possibly due to newer [faster hardware in Travis](https://blog.travis-ci.com/2020-09-11-arm-on-aws) or docker related changes. It can resolve on a re-run, but it takes a few times.

A similar flaky test was fixed in https://github.com/apache/openwhisk-wskdebug/pull/82.

Example failed job: https://travis-ci.com/github/apache/openwhisk-wskdebug/jobs/464430797

Test failures:

```
  1) agentmgr
       should use non-concurrent agent if openwhisk does not support concurrency:
     Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/home/travis/build/apache/openwhisk-wskdebug/test/agentmgr.test.js)
  

  2) agentmgr
       should handle if the agent was left around from a previous run:
     Error: (HTTP code 500) server error - driver failed programming external connectivity on endpoint wskdebug-myaction-1608750742209 (70bb0576af18dc960ee9ed18f704100d9baffe0f994f783f49330e4bd99a36c7): Bind for 0.0.0.0:46747 failed: port is already allocated 
      at /home/travis/build/apache/openwhisk-wskdebug/node_modules/docker-modem/lib/modem.js:296:17
      at getCause (node_modules/docker-modem/lib/modem.js:326:7)
      at Modem.buildPayload (node_modules/docker-modem/lib/modem.js:295:5)
      at IncomingMessage.<anonymous> (node_modules/docker-modem/lib/modem.js:270:14)
      at endReadableNT (_stream_readable.js:1145:12)
      at process._tickCallback (internal/process/next_tick.js:63:19)

  3) agentmgr
       should remove backup action if --cleanup is set:
     Error: Unexpected error while polling agent for activation.
      at AgentMgr.waitForActivations (src/agentmgr.js:65:181)
      at process._tickCallback (internal/process/next_tick.js:68:7)
```

It seems the `should use non-concurrent agent if openwhisk does not support concurrency` fails to start or shutdown the docker container and times out waiting for that. This can then have detrimental effect on the subsequent tests that fail too, it seems.

https://github.com/apache/openwhisk-wskdebug/blob/fa5b0b48fa7e3bad922b94ee80e5ff6f70800ded/test/agentmgr.test.js#L124-L145


	it("should use non-concurrent agent if openwhisk does not support concurrency", async function() {
	const action = "myaction";
	const code = `const main = () => ({ msg: 'CORRECT' });`;

	mockActivationDbAgent(action, code);

	const argv = {
	port: test.port,
	action: "myaction"
	};

	const dbgr = new Debugger(argv);
	await dbgr.start();
	dbgr.run();

	// wait a bit
	await test.sleep(500);

	await dbgr.stop();

	test.assertAllNocksInvoked();
	});

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

flaky agentmgr.test.js tests in Travis #84

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

flaky agentmgr.test.js tests in Travis #84

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions