Fix a race condition in manage runner

The runner was not connecting the client event listener to the event
bus. This meant that all events between the `client.run_job()` and when
`client.get_cli_event_returns()` began polling for events would be lost.
Before `get_cli_event_returns` (via LocalClient's `get_iter_returns`)
gets around to polling for events, it first checks to see if the job
cache has a record of the jid it's querying. When using a custom
returner for the job_cache, one which has even a little bit of latency,
return events from minions may sneak past before the jid lookup is
complete, meaning that `get_iter_returns` will not return them and the
manage runner will assume the minion did not respond.

Connecting to the event bus before we run the test.ping ensures that we
do not miss any of these events.
This commit is contained in:
Erik Johnson 2017-12-13 02:30:19 -06:00
parent 1032ca3290
commit 3ec004bd2e
No known key found for this signature in database
GPG key ID: 5E5583C437808F3F

View file

@ -36,6 +36,7 @@ log = logging.getLogger(__name__)
def _ping(tgt, expr_form, timeout, gather_job_timeout):
client = salt.client.get_local_client(__opts__['conf_file'])
client.event.connect_pub(timeout=timeout)
pub_data = client.run_job(tgt, 'test.ping', (), expr_form, '', timeout, '')
if not pub_data: