Fix a race condition in manage runner

The runner was not connecting the client event listener to the event bus. This meant that all events between the `client.run_job()` and when `client.get_cli_event_returns()` began polling for events would be lost. Before `get_cli_event_returns` (via LocalClient's `get_iter_returns`) gets around to polling for events, it first checks to see if the job cache has a record of the jid it's querying. When using a custom returner for the job_cache, one which has even a little bit of latency, return events from minions may sneak past before the jid lookup is complete, meaning that `get_iter_returns` will not return them and the manage runner will assume the minion did not respond. Connecting to the event bus before we run the test.ping ensures that we do not miss any of these events.
2025-04-17 10:10:20 +00:00 · 2017-12-13 02:30:19 -06:00 · 2017-12-13 02:30:19 -06:00 · 3ec004bd2e
commit 3ec004bd2e
parent 1032ca3290
1 changed files with 1 additions and 0 deletions
--- a/salt/runners/manage.py
+++ b/salt/runners/manage.py
@ -36,6 +36,7 @@ log = logging.getLogger(__name__)

 def _ping(tgt, expr_form, timeout, gather_job_timeout):
    client = salt.client.get_local_client(__opts__['conf_file'])
+    client.event.connect_pub(timeout=timeout)
    pub_data = client.run_job(tgt, 'test.ping', (), expr_form, '', timeout, '')

    if not pub_data: