Skip to content

[RFC] Improve init script: specifically manage salt configurations rather than arbitrary salt processes#32025

Closed
plastikos wants to merge 41 commits intosaltstack:developfrom
plastikos:improvement-init_script
Closed

[RFC] Improve init script: specifically manage salt configurations rather than arbitrary salt processes#32025
plastikos wants to merge 41 commits intosaltstack:developfrom
plastikos:improvement-init_script

Conversation

@plastikos
Copy link
Copy Markdown
Contributor

@plastikos plastikos commented Mar 21, 2016

What does this PR do?

Prevents SysV init script from killing non-salt processes.

What issues does this PR fix or reference?

None

Previous Behavior

Current SysV init script kills unrelated processes.

New Behavior

Manage a specific salt minion configuration rather than anything that seems to be a salt minion.

Tests written?

  • Yes, but not working . . . yet.
  • No

Unfortunately SysV init scripts tend to rummage through PIDs filtering for
appropriate processes to manage. Unfortunately the filters are usually weak
and don't account for similar processes run by other users, PIDs of dead
processes being re-used for completely different executables, etc.. These
weaknesses can result in killing unrelated processes with potentially serious
results.

These improvements to the SysV init script is a complete rewrite with the
following improvements:

  • Specifically manage individual salt configurations rather than looking for
    salt minion-like processes.
  • Obtain salt minion information from the salt configuration - use the
    information to manage the specifically configured process.
  • Drop all of the platform-specific helper functions that allow the
    previously-mentioned weaknesses.
    • Unfortunately this means that the output information may not match the
      specific platform (this could easily be corrected).
  • Now can manage multiple salt processes started by different users
    • Unfortunately starts/stops/restarts as a group and is unable to manage
      them both as a group or as individual processes (this could easily be
      corrected)

…han arbitrary salt processes

Unfortunately SysV init scripts tend to rummage through PIDs filtering for
appropriate processes to manage.  Unfortunately the filters are usually weak
and don't account for similar processes run by other users, PIDs of dead
processes being re-used for completely different executables, etc..  These
weaknesses can result in killing unrelated processes with potentially serious
results.

These improvements to the SysV init script is a complete rewrite with the
following improvements:

  * Specifically manage individual salt configurations rather than looking for
    salt minion-like processes.
  * Obtain salt minion information from the salt configuration - use the
    information to manage the specifically configured process.
  * Drop all of the platform-specific helper functions that allow the
    previously-mentioned weaknesses.

    + Unfortunately this means that the output information may not match the
      specific platform (this could easily be corrected).

  * Now can manage multiple salt processes started by different users

    + Unfortunately starts/stops/restarts as a group and is unable to manage
      them both as a group or as individual processes (this could easily be
      corrected)
@plastikos plastikos changed the title Improve init script: specifically manage salt configurations rather than arbitrary salt processes [RFC] Improve init script: specifically manage salt configurations rather than arbitrary salt processes Mar 22, 2016
@plastikos
Copy link
Copy Markdown
Contributor Author

Don't merge this yet - I need to investigate a couple things and write an integration test.

@cachedout
Copy link
Copy Markdown
Contributor

Looks good to me so far. Let me know when you're ready to try and get this in.

Thayne Harbaugh added 3 commits March 25, 2016 14:25
* Correct algorithm parsing netsat output which can have variable fields
  depending on how many program args it can fit.
* Use sed to parse rather than awk
* Send stderr to /dev/null for netstat and salt-call when DEBUG is not set.
  * Allow installation paths to be overridded for integration tests
  * Detect if netstat accepts "--notrim" option
  * Remove some unused cruft
  * Explicitly use "local" for local variables in functions
  * Drop bashism string translation
  * Correct management of timeout in start() and stop()
  * Add "kill -KILL" as a final attempt in stop()
  * Use exitstatus values from LSB
  * Put status() in function
  * Add try-restart from LSB
* Requires new salt-testing:salttesting.case:ShellTestCase.run_prog()
* Only runs on Linux (although it could be used on other POSIX platforms

:NOTE: MinionTest.test_linux_initscript() unfortunately kills the initial salt
minion started for the test suite.  While it leaves a new minion running it
may not be handled properly by the test suite shut-down.
@plastikos
Copy link
Copy Markdown
Contributor Author

@cachedout, this is getting close to being ready. One thing that I could use help with is how to handle the minions that need to be started/stopped/restarted by the integration test. Unfortunately it doesn't work well with the already-spun-up minion and it's unclear to me the best way to get integration.TestDaemon.transplant_configs() to create a new config+run-root directory structure just for this. I have some ideas, but they will require rewrite of TestDaemon and other classes in integration.

@plastikos
Copy link
Copy Markdown
Contributor Author

@cachedout, I need to figure out a good way of restarting the test minion for this test as well as testing minion.kill and minion.restart commands.

@plastikos
Copy link
Copy Markdown
Contributor Author

Once this is cleaned up and pulled I'll do the same thing for salt-master, salt-syndic and salt-api.

@cachedout
Copy link
Copy Markdown
Contributor

@plastikos Sounds good. Please keep me posted.

@cachedout
Copy link
Copy Markdown
Contributor

@jtand is your man for that. @jtand can you please contact @plastikos privately and get him set up with one?

@plastikos plastikos mentioned this pull request Apr 12, 2016
2 tasks
@justinta
Copy link
Copy Markdown

@plastikos if this fails again, let me know and I can get you a test VM.

@plastikos
Copy link
Copy Markdown
Contributor Author

Yes, please!

@justinta
Copy link
Copy Markdown

@plastikos Ok, How do you want me to send you the login info?

@plastikos
Copy link
Copy Markdown
Contributor Author

@jtand, I just sent you an email.

@cachedout
Copy link
Copy Markdown
Contributor

@jtand Thanks very much.

@plastikos Keep me posted. Thanks!

@plastikos
Copy link
Copy Markdown
Contributor Author

@cachedout , I've been sending emails to @jtand. At this point I am unable to reproduce the failures of the automated testing even in the VM @jtrand provided running the same test. The failure logged in the automated testing is quite odd and indicates that the started salt-minion is blocked somewhere in start up. @jtrand is looking into some things and I'm picking at some things too.

The overall status is that I'm stumped for the moment and call shenanigans on the automated testing. Yes, that's my technical diagnosis.

@plastikos
Copy link
Copy Markdown
Contributor Author

@cachedout, I don't think that the VM is running (all of) the code from my branch - although I'm not certain what it's running. I point out the line in the salt-pr-rs-cent7-n log where the salt-minion log is dumped:

STDERR:LOG_FILE:
STDERR:+ tail -n 20 /tmp/salt-testdaemon-6gFOVm/minion/var/log/salt/minion
STDERR:2016-04-14 00:01:00,533 [salt.utils.verify][WARNING ][18239] Insecure logging configuration detected! Sensitive data may be logged.
STDERR:2016-04-14 00:01:00,533 [salt.cli.daemons ][INFO    ][18239] Setting up the Salt Minion "minion"
STDERR:2016-04-14 00:01:00,657 [salt.cli.daemons                         ][INFO    ][18239] The salt minion is shut down
STDERR:2016-04-14 00:01:00,659 [salt.cli.daemons                         ][INFO    ][18245] The salt minion is shut down
STDERR:2016-04-14 00:01:00,660 [salt.utils.process                       ][DEBUG   ][18247] Created pidfile: /tmp/salt-testdaemon-6gFOVm/minion/var/run/salt-minion.pid
STDERR:2016-04-14 00:01:00,660 [salt.config                              ][DEBUG   ][18247] Reading configuration from /tmp/salt-testdaemon-6gFOVm/minion/etc/salt/minion

Notice the last line: Reading configuration . . .. There is only one location in the code that has that - especially when narrowed to the salt.config module: _read_conf_file(). I changed that log statement to be the following:

log.debug('TLH: Reading configuration from {0}'.format(path))

Notice the TLH: prefix? This change was made in d84011c which the test output shows everywhere for being checked out, yet somehow the debug output is missing the TLH:.

Clearly the tests are running a portion of my code from my improvement-init_script branch, but this particular log statement just doesn't match up.

I'm puzzled.

@plastikos
Copy link
Copy Markdown
Contributor Author

@cachedout, I think I know what's going on. I might have a fix . . ..

@plastikos
Copy link
Copy Markdown
Contributor Author

@cachedout, I fixed it. It looks like the test suite adds /testing to sys.path, but it is not added to the environment in PYTHONPATH. This means that anything that fork+execs a python process is suspect and ends up using installed Salt modules rather than Salt modules from the test source tree.

I'm kicking myself that it took me this long to figure out.

I'll see about cleaning this up, ripping out the breadcrumbs and unnecessary debugging, quash some commits and get it re-submitted.

@plastikos
Copy link
Copy Markdown
Contributor Author

@cachedout, how many places are subprocesses created that don't have PYTHONPATH specifically set and might run incorrect code?

  • test/jenkins.py Lines 22-44
  • salttesting/helpers.py:ensure_in_syspath()

I think the above should be changed to add os.putenv('PYTHONPATH', ':'.join(sys.path))

Thoughts?

@cachedout
Copy link
Copy Markdown
Contributor

@plastikos I'm fine with those changes if @s0undt3ch is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants