Multi-process Docker Service

Kubernetes has Pods; standard Docker Services do not. This poses a problem when multiple processes should have access to the same resources, particularly volumes. While the solution below doesn’t solve all problems Pods do, it does make it possible for two processes to access the same volume.

Here Supervisor is used to control multiple processes within the same container, and while doing so we are going to set it up in a way to ensure a few things:

  • the output for all processes go to stout
  • when a program dies, it is restarted

Install requirements

By default Supervisor writes logs to files. It doesn’t support logging to stdout by default. The package supervisor-stdout fixes this shortcoming by listening to log events and writing them to stdout.

Install both these packages with the command:

pip install supervisor supervisor-stdout

Base configuration

Since we want this to run within a Docker container, Supervisor should not fork into the background. Start the file /etc/supervisor/supervisord.conf with:

[supervisord]
nodaemon=true

Next, setup the event listener. The following block will listen for PROCESS_LOG events:

[eventlistener:stdout]
priority = 1
command = supervisor_stdout
buffer_size = 100
events = PROCESS_LOG
result_handler = supervisor_stdout:event_handler

Finally, add your program blocks. Here, we’ll verify that stdout and stderr are redirected properly with two separate programs:

[program:test_stderr]
priority=10
command=/bin/bash -c 'while [ 1 ]; do date 1>&2; sleep 10; done;'
startsecs=10
exitcodes=0
stdout_events_enabled = true
stderr_events_enabled = true


[program:test_stdout]
priority=10
command=/bin/bash -c 'while [ 1 ]; do date; sleep 10; done;'
startsecs=10
exitcodes=0
stdout_events_enabled = true
stderr_events_enabled = true

Copy one of those blocks and change the command to run for your particular application.

Output will look something like this:

root@24b74c2e5f51:/# supervisord -c /etc/supervisor/supervisord.conf
2017-11-06 14:02:28,996 CRIT Supervisor running as root (no user in config file)
2017-11-06 14:02:28,999 INFO supervisord started with pid 207
2017-11-06 14:02:30,003 INFO spawned: 'stdout' with pid 210
2017-11-06 14:02:30,006 INFO spawned: 'test_stdout' with pid 211
2017-11-06 14:02:30,011 INFO spawned: 'test_stderr' with pid 213
2017-11-06 14:02:31,093 INFO success: stdout entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
test_stderr stderr | Mon Nov  6 14:02:30 UTC 2017
test_stdout stdout | Mon Nov  6 14:02:30 UTC 2017
2017-11-06 14:02:40,021 INFO success: test_stdout entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2017-11-06 14:02:40,021 INFO success: test_stderr entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
test_stderr stderr | Mon Nov  6 14:02:40 UTC 2017
test_stdout stdout | Mon Nov  6 14:02:40 UTC 2017

Docker CMD

Instead of calling your app in CMD, launch Supervisor:

CMD ["/usr/local/bin/supervisord", "-c", "/etc/supervisor/supervisord.conf"]
Advertisements

Automate publishing NPM packages

The trickiest part of setting up automated NPM package publishing is authorization. Below, I’m going to automate publishing to a private nexus3 repository.

NPM package repositories cache a token used for authentication once logged in. So, first thing to do is login locally:

npm adduser --registry=http://nexus3.internal/repository/npm

Extract the token and tweak .npmrc

Now, open up ~/.npmrc and copy the auth token. Additionally, replace it with ${NPM_TOKEN}. The file should end up looking like this:

$ cat ~/.npmrc
//nexus3.internal/repository/npm/:_authToken=${NPM_TOKEN}

Environment

Finally, add the NPM_TOKEN environment variable:

$ export NPM_TOKEN=<paste token here>

Publish

npm publish --registry http://nexus3.internal/repository/npm/

Specifying registry in package.json

It’s also possible to specify the registry in package.json, so that it does not have to be added to the command line all the time:

{
  [...]
  "publishConfig": {
    "registry": "http://nexus3.internal/repository/npm/"
  }
}

References

https://remysharp.com/2015/10/26/using-travis-with-private-npm-deps

Encrypting files with SSH keys

Chicken and egg: need to securely send a colleague VPN connection info before they’re on the VPN. We all have SSH keys!

Found this nice GitHub Gist on this topic. It boils down to the following.

Converting a public SSH key to PKCS8

$ ssh-keygen -e -f /path/to/pubkey -m PKCS8 > /path/to/pubkey.pkcs8

Generate a random key

$ openssl rand 192 -out key

Use random key to encrypt a file

$ openssl aes-256-cbc -in secret.txt -out secret.txt.enc -pass file:key

Encrypt random key with PKCS8 SSH key

$ openssl rsautl -encrypt -pubin -inkey /path/to/pubkey.pkcs8 -in key -out key.enc

Glob up both files

$ tar -zcvf secret.tgz *.enc

Decrypting

$ tar -xzvf secret.tgz
$ openssl rsautl -decrypt -ssl -inkey ~/.ssh/id_rsa -in key.enc -out key
$ openssl aes-256-cbc -d -in secret.txt.enc -out secret.txt -pass file:key

Flexible Python logging

Here’s a template for a very flexible logging configuration:

#!/usr/bin/env python
"""
Program description
"""
import logging

from logging.config import dictConfig

logger = None

LOG_CONFIG = {
    "version": 1,
    "disable_existing_loggers": False,

    "formatters": {
        "simple": {
            "format": "%(asctime)s %(name)s:%(lineno)d %(levelname)s %(message)s"
        },
    },

    # this confiuration applies to all loggers that are not listed in `loggers` below
    "root": {
        "handlers": [
            "console"
        ],
        "level": "INFO",
    },

    # configure logging for specific loggers
    "loggers": {
        # this logger is for logging within this file
        "__main__": {
            "level": "DEBUG",
            "propagate": False,
            "handlers": [
                "console"
            ],
        },
        
        # only log warnings in package foo
        "foo": {
            "level": "WARN",
            "propagate": False,
            "handlers": [
                "console"
            ],
        },
    },

    "handlers": {
        # send logs to the console on stdout
        "console": {
            "stream": "ext://sys.stdout",
            "class": "logging.StreamHandler",
            "level": "DEBUG",
            "formatter": "simple"
        },
        
        # this handler throws stuff away
        "drop": {
            "class": "logging.NullHandler",
            "level": "DEBUG"
        }
    },
}

SCRIPT_NAME = os.path.basename(sys.argv[0])


if __name__ == '__main__':
    dictConfig(LOG_CONFIG)

    logger = logging.getLogger(__name__)

    # your stuff here

Now, any time your code needs to log something, request a logger:

logger = logging.getLogger(__name__)

When the line above is used in the module foo/bar.py only warnings and errors will be logged to stdout per the foo entry above.

The line above in the module something/else.py will log info and above per the root handler.

Finally, anything using the logger in the script itself will log debug and above per the __main__ entry.

Scheduling jobs on Mac OS

On Mac OS launchd is the process manager. It can run background processes for the system as well as for each individual user. User-specific jobs are kept in ~/Library/LaunchAgents/.

Running a job on an interval

Here is a template for running a job on an interval. The example below runs the given program every minute. Copy and paste the following in a file named com.example.JobLabel.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
  <dict>
    <key>Label</key>
    <string>com.example.JobLabel</string>
    <key>RunAtLoad</key>
    <true/>
    <key>StartInterval</key>
    <integer>60</integer>
    <key>StandardOutPath</key>
    <string>/Users/berto/local/var/log/spotify_watcher.log</string>
    <key>ProgramArguments</key>
    <array>
      <string>/path/to/bin/program</string>
      <string>any</string>
      <string>additional</string>
      <string>args</string>
    </array>
  </dict>
</plist>

It doesn’t matter, but it will make your life a lot easier to keep the Label string in sync with the filename.

Additionally, there are two ways to specify the program to run:

  • The Program key
  • The ProgramArguments key

After debugging something stupid (specifying Program and ProgramArguments in the same file, I think it’s best to use ProgramArguments exclusively and pretend the other one does not exist. When the program takes no arguments, simply have a single <string> in the <array>.

Kubernetes on AWS

The AWS cluster configuration script Kubernetes comes with works pretty much flawlessly, with two exceptions. First, make sure you have curl installed on the system you are bootstrapping the cluster from. And, second, make sure the awscli package is recent.

The Debian AWS AMI does not come with curl installed and the awscli package is an old version: aws-cli/1.4.2 Python/3.4.2 Linux/3.16.0-4-amd64).“ After running pip install --upgrade awscli you should see a version at least: aws-cli/1.10.24 Python/2.7.9 Linux/3.16.0-4-amd64 botocore/1.4.15.

Good times.