Multi-process Docker Service

Kubernetes has Pods; standard Docker Services do not. This poses a problem when multiple processes should have access to the same resources, particularly volumes. While the solution below doesn’t solve all problems Pods do, it does make it possible for two processes to access the same volume.

Here Supervisor is used to control multiple processes within the same container, and while doing so we are going to set it up in a way to ensure a few things:

  • the output for all processes go to stout
  • when a program dies, it is restarted

Install requirements

By default Supervisor writes logs to files. It doesn’t support logging to stdout by default. The package supervisor-stdout fixes this shortcoming by listening to log events and writing them to stdout.

Install both these packages with the command:

pip install supervisor supervisor-stdout

Base configuration

Since we want this to run within a Docker container, Supervisor should not fork into the background. Start the file /etc/supervisor/supervisord.conf with:

[supervisord]
nodaemon=true

Next, setup the event listener. The following block will listen for PROCESS_LOG events:

[eventlistener:stdout]
priority = 1
command = supervisor_stdout
buffer_size = 100
events = PROCESS_LOG
result_handler = supervisor_stdout:event_handler

Finally, add your program blocks. Here, we’ll verify that stdout and stderr are redirected properly with two separate programs:

[program:test_stderr]
priority=10
command=/bin/bash -c 'while [ 1 ]; do date 1>&2; sleep 10; done;'
startsecs=10
exitcodes=0
stdout_events_enabled = true
stderr_events_enabled = true


[program:test_stdout]
priority=10
command=/bin/bash -c 'while [ 1 ]; do date; sleep 10; done;'
startsecs=10
exitcodes=0
stdout_events_enabled = true
stderr_events_enabled = true

Copy one of those blocks and change the command to run for your particular application.

Output will look something like this:

root@24b74c2e5f51:/# supervisord -c /etc/supervisor/supervisord.conf
2017-11-06 14:02:28,996 CRIT Supervisor running as root (no user in config file)
2017-11-06 14:02:28,999 INFO supervisord started with pid 207
2017-11-06 14:02:30,003 INFO spawned: 'stdout' with pid 210
2017-11-06 14:02:30,006 INFO spawned: 'test_stdout' with pid 211
2017-11-06 14:02:30,011 INFO spawned: 'test_stderr' with pid 213
2017-11-06 14:02:31,093 INFO success: stdout entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
test_stderr stderr | Mon Nov  6 14:02:30 UTC 2017
test_stdout stdout | Mon Nov  6 14:02:30 UTC 2017
2017-11-06 14:02:40,021 INFO success: test_stdout entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2017-11-06 14:02:40,021 INFO success: test_stderr entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
test_stderr stderr | Mon Nov  6 14:02:40 UTC 2017
test_stdout stdout | Mon Nov  6 14:02:40 UTC 2017

Docker CMD

Instead of calling your app in CMD, launch Supervisor:

CMD ["/usr/local/bin/supervisord", "-c", "/etc/supervisor/supervisord.conf"]

Automate publishing NPM packages

The trickiest part of setting up automated NPM package publishing is authorization. Below, I’m going to automate publishing to a private nexus3 repository.

NPM package repositories cache a token used for authentication once logged in. So, first thing to do is login locally:

npm adduser --registry=http://nexus3.internal/repository/npm

Extract the token and tweak .npmrc

Now, open up ~/.npmrc and copy the auth token. Additionally, replace it with ${NPM_TOKEN}. The file should end up looking like this:

$ cat ~/.npmrc
//nexus3.internal/repository/npm/:_authToken=${NPM_TOKEN}

Environment

Finally, add the NPM_TOKEN environment variable:

$ export NPM_TOKEN=<paste token here>

Publish

npm publish --registry http://nexus3.internal/repository/npm/

Specifying registry in package.json

It’s also possible to specify the registry in package.json, so that it does not have to be added to the command line all the time:

{
  [...]
  "publishConfig": {
    "registry": "http://nexus3.internal/repository/npm/"
  }
}

References

https://remysharp.com/2015/10/26/using-travis-with-private-npm-deps

Encrypting files with SSH keys

Chicken and egg: need to securely send a colleague VPN connection info before they’re on the VPN. We all have SSH keys!

Found this nice GitHub Gist on this topic. It boils down to the following.

Converting a public SSH key to PKCS8

$ ssh-keygen -e -f /path/to/pubkey -m PKCS8 > /path/to/pubkey.pkcs8

Generate a random key

$ openssl rand 192 -out key

Use random key to encrypt a file

$ openssl aes-256-cbc -in secret.txt -out secret.txt.enc -pass file:key

Encrypt random key with PKCS8 SSH key

$ openssl rsautl -encrypt -pubin -inkey /path/to/pubkey.pkcs8 -in key -out key.enc

Glob up both files

$ tar -zcvf secret.tgz *.enc

Decrypting

$ tar -xzvf secret.tgz
$ openssl rsautl -decrypt -ssl -inkey ~/.ssh/id_rsa -in key.enc -out key
$ openssl aes-256-cbc -d -in secret.txt.enc -out secret.txt -pass file:key

Flexible Python logging

Here’s a template for a very flexible logging configuration:

#!/usr/bin/env python
"""
Program description
"""
import logging

from logging.config import dictConfig

logger = None

LOG_CONFIG = {
    "version": 1,
    "disable_existing_loggers": False,

    "formatters": {
        "simple": {
            "format": "%(asctime)s %(name)s:%(lineno)d %(levelname)s %(message)s"
        },
    },

    # this confiuration applies to all loggers that are not listed in `loggers` below
    "root": {
        "handlers": [
            "console"
        ],
        "level": "INFO",
    },

    # configure logging for specific loggers
    "loggers": {
        # this logger is for logging within this file
        "__main__": {
            "level": "DEBUG",
            "propagate": False,
            "handlers": [
                "console"
            ],
        },
        
        # only log warnings in package foo
        "foo": {
            "level": "WARN",
            "propagate": False,
            "handlers": [
                "console"
            ],
        },
    },

    "handlers": {
        # send logs to the console on stdout
        "console": {
            "stream": "ext://sys.stdout",
            "class": "logging.StreamHandler",
            "level": "DEBUG",
            "formatter": "simple"
        },
        
        # this handler throws stuff away
        "drop": {
            "class": "logging.NullHandler",
            "level": "DEBUG"
        }
    },
}

SCRIPT_NAME = os.path.basename(sys.argv[0])


if __name__ == '__main__':
    dictConfig(LOG_CONFIG)

    logger = logging.getLogger(__name__)

    # your stuff here

Now, any time your code needs to log something, request a logger:

logger = logging.getLogger(__name__)

When the line above is used in the module foo/bar.py only warnings and errors will be logged to stdout per the foo entry above.

The line above in the module something/else.py will log info and above per the root handler.

Finally, anything using the logger in the script itself will log debug and above per the __main__ entry.

Scheduling jobs on Mac OS

On Mac OS launchd is the process manager. It can run background processes for the system as well as for each individual user. User-specific jobs are kept in ~/Library/LaunchAgents/.

Running a job on an interval

Here is a template for running a job on an interval. The example below runs the given program every minute. Copy and paste the following in a file named com.example.JobLabel.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
  <dict>
    <key>Label</key>
    <string>com.example.JobLabel</string>
    <key>RunAtLoad</key>
    <true/>
    <key>StartInterval</key>
    <integer>60</integer>
    <key>StandardOutPath</key>
    <string>/Users/berto/local/var/log/spotify_watcher.log</string>
    <key>ProgramArguments</key>
    <array>
      <string>/path/to/bin/program</string>
      <string>any</string>
      <string>additional</string>
      <string>args</string>
    </array>
  </dict>
</plist>

It doesn’t matter, but it will make your life a lot easier to keep the Label string in sync with the filename.

Additionally, there are two ways to specify the program to run:

  • The Program key
  • The ProgramArguments key

After debugging something stupid (specifying Program and ProgramArguments in the same file, I think it’s best to use ProgramArguments exclusively and pretend the other one does not exist. When the program takes no arguments, simply have a single <string> in the <array>.

Kubernetes on AWS

The AWS cluster configuration script Kubernetes comes with works pretty much flawlessly, with two exceptions. First, make sure you have curl installed on the system you are bootstrapping the cluster from. And, second, make sure the awscli package is recent.

The Debian AWS AMI does not come with curl installed and the awscli package is an old version: aws-cli/1.4.2 Python/3.4.2 Linux/3.16.0-4-amd64).“ After running pip install --upgrade awscli you should see a version at least: aws-cli/1.10.24 Python/2.7.9 Linux/3.16.0-4-amd64 botocore/1.4.15.

Good times.

Custom Nginx error pages on upstream responses

Digital Ocean has a great post on setting up error pages when nginx encounters an error. However, the above does not work when a request is successfully passed to an upstream server and the upstream response is a 50x error. When processing the upstream response is desired, proxy_intercept_errors on; must be added to the configuration (thank you Stack Overflow!).

The server block looks like this:

server {
        listen 80 default_server;
        listen [::]:80 default_server ipv6only=on;

        [...]

        proxy_intercept_errors on;
        error_page 500 502 503 504 /custom_50x.html;
        location = /custom_50x.html {
                root /usr/share/nginx/html;
                internal;
        }

        [...]
}

InfluxDB on Raspberry Pi

I found a blog post by Aymerick describing how to build InfluxDB on a Raspberry Pi. Here’s what I did to get it working.

Install prerequisites

$ sudo apt-get install -y bison ruby2.1 ruby-dev build-essential
$ sudo gem2.1 install fpm

Install gvm

This installs gvm for the current user:

$ bash < <(curl -s -S -L https://raw.githubusercontent.com/moovweb/gvm/master/binscripts/gvm-installer)

Setup Go

$ gvm install go1.4.3
$ gvm use go1.4.3 --default

Create an influxdb package set

$ gvm pkgset create influxdb
$ gvm pkgset use influxdb

Build InfluxDB

$ go get github.com/sparrc/gdm
$ go get github.com/influxdata/influxdb
$ cd ~/.gvm/pkgsets/go1.4.3/influxdb/src/github.com/influxdata/influxdb
$ gdm restore
$ go clean ./...
$ go install ./...

The ./package.sh command did not work for me, so I settled for the influxd and influx binaries in ~/.gvm/pkgsets/go1.4.3/influxdb/bin.

Mac + Docker Toolbox + HAProxy

Working with Docker Toolbox on a Mac is great. It makes it very easy to work with Docker continers and setting up interconnected services with Docker Compose. However, since the services are running in a virtual machine, accessing the services from a machine other than the Mac itself is not possible out of the box.

HAProxy

Installing and running HAProxy on the Mac will proxy any traffic from the VM to external hosts looking to access the service. I installed haproxy via homebrew with the command brew install haproxy.

With HAProxy installed, I set it up to proxy HTTPS traffic over to the VM. In my case, the VM was 192.168.99.100 and my Mac’s IP was 192.168.1.200. I saved the configuration below at /usr/local/etc/haproxy/haproxy.cfg.

global
  log  127.0.0.1  local0
  log  127.0.0.1  local1 notice
  maxconn  4096
  chroot   /usr/local/share/haproxy
  uid  99
  gid  99


defaults
  log   global
  mode  tcp
  option  dontlognull
  retries  3
  option  redispatch
  option  http-server-close
  maxconn  2000
  timeout connect  5000
  timeout client  50000
  timeout server  50000


frontend www_fe
  bind 192.168.1.200:443

  use_backend www


backend www
  timeout server 30s
  server www1 192.168.99.100:443

Launch daemon

With haproxy configured, I placed the launchd configuration at /Library/LaunchDaemons/com.zymbit.haproxy.plist and ran the command sudo launchctl load -w /Library/LaunchDaemons/com.zymbit.haproxy.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
 <key>Label</key>
 <string>com.zymbit.haproxy</string>
 <key>ProgramArguments</key>
 <array>
   <string>/usr/local/bin/haproxy</string>
   <string>-db</string>
   <string>-f</string>
   <string>/usr/local/etc/haproxy/haproxy.cfg</string>
 </array>
</dict>
</plist>

Starting and stopping launchd

Now, whenever I want external access to the Docker service I run the command launchctl start com.zymbit.haproxy and when I’m done launchctl stop com.zymbit.haproxy.