CyberSpy

Rantings from a guy with way too much free time

Gogo

Here we gogo!

Years ago, I wrote an interesting article that I thought might be worth re-posting (and revising) here on my blog. For a while I got into programming in golang and in the early going (pun-alert!), there were a lot of idioms that were not well understood by a noob. One of those paradigms was channels, go-routines, and signals used simultaneously. Taken separately, they are more easily understood. But when taken together, there can be some confusion. In this article, I address how to properly establish a signal-handler and a channel to terminate go-routines when the user interrupts the program with a signal such as ctrl-c or kill(1).

Let’s take Control

I read a great article summarizing the interrelationship between go-routines, channel communications, unix signals and proper termination by Adam Presley. In his article, he describes how to coordinate starting a go routine, handle an interrupt, and then communicating through a channel to the go-routine that the program desires termination.

In a non-trivial example, you may have more than one go routine. You need to communicate to all running go-routines and wait for all to finish in order for your program to gracefully and predictably terminate. Here’s a contrived example that demonstrates one way to do it.

One point to note. In this example I don’t distinguish between which routines we wish to quit in any particular order. In fact, as implemented here, there is no deterministic way of knowing the order (How might you implement the code so that you would be able to deterministically know the order of go-routine termination?)

To begin, we’ll create an arbitrary, but a-priori number of go-routines.


const (
        maxGoRoutines = 50
)

Next, start a go routine maxGoRoutines times and then wait for the same number of routines to complete by using a waitGroup

waitGroup := &sync.WaitGroup{}
waitGroup.Add(maxGoRoutines)

Now let’s define a simple go-routine. We’ll pass a channel into the routine thereby letting us know when to quit, a waitGroup to indicate that we’ve quit once we’ve left the routine, and an identifier to distinguish between go-routines making our demo look cool!

go func(shutdownChannel chan bool, waitGroup *sync.WaitGroup, id int) {
	log.Println("Starting work goroutine...")
	defer waitGroup.Done()

	for {
		/*
		 * Listen on channels for message.
		 */
		select {
		case _ = <-shutdownChannel:
			log.Printf("Received shutdown on goroutine %d\n", id)
			return

		default:
		}

		// Do some hard work here!
	}
}(shutdownChannel, waitGroup, i)

Once we’ve launched the routines, we wait for the program to terminate. We’ve established a signal handler to let us know when SIGTERM or SIGQUIT is received by adding the following lines:

quitChannel := make(chan os.Signal)
signal.Notify(quitChannel, syscall.SIGINT, syscall.SIGTERM)

Next, we’ll wait to receive a signal that we’ve quit the program by blocking on the quitChannel. Once we receive a message indicating that we’ve quit, we send a boolean true to our go-routine shutdownChannel. Notice that we have to send as many messages to this channel as we have go routines. Otherwise, we’ll leave go routines hanging around and that will block us from terminating.

Finally, we wait for the waitGroup to complete. After each go-routine calls its deferred waitGroup.Done() function, we unblock on the waitGroup.Wait() and can successfully exit!

        waitGroup.Wait()
        log.Println("Done.")

Here’s the whole thing from soup to nuts!

package main

import (
	"fmt"
	"log"
	"os"
	"os/signal"
	"runtime"
	"sync"
	"syscall"
	"time"
)

const (
	maxGoRoutines = 50
	DELAY         = 1000 // simulate a delay of work being done inside of each go-routine
)

func main() {

	runtime.GOMAXPROCS(int(float64(runtime.NumCPU()) * 1.25))

	log.Println("Starting application...")

	/*
	 * When SIGINT or SIGTERM is caught write to the quitChannel
	 */
	quitChannel := make(chan os.Signal)
	signal.Notify(quitChannel, syscall.SIGINT, syscall.SIGTERM)

	shutdownChannel := make(chan bool)
	waitGroup := &sync.WaitGroup{}

	waitGroup.Add(maxGoRoutines)

	/*
	 * Create a goroutine that does imaginary work
	 */
	for i := 0; i < maxGoRoutines; i++ {
		go func(shutdownChannel chan bool, waitGroup *sync.WaitGroup, id int) {
			log.Println("Starting work goroutine...")
			defer waitGroup.Done()

			for {
				/*
				 * Listen on channels for message.
				 */
				select {
				case _ = <-shutdownChannel:
					log.Printf("Received shutdown on goroutine %d\n", id)
					return

				default:
					time.Sleep(time.Millisecond * DELAY)
				}

				// Do some hard work here!
				fmt.Printf("[%d] working...\n", id)
			}
		}(shutdownChannel, waitGroup, i)
	}

	/*
	 * Wait until we get the quit message
	 */
	<-quitChannel

	log.Println("Received quit. Sending shutdown and waiting on goroutines...")

	for i := 0; i < maxGoRoutines; i++ {
		shutdownChannel <- true
	}

	/*
	 * Block until wait group counter gets to zero
	 */
	waitGroup.Wait()
	log.Println("Done.")
}

Without adding code to trap signals, we run the risk of terminating our program non-deterministically and potentially leaving data in a corrupt state. It’s definitely a good practice in any production system to mitigate the likelihood of any such invalid state by merely adding code similar to the demonstration above.

comments powered by Disqus