A basic overview of concurrency, in Go

Posted on May 27, 2021

Gophers

This article aims to explain Concurrency in a very thorough way. Buckle up, this might be a long one.

According to Wikipedia, In Computer Science, Concurrency is the ability for different parts or units of a program, algorithm or problem to be executed out-of-order or at the same time simultaneously partial order, without affecting the final outcome. This allows for parallel execution of the concurrent units, which can significantly improve the overall speed of the execution in multi-processor and multi-core systems.

This is a lot of technical talk. To dumb it down a bit, It simply means executing multiple tasks at the same time but not necessarily simultaneously. This essentially means that two or more programs can start, and complete in overlapping time periods, this doesn’t always mean that these programs will be running at the same time instant. It is important to note that Concurrency is often muddled up with another important concept in Computer Science called Parallelism.

Parallelism occurs when tasks literally run at the same time, This means that even if we have two tasks, they are continuously working without any breaks in between them. e.g on a multi-core processor.

To draw a little bit of a relatable context, let’s think of having to do two chores within a limited time frame, one of these chores would be to do laundry using a washer and dryer system, while the other would be to clean your space. A good way to go about it with the time constraint in mind would be to load the washer and get started with cleaning, and then, periodically check on the washer to know at what point it would be ready to be loaded into the dryer. Here, both tasks would be making progress, not necessarily simultaneously but at the same time. This can be likened to Concurrency.

Another way to go about these tasks would be to delegate both tasks to two different people, one being solely in charge of the laundry process while the other being solely in charge of the cleaning. This way, both tasks are literally running simultaneously and somewhat independently. Similar to Parallelism.

According to Rob Pike, Concurrency is about dealing with a lot of things at once, while parallelism is about doing a lot of things, all at once. Concurrency is majorly about structure, while parallelism is focused on execution.

How does Go handle Concurrency?

To fully understand how Go handles concurrency, we need to understand a few very thoughtful techniques the Go developers have employed for an efficient execution;

Goroutines: A goroutine is a function that runs independently of the function that started it, kinda like threads but they complete processes without blocking the main thread. Goroutines are cheap and lightweight.

Goroutines are created by adding the go keyword right before we call a function, let’s take a look at some sample code;


package main
import (
    "fmt"
    "time"
)
func printI(n int) {
    for i : = 0; i < n; i++ {
        time.sleep(time.Second)
        fmt.Println(i)
    }
}

func main() {
    fmt.Println( "This doesn't allow our goroutines a chance")

    go printI(10)
    go printI(10)
}

the main function completes, thereby not giving goroutines a chance to run.

Here we have created two goroutines to run the same function therefore our output should be predictable, we would expect to get the same output twice(not necessarily sequentially) since they’re asynchronous functions. However, If we run this in our terminal, we would get a response that looks like this:

:~/Desktop/Projects/concurrency$ go run main.go
This doesn't allow our goroutines a chance
:~/Desktop/Projects/concurrency$

Output 1

We would notice here that only the print statement fmt.PrintLn() in the main function gets to complete. This is because the main function completes before the goroutines even get a chance to run and therefore the program exits. Now there is a “hack” to handle this in order to give our goroutines enough time to run before the main function completes.


package main
import (
    "fmt"
    "time"
)

func printI(n int) {
    for i : = 0; i < n; i++ {
        time.Sleep(time.Second)
        fmt.Println(i)
    }
}

func main()
    fmt.Println("Now this allow our goroutines a chance")

    go printI(10)
    go printI(10)
// we scan fmt for input and print that to our console
    var input string
    fmt.Scanln(&input)
}

Our goroutines now have some breathing space!

The significant difference here is the fmt.ScanLn(&input) declaration which is waiting to receive an input from the command line thereby stalling the main function from completing in order to give our goroutines enough time to execute their processes completely. Now we get an output that looks like this:

:~/Desktop/Projects/concurrency$ go run main.go
Now this allows our goroutines a chance
0
0
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9

Output 2

In real-world use-cases of goroutines, this “hack” wouldn’t hold up, which leads us to talk about a more efficient way of handling this.

WaitGroup: Waitgroup or sync.Waitgroup(because it is a part of the sync package), going by the name, waits for a group of goroutines to finish executing. To better understand this, we would talk about three of the methods the Waitgroup exports;

Add(int): Add(int) increments the wait group counter by the integer passed in as an argument.
Wait(): Wait() ensures that the execution of our main function is blocked until until all the goroutines in the waitgroup have successfully been completed. It essentially blocks until the waitgroup counter is zero.
Done(): Done() decrements the waitgroup counter by one.

package main

import(
    "fmt"
    "sync"
    "time"
)

func printI (waitqroup *sync.WaitGroup, n int) {
    for i : = 0: i < n: i++
        time.Sleep(time.Second)
        fmt.Println(i)
    }
    
    waitgroup.Done()

}


func main() {
    fmt.Println( "Begins Executing")

    var waitgroup sync.WaitGroup
    waitgroup.Add(2)
    go printI(&waitgroup, 10)
    go printI(&waitgroup, 10)

    waitgroup.wait()

    fmt.Println ("Done Executing.")
}

waitGroup at work.

Taking a close look at the code snippet above, it does the same things as the previous ones, but handles blocking more efficiently. I’ll explain. On line 10, we notice that the func printI takes in a pointer to waitgroup as an extra argument, after which the Done() method is called once the function is done executing, so as to decrement the waitgroup counter by one(itself).

Moving to the main() function, we notice here that before calling our goroutines, we initialize a variable of type sync.waitGroup, this would essentially help us call other waitgroup methods. Moving on, we call the Add() method to increment our waitGroup counter by two because we would be initializing two goroutines, after which we initialize the goroutines and pass in a pointer to the waitgroup variable so that it communicates with the Done() method in the printI() function.

The next step is to call the Wait() method to hold off on the execution of the main() function till the waitGroup counter drops to zero which would mean that our goroutines have successfully been executed completely.

Running this code, the output looks like this;

:~/Desktop/Projects/concurrency$ go run main.go
Begins Executing
0
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
Done Executing.
:~/Desktop/Projects/concurrency$

waitGroup output

Now, the goroutines have been efficiently tidied and the program exits since the main function is done executing, no strange hacks employed.

3. Channels: Think of channels as pipelines for sending and receiving data. Channels provide a way for one goroutine to send structured data to another. They are an effective way to wait for an output(essentially, data) before ending a program.