“There’s been something wrong with our Go server for the past 2 years…”, my colleague says.
My mind shuts off in anticipation of what’s to come.
“We’ve had several devs investigate this issue from our backend lead to various members from other backend teams but still with no luck.”
What could this be?
“It’s a memory leak and it causes constant crashes. Our users get kicked off of the websockets each time and they can’t view any content for some time. Could you check this out?”
I accepted the challenge, committing to solving not only this leak, but all other Go service leaks.
Global State
My initial hunch was to check for global state across Go modules. To my surprise, this was a dead end. There has to be something growing across requests…
What did my co-worker mention about the service that involves opening and closing resources?
Websockets. Those damn websockets.
Each websocket connection opened up a new time.Ticker which is used in part to send out heartbeats to each connection.
Sounds fine, right?
One little problem: once you no longer need a time.Ticker, you MUST call the .Stop() method on it or else it will never be freed.
It’d be ideal if Go would reject programs at compile time if a struct is missing a respective method call to close a resource.
The memory increases a slower rate with my tweak but restarts still happen at least once a day.
I’m not done just yet…
Slices
Array slices seem pretty harmless. You create them, index them, create sub-slices and they work well.
The service also kept expanding a slice like the above, but with pointers still referencing older structs that were no longer needed.
Once I fixed this, the memory leaks were gone and restarts were a thing of the past.
pprof the last hope
As an honorable mention, pprof is a tool that helps us check out memory dumps of our Go programs. I didn’t need to use it this time but it can really come in handy when you’re not sure where to look.
We can spin up our web server locally, hit endpoints or other seams of our program to the real world and see how and where memory is being allocated.
TLDR: Find Go memory leaks fast
Check for global state across Go Modules
Check for immortal time.Ticker’s
Check for slices
When in doubt, pull out pprof & test all endpoints and seams with the outside world to investigate the memory dump