It's been two months since we've upgraded our entire infrastructure.
In 2018, two years after our last huge upgrade, we noticed that some of our software packages, tools and even the operating systems on both of our servers would enter EOL (End-Of-Life) state in spring 2019. This lead to extensive planning and discussions in a joint working group for months, mainly so that the next upgrade would provide future stability including improved workflows for our developers. I remember back in 2016 when upgrading both servers was quickly done because our backups contained data of just one year (August 2015 - August 2016). But after 3 years and 3 months (08/16 - 11/19) you can imagine how many backups and how much data would need to be taken care of.
Meanwhile members of the joint upgrade taskforce suggested that we could also upgrade our development workflows, using the latest PHP Version, supporting Docker and moving our VCS (Version Control System) from SVN to git. Not only would this improve our overall performance on development, but also help us reduce bugs, and offer new modern services as well.
The Limelight SCM Application
And this is where the upgrade journey started in April 2019.
Searching for a suitable git hosting provider was a no-brainer and we went with Github.com. Our HoS (Head of Security) Faustie set up a GitHub organization and I started working on a source control management application, Limelight SCM, that would provide continuous deployment (CD) for our web services. Doctor Internet then migrated these services to git and we started using them in production.
To aid our developers in upgrading their PHP Code to support PHP 7.x+, we added a tool to our SCM that would perform sanity- and compatibility-checks.
Our SCM application is unique in such a way that it offers:
- Namespace Management (folder routing to subdomains)
- Branch Management (production, development, feature, fix)
- Deployment Statuses / Reports
- Dealing with unstaged changes
- Support for external tools (Composer, PHP7 compat, ..)
- Protection of deployments (.htaccess, webserver, ..)
- Integrated namespace blacklist
- Permission management for deployments
- Adapters for extending/configuring the SCM
For instance, this is our .scmconfig adapter configuration:
Before we were able to start upgrading, we had to carefully make backups.
Those backups contain the databases, game servers, teamspeak and other system utilities.
The Beta Server
Until Nov 11th, the beta server lived in its own SVN repository, and a couple of changes had to be made for supporting git. Instead of working on two repositories (Live and Beta), we decided that with git we wanted just one main repository with separate branches:
Master Branch = the production-ready code (live server)
Development Branch= represents the beta server
Goodbye Beta Server.....for now
Those changes ultimately resulted in archiving the beta server to its own git repository, as we were unable to simply merge the beta code to the new development branch of the main repository. Both repositories had different commit histories.
Hello again, Beta Server!
Instead, the new beta server was created from the development branch of the main repository, and we started backporting code from the beta repository to the development branch of the main repository.
Upgrading the DS1 server
Now we were able to start upgrading our server DS1(EU) that is primarily responsible for running the gameservers and its databases. The installation was quickly performed until we ran into some difficulties when booting our secondary NIC (Networking Interface Controller).
After booting into rescue mode - investigating, we quickly found the problem. Our secondary NIC was using an identifier that was not yet configured on the new freshly installed OS.
So we did that, brought the secondary NIC up and it was just fine.
Game servers back online!
After retrieving our backups, installing the database server and additional dev services, we focused on getting the game servers running.
This included testing and tackling some issues on the beta server.
Yes, we spent the whole day and night (from 11th Nov 11am to 12th Nov 7 am) on upgrading the DS1 server.
Configuring our new development environment
We kept monitoring the game servers until we started setting up our dev environment. For that we configured Githubs CI/CD tooling, and as we were ready to test it, suddenly the whole thing was unoperative. Luckily many other companies/developers reported the same issue so that Github was pressured to fix it.
Four hours later, our CD pipelines (deployments) started deploying code.
On Nov 13th at 2 am, our developers were able to use our new development environment, thus pushing updates.
By 1 pm, Doctor implemented a proper git configuration (.gitignore, Code permissions & Co.) and added his gamemode linter to our automated workflows.
At that time the linter reported over 8200 warnings...not bad actually.
This helps us reduce error prone code in the future.
Commit Dispatcher & DB Processor
Now it was time to somehow get our commit messages into the changelogs database, on which our changelogs service relies on. To keep things organized and in one place, I decided to re-write my old changelogger NodeJS application, this time using a scripting language called Bash-Scripting.
Meanwhile Temar enrolled for some lessons in linting.
Later I realized how using database-insertions in Bash Scripting was not safe enough, so I had to choose another route. I decided to write a PHP CLI (Command-Line-Interface) Script, and split the application into parts -
the Commit Dispatcher and DBI (Database Insertion Processor).
Forums -> Github issues integration
At the same time Doctor was working on migrating our gamemode issues boards to Github. To make things even prettier, he proposed writing an integration that would link our forum bug-reports directly to our Github issue boards, making it easier for developers to work on fixing bugs. He started working on a bbcode-markdown-converter to convert forum tags (bold, italic, etc) to Markdown format.
And one hour later he presented first results:
Bambo lives in the future
Bambo and Temar were one of the first devs to publish an update using our new dev environment. It was not long before Night noticed a bug in the changelogs viewer/in-game loading screen:
With git we started using a more accurate time tracking that includes timezones.
Our old SVN timestamps were one hour behind to accurately match the timezone settings of our DS1 server. With Doctors addition to support timezones in the changelogs-viewer, we also went with using the deployments execution timestamp + 3600 seconds (1 hour) to get the most accurate time for the changelogs.
Upgrading the VPS1 server
Now that the DS1 upgrade was completed, we focused on getting it done on VPS1 - our web-server.
On Nov 15th around 1 pm, we created some last-minute backups and started the procedure. The setup was done 3 hours later, and this is where the journey starts.
Switching to the latest PHP 7 version was much harder than expected.
Even though we had been using using our SCM compatibility checker for months, many services did not successfully pass the switch.
It took us 7 hours (11 pm) to manually search and fix these services.
By Nov 16th at 2 am we completed upgrading/testing our forums and started backporting our ACP (Admin Control Panel) and Gamepanel to support our latest PHP version.
By 4:30 am in the morning the
nightmare fun was over and we sent the entire web-server back online.
During the day, Doctor completed his Forums -> Github integration while also fixing the Limelight API.
Since then we....
- we fixed some broken deployments
- improved our development environment
- added documentation scripts to our CI/CD workflows
- brought back our in-game update announcer on Dec 5th
- published the bbcode-markdown-converter as an open source project
- published the DB processor as an open source project
- and more....
Special thanks to all involved and you for sticking with us!
Discord Channel: https://discord.gg/4vxZhdg