Creating and operating IT infrastructure is problematic for any number of reasons. What can you do if you don’t want to entrust the stability and reproducible, reliable operation of your system to an 'irreplaceable' administrator? Don't worry, there is a solution. It's true that you still need to know how to program, but that's not exactly an unrealistic expectation in this profession.
But exactly what is IT infrastructure anyway? According to Wikipedia, it is the "set of information technology components that are the foundation of an IT service". Of course, you might then ask for the definition of an IT service... However, the purpose of this article is not to split hairs, but to explore the term in the title. In any case, for the purposes of this article, we accept that the infrastructure is all that is 'behind' or 'at the service of' the system in question. In the conventional sense, this includes hardware, software, networking and any other tools needed to develop and operate the service.
It is important to clarify this, because the subject of our examination, 'programmable infrastructure', is a little more difficult to grasp than the simple word 'infrastructure'. Nevertheless, now that we know what infrastructure is, we can discuss what it means to say something is programmable. We are talking about a program, a set of data and instructions that a computer can execute to solve a particular problem. If something is programmable, it means we are able to – or at least have the opportunity to – create a program that solves that problem. This reasoning already suggests that infrastructure is problematic. Anyone who has ever been involved in installing or operating a computer system knows this to be true. Creating and operating infrastructure is always a challenge. To facilitate this task, a technique was created whose name forms the basis of this article.
In the traditional approach, you take a computer, install various basic pieces of software on it, as preparation, and then finally place (install) the system in this nest. So what's the problem? It might sound fine, and yet the more you think about it, the more you realise that there is more than one issue with this process. You need a person who knows what they are doing. Yet such a person is expensive, especially if they're good at what they do. Also, one person is not enough, because what happens if this one person is sick or – for whatever reason – is unable to work. So you end up with several people, and the viability of your system depends on their knowledge. Their knowledge, contained in their head...
'Documentation!' shouts the experienced IT manager at this point. However, we all know that the degree of maintenance given to documentation is directly proportional to the declining momentum of the project. And its momentum will decline. The beginning sees the creation of all those very nice, lovely-looking documents worthy of a dissertation. Then the update is delayed, until finally the whole thing becomes rather unclear, and only the only administrator knows in which directory of which server the magic spells to restart the system are hidden. Not to mention the fact that sometimes you need to replicate systems in an identical way so that you can test or look for bugs, reinstall in the event of – heaven forbid – a machine failure, or train new users in a non-live environment so as not to disrupt live operations. In this case, again, we are worrying about the dubious documentation and the irreplaceable system administrator, and how well they can make a replica from their head (or the documentation) that is close enough to the live environment.
Unfortunately, people are weak and make mistakes. We don't always remember everything, we don't perform the same task twice in the same way, and the documentation is in most cases incomplete (because people are weak and make mistakes!), so your efforts to replicate the system are usually limited. You will never have two identical systems – no matter how nice that would be. After reinstallation, the new server will not be the same as the old one. In this case, your software, the infrastructure we are discussing, is rightly put out and does not perform the expected operation, yet no one will be able to tell you exactly why. You look for the cause for days, in some cases for weeks, you burn a lot of time (money) to debug, and even if the cause of the problem is found and fixed, you still can't sleep soundly at night, as there is no guarantee you won’t have the same issue next time. A concern...
'Backup!' your IT manager might continue, who is experienced in ploughing through and confident in his bonus. So you make a backup and then restore it to another environment to reproduce the failed system for testing, instruction, or for any other reason. A nice thought, almost romantic, though not particularly effective. Both backups and restorations are usually time-consuming and resource-intensive processes and, as such, are typically designed when you need to recover from a disaster, not for serving as the basis for running automated tests to verify the correctness of new developments multiple times a day. Just as an example, I intentionally didn’t include the issues of load-dependent scaling and resource optimization in the story, even though the topic touches on them as well.
So what can you do?
Nowadays, in the agile cloud age, quite a lot. How much more beautiful would life be if we created and maintained our systems and their infrastructure not by hand, but also with documentation, stored in a version manager, audited, reused, automated, using declarative programs? Much more! Goodbye Irreplaceable Administrator, hello Programmer DevOps Infrastructure! The IT sector has long recognized the potential of this, and with the rise of cloud services, most infrastructure services are now available on programmable interfaces. What's more, this is not only common in large cloud providers (AWS, Azure, GCP, ...), but also in-house virtualization solutions (OpenStack, VMWare, ...)
Sure, you need to be able to program for it, but in IT that's not such an unusual requirement.
You still need to create systems, just by writing and executing programs. It is as if you were to entrust the tasks of a bricklayer to a 3D printer, which is controlled by a guiding blueprint. In this case, the bricklayer no longer lays bricks, but handles a 3D printer, and the architect does not draw with ink on paper, but works in ArchiCAD. A brave new world...