Managing the complexity of software, infrastructure and computer systems

I woke up round 3am the other morning - thinking about the way that software systems impacted the infrastructure at work.  In a previous life this point-of-view would have eluded me.  Why?  Because I was a software developer plugged into a project which didn’t include it’s associated operations and system infrastructure. Six years ago, my dissertation focussed on impacting the quality of service (routing) of a pier-to-pier network.  A fundamental part of this involved “taking a really good look” at your environment. Winding forward to today, the same task should (I feel) be part of a project which will impact systems. Systems are (often) big, and complex, with “bits” hidden away behind web services, sockets, or manual process.  Should the project need to know about every single system component? No of course not.  But the project should have an interest in gaining visibility on how (they will impact adjacent systems) for example, the number of request/response “put-through” a web-service.  I say project, as the task of what web-service to select (for specific functionality), should not “be solely up to the developer writing the code” (it often is precisely that individual). Visibility may involve “the owner” of the web service stating  how many transactions can be “put-through”, or the owners of the “system behind the web service” answering how many transactions/queries/widgets can be processed.  Any system should have an operating tolerance - exceeding it is not usually a good thing. Returning to our developer - have they asked the question in the previous paragraph?  If they have an interest in “systems” - they will have (tried), but may have hit a wall.  The owner of the “back-end” system, the project manager (scope), or the design-authority/architect may have stonewalled this request-for-information. If you are designing a car, and you overload the electrical system (too many gadgets needing too much juice) you are going to (in theory) know the tolerance of this system (car), and know how much you can “push it”.  If car manufacturers can do this, why can technology projects not do the same? I love fast, light-weight, distributed systems, with a high through-put for whatever is using it. Watching a proxy / log / request-response traversing the system is cool.  Efficiency is so important.  Putting up with bloated systems (especially as you watch the logs grind) fills me with desperation. Perfection is not being sought, rather a pragmatic view on how a project will impact the technology real-estate which is already “in-place”.  One thing you should not be doing is increasing long-term technical debt (complexity “and the management” of software / system / infrastructure). If you are doing this, because you are introducing new technology “as well as” keeping existing systems - there should be provision for additional resource (operations, the owners of the systems “which you talk to”, and anyone / thing which keep the lights on).  Capacity and “known through-put” (should help) in managing this provision. Has there been an assessment of how long you are going to “have to manage” the technical debt (system complexity), and is it part of your project plan (being noticed, and acted upon).  The acronym KISS (keep it simple ……) does not exist for nothing.  Or is the project focus “just” on the delivery of specific features, on-time. This management of system complexity could be managed by technologists, project managers, business sponsors….. or you  could just blame the developers for writing code which smells….. :O/

Author | Miles Davenport

Career programmer, who designs, assembles, fixes, and supports customers, software and systems.