The author is Chief Scientist at graph database platform leader Neo4j and Visiting Professor at Newcastle University.
IT has become one of the worst contributors to greenhouse gas (GHG) emissions that are rapidly accelerating global warming. There are understandable reasons for this. The amount of infrastructure we need to run the modern world is high, and its power consumption is growing at 2% annually (albeit from a slowly greening grid).
There are fewer good reasons behind this trend, too. We have grown far too profligate in our reliance on scalable cloud technologies and forgotten some basic computer science.
Fortunately, there is hope for rebalancing in the form of a bit of rethinking and the adoption of more efficient software architectures in the cloud. By slimming the need for so many servers—and trimming electricity consumption—we can make a big difference to our power consumption metrics. And given that enterprise IT accounts for 1.5% of the planet’s total energy consumption (even discounting wasteful systems like Bitcoin), savings are very noticeable for both a company’s bottom line and the planet.
It’s been estimated that in 2020, data centers were responsible for producing 300 metric tons of CO2, which is equivalent to 0.6% of total GHG emissions or 0.9% of energy-related GHG emissions. So if we want to address this issue, a logical starting place is, “Why do we need so many data centers in the first place?”
Cloud has solved many problems, but created new ones.
The answer is the cloud—an approach that involves moving IT provision away from a company’s own hardware and computer rooms to vast, centralized server factories. As a user, you rent computing resources as you need them. The problem is that since the transition radically reduced the costs associated with supplying access to software applications and storing corporate data, many CIOs started to think these services were effectively infinite. With the cloud, any workload can be processed at scale in a way that wasn’t possible for the physical data center.
During my early days as a programmer, the situation was very different. We were well aware of the limitations of data storage and available RAM for our code, so working within the constraints of these systems became a mark of professional pride. Our concept of elegant code didn’t involve instructing the computer to navigate through countless paths to find a solution. Instead, our aim was to write code that frugally applied an algorithm in an efficient manner. This enabled it to get to the answer within a reasonable timeframe without overwhelming the system’s resources.
Times have changed. Spinning up 1,000 servers and powering up Apache Spark on them to solve a problem isn’t uncommon. In some cases, it’s the right thing to do, but in other cases, we’re using brute force to solve the problem that might be better expressed as a carefully thought-out, single-threaded operation running on a laptop.
To me, that has always felt very wasteful in terms of resources. While big data processing is undoubtedly a powerful tool, employing it indiscriminately as a battering ram for every problem is inefficient. And every server still consumes power, be it on-premise or in a Google data center.
Why default to IT brute force?
As a CIO, you need to select the right tool for the job. It’s worth acknowledging that there are IT problems—even at the highest enterprise level—that can be solved more efficiently on a basic laptop with the correct software as opposed to relying on vast cloud infrastructure. And one can bypass the complexities associated with managing and maintaining a 1,000-server environment, including all the power and hardware purchasing and overseeing that takes and the subsequent greenhouse gas emissions it entails.
I suspect there is a change coming with the overuse of the cloud anyway, independent of environmental concerns. CIOs are receiving ever larger bills from hyperscale cloud providers, which is prompting them to realize the cloud is not at all a “free” solution. Cloud providers use their economies of scale to offer discounts to rationalize CIO’s spend. However, this discounting often encourages an unvirtuous cycle where the default response to every problem becomes deploying a 1,000-server solution.
But there are lightweight and far less computationally expensive alternatives available for processing data. And this isn’t about relying on some groundbreaking new super chip but exploring data engines that tackle problem-solving in smarter, less brute-force ways.
A Thousand Times Smaller
One example is the experience that Adobe, a customer of Neo4j, had while developing Behance, a specialized social media application for the creative industry. Despite attempting two different architectural approaches that required dozens of servers and datasets between 20 and 50 terabytes, neither of them successfully delivered the desired functionality for the application. It took a fresh approach using a native graph database to help efficiency and significantly reduced infrastructure requirements. It needed three servers and operated on a dataset size of only 30 to 50 gigabytes of memory. The system was a thousand times smaller than the bigger system they had been grappling with.
Using more efficient (and therefore greener) software architectures is not just something that can be done with databases. For example, the LMAX Group, a London-based financial technology company operating multiple institutional execution venues for electronic foreign exchange (FX) and cryptocurrency trading, has built a high-performance trading platform following these principles. While the typical way to build such an exchange is to have a highly concurrent design, this organization instead decided to streamline the critical parts of the system to allow its platform to work at lightning-fast speeds on a single thread. Most computer systems engineers these days would have addressed this with a lot of clever (but more computationally expensive) concurrency patterns. But in this case, as in many others, specialized and focused easily wins over generic
In Conclusion
Am I saying that any single technology solution solves all your problems? No. What I am saying is that the only companies in the world that can afford to solve all their IT problems with gargantuan data processing and computing resources are the web hyperscalers.
Most of us aren’t Google. So one way to help your bottom line and the planet is by applying some computer science. Your CFO will thank you, as will the planet.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?
Read the full article here