Maya is VP Product at Helios, a dev-first observability platform that helps dev and ops teams improve the reliability of distributed apps.
As an engineering leader, your work facing the challenge of scaling microservices deserves a moment of appreciation and gratitude. It means you’re part of a team that’s building a product gaining traction and users. Great job!
However, approaching this task without a firm strategy might turn your entrepreneurial triumphs into debris. The inherent complexities of a distributed system are further magnified and create a massive challenge in terms of observability and control.
Industry leaders like Amazon and Netflix have mastered the art of managing large-scale microservices. With over 700 microservices, Amazon processes an impressive 18 orders per second. Netflix, managing the content preferences of more than 200 million subscribers, operates nearly 1,000 microservices.
While there’s no one cookie-cutter solution that works for all teams, products and markets, there’s a lot to be learned from the shared industry experience when scaling microservices. Let’s have a look at five pillars for architecting growth that can serve as a navigational guide for the journey ahead.
1. System Architecture
Scaling your system architecture is a really fascinating engineering challenge. There are two main approaches to scaling microservices: vertical scaling, which involves adding more resources, or horizontal scaling, which involves distributing the workload across multiple servers.
YouTube, for instance, implemented a horizontal scaling strategy in response to massive user growth. YouTube developed Vitess, an open-source database clustering system for horizontal scaling of MySQL. This approach has also been adopted by other notable companies such as Slack and Hubspot into their tech stacks.
Alternatively, a different trigger for scaling can be serving a large customer—and for this use case, vertical scaling might be more appropriate, simple and cost-effective.
Understanding the scalability route, whether it’s vertical or horizontal, sets the stage for more nuanced architectural considerations like service dependencies and shared resources such as databases.
Regardless of the selected approach, as the number of microservices increases, managing service discovery becomes crucial. Having a centralized, single source of truth service map, as well as an up-to-date API catalog, is critical so all developers on your team can efficiently locate and communicate with different microservices.
2. Observability
As systems scale, their increasing complexity can obscure end-to-end visibility, creating hurdles for various teams across the company (engineering, support, product, CS, etc.).
Powered by the emerging industry standard OpenTelemetry, distributed tracing has become the key to microservices observability. Distributed tracing is a potent technique that provides invaluable insights into the journey and context of requests across various microservices within a system. This insight equips teams with a centralized view of the entire applicative flow—in addition to traditional performance metrics.
By selecting the right distributed tracing tool, your teams can:
• Set effective monitoring and trace-based alerting mechanisms.
• Keep MTTR short, despite the changing architecture and complexities.
• Perform bottleneck analysis.
The key to applying observability effectively is to make sure all components and services are instrumented, across the entire tech stack of the distributed application.
3. Cost
Scaling microservices, particularly in a cloud environment, requires careful financial analysis to avoid unforeseen expenses. These can range from the cost of specific cloud resources, such as compute power, and storage to potential data transfer charges and the expenses associated with utilizing high-level managed services.
This topic has become such a widespread challenge that there’s a dedicated market segment of companies trying to provide cloud cost optimization solutions.
To efficiently manage costs, consider approaches like:
• Right-Sizing: Matching your resources to the actual needs of your workloads.
• Utilizing Reserved Or Spot Instances: Committing to a specific instance type over a longer duration for reduced costs.
• Implementing Auto-Scaling Strategies: Dynamically adjusting resources based on demand, like Netflix’s Scryer.
Moreover, proactive resource usage tracking and billing monitoring are crucial. Utilizing cost management platforms for regular monitoring can forestall unexpected expenses and boost resource efficiency.
4. Team Topology
Scaling transcends technical boundaries and should also drive organizational transformation. The organization architecture, just like the system architecture, is a dynamic organism that is supposed to serve the business as best as possible.
Concurrent with the technological challenge of scaling microservices, team topology might need an update too to enable sustainable growth. Teams should be cross-functional, consisting of developers, testers, DevOps engineers, product, design, data and other vital roles. This diversity enhances collective problem-solving and expedites repair time. It also sets the ground for scaling the teams in an efficient manner.
Organizational structure could be hierarchical or take a more agile, bottom-up approach where developers fully own their microservices. Amazon’s “two-pizza teams” concept is an excellent example of an agile approach. Each team should be small enough to be fed with two pizzas, with full autonomy over their microservices. The selection of approaches is driven by organizational DNA.
Part of scaling also shifts the focus to fostering a culture of open communication and collaboration across teams. Streamlining internal communication becomes crucial to make sure all teams move together in the desired direction.
5. Governance
Lastly, governance is a critical aspect when scaling microservices. New processes and infrastructure need to be put in place to ensure that the organization maintains control, compliance and consistency while allowing for the flexibility and autonomy of individual microservices teams.
Ultimately with global expansion and serving additional territories, you may need to ensure there are practices in place to ensure compliance with varying privacy requirements and regulations.
Governance considerations should include service guidelines, service lifecycle management, API specifications, change management, security, privacy, compliance and training and education.
Change Is The Sign Of Success
Scaling microservices to support growth is a sign that your business is poised for the next level. Drawing from industry pioneers can help you architect a resilient, scalable system.
“The need to change our systems to deal with scale isn’t a sign of failure. It is a sign of success,” writes Sam Newman in his book Building Microservices: Designing Fine-Grained Systems.
As a business grows, its product gains traction and its user base increases—the engineering organization should successfully scale with it as well.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?
Read the full article here