Test mechanisms for best practices in cloud design
Best Clouds
An architecture review is a good way to determine whether best practices are being followed in cloud design and, if so, whether your design or approach carries risks. As a cloud provider, Amazon Web Services (AWS) has developed data-based practices that are reflected in the structure and questions of the Well-Architected Framework Review. Over the years – the framework has been in place since 2015 – AWS has formalized matching test mechanisms. A questionnaire helps you apply this uniform compendium of best practices and lets you explore how well an architecture aligns with best practices for the cloud. Additionally, the framework provides detailed guidance on how to eliminate any vulnerabilities you discover.
The goal of the Well-Architected Framework is to achieve a good application design in the cloud that is appropriate for the application's purpose. One such principle, for example, is that of developing environments in a data-driven way. The aim is to measure the influence of architecture decisions and, accordingly, base further development of the architecture on the facts by collecting metrics for the change process. Each new version of the architecture provides new data points on which organizations can build a continuous evolution process. These data points can then be used in a targeted way to implement improvements.
Six Pillars
The AWS Well-Architected Framework is based on six pillars that look at different aspects:
- Operational excellence
- Security
- Reliability
- Performance efficiency
- Cost optimization
- Sustainability
Operational excellence relates to the design and monitoring of the systems provided. The goals are to generate genuine added value for the business and optimize continuously processes and procedures. Important aspects include automating changes, handling disruptions to operations efficiently, and defining standards for managing day-to-day operations. Building on these aspects are effective organization of teams and promoting innovation.
Security relates to the security of information and systems. Key areas include confidentiality and data integrity; rights management, including defining and managing individual permissions; protecting systems; and establishing controls to detect security incidents.
Reliability focuses on ensuring that the workload performs its intended function correctly and consistently at the right time. A fail-safe workload ideally recovers quickly from outages to meet business requirements. Important aspects include a distributed system design, recovery planning, and change handling.
Performance efficiency revolves around the efficient use of IT and computing resources. Key issues include selecting the right resources according to the workload, monitoring performance, and making informed decisions on maintaining efficiency as the business grows.
Cost optimization aims to avoid unnecessary costs. The key to success is to establish cost visibility, define budgets, and analyze expenses and optimize accordingly on a regular basis.
Sustainability is concerned with minimizing the effect that using cloud workloads has on the environment. The key issues include a shared responsibility model for sustainability, understanding environmental impacts, and making the best possible use of existing resources to reduce the effect on the environment.
An Ideal Review
One cornerstone of a successful Well-Architected Framework Review is defining from the outset a concrete scope with a common understanding for the workload to be considered. Making this scope explicit ensures that you can involve the required knowledge holders in the review while marking clear boundaries to other systems. This effort significantly reduces the likelihood of delays during and after the review because the critical workload information can be provided in the course of the meeting. Also, the volume of components to be considered will not grow.
Inviting the business and technical experts to the review meeting is also important because it is the only way to consider all aspects of the workload and to define improvements that result from the discussions held. Although covering all addressed areas of the Well-Architected Framework Review with the participants is elementary, it is equally important to limit the number of participants: A group that is too large can cause important discussions to get out of hand and make the conclusions of the review difficult to implement in a reasonable amount of time. Less is often more.
Discussions in the review meeting are intentional and valuable. Often, such a meeting is one of the few occasions that unites all relevant views. However, it is important to avoid slipping into unproductive and detail-obsessed discussions. Active moderation will help everyone work toward a common goal. Experienced individuals, such as cloud architects who are not directly involved in the workload and can therefore guide others through the review in an unbiased manner, are often the best choice for moderators.
Make a note of any arguments and additional issues that occur and take them up at a later time. If it is not possible to agree on a common answer to a question, you need to record that fact, too. Often discussions of this type during the review reveal information gaps that can be closed afterward.
Concerns from the Team
Before the review takes place, you should define the purpose for which the review is being carried out. The motivations can be manifold. For example, the team responsible for the workload might want to analyze for itself what vulnerabilities the current workload architecture has. Weak points can often be analyzed quickly in the process. Even in this case, though, you need to make sure nobody involved is afraid to admit their mistakes or is tempted to defend their work or tries to present it in a better light. The easiest way to achieve this is to clarify in advance that the results of the review will only remain within the circle of those directly involved.
Another motivation could be to document the vulnerabilities found and the overhead required to eliminate them, so that additional resources, such as staff or time, can be requested. If communicated clearly, the results are likely to be good. However, it can be very difficult when a review is requested externally (e.g., by management) or is motivated by the desire to provide data points for an external audit. You need to put a great deal of effort into creating an environment of trust, wherein problems can be addressed openly, despite it being a potentially uncomfortable situation. The moderator of such a review needs to question the responses even more critically to achieve the goal of obtaining realistic results from the review.
One important aspect is to orient the participants' mentality. An identified shortcoming – even if it is a high-risk item – does not mean a failure but, instead, an opportunity to make an improvement. It is good to clarify this point in the review itself and not after the risk has occurred, possibly provoking a loss of production.
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.