Cloud Governance: Concepts & Best Practices
Cloud governance is about mitigating risk and ensuring smooth cloudops for enterprises. It achieves this by setting out the rules, policies, and procedures to be followed by your organization when running services in the cloud.
It is likely that your organization already has an IT governance framework for its on-premise systems that sets out the strategic direction of these resources, considering business risks and justifying expenditure based on the business-derived value. A 2022 Foundry Cloud Computing Study of 850 IT decision makers identified their top cloud challenges as:
- Controlling cloud costs (36 percent)
- Data privacy and security challenges (35 percent)
- Lack of cloud security skills/expertise (34 percent)
For many enterprises, existing IT governance practices are unlikely to address the unique cloud challenges above. This article will review three core cloud governance themes to help enterprises develop their cloud governance strategy.
Summary of key cloud governance concepts
Adopting cloud technologies should address deficiencies in your current IT operations. You might look at cloud technologies to replace legacy on-premises hardware, improve the accessibility of your applications, or spur innovation within your organization. Whatever the reason, your cloud governance strategy should align with your cloud adoption objectives and help to ensure smooth cloud operations. To achieve this end, we will explore the following three themes related to cloud governance.
Concept | Description |
---|---|
Continuous improvement | Cloud and data governance will need to be established, but it is not “set once and forget” – it will need to be reviewed and updated.
Adapting to reduced timescales in the cloud. |
Responsibility and accountability | The responsibilities of your organizational units will change and the “shared responsibility model” shows how the changes affect IT teams.
Adopting cloud technology introduces new risks to the business. Deployment templates for consistent resource configurations. This maintains consistent tagging and consistent naming conventions. |
Maintaining oversight | To measure compliance requires visibility and capturing metrics.
Dashboards help to maintain oversight. Unified dashboards can provide a holistic experience from (hybrid-)cloud environments. Integrating into a commercial IT Service Management (ITSM) Tool and Cloud Security Posture Management (CSPM) tool that supports both on-premises and cloud-based systems. |
Continuous improvement with cloud governance
You should tailor cloud governance to your organization’s specific needs, whilst also following established best practices. When developing your cloud governance strategy, you will need to consider:
- Regulations that apply to your data and your compliance obligations
- The data that you will store in the cloud and how sensitive it is
- Security risks associated with cloud services and how to prevent them
- Costs related to using cloud services and how to optimize them
- Your responsibilities vs. the cloud provider responsibilities
When addressing these points, key questions to ask are:
- How will performance be measured? For example, if you are targeting a specific compliance standard, how will you track and benchmark this?
- How will you respond? For example, what action will you take if a cloud cost forecast exceeds your tolerated thresholds?
It is also advisable to introduce your cloud governance framework incrementally. Starting with the aspects of cloud governance that help your organization achieve its primary objectives. For example, if cost savings are your main objective, focus on cost management early in the process.
Cloud technology continues developing, and most providers make major annual innovation announcements. Continuous review of the suitability of your cloud governance strategy will help your organization remain agile.
Cloud governance models and standards
While not specific to cloud governance, the following governance models and standards are still relevant and emphasize that governance is more about people and processes than specific technologies. Existing governance models and frameworks, including your own, are a useful foundation, and your cloud governance should build on what is already available.
Name | Description |
---|---|
COBIT (Control Objectives for Information and Related Technologies) | Internationally recognized IT governance control framework that helps organizations meet business challenges in regulatory compliance, risk management, and aligning IT strategy with organizational goals. |
ISO/IEC 385[00:20]15
and |
International standards for corporate governance of information technology.
Designed to assist organizations in understanding and fulfilling their legal and regulatory obligations. |
ITIL (Information technology infrastructure library) | A series of practices in IT Service Management (ITSM) for aligning operations and services. |
Cloud governance responsibility and accountability
Responsibility and accountability are core cloud governance concepts. The sections below examine the factors enterprises need to consider related to responsibility and accountability in the cloud.
Shared responsibility model
Adopting cloud-based services changes the responsibilities within your organization and introduces additional risks. Deployment models drive which organization (the enterprise or the vendor) is responsible for a specific domain. The four standard deployment models are:
- On-prem (on-premises) – IT infrastructure hardware and software applications are hosted on-site.
- IaaS (Infrastructure-as-a-Service) – IT infrastructure, like computing, storage, and network, delivered on a pay-as-you-go basis. For example, Amazon Web Services (AWS) EC2 delivers virtual servers, and you are responsible for controlling network access to the servers, patching the operating system, and configuring your application.
- PaaS (Platform-as-a-Service) – PaaS includes IT infrastructure, like IaaS, and also provides databases, development tools, and integrations. Azure Functions is an example of PaaS that provides a serverless platform for running single-task code without you needing to manage the underlying infrastructure. Your primary responsibility is developing and configuring your application.
- SaaS (Software-as-a-Service) – SaaS provides an entire application stack. An example is Google Workspace, where a user can log in and start using the application. Your responsibility is limited to user access management.
The table below illustrates the responsibility split between you and the cloud provider for the four main deployment strategies.
The table shows the “shared responsibility model” for four different deployment strategies. (Source)
From a cloud security perspective, the UK National Cyber Security Centre (NCSC) Cloud security guidance recommends ceding as much security responsibility to your trusted cloud provider as possible. To cede this responsibility requires your organization to trust the provider and your decision will need to be informed by your business objectives and risk appetite. The table below summarizes their recommendations.
Deployment strategy | Recommendation |
---|---|
IaaS | Identify the components in the architecture that could easily be replaced with managed services. |
PaaS | If you are building or hosting an application, use PaaS over IaaS. |
SaaS | If you can find a service that meets your needs, then SaaS is preferred. You can benefit from the provider working at scale. |
Table summarizing the main recommendations from the UK NCSC Cloud security guidance (Source)
You need to evaluate which of these deployment strategies is acceptable to your organization based on your cloud adoption objectives and your organization’s risk tolerance. Then align your cloud governance accordingly.
Resource tagging
Maintaining oversight of your operations requires tagging of cloud resources. A global tagging and naming convention policy should be defined as part of your cloud governance.
Your tagging policy should define the required vs. recommended tags and be compatible with the constraints imposed by all the cloud providers you are using. For example, Google Cloud has a tag key short name that is limited to 63 characters, whereas AWS allows tag keys of 128 characters. In the article, Multi-Cloud Tagging Strategies For the Win, four tags are strongly recommended.
Tag | Description |
---|---|
Application ID | This will help identify the components of an application that may be spread across many services and cloud providers. |
Owner | This will help to identify which department is responsible for this resource and help maintain cost oversight. |
Environment | Use this to identify whether a resource is in production, development, or something else. This can help to automate and prioritize operational tasks. |
Risk | A risk-level tag can help you identify which system contains sensitive customer data or confidential business information. |
Table summarizing the main tags recommended for your cloud resources. (Source)
Automating the deployment of your cloud resources using orchestration templates will help to apply consistent tags. Consistent tagging is the foundation for reliably monitoring resources and costs.
Expertise and automation
In the introduction, a “lack of cloud security skills/expertise” was a major challenge faced by organizations adopting cloud-based operations. Your cloud governance strategy should address any deficiencies in your organization's expertise.
Commercial cloud governance tools often provide orchestration templates to help your organization follow established best practices. Automation by using Infrastructure as Code and pre-defined templates for Identity and Access Management (IAM) can help give your development teams the scope to be nimble, but also provide the reassurance that well-defined security practices are being adhered to.
Beyond security expertise, it may be that awareness of cloud costs is lacking within your development teams. The article, Managing Clouds from the Ground Up: Cost Engineering at Spotify, shows how Spotify empowered its engineering teams to take action to optimize cloud costs. If your organization is implementing a cloud governance tool, it should be able to provide insights, like costs, across your organization. By utilizing an Owner tag on resources, costs can be easily attributed, and teams can track their optimizations over time. This will help foster a culture of financial accountability.
Similarly, the Environment tag could be used to determine appropriate action if excess costs are flagged. For example, it may be possible to automatically de-provision resources tagged for development. This is an example of your cloud governance dealing with the “how you will respond?” question.
Maintaining oversight with cloud governance
Visibility and data-driven decision making are cornerstones of effective cloud governance. The sections below explain how capturing the right metrics enables cloud governance and explore four practical scenarios enterprises may encounter.
Capturing metrics
To measure operational performance and maintain oversight of your cloud operations requires capturing metrics. The major cloud providers provide tools that help monitor your cloud environment, and we summarize these below.
Cloud Provider | Service name |
---|---|
AWS | AWS CloudTrail
AWS CloudWatch |
Microsoft Azure | Azure Monitor
Application Insights |
Google Cloud | Operations Suite |
Table summarizing the main cloud provider tools for monitoring your cloud environment.
These tools provide good integration within their respective cloud provider offerings. However, using multiple tools in multi-cloud environments can become time-consuming, and obtaining holistic insights can involve complex integrations. In multi-cloud and hybrid-cloud environments, it is advisable to use a commercial IT Service Management (ITSM) Tool and Cloud Security Posture Management (CSPM) tool to gain a holistic understanding of your cloud governance and security. These tools can aggregate metrics from multiple environments into one portal from which dashboards can be used to graph trends.
This also means that alerts and notifications can be created based on the aggregated metrics rather than having to define alerts separately. More sophisticated tools can also use machine learning to spot anomalies from a baseline that establishes normal operations. This can direct your operations team more efficiently.
Practical CloudOps scenarios
In this section, we will consider different scenarios that can affect cloudops. This can be a useful aid for thinking about your own cloud governance rules, policies, and procedures.
Security flaw discovered
In 2021, a security researcher reported a vulnerability in the Microsoft Azure Cosmos DB Jupyter Notebook feature. The vulnerability could allow a user to access another customer’s resource. Microsoft handled their response well, and customers using this feature were notified with a recommendation to regenerate their primary read-write keys.
Key questions for enterprises to ask:
- How would you identify whether your teams have been using the Azure Cosmos DB Jupyter Notebook feature?
- How would you replace this service?
- What process would be followed to mitigate the risk by regenerating access keys?
Misconfigured resources
Your cloud governance IT Service Management (ITSM) Tool has alerted you to misconfigured tags.
Key questions for enterprises to ask:
- What action should you take?
- Do you have a clear policy that resources without specific tags should delete automatically?
- Is your organization comfortable with providing a grace period to the resource owner?
Anomalous activity detected
You have utilized a commercial cloud governance tool, and it has utilized machine learning to calibrate your baseline cloud operations activity. It triggered an alert due to anomalous activity.
Key questions for enterprises to ask:
- Does it matter more to you whether this is a security or cost alert?
- How will you prioritize internal vs. external security alerts?
- Can you identify which resources the alert is related to and will this change the response?
- How can you avoid future false positives if your investigation finds this alert false?
New cloud service becomes available
Cloud providers continue to innovate, and new services are continuously introduced. For examples from 2022, see What’s New with AWS in 2022 and Microsoft Ignite 2022.
Key questions for enterprises to ask:
- How will your cloud governance framework help your development teams evaluate these innovations?
- What process do they need to follow to deploy new services?
Platform
|
Provisioning Automation |
Security Management |
Cost Management |
Regulatory Compliance |
Powered by Artificial Intelligence |
Native Hybrid Cloud Support
|
---|---|---|---|---|---|---|
Azure Native Tools |
✔
|
✔
|
✔
|
|||
CoreStack
|
✔
|
✔
|
✔
|
✔
|
✔
|
✔
|
Recap of key cloud governance concepts
Well-defined cloud governance should address the security, compliance, financial, and operational best practices your organization should adopt to ensure smooth cloud operations. Incrementally build upon existing governance frameworks and start by addressing the aspects of cloud governance most important to achieving your cloud adoption objectives.
In multi-cloud and hybrid-cloud environments, third-party cloud governance platforms can help you maintain oversight by aggregating metrics and implementing machine learning to identify anomalies and automate responses. By capturing metrics, you can benchmark your organization and, where possible, implement automated responses to alerts. A holistic understanding of your cloud governance performance will help develop a culture of accountability throughout your organization.
After all, going from on-premise to cloud-based environments is a mindset change as much as a technology change. Cloud governance is not a linear process. It should be reviewed and maintained based on data-driven feedback from your operational resources and team’s experience.