Cost Management
Our goal is to improve cost visibility and provide cost savings recommendations for the platform and its core services. To achieve this, CG DevX reference architecture incorporates cloud and in-cluster resource labeling/tagging, and integrates with real-time cost visibility and insights tools. By default, the free version of KubeCost is installed with the platform and can be replaced with OpenCost or other tools from our integration catalog.
To properly manage costs, users of CG DevX should have a full-scale cost management (visibility, analysis, optimization, governance) FinOps solution.
CG DevX reference architecture cannot and should not replace your primary cost management solution.
Cost Visibility & Attribution
As with many practices in operations, implementing a labeling/tagging strategy is a process of iteration and improvement. CG DevX comes with a minimalistic set of labels/tags as part of an integrated labeling strategy. The main goal is to address immediate needs like lineage tracking and cost attribution, and allow users to grow and evolve the strategy.
Cost Attribution
Map spending data to the business to enable setback/chargeback by:
- Creating a meaningful taxonomy of costs
- Providing an organizational breakdown (by division, business unit, management hierarchy)
- Offering a functional breakdown (by workload)
- Classifying by cost center, service, or other unit (map to available taxonomy)
Secure Operations
By using shared ownership and consumer information associated with resources, you can quickly track all affected parties in case of a security incident. It helps to isolate, identify, and notify the right person quicker, reducing potential impact.
Such data could include:
- Workload owner name
- Operations team / Team accountable for day-to-day operations
- Business criticality / Business impact of the resource or supported workload
- Disaster recovery / Business criticality of the workload or service
- End date of the project / Date when the workload is scheduled for retirement
Rightsizing and Cost Recommendations
Containerization helps solve the rightsizing problem by making workloads smaller and easier to properly distribute across multiple nodes. However, it is still challenging to rightsize and attribute workloads correctly. In the K8s world, rightsizing should be done on multiple levels.
Pod/Container Rightsizing
Ensure that your containers request an appropriate amount of resources. Asking for too little means your application cannot perform, while too much means inefficiency. Often, there is a buffer between the requests or limits configured for a container and what it really needs. When this buffer is larger than necessary, there is an opportunity for cost savings. KubeCost identifies and highlights such opportunities. It works with Vertical Pod Autoscaler (VPA) and Horizontal Pod Autoscaler (HPA), where VPA helps adjust the requests and limits configuration based on actual usage, and HPA scales out and in.
Node Rightsizing
Node rightsizing involves choosing the right worker node type for the cluster and bin packing. KubeCost provides instance type recommendations, but in practice, this can be challenging. CG DevX allows you to use multiple node types per node group and multiple node groups to improve resource utilization. To enable additional node groups, update your hosting module IaC configuration. Please check this section for more details.
CG DevX comes with limits set based on a "demo scenario," prioritizing cost minimization and running core services with a low number of active workloads. On the K8s cluster level, Cluster Autoscaler is turned on by default. These limits should be adjusted for heavy use. For user workloads, we recommend starting with reasonable limits and adjusting them after some time (usually ≤ 1 week) to understand load patterns.
Labels/Tags Recommendations
Considerations
- Labels/tags should not be used as a configuration management database.
- Values that are not easily human-readable can have a human-readable name/description added to another label/tag.
- Labels/tags should be (semi-) static.
- Namespaces are used to help organize data specific to a domain or organization. Namespace should be either dash
-or dot-separated.. Azure doesn't support colon-separator:. - Kebab-case or camelCase is used for keys. Note: some AWS resources may have issues with kebab-case.
- The format for multi-value is the following:
- Single-Value:
name=value - Multi-Value:
name=value1:value2 - Multi-Attribute:
name=attribute1=value1/attribute2=value2 - Combination:
name=attribute1=value1:value2/attribute2=value1:value3
- Single-Value:
- When date-time data is stored in labels/tags, the UTC ISO 8061 standard is used.
Cloud Resources Labels / Tags
Below are suggested labels with examples:
| Use Case | Tag | Rationale | Allowed Values (Listed or value prefix/suffix) |
|---|---|---|---|
| Cost Allocation | cg-devx.cost-allocation.workload-id | Track cost vs value generated by each line of business | analytics/landing-page |
| Cost Allocation | cg-devx.cost-allocation.business-unit-id | Monitor costs by business unit | architecture/devops/finance |
| Cost Allocation | cg-devx.cost-allocation.cost-center | Monitor costs by cost center | 123-* |
| Cost Allocation | cg-devx.cost-allocation.owner | Which budget holder is responsible for this workload | marketing/retail-support |
| Data Classification | cg-devx.data.classification | Classify data for compliance and governance | public/private/confidential, restricted |
| Security | cg-devx.security.application-owner | IT Application Owner | NameA/NameB |
| Security | cg-devx.security.operations-team | Team that is accountable for day-to-day operations | TeamA/TeamB |
| Security | cg-devx.security.business-criticality | Business criticality of the application, workload, or service | High/Medium/Low |
Business criticality values
| Business criticality | Description |
|---|---|
| high | Exploitation causes serious public image damage and financial loss with long-term business impact |
| medium | Applications connected to the internet that process financial or other private information |
| low | Typically internal applications with non-critical business impact |
K8s Cluster Resources Labels
| Use Case | Label | Rationale | Examples |
|---|---|---|---|
| Cost Allocation | cg-devx.cost-allocation.environment | Defines SDLC stage | dev/test/stg/prod |
| Cost Allocation | cg-devx.cost-allocation.wl-name | The name of the workload | workload1/workload2/workload3 |
| Cost Allocation | cg-devx.cost-allocation.service | The name of the service in the workload | service1/service2/service3 |
| Cost Allocation | cg-devx.cost-allocation.version | The current version of the application (e.g., a SemVer 1.0, revision hash, etc.) | 5.7.21 |
| Cost Allocation | cg-devx.cost-allocation.component | The component within the architecture | frontend, backend, database |
| OperationalExcellence | cg-devx.operations.deployed-by | The controller/user who created this resource | hpotter, ggranger, rweasley |
| Security | cg-devx.security.business-criticality | Business criticality of the application, workload, or service | High, Medium, Low |
CG DevX default labels
CG DevX applies the following labels by default
Cloud Resources Labels / Tags
| Tag / Label | Value | Rationale |
|---|---|---|
| cg-devx.cost-allocation.cost-center | platform | Indicates cost center |
| cg-devx.metadata.owner | platform-admin | Indicates team owning the resource |
| cg-devx.metadata.cluster-name | Cluster name provided as input parameter |
K8s Cluster Resources Labels
| Label | Value | Rationale |
|---|---|---|
| cg-devx.cost-allocation.workload | Workload name. Applied to workload resources | |
| cg-devx.cost-allocation.environment | dev,sta,prod | Workload environment. Applied to workload resources only |
| cg-devx.cost-allocation.cost-center | platform,development | Indicates cost center |
| cg-devx.metadata.owner | platform-admin/workload-admin | Indicates team owning the resource |
| cg-devx.metadata.service | Indicates service, workload only | |
| cg-devx.metadata.chart-version | Indicates version of Helm chart used to deploy service, core-services only | |
| cg-devx.metadata.version | Indicates version of application deployed, core-services only |