Release Notes
6.16.0 (2025-05-29)
🚨 Upgrade Notices
-
Promtail has been deprecated, disabled by default, and replaced by Alloy.
- If the user has any custom configurations of Promtail, they can utilize the Grafana Alloy migration utility to assist with the migration from Promtail to Alloy.
- Follow the appropriate installation instructions to install Grafana Alloy.
- With an active connection to the cluster containing a Promtail configuration, run:
kubectl get secret -n monitoring promtail-promtail -o yaml
, or otherwise obtain the configuration frompromtail.yaml
.- Take the decoded contents of the
promtail.yaml
key and save locally:alloy convert --source-format=promtail --output=alloy.yaml promtail.yaml
- Update any custom
values.yaml
references to.alloy
.
- If the user has any custom configurations of Promtail, they can utilize the Grafana Alloy migration utility to assist with the migration from Promtail to Alloy.
-
The new variable,
create_cloudwatch_log_group
, defaults totrue
. To opt-out of importing and modifying the log group default retention time, setcreate_cloudwatch_log_group = false
-
❗ For existing build clusters, it will require the import of the AWS CloudWatch Log Group (implicitly created by AWS) for each of the (typically 8) RDS instances (shown below) before running the Terraform IaC. Otherwise the IaC upgrade will fail.
-
If opting-in, then enter the following import commands for existing build clusters prior to configuring log retention and running the terraform IaC:
terragrunt --terragrunt-working-dir modules/jira import 'module.rds.aws_cloudwatch_log_group.this["postgresql"]' /aws/rds/cluster/build-'dbIdentifier'-jira/postgresql
terragrunt --terragrunt-working-dir modules/confluence import 'module.rds.aws_cloudwatch_log_group.this["postgresql"]' /aws/rds/cluster/build-'dbIdentifier'-confluence/postgresql
terragrunt --terragrunt-working-dir modules/console import 'module.rds.aws_cloudwatch_log_group.this["postgresql"]' /aws/rds/cluster/build-'dbIdentifier'-console/postgresql
terragrunt --terragrunt-working-dir modules/mattermost import 'module.rds.aws_cloudwatch_log_group.this["postgresql"]' /aws/rds/cluster/build-'dbIdentifier'-mattermost/postgresql
terragrunt --terragrunt-working-dir modules/keycloak import 'module.rds.aws_cloudwatch_log_group.this["postgresql"]' /aws/rds/cluster/build-'dbIdentifier'-keycloak/postgresql
terragrunt --terragrunt-working-dir modules/sonarqube import 'module.rds.aws_cloudwatch_log_group.this["postgresql"]' /aws/rds/cluster/build-'dbIdentifier'-sonarqube/postgresql
# NOTE: The commands below are a bit different than those above.
terragrunt --terragrunt-working-dir modules/gitlab import 'module.rds[0].module.db_instance.aws_cloudwatch_log_group.this["postgresql"]' /aws/rds/instance/build-'dbIdentifier'-gitlab/postgresql
terragrunt --terragrunt-working-dir modules/nexus import 'module.rds[0].aws_cloudwatch_log_group.this["postgresql"]' /aws/rds/cluster/build-'dbIdentifier'-nexus/postgresql
-
-
The Auto SSO feature is now disabled by default. The following config can be added to the
zarf-config.yaml
file to enable the feature:package:
deploy:
set:
AUTO_SSO_ENABLED: "true"
📦 SmoothGlue Features
- Promtail has been deprecated, disabled by default, and replaced by Alloy.
- Kubernetes
v1.31.x
is officially supported and the default version used to test SmoothGlue on EKS/RKE2. Additional testing is performed for Kubernetesv1.32.x
using internal single node instances on K3s. - SmoothGlue also now supports new native Istio Helm charts in preparation for the required migration off of Istio Operator. If users would like to test out the automation and new charts before they are required, as well as read about the pre-migration steps and migration concerns, please see the Istio Migration documentation.
- IaC Version tracking has been added. Objects deployed via the IaC are saved as a variable in the
commit_ref_name
in the outputs and will show up as a tag on AWS objects with the tagsg:automation:commit-ref-name
. - A new Terragrunt variable has been added in the HCL to allow setting
apply_immediately
for RDS modules. Setting this totrue
will apply Terraform changes to RDS instances to occur during IaC apply instead at scheduled maintenance time; the default remainsfalse
. - Auto SSO Feature:
- This feature is now disabled by default. Please see the linked documentation below on how to enable and configure the feature.
- SmoothGlue Run environments are now supported and new documentation on how to enable/configure the feature is now available.
- The CloudWatch Log Group retention policy capability has been added for RDS/Aurora.
- Two new variables,
cloudwatch_log_group_retention_in_days
andcreate_cloudwatch_log_group
, have been added to application module inputs within thebuild.hcl
file to configure the Aurora/RDS log group retention time, improving log retrieval times, overall system performance, and cost savings. (See upgrade notices for existing build clusters.) - The variable
create_cloudwatch_log_group
is defaulted totrue
. For existing clusters, seeUpgrade Notices
. For new clusters, usecloudwatch_log_group_retention_in_days
to set retention days per database as seen below:
- Two new variables,
locals {
gitlab_inputs = {
# Adjust as needed, default is 0 days (logs never expire). Valid value for X is one of:
# [0 1 3 5 7 14 30 60 90 120 150 180 365 400 545 731 1096 1827 2192 2557 2922 3288 3653]
cloudwatch_log_group_retention_in_days = X
# Set to false to avoid importing the cloudwatch_log_group for existing build clusters
# create_cloudwatch_log_group = false
}
# Repeat for each module's input
jira_inputs = {
cloudwatch_log_group_retention_in_days = X
}
confluence_inputs = {...}
keycloak_inputs = {...}
mattermost_inputs = {...}
sonarqube_inputs = {...}
console_inputs = {...}
nexus_inputs = {{...}
}
- Grafana can be enabled with High Availability (HA). This creates multiple Grafana pods managed by an HPA with pod distribution rules and a backing RDS instance. See our docs for more information.
- To enable Grafana HA, you must turn on the Grafana module, which can be completed by:
- Setting the
locals.grafana_inputs.high_availability
totrue
- Setting
modules.grafana
totrue
- Setting the
- NOTE: Due to how Grafana pods need to communicate with each other to deconflict and de-duplicate, all database IaC features and in-cluster variables must be enabled at once at the Terragrunt level.
- Grafana HA can be enabled on existing clusters, and maintainers should be aware of the following chart additions when enabling:
- To enable Grafana HA, you must turn on the Grafana module, which can be completed by:
---
grafana:
values:
headlessService: true
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 5
targetCPU: "60"
targetMemory: ""
podDisruptionBudget:
apiVersion: "policy/v1"
minAvailable: 1
grafana.ini:
database:
type: postgres
host: "${db_host}:${db_port}"
name: grafana
user: "${db_name}"
alerting:
enabled: false
unified_alerting:
enabled: true
ha_peers: monitoring-grafana-headless:9094
ha_listen_address: ${POD_IP}:9094
ha_advertise_address: ${POD_IP}:9094
rule_version_record_limit: "5"
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: "kubernetes.io/hostname"
labelSelector:
matchLabels:
dont-schedule-with: grafana
⏩ Upgraded Packages
- This release of SmoothGlue Enterprise v6.16.0 includes Big Bang Version 2.53.1. For more details on the features and updates included in Big Bang Version 2.53.1, please refer to the Big Bang Release Notes.
- Confluence: chart bump(@2.0.0-bb.0) version bump confluence-node-lts:9.2.4
- Jira: chart bump (2.0.0-bb.2)
🐞 Bug Fixes
- SmoothGlue now disables the
KubeControllerManagerDown
andKubeSchedulerDown
Prometheus alerts by default for EKS since those components are located on the control-plane, which is managed by AWS and is not accessible from the cluster.
🌐 Compatibility
- The packages for this release were built using Zarf v0.54.0.
- The packages were tested across the following Kubernetes distributions:
- RKE2:
v1.31.8+rke2r1
- K3s:
v1.32.4+k3s1
- EKS:
v1.31.7
- RKE2:
- The following AMI versions were used for testing:
- RKE2 AMI:
smoothglue-rke2-v1.31.8-rke2r1-rocky-8-base-v1.1.1-stig-2025-05-19T08-22-30Z
- EKS AMI:
smoothglue-eks-1.31.7-rocky-8-base-v1.1.1-stig-2025-05-19T08-22-32Z
- Base AMI:
base-Rocky-8-EC2-LVM-v1.1.1-stig-2025-05-19T0702
- RKE2 AMI:
🔗 Helpful Links
- Refer to the SmoothGlue documentation for additional guidance.
- For details on the Big Bang release, see the Big Bang Release Notes.
6.15.0 (2025-05-14)
🚨 Upgrade Notices
- Zarf is being updated to
v0.54.0
, which is now the minimum supported version. Trying to use an older Zarf version will result in Zarf registry failures on EKS clusters if using IRSA and S3 bucket backing for the registry (which are the default settings). - Ensure that you are using the new Zarf init config from the IaC. You will also need to run the Zarf init steps again to update Zarf. See the "Initializing Zarf on SmoothGlue" section in the "SmoothGlue Build/Run Deploy Guide" for more details. The Zarf registry will be unusable between the period from when the IaC is run and when Zarf is re-initialized.
ZARF_CONFIG=infra-iac/outputs/zarf-init-config.yaml zarf init --components git-server --architecture=amd64
- The Nexus Repository Manager APIs changed impacting the
blobstorage
andrepo
jobs. We have temporarily modified the IaC values to disable these jobs and ensure the IaC values are applied, or the Nexus Repository Manager upgrade will fail. - You may need to restart the Keycloak pods using
kubectl rollout restart statefulset -n keycloak keycloak
to ensure that all pods are using the updated Keycloak theme bundled with this SmoothGlue release.
📦 SmoothGlue Features
- This release updates the styling and branding of the SmoothGlue Keycloak theme. Notably, the theme now includes a configurable Terms of Use banner, which can be configured on a per-realm basis. This feature is not enabled by default; to configure the Terms of Use banner, follow these steps:
- Log into Keycloak's admin console and select the realm you wish to modify.
- Navigate to "Realm Settings" -> "Localization" -> "Realm Overrides".
- Click the "Add Translation" button to add a translation with the key "termsText". The value should be the text of your Terms of Use banner. This field supports HTML tags.
- IAM Roles for Service Accounts (IRSA) is now enabled by default for EKS cluster nodes to access the Zarf registry, when backed up by an S3 bucket (by itself, a default setting). This moves away from the IMDSv1 based S3 bucket policy, which was a less secure access method. Existing clusters will transition seamlessly from IMDSv1 to IRSA based Zarf registry access without requiring any user intervention. Zarf has been updated to
v0.54.0
to support this.
⏩ Upgraded Packages
- This release of SmoothGlue Enterprise v6.15.0 includes Big Bang Version 2.52.0. For more details on the features and updates included in Big Bang Version 2.52.0, please refer to the Big Bang Release Notes.
- Upgrades the following Big Bang third-party apps:
- Cert-Manager: 1.17.2
- Confluence chart:
1.22.7-bb.1
- Jira chart:
2.0.0-bb.0
🐞 Bug Fixes
- Updated the Keycloak theme, which resolved a Javascript error presented on the login page, as well as re-styled the "update password" page, which previously had white text on a white background for the password input field.
❗ Known Issues
- You may encounter a scenario following the upgrade or installation where
istio-proxy
fails to communicate properly with theistiod
service. You may observe an error similar to the following:- To workaround this issue, restart the
istiod
Deployment.
- To workaround this issue, restart the
2025-05-12T16:58:56.878375Z warn ca ca request failed, starting attempt 4 in 804.379312ms
2025-05-12T16:58:57.683232Z error citadelclient failed to sign CSR: create certificate: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing:
-
The following only applies to the initial deployment of the SmoothGlue IAC. No action is required for updates to already deployed clusters - Due to an upstream issue for the EKS module and when deploying a cluster using an AL2023 AMI, the System Integrator will need to manually generate and set a Zarf registry pull password. The following config can be added to the
env.hcl
file:locals {
cluster_inputs = {
zarf_registry_pull_password = "securepassword123"
}
} -
When using a network load balancer (NLB) with the
preserve_client_ip
option enabled, the default routing rules for EKS nodes prevent nodes from accessing platform services hosted on the same node, which can cause failures when logging into Keycloak, particularly on clusters with fewer nodes.- More specifically, the default routing rules for nodes do not route traffic to the VPC router for traffic within the node’s local subnet, since these addresses should theoretically be reachable directly by the node. However, when using the
preserve_client_ip
option, the VPC router rewrites the source IP for traffic; when the node attempts to talk to the NLB, the traffic is rewritten so that it appears to come from the node itself, and the return traffic is not able to be routed correctly back to the NLB. - The following options are potential workarounds:
- Disabling the
preserve_client_ip
option on the NLB will resolve the issue at the cost of losing source attribution for incoming traffic. - Removing the local subnet route on nodes will resolve the issue at the cost of increasing the amount and cost of traffic being routed through the VPC router.
- Increasing the node count for the cluster will reduce the likelihood of the issue because it will become less likely for any given traffic to be routed back to the original node.
- Disabling the
- More specifically, the default routing rules for nodes do not route traffic to the VPC router for traffic within the node’s local subnet, since these addresses should theoretically be reachable directly by the node. However, when using the
-
The Big Bang Istio Helm chart has a bug that prevents Istio Gateway deployments from properly being upgraded. During an upgrade, Istio Gateway deployments may get stuck as a result and will need manual intervention to complete the upgrade. To validate the issue in the cluster, check the health of the
istiooperators.install.istio.io
resource as follows:kubectl get istiooperators.install.istio.io -n istio-system
If it is in
Error
status, delete all Istio Gateway deployments in theistio-system
namespace to allow the Istio Operator to finish reconciling the upgrade and report aHealthy
status. The deployments will be recreated automatically by the Istio Operator. For example:kubectl delete deployment.apps/admin-ingressgateway -n istio-system
kubectl delete deployment.apps/passthrough-ingressgateway -n istio-system
kubectl delete deployment.apps/public-ingressgateway -n istio-systemNote: Deleting the deployments will entail some brief but non-zero downtime.
https://repo1.dso.mil/big-bang/product/packages/istio-controlplane/-/issues/253 has been opened to track this issue, which was introduced in SmoothGlue 6.7 (Big Bang 2.44)
🌐 Compatibility
- The packages for this release were built using Zarf v0.54.0.
- The packages were tested across the following Kubernetes distributions:
- RKE2:
v1.30.12+rke2r1
- K3s:
v1.32.3+k3s1
- EKS:
v1.30.9-eks-5d632ec
- RKE2:
🔗 Helpful Links
- Refer to the SmoothGlue documentation for additional guidance.
6.14.1 (2025-05-07)
Package Bug Fixes
- add missing k8s-sidecar image (b2fc23e)
6.14.0 (2025-05-01)
🚨 Upgrade Notices
-
Updated Console's default config to change the default Keycloak host config from
login.<domain>
tokeycloak.<domain>
. SmoothGlue Build environments use thekeycloak.<domain>
host by default when deploying Keycloak. If deploying Keycloak underlogin.<domain>
the following can be added to thebigbang-values.yaml
to configure Console to use the overwritten host for Keycloak:packages:
console:
values:
keycloak:
host: login.<domain> -
This upgrade includes a major version update to Keycloak. The full migration guide for Keycloak
26.0.0
is located here. Specific changes of note:-
If you are currently setting the
KC_PROXY
environment variable toedge
using the.addons.keycloak.values.secrets.env
value, note that this option has been removed. It has been replaced byKC_PROXY_HEADERS
option, which should be set automatically if the value.addons.keycloak.values.proxy.enabled
is set totrue
.addons:
keycloak:
values:
proxy:
enabled: true
-
-
Keycloak may fail to upgrade.
- To resolve this, reconcile the helm release and then delete the pods within the keycloak namespace.
flux reconcile hr -n bigbang keycloak --with-source --force
kubectl delete pods -n keycloak- Alternately, scale the keycloak pod replica count to 1 by editing the Horizontal Pod Autoscaler before starting the upgrade.
kubectl patch hpa -n keycloak keycloak -p "{\"spec\":{\"minReplicas\":1,\"maxReplicas\":1}}"
And then scale it back up once done with the upgrade, for example.
kubectl patch hpa -n keycloak keycloak -p "{\"spec\":{\"minReplicas\":2,\"maxReplicas\":5}}"
-
The SmoothGlue EKS IaC's
zarf-config
output now includes Zarf variables for AWS and a variable forCLUSTER_NAME
. When using the auto-SSO feature, please ensure the variables in the outputtedzarf-config
are merged with customer-managedzarf-config
's to ensure Keycloak client names are configured properly.
📦 SmoothGlue Features
- SmoothGlue Automated SSO:
- SmoothGlue now automatically configures the
_structusureAdmins
group with the appropriate Keycloak permissions to manage thesmoothglue
realm. - SmoothGlue now automatically configures cluster-level prefixes onto Keycloak clients managed by SmoothGlue. This change will enable future work to allow a SmoothGlue Run's SSO clients to be managed by SmoothGlue. Some applications may need to be manually restarted to pickup the new SSO client names.
- SmoothGlue now automates the creation of the Keycloak objects for SonarQube. There are still some manual steps required by System Integrators to enable SSO for SonarQube. Please see updated documentation.
- SmoothGlue will automatically configure a Keycloak client and automatically configure the application for:
- Gitlab
- Mattermost
- Console
- SmoothGlue will automatically configure a Keycloak client for:
- Confluence
- Jira
- SmoothGlue now automatically configures the
- This release updates the styling and branding of the SmoothGlue Keycloak theme. Notably, the theme now includes a configurable terms of use banner which can be configured on a per-realm basis. This feature is not enabled by default; to configure the terms of use banner, follow these steps:
- Log into Keycloak's admin console and select the realm you wish to modify.
- Navigate to "Realm Settings" -> "Localization" -> "Realm Overrides".
- Click the "Add Translation" button to add a translation with the key "termsText". The value should be the text of your Terms of Use banner. This field supports HTML tags.
⏩ Upgraded Packages
- This release of SmoothGlue Enterprise v6.14.0 includes Big Bang Version 2.51.0. For more details on the features and updates included in Big Bang Version 2.51.0, please refer to the Big Bang release notes.
- Kiali stays pinned to the earlier 2.50.0 version of v2.6.0 due to a failure seen in latest version.
🐞 Bug Fixes
- Docusaurus and Tailwind config to support dark mode by default. Updated global footer to add a label for the legal links.
❗ Known Issues
-
The following only applies to the initial deployment of the SmoothGlue IAC. No action is required for updates to already deployed clusters - Due to an upstream issue for the EKS module and when deploying a cluster using an AL2023 AMI, the System Integrator will need to manually generate and set a Zarf registry pull password. The following config can be added to the
env.hcl
file:locals {
cluster_inputs = {
zarf_registry_pull_password = "securepassword123"
}
} -
When using a network load balancer (NLB) with the
preserve_client_ip
option enabled, the default routing rules for EKS nodes prevent nodes from accessing platform services hosted on the same node, which can cause failures when logging into Keycloak, particularly on clusters with fewer nodes.- More specifically, the default routing rules for nodes do not route traffic to the VPC router for traffic within the node’s local subnet, since these addresses should theoretically be reachable directly by the node. However, when using the
preserve_client_ip
option, the VPC router rewrites the source IP for traffic; when the node attempts to talk to the NLB, the traffic is rewritten so that it appears to come from the node itself, and the return traffic is not able to be routed correctly back to the NLB. - The following options are potential workarounds:
- Disabling the
preserve_client_ip
option on the NLB will resolve the issue at the cost of losing source attribution for incoming traffic. - Removing the local subnet route on nodes will resolve the issue at the cost of increasing the amount and cost of traffic being routed through the VPC router.
- Increasing the node count for the cluster will reduce the likelihood of the issue because it will become less likely for any given traffic to be routed back to the original node.
- Disabling the
- More specifically, the default routing rules for nodes do not route traffic to the VPC router for traffic within the node’s local subnet, since these addresses should theoretically be reachable directly by the node. However, when using the
-
The Big Bang Istio Helm chart has a bug that prevents Istio Gateway deployments from properly being upgraded. During an upgrade, Istio Gateway deployments may get stuck as a result and will need manual intervention to complete the upgrade. To validate the issue in the cluster, check the health of the
istiooperators.install.istio.io
resource as follows:kubectl get istiooperators.install.istio.io -n istio-system
If it is in
Error
status, delete all Istio Gateway deployments in theistio-system
namespace to allow the Istio Operator to finish reconciling the upgrade and report aHealthy
status. The deployments will be recreated automatically by the Istio Operator. For example:kubectl delete deployment.apps/admin-ingressgateway -n istio-system
kubectl delete deployment.apps/passthrough-ingressgateway -n istio-system
kubectl delete deployment.apps/public-ingressgateway -n istio-systemNote: Deleting the deployments will entail some brief but non-zero downtime.
https://repo1.dso.mil/big-bang/product/packages/istio-controlplane/-/issues/253 has been opened to track this issue, which was introduced in SmoothGlue 6.7 (Big Bang 2.44)
🌐 Compatibility
- The packages for this release were built using Zarf v0.46.0.
- The packages were tested across the following Kubernetes distributions:
- RKE2: v1.30.11+rke2r1
- K3s: v1.32.3+k3s1
- EKS: v1.30.9-eks-5d632ec
🔗 Helpful Links
- Refer to the SmoothGlue documentation for additional guidance.
- For details on the Big Bang release, see the Big Bang Release Notes.
6.13.0 (2025-04-16)
📦 SmoothGlue Features
-
A new optional variable
cloudwatch_log_group_retention_in_days
has been added to theenv.hcl
files to configure the EKS cluster log group retention time. It can be configured as shown below.locals {
cluster_inputs = {
# Adjust as needed, default is 90 days. Valid value for X is one of:
# [0 1 3 5 7 14 30 60 90 120 150 180 365 400 545 731 1096 1827 2192 2557 2922 3288 3653]
cloudwatch_log_group_retention_in_days = X
}
} -
IAM policies generated by the SmoothGlue IAC no longer apply tags to IAM policies when
compatibility_mode
is set to true. This change is to conform to limitations on high-side deployments- NOTE: On an existing cluster, you may need to delete the
allow_kms
andallow_cluster_autoscaler
IAM policies to allow their recreation.
- NOTE: On an existing cluster, you may need to delete the
⏩ Upgraded Packages
- This release of SmoothGlue Enterprise v6.13.0 includes Big Bang Version 2.50.0. For more details on the features and updates included in Big Bang Version 2.50.0, please refer to the Big Bang release notes.
- Upgrades the following Big Bang third-party apps:
- Confluence LTS: 9.2.3
- Jira LTS: 10.3.5
- Nexus IQ Server: 1.189.0-01
🐞 Bug Fixes
- Fixed an issue that prevented user data scripts from running on AL2023 AMIs.
- Fixed an issue that overwrote default values from SmoothGlue with customer overrides. This prevented SmoothGlue default values from being viewable at runtime. Overrides provided by customers are still overlayed on top of SmoothGlue default values, so this is a purely cosmetic change.
❗ Known Issues
-
The following only applies to the initial deployment of the SmoothGlue IAC. No action is required for updates to already deployed clusters - Due to an upstream issue for the EKS module and when deploying a cluster using an AL2023 AMI, the System Integrator will need to manually generate and set a Zarf registry pull password. The following config can be added to the
env.hcl
file:locals {
cluster_inputs = {
zarf_registry_pull_password = "securepassword123"
}
} -
When using a network load balancer (NLB) with the
preserve_client_ip
option enabled, the default routing rules for EKS nodes prevent nodes from accessing platform services hosted on the same node, which can cause failures when logging into Keycloak, particularly on clusters with fewer nodes.- More specifically, the default routing rules for nodes do not route traffic to the VPC router for traffic within the node’s local subnet, since these addresses should theoretically be reachable directly by the node. However, when using the
preserve_client_ip
option, the VPC router rewrites the source IP for traffic; when the node attempts to talk to the NLB, the traffic is rewritten so that it appears to come from the node itself, and the return traffic is not able to be routed correctly back to the NLB. - The following options are potential workarounds:
- Disabling the
preserve_client_ip
option on the NLB will resolve the issue at the cost of losing source attribution for incoming traffic. - Removing the local subnet route on nodes will resolve the issue at the cost of increasing the amount and cost of traffic being routed through the VPC router.
- Increasing the node count for the cluster will reduce the likelihood of the issue because it will become less likely for any given traffic to be routed back to the original node.
- Disabling the
- More specifically, the default routing rules for nodes do not route traffic to the VPC router for traffic within the node’s local subnet, since these addresses should theoretically be reachable directly by the node. However, when using the
-
The Big Bang Istio Helm chart has a bug that prevents Istio Gateway deployments from properly being upgraded. During an upgrade, Istio Gateway deployments may get stuck as a result and will need manual intervention to complete the upgrade. To validate the issue in the cluster, check the health of the
istiooperators.install.istio.io
resource as follows:kubectl get istiooperators.install.istio.io -n istio-system
If it is in
Error
status, delete all Istio Gateway deployments in theistio-system
namespace to allow the Istio Operator to finish reconciling the upgrade and report aHealthy
status. The deployments will be recreated automatically by the Istio Operator. For example:kubectl delete deployment.apps/admin-ingressgateway -n istio-system
kubectl delete deployment.apps/passthrough-ingressgateway -n istio-system
kubectl delete deployment.apps/public-ingressgateway -n istio-systemNote: Deleting the deployments will entail some brief but non-zero downtime.
https://repo1.dso.mil/big-bang/product/packages/istio-controlplane/-/issues/253 has been opened to track this issue, which was introduced in SmoothGlue 6.7 (Big Bang 2.44)
🌐 Compatibility
- The packages for this release were built using Zarf v0.46.0.
- The packages were tested across the following Kubernetes distributions:
- RKE2:
v1.29.8+rke2r1
- K3s:
v1.32.3+k3s1
- EKS:
v1.30.9-eks-5d632ec
- RKE2:
🔗 Helpful Links
- Refer to the SmoothGlue documentation for additional guidance.
6.12.0 (2025-04-02)
🚨 Upgrade Notices
- Kyverno
- A new Kyverno Policy has been added which mutates pod specs to drop
ALL
capabilities in all containers if not already done. This policy works in tandem with therequire-drop-all-capabilities
policy to make it easier for SREs to securely deploy workloads to their clusters without having to explicitly modify the pod's containers'securityContext
s to be compliant. - If Big Bang consumers are currently excluding certain workloads from the
require-drop-all-capabilities
policy due to incompatibilities with that policy, those exclusions should also be included for this new policy:add-default-capability-drop
to avoid workload interruption.
- A new Kyverno Policy has been added which mutates pod specs to drop
⏩ Upgraded Packages
- This release of SmoothGlue Enterprise v6.12.0 includes Big Bang Version 2.49.0. For more details on the features and updates included in Big Bang Version 2.49.0, please refer to the Big Bang Release Notes.
console
updated image to 54183nexus-iq
chart upgraded to 188
🌐 Compatibility
- The packages for this release were built using Zarf v0.46.0.
- The packages were tested across the following Kubernetes distributions:
- RKE2:
v1.30.9-rke2r1
- K3s:
v1.30.9+k3s1
- EKS:
v1.30.8
- RKE2:
- The following AMI versions were used for testing:
- RKE2 AMI:
smoothglue-rke2-v1.30.9-rke2r1-rocky-8-base-v1.1.1-stig-2025-02-17T09-24-30Z
- EKS AMI:
smoothglue-eks-1.30.8-rocky-8-base-v1.1.1-stig-2025-02-10T09-21-19Z
- Base AMI:
base-Rocky-8-EC2-LVM-v1.1.1-stig-2025-02-10T0802
- RKE2 AMI:
🔗 Helpful Links
- Refer to the SmoothGlue documentation for additional guidance.
- For details on the Big Bang release, see the Big Bang Release Notes.
6.11.0 (2025-03-19)
🚨 Upgrade Notices
- Flux has been upgraded to
2.5.1
. Platform Operators should update their local Flux binary to a compatible version.
📦 SmoothGlue Features
- On RKE2-based clusters, the
preserve_client_ips
IaC variable is now set tofalse
by default in order to allow pods to communicate internally using external DNS names. See the known issues section for more information.
Note: This means that the client IP in any logs will appear to be the load balancer itself. To get the true client IP, you will need to setup monitoring on the NLB - Added documentation for manually rotating RDS/Aurora database passwords as a good security practice.
⏩ Upgraded Packages
- This release of SmoothGlue Enterprise v6.11.0 includes Big Bang Version 2.48.0. For more details on the features and updates included in Big Bang Version 2.48.0, please refer to the Big Bang Release Notes.
- Update Gitlab to 17.9.2 (applied Critical Patch)
- Update Jira to LTS 10.3.4 (addresses CVE-2024-38819)
- Update JSM to 10.3.4 (addresses CVE-2024-38819)
🐞 Bug Fixes
- Fixed an issue with the SmoothGlue automated SSO feature for ArgoCD. SmoothGlue Admins should now be correctly given admin privileges in ArgoCD.
❗ Known Issues
- When using a network load balancer (NLB) with the
preserve_client_ip
option enabled, the default routing rules for EKS nodes prevent nodes from accessing platform services hosted on the same node, which can cause failures when logging into Keycloak, particularly on clusters with fewer nodes.- More specifically, the default routing rules for nodes do not route traffic to the VPC router for traffic within the node’s local subnet, since these addresses should theoretically be reachable directly by the node. However, when using the
preserve_client_ip
option, the VPC router rewrites the source IP for traffic; when the node attempts to talk to the NLB, the traffic is rewritten so that it appears to come from the node itself, and the return traffic is not able to be routed correctly back to the NLB. - The following options are potential workarounds:
- Disabling the
preserve_client_ip
option on the NLB will resolve the issue at the cost of losing source attribution for incoming traffic. - Removing the local subnet route on nodes will resolve the issue at the cost of increasing the amount and cost of traffic being routed through the VPC router.
- Increasing the node count for the cluster will reduce the likelihood of the issue because it will become less likely for any given traffic to be routed back to the original node.
- Disabling the
- More specifically, the default routing rules for nodes do not route traffic to the VPC router for traffic within the node’s local subnet, since these addresses should theoretically be reachable directly by the node. However, when using the
🌐 Compatibility
- The packages for this release were built using Zarf v0.46.0.
- The packages were tested across the following Kubernetes distributions:
- RKE2:
v1.30.9-rke2r1
- K3s:
v1.30.9+k3s1
- EKS:
v1.30.8
- RKE2:
- The following AMI versions were used for testing:
- RKE2 AMI:
smoothglue-rke2-v1.30.9-rke2r1-rocky-8-base-v1.1.1-stig-2025-02-17T09-24-30Z
- EKS AMI:
smoothglue-eks-1.30.8-rocky-8-base-v1.1.1-stig-2025-02-10T09-21-19Z
- Base AMI:
base-Rocky-8-EC2-LVM-v1.1.1-stig-2025-02-10T0802
- RKE2 AMI:
🔗 Helpful Links
- Refer to the SmoothGlue documentation for additional guidance.
- For details on the Big Bang release, see the Big Bang Release Notes.
6.10.0 (2025-03-04)
SmoothGlue Features
- This release adds optional basic support for Amazon Linux 2023 (AL2023) AMIs in EKS cluster IaC. To use AL2023, add the following to the
cluster_inputs
section of yourenv.hcl
file:locals {
cluster_inputs = {
ami_id = "ami-0123456789abcdef0" # replace with actual AMI ID
default_ami_type = "AL2023_x86_64_STANDARD"
}
}- For more details, refer to How to Create an EKS Cluster with Amazon Linux 2023.
- This release adds optional support in IaC for provisioning GitLab's database using an RDS Multi-AZ cluster rather than a single database instance. As a single instance, RDS offers support for a single warm standby instance; however, provisioning the database using an RDS Multi-AZ cluster allows for a cluster of three instances. There is no automatic migration path from a single RDS instance to a Multi-AZ cluster, so we recommend enabling the Multi-AZ cluster during the initial cluster provision if possible. If migrating an existing cluster, you will need to perform a database import/export manually.
- Note that there are instance class limitations when using a Multi-AZ RDS cluster; see the AWS documentation for more information.
⏩ Upgraded Packages
- This release of SmoothGlue Enterprise v6.10.0 includes Big Bang Version 2.47.0. For more details on the features and updates included in Big Bang Version 2.47.0, please refer to the Big Bang Release Notes.
- Update Confluence to LTS 9.2.1 (Helm chart version 1.22.5-bb.0).
- Update Jira to LTS 10.3.3 (Helm chart version 1.22.5-bb.1).
🪲 Bug Fixes
- Refactor IaC compatibility mode toggle to correctly disable NLB stickiness in ISO regions.
- Exclude
aws-ebs-csi-driver
namespace fromgenerate-networkpolicy-imds
Kyverno ClusterPolicy so that RKE2 clusters can provision EBS PVCs correctly. - Exclude the following namespaces from the
require-istio-on-namespaces
Kyverno ClusterPolicy so that users may enforce the policy:cluster-autoscaler
crossplane-system
kyverno
structsure-system
- Extend HelmRelease install timeout for GitLab to 15 minutes.
❗️ Known Issues
- When using a network load balancer (NLB) with the
preserve_client_ip
option enabled, the default routing rules for EKS and RKE2 nodes prevent nodes from accessing platform services hosted on the same node, which can cause failures when logging into Keycloak, particularly on clusters with fewer nodes.- More specifically, the default routing rules for nodes do not route traffic to the VPC router for traffic within the node's local subnet, since these addresses should theoretically be reachable directly by the node. However, when using the
preserve_client_ip
option, the VPC router rewrites the source IP for traffic; when the node attempts to talk to the NLB, the traffic is rewritten so that it appears to come from the node itself, and the return traffic is not able to be routed correctly back to the NLB. - We are currently working on an Istio-level fix which should prevent VirtualService traffic within the cluster from ever leaving the cluster. Until that fix is available, the following options are potential workarounds:
- Disabling the
preserve_client_ip
option on the NLB will resolve the issue at the cost of losing source attribution for incoming traffic. - Removing the local subnet route on nodes will resolve the issue at the cost of increasing the amount and cost of traffic being routed through the VPC router.
- Increasing the node count for the cluster will reduce the likelihood of the issue because it will become less likely for any given traffic to be routed back to the original node.
- Disabling the
- More specifically, the default routing rules for nodes do not route traffic to the VPC router for traffic within the node's local subnet, since these addresses should theoretically be reachable directly by the node. However, when using the
🌐 Compatibility
- The packages for this release were built using Zarf v0.46.0.
- The packages were tested across the following Kubernetes distributions:
- RKE2:
v1.30.9-rke2r1
- K3s:
v1.30.9+k3s1
- EKS:
v1.30.8
- RKE2:
- The following AMI versions were used for testing:
- RKE2 AMI:
smoothglue-rke2-v1.30.9-rke2r1-rocky-8-base-v1.1.1-stig-2025-02-17T09-24-30Z
- EKS AMI:
smoothglue-eks-1.30.8-rocky-8-base-v1.1.1-stig-2025-02-10T09-21-19Z
- Base AMI:
base-Rocky-8-EC2-LVM-v1.1.1-stig-2025-02-10T0802
- RKE2 AMI:
🔗 Helpful Links
- Refer to the SmoothGlue documentation for additional guidance.
- For details on the Big Bang release, see the Big Bang Release Notes.