Upgrade Kubeflow Operators
This page describes the manual actions that may be required after upgrading the Kubeflow operators.
For installation steps, see Install Kubeflow Operators.
TOC
Migrating fromv1.x (Cluster Plugin) to v26.3.0 (OLM Operator)Recommended migration procedureWhat changed between v1.x and v26.3.0Upgrade Notes for kfbase (v1.x history)Upgrading from v1.10.13 or earlierUpgrading from v1.10.9 or earlierMigrating from v1.x (Cluster Plugin) to v26.3.0 (OLM Operator)
Starting in v26.3.0 (Alauda AI v2.3), Kubeflow components ship as OLM Helm Operators (kfbase-operator, kfp-operator, kubeflow-trainer-operator) instead of Cluster Plugins (kfbase, kfp, kftraining, kubeflow-trainer). There is no in-place upgrade path between the two form factors — the Cluster Plugin install descriptor (ModuleInfo) and the OLM Subscription are mutually incompatible.
Recommended migration procedure
-
Back up user data:
- Snapshot Notebook PVCs in every user namespace.
- Export
ProfileCRs and any customRoleBinding/AuthorizationPolicyyou created for Kubeflow users. - Export your KFP pipelines, experiments, and scheduled runs via the
kfpCLI. - Export any
TrainingRuntimeand activeTrainJobCRs.
-
Uninstall the v1.x Cluster Plugin installs from the AC UI (
Cluster Plugins): removekubeflow-trainer,kfp,kftraining(if present), andkfbasein that order. The matchingModuleInfo/ModuleConfigresources are removed automatically.Warning: Do not proceed until the backups in step 1 are confirmed. If uninstalling removes the
kubeflow.orgCRDs, allProfileCRs (and the user namespaces / Notebook PVCs they own) may be cascade-deleted. Verify whether your user namespaces and PVCs survive the uninstall before relying on the restore step below. -
Install the v26.3.0 operator set from Administrator > MarketPlace > OperatorHub:
kfbase-operatorfirst (other operators depend on the base components).kfp-operatorif you need Kubeflow Pipelines (amd64 clusters only).kubeflow-trainer-operatorif you need Trainer v2.
-
Create the matching CR instances (
KubeflowBase,KubeflowPipelines,KubeflowTrainer) and reuse the configuration from your v1.x install. The chart values previously set in the Cluster Plugin install form are now exposed through the operator's CSVspecDescriptors— most field names are unchanged. -
Restore user data: If the
ProfileCRs and their PVCs were preserved through the uninstall, they reattach automatically through the Notebook controller reconcile. If they were removed, re-apply theProfileCRs you exported in step 1 and restore the PVCs from your snapshots first. In all cases, re-import KFP pipelines and TrainingRuntimes.
What changed between v1.x and v26.3.0
- Form factor: Cluster Plugin → OLM Helm Operator.
- Install descriptor:
ModuleInfo→ OLMSubscription+ClusterServiceVersion+ operator-owned CR. - Trainer:
kftraining(Training Operator v1, deprecated) is removed; replaced bykubeflow-trainer-operator(Trainer v2). - Upstream alignment: all charts re-pinned to
kubeflow/manifests26.03. - Architecture:
kfp-operatoris now amd64-only;kfbase-operatorandkubeflow-trainer-operatorremain amd64 + arm64.
Upgrade Notes for kfbase (v1.x history)
Upgrading from v1.10.13 or earlier
Versions up to v1.10.13 expose the Kubeflow dashboard through NodePort. After the upgrade, the
recommended access method is through the gateway endpoint instead.
After the upgrade:
- Check the
kubeflowDomainfield in thekfbaseplugin configuration to get<your-kubeflow-domain>. - Run
kubectl -n istio-system get gateway kubeflow-external-gatewayto get the gateway IP address. - Update DNS resolution, or your local hosts file, so that
<your-kubeflow-domain>resolves to the gateway IP address.
If you still need to use NodePort, manually change the istio-system/kubeflow-istio-ingressgateway
service to type NodePort, then get the assigned port for 443 by running:
You can then access the dashboard through:
Upgrading from v1.10.9 or earlier
Before the upgrade, set a default storage class in your cluster, which will be used for the pgStorageClass parameter in the kfbase plugin configuration. If no default storage class is set, the upgrade may fail due to missing required parameters. These parameters were introduced in version v1.10.10.