On-Prem Upgrade Guide#

Upgrade Path: EMF On-Prem v3.0 → v3.1 Document Version: 1.0

Overview#

This document provides step-by-step instructions to upgrade On-Prem Edge Manageability Framework (EMF) from version 3.0 to 3.1.

Important Notes#

Warning

DISRUPTIVE UPGRADE WARNING This upgrade requires edge node re-onboarding due to architecture changes (RKE2 → K3s). Plan for edge nodes service downtime and manual data backup/restore procedures in edge nodes.

Prerequisites#

System Requirements#

Current EMF On-Prem installation version 3.0
Root/sudo privileges on orchestrator node
PostgreSQL service running and accessible
Sufficient disk space for backups ~200+GB
docker user credential if any pull limit hit

Pre-Upgrade Checklist#

[ ] Back up critical application data from edge nodes
[ ] Document current edge node configurations
[ ] Remove all edge clusters and hosts: - Delete clusters - De-authorize hosts - Delete hosts

Upgrade Procedure#

Step 1: Copy Latest OnPrem Upgrade Script#

On the orchestrator deployed node, copy the latest upgrade script:

cd
cp edge-manageability-framework/on-prem-installers/onprem/*.sh ~/
chmod +x onprem_upgrade.sh

Step 2: Open Two Terminals#

You will need two terminals for this upgrade process:

Terminal 1: To run the upgrade script
Terminal 2: To update proxy and load balancer configurations when prompted

Step 3: Terminal 1 - Set Environment Variables#

In Terminal 1, set the required environment variables:

# get LB IP
kubectl get svc argocd-server -n argocd
kubectl get svc traefik -n orch-gateway
kubectl get svc ingress-nginx-controller -n orch-boots

# Set Environment

export RELEASE_SERVICE_URL=registry-rs.edgeorchestration.intel.com
export ORCH_INSTALLER_PROFILE=onprem
export CLUSTER_DOMAIN=cluster.onprem
export GITEA_IMAGE_REGISTRY='docker.io'
export DOCKER_USERNAME=<docker-username>
export DOCKER_PASSWORD=<docker-password>
export ARGO_IP=<ARGO_LodeBalanceIP>
export TRAEFIK_IP=<TRAEFIK_LodeBalanceIP>
export NGINX_IP=<NGINX_LodeBalanceIP>

Note: if any docker limit hit issue user should set docker login credential as env

# Unset PROCEED to allow manual confirmation
unset PROCEED

# Set deployment version (replace with your actual upgrade version tag)
export DEPLOY_VERSION=v3.1.0

Step 4: Terminal 1 - Run OnPrem Upgrade Script#

In Terminal 1, execute the upgrade script:

./onprem_upgrade.sh

The script will:

Validate current installation
Check PostgreSQL status
Download packages and artifacts
Eventually prompt for confirmation:

Ready to proceed with installation? (yes/no)

DO NOT enter “yes” yet - proceed to Step 5 first

Step 5: Terminal 2 - Update Configuration#

Before confirming in Terminal 1, open Terminal 2 and update configurations:

Update proxy settings (if applicable):

file:repo_archives/tmp/edge-manageability-framework/orch-configs/profiles/proxy-none.yaml

argo:
 proxy:
   httpProxy: ""
   httpsProxy: ""
   noProxy: ""
   enHttpProxy: ""
   enHttpsProxy: ""
   enFtpProxy: ""
   enSocksProxy: ""
   enNoProxy: ""

Note: Update the proxy settings according to your network configuration.

Verify load balancer IP configuration:

# Check current LoadBalancer IPs
kubectl get svc argocd-server -n argocd
kubectl get svc traefik -n orch-gateway
kubectl get svc ingress-nginx-controller -n orch-boots

# Verify LB IP configuration are updated
nano repo_archives/tmp/edge-manageability-framework/orch-configs/clusters/onprem.yaml

Ensure all configurations are correct

Step 6: Terminal 1 - Confirm and Continue#

Once proxy and load balancer configurations are updated in Terminal 2, switch back to Terminal 1 and enter:

yes

The upgrade will then proceed automatically through all components.

Step 7: Monitor Upgrade Progress#

The upgrade process includes:

OS Configuration upgrade
Gitea upgrade
ArgoCD upgrade
Edge Orchestrator upgrade
Unseal Vault

Post-Upgrade Verification#

Check the console output from the script. The last line should read:

Upgrade completed! Wait for ArgoCD applications to be in 'Synced' and 'Healthy' state

System Health Check#

# Verify package versions
dpkg -l | grep onprem-

# Check cluster status
kubectl get nodes
kubectl get pods -A

# Verify ArgoCD applications
kubectl get applications -A

Service Validation#

- Watch ArgoCD applications until they are in Synced` and Healthy state.

Web UI Access Verification#

After successful EMF upgrade, verify you can access the web UI with the same project/user/credentials used in before upgrade.

ArgoCD#

Username: admin

Retrieve argocd password:

kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d

Gitea#

Retrieve Gitea username:

kubectl get secret gitea-cred -n gitea -o jsonpath="{.data.username}" | base64 -d

Reset Gitea password

# Get Gitea pod name
GITEA_POD=$(kubectl get pods -n gitea -l app=gitea -o jsonpath='{.items[0].metadata.name}')

# Reset password (replace 'test12345' with your desired password)
kubectl exec -n gitea $GITEA_POD -- \
  bash -c 'export GITEAPASSWORD=test12345 && gitea admin user change-password --username gitea_admin --password $GITEAPASSWORD'

Login to Gitea web UI:

kubectl -n gitea port-forward svc/gitea-http 3000:443 --address 0.0.0.0
# Then open https://localhost:3000 in your browser and use the above credentials.

Troubleshooting#

Symptom: Sometimes the infra-managers application in ArgoCD may show as Not Healthy or Out of Sync. This can impact dependent components or cluster state.

Resolution Steps:

Delete the application from ArgoCD: and resync root-app or using kubectl patch command

Note

If the upgrade takes more than ~20 minutes and the root-app remains in an OutOfSync or Unhealthy state, apply the patch to the applications that are not healthy first, and then patch the root-app.

Example: Sometimes, after an upgrade, the following applications may be in a Missing, Unhealthy, or OutOfSync state: tenancy-api-mapping, tenancy-datamodel, infra-external, infra-managers.

# Patch the affected applications
kubectl patch application APPLICATION1-NAME -n onprem --patch-file /tmp/argo-cd/sync-patch.yaml --type merge
kubectl patch application APPLICATION2-NAME -n onprem --patch-file /tmp/argo-cd/sync-patch.yaml --type merge
kubectl patch application    root-app       -n onprem --patch-file /tmp/argo-cd/sync-patch.yaml --type merge

# Patch the root-app
kubectl patch application root-app -n onprem --patch-file /tmp/argo-cd/sync-patch.yaml --type merge

After applying the patch, the root-app should sync cleanly once its dependencies have become healthy.

During the onprem_upgrade, if Vault appears sealed or becomes unavailable, manual intervention may be required.

Symptom:

Vault Unseal Problem

Vault pod status shows sealed, causing issues with secret access or platform services. After running the on-prem upgrade script, if you see the following vault waiting output: then further vault unseal require
```
Deleting Vault pod: vault-0 in namespace: orch-platform
pod "vault-0" deleted
Waiting for pod 'vault-0' in namespace 'orch-platform' to be in Running state...
```

Check Vault status

kubectl get pod -A | grep vault-0
kubectl -n orch-platform exec -i vault-0 -- vault status

Vault Unseal Procedure

# Run the Vault unseal script
source ./vault_unseal.sh
vault_unseal

Open Issues:#

API Gateway does not reflect API changes from v1 to v2 automatically Workaround: Manually delete the nexus-api-gw pod to recover API changes.

After upgrade, both RKE2 and K3s Cluster Templates are labeled as default Workaround: Manually delete all old cluster templates related to 3.0 release RKE2 base.

Deployment package extensions are not updated after upgrade Workaround: Manually delete the app-orch-tenant-controller pod.

Automation Script for Workarounds#

To simplify post-upgrade recovery, the following script should be executed as part of the upgrade validation steps:

Script Name: after_upgrade_restart.sh Purpose: Automates the following workaround actions: - Restarts the nexus-api-gw pod to reflect API changes from v1 to v2 - Deletes outdated RKE2-based cluster templates from the 3.0 release - Restarts the app-orch-tenant-controller pod to trigger deployment extension updates

Note

Run the script after the on-prem upgrade using:

./after_upgrade_restart.sh

Post-Upgrade Steps EdgeNode onboarding process#

After a successful upgrade, follow the EN onboarding process as outlined in the official documentation: Set Up Edge Infrastructure – Intel Open Edge Platform