On-Prem Upgrade Guide#

Upgrade Path: EMF On-Prem v3.1.3 → v2025.2.0

Document Version: 1.0

Overview#

This document provides step-by-step instructions to upgrade On-Prem Edge Manageability Framework (EMF) from version v3.1.3 to v2025.2.0.

Prerequisites#

System Requirements#

Current EMF On-Prem installation version 3.1.3 or later
Root/sudo privileges on orchestrator node
PostgreSQL service running and accessible
Sufficient disk space for backups (~200GB minimum)
Docker Hub credentials (if pull rate limit is reached)

Pre-Upgrade Checklist#

[ ] Back up critical application data from edge nodes
[ ] Document current edge node configurations
[ ] Ensure network connectivity between orchestrator and edge nodes

Upgrade Procedure#

Step 1: Download the Latest On-Prem Upgrade Script#

Note

EMF is released on a weekly basis. To use a weekly build, refer to the latest weekly tag available here. In the below script, replace v2025.2.0 with the appropriate weekly tag. Weekly tags follow the format: v2025.2.0-nYYYYMMDD.

Create the script file on the Edge Orchestrator node using the following command:

cat <<'EOF' > access_script.sh
#!/usr/bin/env bash

set -o errexit
set -o nounset
set -o pipefail

REGISTRY_URL='registry-rs.edgeorchestration.intel.com'
RS_PATH='edge-orch/common/files/on-prem'
ORAS_VERSION='1.1.0'
ORCH_VERSION='v2025.2.0'

# Install oras if not already installed
if ! command -v oras &> /dev/null; then
   echo "Oras not found. Installing..."
   # Download the specified version of oras
   curl -LO "https://github.com/oras-project/oras/releases/download/v${ORAS_VERSION}/oras_${ORAS_VERSION}_linux_amd64.tar.gz"
   # Create a temporary directory for oras installation
   mkdir -p oras-install/
   # Extract the downloaded tarball into the temporary directory
   tar -zxf oras_${ORAS_VERSION}_*.tar.gz -C oras-install/
   # Move the oras binary to a directory in the system PATH
   sudo mv oras-install/oras /usr/local/bin/
   # Clean up the downloaded files and temporary directory
   rm -rf oras_${ORAS_VERSION}_*.tar.gz oras-install/
else
   echo "Oras is already installed."
fi

# Pull the specified artifact from the registry
oras pull -v "${REGISTRY_URL}/${RS_PATH}:${ORCH_VERSION}"

# Make all shell scripts in the current directory executable
chmod +x *.sh
EOF

Make the script executable.
```
chmod +x access_script.sh
```
Run the script on the Edge Orchestrator node.
```
./access_script.sh
```
The script does the following:
- Installs the oras tool
- Downloads the scripts to install and uninstall Edge Orchestrator

Configure Upgrade Environment#

The upgrade uses an onprem.env file for configuration. This file contains all environment variables used by the on-premise upgrade scripts and must be properly configured before running the upgrade.

Important

The onprem.env file is located in the same directory as the upgrade scripts (downloaded via access_script.sh). You must edit this file and set the required values before proceeding with the upgrade. If you re-run the upgrade script, ensure the onprem.env file is correctly configured. Runtime arguments will have higher precedence over the environment variables set in onprem.env.

Configuration Workflow +~~~~~~~~~~~~~~~~~~~~~~

Download the installer scripts using access_script.sh (see previous section)
Locate the onprem.env file in the downloaded directory
Edit onprem.env with your deployment-specific values
Run ./onprem_upgrade.sh to begin upgrade

The onprem.env file contains several configuration sections described below.

Core Deployment Configuration#

Core Environment Variables (Required)#
Variable	Description	Default Value
`RELEASE_SERVICE_URL`	Registry where packages and images are hosted	`registry-rs.edgeorchestration.intel.com`
`DEPLOY_VERSION`	Version of Edge Orchestrator to deploy	`v2025.2.0`
`ORCH_INSTALLER_PROFILE`	Deployment profile for Edge Orchestrator	`onprem`

Authentication & Security#

Docker Hub Credentials (Required)#
Variable	Description	Default Value
`DOCKER_USERNAME`	Docker Hub username for pulling images	(empty)
`DOCKER_PASSWORD`	Docker Hub password or access token	(empty)

Network Configuration#

Network Variables (Required)#
Variable	Description	Default Value
`CLUSTER_DOMAIN`	Cluster domain name for internal services	`cluster.onprem`
`ARGO_IP`	MetalLB IP address for ArgoCD	(empty)
`TRAEFIK_IP`	MetalLB IP address for Traefik	(empty)
`NGINX_IP`	MetalLB IP address for NGINX	(empty)

Container Registry Configuration#

Registry Variables#
Variable	Description	Default Value
`GITEA_IMAGE_REGISTRY`	Image registry for Gitea container images	`docker.io`

SRE and SMTP Configuration#

SRE Configuration#
Variable	Description	Default Value
`SRE_USERNAME`	Site Reliability Engineering username	`sre`
`SRE_PASSWORD`	SRE password	`123`
`SRE_DEST_URL`	SRE exporter destination URL	`http://sre-exporter-destination.cluster.onprem:8428/api/v1/write`

SMTP Configuration for Email Notifications#
Variable	Description	Default Value
`SMTP_ADDRESS`	SMTP server address	`smtp.serveraddress.com`
`SMTP_PORT`	SMTP server port	`587`
`SMTP_HEADER`	Email sender information	`foo bar <foo@bar.com>`
`SMTP_USERNAME`	SMTP authentication username	`uSeR`
`SMTP_PASSWORD`	SMTP authentication password	`T@123sfD`

Advanced Configuration#

Advanced Variables#
Variable	Description	Default Value
`KUBECONFIG`	Kubernetes configuration file path	`/home/$USER/.kube/config`

OXM Network Configuration#

OXM PXE Server Variables#
Variable	Description	Default Value
`OXM_PXE_SERVER_INT`	PXE server interface	(empty)
`OXM_PXE_SERVER_IP`	PXE server IP address	(empty)
`OXM_PXE_SERVER_SUBNET`	PXE server subnet	(empty)

Proxy Configuration#

Proxy Variables#
Variable	Description	Default Value
`ENABLE_EXPLICIT_PROXY`	Enable explicit proxy configuration	`false`
`ORCH_HTTP_PROXY`	HTTP proxy for Orchestrator	(empty)
`ORCH_HTTPS_PROXY`	HTTPS proxy for Orchestrator	(empty)
`ORCH_NO_PROXY`	No proxy list for Orchestrator	(empty)
`EN_HTTP_PROXY`	HTTP proxy for Edge Nodes	(empty)
`EN_HTTPS_PROXY`	HTTPS proxy for Edge Nodes	(empty)
`EN_FTP_PROXY`	FTP proxy for Edge Nodes	(empty)
`EN_SOCKS_PROXY`	SOCKS proxy for Edge Nodes	(empty)
`EN_NO_PROXY`	No proxy list for Edge Nodes	(empty)

Step 2: Open Two Terminals#

You will need two terminals for this upgrade process:

Terminal 1: To run the upgrade script
Terminal 2: To update proxy and load balancer configurations, if needed

Step 3: Terminal 1 - Set Environment Variables#

In Terminal 1, set all required environment variables. You can either:

Option A: Update onprem.env file directly

Edit the onprem.env file with all required values:

nano onprem.env

Update the following sections:

CORE DEPLOYMENT CONFIGURATION: - RELEASE_SERVICE_URL - DEPLOY_VERSION - ORCH_INSTALLER_PROFILE
AUTHENTICATION & SECURITY: - DOCKER_USERNAME - DOCKER_PASSWORD
NETWORK CONFIGURATION: - CLUSTER_DOMAIN - ARGO_IP, TRAEFIK_IP, NGINX_IP
CONTAINER REGISTRY: - GITEA_IMAGE_REGISTRY
PROXY CONFIGURATION (if applicable): - ENABLE_EXPLICIT_PROXY - ORCH_HTTP_PROXY, ORCH_HTTPS_PROXY, ORCH_NO_PROXY - EN_HTTP_PROXY, EN_HTTPS_PROXY, EN_FTP_PROXY, EN_SOCKS_PROXY, EN_NO_PROXY - GIT_PROXY
SRE and SMTP Configuration: - All SRE_* variables (SRE_USERNAME, SRE_PASSWORD, SRE_DEST_URL) - All SMTP_* variables (SMTP_ADDRESS, SMTP_PORT, SMTP_HEADER, SMTP_USERNAME, SMTP_PASSWORD)

Important: Ensure ALL variables in onprem.env are correctly set according to your environment. Some default values are provided, but you must update them to match your deployment:

# Get Load Balancer IPs
kubectl get svc argocd-server -n argocd
kubectl get svc traefik -n orch-gateway
kubectl get svc ingress-nginx-controller -n orch-boots

# Set deployment version (replace with your actual upgrade version tag)
export DEPLOY_VERSION=v2025.2.0

# Set non-interactive mode to true to skip prompts
export PROCEED=true

Step 4: Terminal 1 - Run On-Prem Upgrade Script#

In Terminal 1, execute the upgrade script:

./onprem_upgrade.sh

The script will:

Validate current installation
Check PostgreSQL status
Download packages and artifacts
Prompt for confirmation:

Ready to proceed with installation? (yes/no)

DO NOT enter “yes” yet - proceed to Step 5 first

Step 5: Terminal 2 - Update Configuration#

Before confirming in Terminal 1, open Terminal 2 and update configurations:

Verify proxy configuration (if applicable):

# File: repo_archives/tmp/edge-manageability-framework/orch-configs/clusters/onprem.yaml

argo:
  proxy:
    httpProxy: ""
    httpsProxy: ""
    noProxy: ""
    enHttpProxy: ""
    enHttpsProxy: ""
    enFtpProxy: ""
    enSocksProxy: ""
    enNoProxy: ""

Note

Update the proxy settings according to your network configuration, if needed.

Verify load balancer IP configuration:

# Check current LoadBalancer IPs
kubectl get svc argocd-server -n argocd
kubectl get svc traefik -n orch-gateway
kubectl get svc ingress-nginx-controller -n orch-boots

# Verify LB IP configurations are updated
nano repo_archives/tmp/edge-manageability-framework/orch-configs/clusters/onprem.yaml

Ensure all configurations are correct.

Step 6: Terminal 1 - Confirm and Continue#

if Interactive mode is enabled, wait for user confirmation. Once proxy and load balancer configurations are updated in Terminal 2, switch back to Terminal 1 and enter:

yes

The upgrade will then proceed automatically through all components.

Step 7: Monitor Upgrade Progress#

The upgrade process includes:

Upgrade RKE2 to 1.34.1 versions
OS Configuration upgrade
Gitea upgrade
ArgoCD upgrade
Edge Orchestrator upgrade
PostgreSQL Migrate
Unseal Vault

Post-Upgrade Verification#

Check the console output from the script. The last line should read:

Upgrade completed! Wait for ArgoCD applications to be in 'Synced' and 'Healthy' state

System Health Check#

# Verify package versions
dpkg -l | grep onprem-

# Check cluster status
kubectl get nodes
kubectl get pods -A

# Verify ArgoCD applications
kubectl get applications -A

Service Validation#

- Watch ArgoCD applications until they are in Synced` and Healthy state.

Web UI Access Verification#

After successful EMF upgrade, verify you can access the web UI with the same project/user/credentials used in before upgrade.

ArgoCD#

Username: admin

Retrieve argocd password:

kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d

Gitea#

Retrieve Gitea username:

kubectl get secret gitea-cred -n gitea -o jsonpath="{.data.username}" | base64 -d

Reset Gitea password

# Get Gitea pod name
GITEA_POD=$(kubectl get pods -n gitea -l app=gitea -o jsonpath='{.items[0].metadata.name}')

# Reset password (replace 'test12345' with your desired password)
kubectl exec -n gitea $GITEA_POD -- \
  bash -c 'export GITEAPASSWORD=test12345 && gitea admin user change-password --username gitea_admin --password $GITEAPASSWORD'

Login to Gitea web UI:

kubectl -n gitea port-forward svc/gitea-http 3000:443 --address 0.0.0.0
# Then open https://localhost:3000 in your browser and use the above credentials.

Troubleshooting#

Issue#1: Root-App Sync and Certificate Refresh After Upgrade#

Symptoms:

Some applications show as OutOfSync, Degraded, or Missing
external-secrets and copy-ca* specific pods remain in OutOfSync, Missing, or Processing state

Resolution:

After running onprem_upgrade.sh, wait 5–10 minutes for root-app and dependent applications to sync.
Run the resync script:
```
./after_upgrade_restart.sh
```
This script:
- Continuously syncs applications
- Performs root-app sync
- Restarts tls-boots and dkam pods
If applications still fail to sync:
- Log in to ArgoCD UI
- Delete error-state CRDs/jobs
- Re-sync root-app and rerun the ./after_upgrade_restart.sh script
Note

If external-secrets and copy-ca* specific pods remain in problematic state for an extended period, first delete the associated Jobs and CRDs. If the issue persists, delete the affected applications from the ArgoCD UI and then resync the root-app.

#. After running ./after_upgrade_restart.sh successfully and once all root-apps are in sync and healthy state, wait approximately 5 minutes to allow DKAM to fetch all dependent applications. Verify that the signed_ipxe.efi image is downloaded using the freshly downloaded Full_server.crt, or monitor until signed_ipxe.efi is available.

Download the latest certificates:

# Delete both files before downloading
rm -rf Full_server.crt signed_ipxe.efi
export CLUSTER_DOMAIN=cluster.onprem
wget https://tinkerbell-nginx.$CLUSTER_DOMAIN/tink-stack/keys/Full_server.crt --no-check-certificate --no-proxy -q -O Full_server.crt
wget --ca-certificate=Full_server.crt https://tinkerbell-nginx.$CLUSTER_DOMAIN/tink-stack/signed_ipxe.efi -q -O signed_ipxe.efi

Once the above steps are successful, the orchestrator (Orch) is ready for onboarding new Edge Nodes (EN).

Issue#2: Handling Gitea Pod Crashes During Upgrade#

Symptoms:

Sometimes onprem_upgrade.sh may fail with the following error:

Error: UPGRADE FAILED: context deadline exceeded
dpkg: error processing package onprem-gitea-installer
E: Sub-process /usr/bin/dpkg returned an error code (1)

Resolution:

Check Gitea pod status:
```
kubectl get pod -n gitea
```
Restart dependent pods in order:
```
kubectl delete pod gitea-postgresql-0 -n gitea
kubectl delete pod gitea-<pod-id> -n gitea
```
Note

Replace <pod-id> with the actual Gitea pod ID from the output of the previous command.
After the Gitea pod restarts successfully, re-run the upgrade script:
```
./onprem_upgrade.sh
```

Issue#3: Unsupported Workflow for Pre-Upgrade Onboarded Edge Nodes#

Issue:

If an Edge Node (EN) was onboarded before the EMF upgrade but the cluster installation was not completed, running the cluster installation after the upgrade using the latest cluster template will not work. This fails because the EN still uses old OS profiles and pre-upgrade settings.

Resolution:

To continue successfully after the upgrade, choose one of the following options:

Option 1: De-authorize and Re-Onboard the EN

De-authorize the existing EN from the orchestrator
Re-onboard the EN to ensure it gets the correct post-upgrade templates and configurations

Option 2: Update the OS Profile Using Day-2 Upgrade Process

Update the EN to the latest available OS profile using the day-2 upgrade process
After the OS profile upgrade is complete, proceed with cluster installation

Issue#4: Kyverno Pod in ImagePullBackOff State#

Issue:

After the upgrade, the Kyverno pod may be stuck in an ImagePullBackOff state due to image pull errors.

Resolution:

To resolve this issue, run the following commands to clean up and reset the Kyverno clean-reports job:

kubectl delete job kyverno-clean-reports -n kyverno &
kubectl delete pods -l job-name="kyverno-clean-reports" -n kyverno &
kubectl patch job kyverno-clean-reports -n kyverno --type=merge -p='{"metadata":{"finalizers":[]}}'

On-Prem Upgrade Guide#

Overview#

Prerequisites#

System Requirements#

Pre-Upgrade Checklist#

Upgrade Procedure#

Step 1: Download the Latest On-Prem Upgrade Script#

Configure Upgrade Environment#

Core Deployment Configuration#

Authentication & Security#

Network Configuration#

Container Registry Configuration#

SRE and SMTP Configuration#

Advanced Configuration#

OXM Network Configuration#

Proxy Configuration#

Step 2: Open Two Terminals#

Step 3: Terminal 1 - Set Environment Variables#

Step 4: Terminal 1 - Run On-Prem Upgrade Script#

Step 5: Terminal 2 - Update Configuration#

Step 6: Terminal 1 - Confirm and Continue#

Step 7: Monitor Upgrade Progress#

Post-Upgrade Verification#

System Health Check#

Service Validation#

Web UI Access Verification#

ArgoCD#

Gitea#

Troubleshooting#

Issue#1: Root-App Sync and Certificate Refresh After Upgrade#

Issue#2: Handling Gitea Pod Crashes During Upgrade#

Issue#3: Unsupported Workflow for Pre-Upgrade Onboarded Edge Nodes#

Issue#4: Kyverno Pod in ImagePullBackOff State#

This Page