On-Prem Upgrade Guide#
Upgrade Path: EMF On-Prem v3.1.3 → v2025.2.0
Document Version: 1.0
Overview#
This document provides step-by-step instructions to upgrade On-Prem Edge Manageability Framework (EMF) from version v3.1.3 to v2025.2.0.
Prerequisites#
System Requirements#
Current EMF On-Prem installation version 3.1.3 or later
Root/sudo privileges on orchestrator node
PostgreSQL service running and accessible
Sufficient disk space for backups (~200GB minimum)
Docker Hub credentials (if pull rate limit is reached)
Pre-Upgrade Checklist#
[ ] Back up critical application data from edge nodes
[ ] Document current edge node configurations
[ ] Ensure network connectivity between orchestrator and edge nodes
Upgrade Procedure#
Step 1: Download the Latest On-Prem Upgrade Script#
Note
EMF is released on a weekly basis. To use a weekly build, refer to the latest weekly tag available here. In the below script, replace v2025.2.0 with the appropriate weekly tag. Weekly tags follow the format: v2025.2.0-nYYYYMMDD.
Create the script file on the Edge Orchestrator node using the following command:
cat <<'EOF' > access_script.sh #!/usr/bin/env bash set -o errexit set -o nounset set -o pipefail REGISTRY_URL='registry-rs.edgeorchestration.intel.com' RS_PATH='edge-orch/common/files/on-prem' ORAS_VERSION='1.1.0' ORCH_VERSION='v2025.2.0' # Install oras if not already installed if ! command -v oras &> /dev/null; then echo "Oras not found. Installing..." # Download the specified version of oras curl -LO "https://github.com/oras-project/oras/releases/download/v${ORAS_VERSION}/oras_${ORAS_VERSION}_linux_amd64.tar.gz" # Create a temporary directory for oras installation mkdir -p oras-install/ # Extract the downloaded tarball into the temporary directory tar -zxf oras_${ORAS_VERSION}_*.tar.gz -C oras-install/ # Move the oras binary to a directory in the system PATH sudo mv oras-install/oras /usr/local/bin/ # Clean up the downloaded files and temporary directory rm -rf oras_${ORAS_VERSION}_*.tar.gz oras-install/ else echo "Oras is already installed." fi # Pull the specified artifact from the registry oras pull -v "${REGISTRY_URL}/${RS_PATH}:${ORCH_VERSION}" # Make all shell scripts in the current directory executable chmod +x *.sh EOF
Make the script executable.
chmod +x access_script.sh
Run the script on the Edge Orchestrator node.
./access_script.sh
The script does the following:
Installs the
orastoolDownloads the scripts to install and uninstall Edge Orchestrator
Configure Upgrade Environment#
The upgrade uses an onprem.env file for configuration. This file contains all
environment variables used by the on-premise upgrade scripts and must be properly
configured before running the upgrade.
Important
The onprem.env file is located in the same directory as the upgrade scripts
(downloaded via access_script.sh). You must edit this file and set the required
values before proceeding with the upgrade.
If you re-run the upgrade script, ensure the onprem.env file is correctly configured.
Runtime arguments will have higher precedence over the environment variables set in onprem.env.
Configuration Workflow +~~~~~~~~~~~~~~~~~~~~~~
Download the installer scripts using
access_script.sh(see previous section)Locate the
onprem.envfile in the downloaded directoryEdit
onprem.envwith your deployment-specific valuesRun
./onprem_upgrade.shto begin upgrade
The onprem.env file contains several configuration sections described below.
Core Deployment Configuration#
Variable |
Description |
Default Value |
|---|---|---|
|
Registry where packages and images are hosted |
|
|
Version of Edge Orchestrator to deploy |
|
|
Deployment profile for Edge Orchestrator |
|
Authentication & Security#
Variable |
Description |
Default Value |
|---|---|---|
|
Docker Hub username for pulling images |
(empty) |
|
Docker Hub password or access token |
(empty) |
Network Configuration#
Variable |
Description |
Default Value |
|---|---|---|
|
Cluster domain name for internal services |
|
|
MetalLB IP address for ArgoCD |
(empty) |
|
MetalLB IP address for Traefik |
(empty) |
|
MetalLB IP address for NGINX |
(empty) |
Container Registry Configuration#
Variable |
Description |
Default Value |
|---|---|---|
|
Image registry for Gitea container images |
|
SRE and SMTP Configuration#
Variable |
Description |
Default Value |
|---|---|---|
|
Site Reliability Engineering username |
|
|
SRE password |
|
|
SRE exporter destination URL |
|
Variable |
Description |
Default Value |
|---|---|---|
|
SMTP server address |
|
|
SMTP server port |
|
|
Email sender information |
|
|
SMTP authentication username |
|
|
SMTP authentication password |
|
Advanced Configuration#
Variable |
Description |
Default Value |
|---|---|---|
|
Kubernetes configuration file path |
|
OXM Network Configuration#
Variable |
Description |
Default Value |
|---|---|---|
|
PXE server interface |
(empty) |
|
PXE server IP address |
(empty) |
|
PXE server subnet |
(empty) |
Proxy Configuration#
Variable |
Description |
Default Value |
|---|---|---|
|
Enable explicit proxy configuration |
|
|
HTTP proxy for Orchestrator |
(empty) |
|
HTTPS proxy for Orchestrator |
(empty) |
|
No proxy list for Orchestrator |
(empty) |
|
HTTP proxy for Edge Nodes |
(empty) |
|
HTTPS proxy for Edge Nodes |
(empty) |
|
FTP proxy for Edge Nodes |
(empty) |
|
SOCKS proxy for Edge Nodes |
(empty) |
|
No proxy list for Edge Nodes |
(empty) |
Step 2: Open Two Terminals#
You will need two terminals for this upgrade process:
Terminal 1: To run the upgrade script
Terminal 2: To update proxy and load balancer configurations, if needed
Step 3: Terminal 1 - Set Environment Variables#
In Terminal 1, set all required environment variables. You can either:
Option A: Update onprem.env file directly
Edit the onprem.env file with all required values:
nano onprem.env
Update the following sections:
CORE DEPLOYMENT CONFIGURATION: - RELEASE_SERVICE_URL - DEPLOY_VERSION - ORCH_INSTALLER_PROFILE
AUTHENTICATION & SECURITY: - DOCKER_USERNAME - DOCKER_PASSWORD
NETWORK CONFIGURATION: - CLUSTER_DOMAIN - ARGO_IP, TRAEFIK_IP, NGINX_IP
CONTAINER REGISTRY: - GITEA_IMAGE_REGISTRY
PROXY CONFIGURATION (if applicable): - ENABLE_EXPLICIT_PROXY - ORCH_HTTP_PROXY, ORCH_HTTPS_PROXY, ORCH_NO_PROXY - EN_HTTP_PROXY, EN_HTTPS_PROXY, EN_FTP_PROXY, EN_SOCKS_PROXY, EN_NO_PROXY - GIT_PROXY
SRE and SMTP Configuration: - All SRE_* variables (SRE_USERNAME, SRE_PASSWORD, SRE_DEST_URL) - All SMTP_* variables (SMTP_ADDRESS, SMTP_PORT, SMTP_HEADER, SMTP_USERNAME, SMTP_PASSWORD)
Important: Ensure ALL variables in onprem.env are correctly set according to your environment. Some default values are provided, but you must update them to match your deployment:
# Get Load Balancer IPs
kubectl get svc argocd-server -n argocd
kubectl get svc traefik -n orch-gateway
kubectl get svc ingress-nginx-controller -n orch-boots
# Set deployment version (replace with your actual upgrade version tag)
export DEPLOY_VERSION=v2025.2.0
# Set non-interactive mode to true to skip prompts
export PROCEED=true
Step 4: Terminal 1 - Run On-Prem Upgrade Script#
In Terminal 1, execute the upgrade script:
./onprem_upgrade.sh
The script will:
Validate current installation
Check PostgreSQL status
Download packages and artifacts
Prompt for confirmation:
Ready to proceed with installation? (yes/no)
DO NOT enter “yes” yet - proceed to Step 5 first
Step 5: Terminal 2 - Update Configuration#
Before confirming in Terminal 1, open Terminal 2 and update configurations:
Verify proxy configuration (if applicable):
# File: repo_archives/tmp/edge-manageability-framework/orch-configs/clusters/onprem.yaml argo: proxy: httpProxy: "" httpsProxy: "" noProxy: "" enHttpProxy: "" enHttpsProxy: "" enFtpProxy: "" enSocksProxy: "" enNoProxy: ""
Note
Update the proxy settings according to your network configuration, if needed.
Verify load balancer IP configuration:
# Check current LoadBalancer IPs kubectl get svc argocd-server -n argocd kubectl get svc traefik -n orch-gateway kubectl get svc ingress-nginx-controller -n orch-boots # Verify LB IP configurations are updated nano repo_archives/tmp/edge-manageability-framework/orch-configs/clusters/onprem.yaml
Ensure all configurations are correct.
Step 6: Terminal 1 - Confirm and Continue#
if Interactive mode is enabled, wait for user confirmation. Once proxy and load balancer configurations are updated in Terminal 2, switch back to Terminal 1 and enter:
yes
The upgrade will then proceed automatically through all components.
Step 7: Monitor Upgrade Progress#
The upgrade process includes:
Upgrade RKE2 to 1.34.1 versions
OS Configuration upgrade
Gitea upgrade
ArgoCD upgrade
Edge Orchestrator upgrade
PostgreSQL Migrate
Unseal Vault
Post-Upgrade Verification#
Check the console output from the script. The last line should read:
Upgrade completed! Wait for ArgoCD applications to be in 'Synced' and 'Healthy' state
System Health Check#
# Verify package versions
dpkg -l | grep onprem-
# Check cluster status
kubectl get nodes
kubectl get pods -A
# Verify ArgoCD applications
kubectl get applications -A
Service Validation#
Watch ArgoCD applications until they are in Synced` and
Healthystate.
Web UI Access Verification#
After successful EMF upgrade, verify you can access the web UI with the same project/user/credentials used in before upgrade.
ArgoCD#
Username: admin
Retrieve argocd password:
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d
Gitea#
Retrieve Gitea username:
kubectl get secret gitea-cred -n gitea -o jsonpath="{.data.username}" | base64 -d
Reset Gitea password
# Get Gitea pod name GITEA_POD=$(kubectl get pods -n gitea -l app=gitea -o jsonpath='{.items[0].metadata.name}') # Reset password (replace 'test12345' with your desired password) kubectl exec -n gitea $GITEA_POD -- \ bash -c 'export GITEAPASSWORD=test12345 && gitea admin user change-password --username gitea_admin --password $GITEAPASSWORD'
Login to Gitea web UI:
kubectl -n gitea port-forward svc/gitea-http 3000:443 --address 0.0.0.0 # Then open https://localhost:3000 in your browser and use the above credentials.
Troubleshooting#
Issue#1: Root-App Sync and Certificate Refresh After Upgrade#
Symptoms:
Some applications show as OutOfSync, Degraded, or Missing
external-secrets and copy-ca* specific pods remain in OutOfSync, Missing, or Processing state
Resolution:
After running
onprem_upgrade.sh, wait 5–10 minutes forroot-appand dependent applications to sync.Run the resync script:
./after_upgrade_restart.sh
This script:
Continuously syncs applications
Performs root-app sync
Restarts tls-boots and dkam pods
If applications still fail to sync:
Log in to ArgoCD UI
Delete error-state CRDs/jobs
Re-sync
root-appand rerun the./after_upgrade_restart.shscript
Note
If external-secrets and copy-ca* specific pods remain in problematic state for an extended period, first delete the associated Jobs and CRDs. If the issue persists, delete the affected applications from the ArgoCD UI and then resync the root-app.
#. After running ./after_upgrade_restart.sh successfully and once all root-apps are in sync and healthy state, wait approximately 5 minutes to allow DKAM to fetch all dependent applications.
Verify that the signed_ipxe.efi image is downloaded using the freshly downloaded Full_server.crt, or monitor until signed_ipxe.efi is available.
Download the latest certificates:
# Delete both files before downloading rm -rf Full_server.crt signed_ipxe.efi export CLUSTER_DOMAIN=cluster.onprem wget https://tinkerbell-nginx.$CLUSTER_DOMAIN/tink-stack/keys/Full_server.crt --no-check-certificate --no-proxy -q -O Full_server.crt wget --ca-certificate=Full_server.crt https://tinkerbell-nginx.$CLUSTER_DOMAIN/tink-stack/signed_ipxe.efi -q -O signed_ipxe.efi
Once the above steps are successful, the orchestrator (Orch) is ready for onboarding new Edge Nodes (EN).
Issue#2: Handling Gitea Pod Crashes During Upgrade#
Symptoms:
Sometimes onprem_upgrade.sh may fail with the following error:
Error: UPGRADE FAILED: context deadline exceeded
dpkg: error processing package onprem-gitea-installer
E: Sub-process /usr/bin/dpkg returned an error code (1)
Resolution:
Check Gitea pod status:
kubectl get pod -n gitea
Restart dependent pods in order:
kubectl delete pod gitea-postgresql-0 -n gitea kubectl delete pod gitea-<pod-id> -n gitea
Note
Replace
<pod-id>with the actual Gitea pod ID from the output of the previous command.After the Gitea pod restarts successfully, re-run the upgrade script:
./onprem_upgrade.sh
Issue#3: Unsupported Workflow for Pre-Upgrade Onboarded Edge Nodes#
Issue:
If an Edge Node (EN) was onboarded before the EMF upgrade but the cluster installation was not completed, running the cluster installation after the upgrade using the latest cluster template will not work. This fails because the EN still uses old OS profiles and pre-upgrade settings.
Resolution:
To continue successfully after the upgrade, choose one of the following options:
Option 1: De-authorize and Re-Onboard the EN
De-authorize the existing EN from the orchestrator
Re-onboard the EN to ensure it gets the correct post-upgrade templates and configurations
Option 2: Update the OS Profile Using Day-2 Upgrade Process
Update the EN to the latest available OS profile using the day-2 upgrade process
After the OS profile upgrade is complete, proceed with cluster installation
Issue#4: Kyverno Pod in ImagePullBackOff State#
Issue:
After the upgrade, the Kyverno pod may be stuck in an ImagePullBackOff state due to image pull errors.
Resolution:
To resolve this issue, run the following commands to clean up and reset the Kyverno clean-reports job:
kubectl delete job kyverno-clean-reports -n kyverno &
kubectl delete pods -l job-name="kyverno-clean-reports" -n kyverno &
kubectl patch job kyverno-clean-reports -n kyverno --type=merge -p='{"metadata":{"finalizers":[]}}'