How to use NPU for inference#

Docker deployment#

Follow steps 1 and 2 mentioned in Get Started guide if not already done.

Volume mount NPU config#

Comment out CPU and GPU volume mount and uncomment the NPU volume mount present in compose.yml file under volumes section as shown below:

    volumes:
      # - "./src/dlstreamer-pipeline-server/configs/filter-pipeline/config.cpu.json:/home/pipeline-server/config.json"
      # - "./src/dlstreamer-pipeline-server/configs/filter-pipeline/config.gpu.json:/home/pipeline-server/config.json"
      - "./src/dlstreamer-pipeline-server/configs/filter-pipeline/config.npu.json:/home/pipeline-server/config.json"

Start and run the application#

After the above changes to docker compose file, follow from step 3 till end of the section as mentioned in the Get Started guide.

Helm deployment#

Follow step 1 mentioned in this document if not already done.

Update values.yaml#

In values.yaml file, change value of pipeline config present under dlstreamerpipelineserver section as shown below:

dlstreamerpipelineserver:
  # key: dlstreamerpipelineserver.repository
  repository:
    # key: dlstreamerpipelineserver.repository.image
    image: docker.io/intel/dlstreamer-pipeline-server
    # key: dlstreamerpipelineserver.repository.tag
    tag: 2025.2.0-ubuntu24
  # key: dlstreamerpipelineserver.replicas
  replicas: 1
  # key: dlstreamerpipelineserver.nodeSelector
  nodeSelector: {}
  # key: dlstreamerpipelineserver.pipeline
  pipeline: config.npu.json       #### Changed value from config.cpu.json to config.npu.json

Start the application#

After above changes to values.yaml file, follow from step 2 as mentioned in the Helm Deployment Guide.