Coding Agent (Preview)#
The DL Streamer Coding Agent is an AI-powered assistant that helps you build DL Streamer video analytics applications quickly from natural language descriptions. It generates complete, ready-to-run projects — including model download scripts, application code, and documentation.
Note: This feature is in Preview stage. Please share your feedback to help us improve it.
Prerequisites#
Development Machine#
You need a development machine with one of the following coding environments installed:
Visual Studio Code with the GitHub Copilot Chat extension
Note: The Coding Agent communicates with AI coding models in the cloud, which typically requires a paid service or subscription. For best results, select a reasoning AI model.
Target Machine#
You need a target machine to execute, run, and debug the generated code. The target machine can be the same as the development machine or a remote system. Refer to the System Requirements page for the list of supported hardware platforms.
Important: In the Preview release, only Linux target machines are supported.
Procedure#
1. Install the DL Streamer coding agent skill#
# Install npm to get `npx skills` cli
sudo apt-get update && sudo apt-get install npm -y
# Install skill globally across agents - choose the recommended option of SymLink to create one copy of skills at ~/.agents/skills and symbollically link to that for other agents
npx skills add open-edge-platform/dlstreamer --skill dlstreamer-coding-agent -g -a claude-code -a cursor -a github-copilot
# To update to latest skills from `main` branch
npx skills update -g dlstreamer-coding-agent
Note: In general, the
npx skills addcommand template to install skill is:npx skills add open-edge-platform/dlstreamer@<branch_ref|commig-hash> -s <skill-name> -a <agent1> -a <agent2>. For more details on thenpx skillsCLI, please refer https://github.com/vercel-labs/skills
2. Open the Chat Window and Enter Your Prompt#
Open the AI chat window in your coding environment and enter a prompt describing the application you want to build.
Develop a Python application that implements a license plate recognition pipeline optimized for Intel® Core™ Ultra Series 3 Processors:
- Read input video from a file (https://github.com/open-edge-platform/edge-ai-resources/raw/main/videos/ParkingVideo.mp4) but also allow remote IP cameras
- Run YOLOv11 (https://huggingface.co/morsetechlab/yolov11-license-plate-detection) for object detection and PaddleOCR (https://huggingface.co/PaddlePaddle/PP-OCRv5_server_rec) model for character recognition
- Output license plate text for each detected object as a JSON file
- Annotate the video stream and store it as an output video file
Save source code in the license_plate_recognition directory.
For the list of user prompts that can be tried with the DL Streamer coding agent, please check the ones available at link
Note: If the installed
dlstreamer-coding-agentskill at~/.agents/skillsis not auto-discovered by the coding agent tools, one can invoke the same in the respective AI chat window by invoking it manually as following:/dlstreamer-coding-agent [user prompt]
3. What the Coding Agent Does#
Once the prompt is submitted, the DL Streamer Coding Agent automatically performs the following steps:
Sets up the target machine — installs DL Streamer or pulls the DL Streamer Docker container on the target system.
Downloads AI models — fetches the required AI models and converts them to OpenVINO™ IR format on the target machine.
Builds the DL Streamer application — generates the complete application source code, including the main script, model export scripts, configuration files, and a
README.mdwith setup and run instructions.Runs, debugs, and validates — executes the generated application, debugs any issues, and validates that the output matches expectations.
If specific information is missing, the agent may ask clarifying questions and/or provide suggestions based on the existing DL Streamer sample applications.
The agent may ask for permission to access remote web pages, create files in your project, or execute commands on the target machine. The entire flow — from prompt to running application — typically takes 10–20 minutes, depending on network speed and application complexity.
The final output is a complete project directory containing:
<project_name>/
├── <app_name>.py # Main application script
├── export_models.py # Model download and export script
├── requirements.txt # Python dependencies
├── export_requirements.txt # Dependencies for model export
├── README.md # Setup and run instructions
├── plugins/ # Custom GStreamer elements (if needed)
│ └── python/
├── config/ # Configuration files (if needed)
├── models/ # Downloaded and converted models
├── videos/ # Cached input videos
└── results/ # Output files (JSON, annotated video, etc.)
Video walkthrough: The following video illustrates how a license plate recognition application can be created in less than 10 minutes.