# Getting Started Guide - Metro Vision AI SDK ## Overview The Metro Vision AI SDK provides a comprehensive development environment for computer vision applications using Intel's optimized tools and frameworks. This guide demonstrates the installation process and provides a practical object detection implementation using DLStreamer and OpenVINO. ## Learning Objectives Upon completion of this guide, you will be able to: - Install and configure the Metro Vision AI SDK - Execute object detection inference on video content - Understand the basic pipeline architecture for computer vision workflows ## System Requirements Verify that your development environment meets the following specifications: - Operating System: Ubuntu 24.04 LTS or Ubuntu 22.04 LTS - Memory: Minimum 8GB RAM - Storage: 20GB available disk space - Network: Active internet connection for package downloads ## Installation Process Execute the automated installation script to configure the complete development environment: ```bash curl https://raw.githubusercontent.com/open-edge-platform/edge-ai-suites/refs/heads/main/metro-ai-suite/metro-sdk-manager/scripts/metro-vision-ai-sdk.sh | bash ``` ![Metro Vision AI SDK Installation](images/metro-vision-ai-sdk-install.png) The installation process configures the following components: - Docker containerization platform - Intel DLStreamer video analytics framework - OpenVINO inference optimization toolkit - Pre-trained model repositories and sample implementations ## Object Detection Implementation This section demonstrates a complete object detection workflow using the installed components. ### Step 1: Create Working Directory Create a dedicated working directory for the tutorial: ```bash mkdir -p ~/metro/metro-vision-get-started-tutorial cd ~/metro/metro-vision-get-started-tutorial ``` ### Step 2: Download Sample Video and model Download the required video sample and pre-trained model components: ```bash wget -O sample.mp4 https://github.com/intel-iot-devkit/sample-videos/raw/master/person-bicycle-car-detection.mp4 mkdir -p models/intel/pedestrian-and-vehicle-detector-adas-0001/FP32/ wget -O "models/intel/pedestrian-and-vehicle-detector-adas-0001/FP32/pedestrian-and-vehicle-detector-adas-0001.xml" "https://storage.openvinotoolkit.org/repositories/open_model_zoo/2023.0/models_bin/1/pedestrian-and-vehicle-detector-adas-0001/FP32/pedestrian-and-vehicle-detector-adas-0001.xml?raw=true" wget -O "models/intel/pedestrian-and-vehicle-detector-adas-0001/FP32/pedestrian-and-vehicle-detector-adas-0001.bin" "https://storage.openvinotoolkit.org/repositories/open_model_zoo/2023.0/models_bin/1/pedestrian-and-vehicle-detector-adas-0001/FP32/pedestrian-and-vehicle-detector-adas-0001.bin?raw=true" wget -O "models/intel/pedestrian-and-vehicle-detector-adas-0001/pedestrian-and-vehicle-detector-adas-0001.json" "https://raw.githubusercontent.com/dlstreamer/dlstreamer/refs/heads/master/samples/gstreamer/model_proc/intel/pedestrian-and-vehicle-detector-adas-0001.json" ``` ### Step 3: Pipeline Execution Execute the object detection pipeline using the configured assets. ```bash # Allow X11 connections from local clients xhost + # Start the container docker run --rm -it --name dlstreamer \ -v $PWD:/data \ -e DISPLAY=$DISPLAY \ -v /tmp/.X11-unix:/tmp/.X11-unix \ intel/dlstreamer ``` ```bash # Run the GStreamer pipeline inside the running container gst-launch-1.0 filesrc location=/data/sample.mp4 ! \ decodebin ! videoconvert ! \ gvadetect model=/data/models/intel/pedestrian-and-vehicle-detector-adas-0001/FP32/pedestrian-and-vehicle-detector-adas-0001.xml \ model-proc=/data/models/intel/pedestrian-and-vehicle-detector-adas-0001/pedestrian-and-vehicle-detector-adas-0001.json \ device=CPU ! \ gvawatermark ! videoconvert ! autovideosink ``` ```bash # To exit the container exit ``` ![Object Detection](images/object-detection.png) ### Pipeline Architecture Analysis The executed command implements a processing pipeline with the following stages: 1. **File Input**: Reads the specified video file from the filesystem 2. **Video Decoding**: Converts encoded video stream to raw frame data 3. **Object Detection**: Applies the pre-trained model to identify pedestrians and vehicles 4. **Visualization**: Renders bounding boxes and classification labels on detected objects 5. **Display Output**: Presents the annotated video stream through the system display The resulting output displays the original video content with overlaid detection annotations indicating identified objects and their classification confidence levels. ## Technology Framework Overview ### DLStreamer Framework DLStreamer provides a comprehensive video analytics framework built on GStreamer technology. Key capabilities include: - Multi-format video input support (files, network streams, camera devices) - Real-time inference execution on video frame sequences - Flexible output rendering and storage options - Modular pipeline architecture for custom workflow development ### OpenVINO Optimization Toolkit OpenVINO delivers cross-platform inference optimization for Intel hardware architectures. Core features include: - Model format standardization through Intermediate Representation (IR) - Hardware-specific performance optimization - Extensive pre-trained model repository - Development tools for model conversion and validation ## Next Steps Continue your learning journey with these hands-on tutorials: ### [Tutorial 1: OpenVINO Model Benchmark](./tutorial-1.md) Learn to benchmark AI model performance across different Intel hardware (CPU, GPU, NPU) and understand optimization techniques for production deployments. ### [Tutorial 2: Multi-Stream Video Processing](./tutorial-2.md) Build scalable video analytics solutions by processing multiple video streams simultaneously with hardware acceleration. ### [Tutorial 3: Real-Time Object Detection](./tutorial-3.md) Develop a complete object detection application with interactive controls, performance monitoring, and production-ready features. ### [Tutorial 4: Advanced Video Analytics Pipelines](./tutorial-4.md) Create sophisticated video analytics using IntelĀ® DL Streamer framework, including human pose estimation and multi-model integration. ### [Tutorial 5: Profiling](./tutorial-5.md) Profiling and monitoring performance of Metro Vision AI workloads using command-line tools. ## Additional Resources ### Technical Documentation - [DLStreamer](http://docs.openedgeplatform.intel.com/dev/edge-ai-libraries/dl-streamer/index.html) \- Comprehensive documentation for Intel's GStreamer-based video analytics framework - [DLStreamer Pipeline Server](https://docs.openedgeplatform.intel.com/edge-ai-libraries/dlstreamer-pipeline-server/main/user-guide/Overview.html) \- RESTful microservice architecture documentation for scalable video analytics deployment - [OpenVINO](https://docs.openvino.ai/2025/get-started.html) \- Complete reference for Intel's cross-platform inference optimization toolkit - [OpenVINO Model Server](https://docs.openvino.ai/2025/model-server/ovms_what_is_openvino_model_server.html) \- Model serving infrastructure documentation for production deployments - [Edge AI Libraries](https://docs.openedgeplatform.intel.com/dev/ai-libraries.html) \- Comprehensive development toolkit documentation and API references - [Edge AI Suites](https://docs.openedgeplatform.intel.com/dev/ai-suite-metro.html) \- Complete application suite documentation with implementation examples ### Support Channels - [GitHub Issues](https://github.com/open-edge-platform/edge-ai-suites/issues) \- Technical issue tracking and community support :::{toctree} :hidden: tutorial-1 tutorial-2 tutorial-3 tutorial-4 tutorial-5 :::