# Mapping Service This Docker container provides a Flask REST API interface for 3D reconstruction with build-time model selection, enabling generation of meshes and camera parameters from captured frames. Each container is built with one of two state-of-the-art models: - **MapAnything**: Universal Feed-Forward Metric 3D Reconstruction - **VGGT**: Visual Geometry Grounded Transformer for sparse view reconstruction ## Features - **Flask** based REST API with JSON responses - **Build-Time Model Selection**: Single model per container, no dependency conflicts - **Multi-image Input**: Process multiple images simultaneously - **GLB Output**: Generate 3D models in GLB format - **Camera Data**: Extract camera poses and intrinsics - **Image Enhancement**: Automatic CLAHE preprocessing for improved contrast - **Containerized**: Model-specific containers for clean deployment ## SceneScape Integration The following diagram shows the dataflow between the Intel® SceneScape Web UI, database, MQTT broker, and the Mapping Service. > **Note:** The diagram is currently best viewed in light color mode. ```mermaid sequenceDiagram SceneScape Web UI ->>+Database: "Query camera info" SceneScape Web UI ->>+MQTT Broker: "Get latest frame for each camera" SceneScape Web UI ->>+Mapping Service: "REST API call to /reconstruction endpoint with camera frames" Mapping Service ->>+SceneScape Web UI: "Output: GLB & Camera Poses" SceneScape Web UI ->>+Database: "Update scene map & camera poses" ``` ## API Endpoints ### Health Check ```bash GET /health ``` Returns service status and model availability. ### List Models ```bash GET /models ``` Returns information about the model in this container and its status. ### 3D Reconstruction ```bash POST /reconstruction ``` Perform 3D reconstruction from images and/or video. #### Request Format **Multipart Form Data (Required)** The API accepts `Content-Type: multipart/form-data` to upload image and/or video files: ```bash POST /reconstruction Content-Type: multipart/form-data Form fields: - images: Image files (can specify multiple) - video: Video file (optional) - output_format: "glb" or "json" (default: "glb") - mesh_type: "mesh" or "pointcloud" (default: "mesh") - use_keyframes: "true" or "false" (for video, default: true) ``` **Notes:** - You can provide images only, video only, or both together - All inputs are processed as individual frames - The API only accepts multipart/form-data format with actual file uploads - JSON payloads with base64-encoded images are NOT supported - `model_type` is no longer needed - the model is determined at build time #### Response Format ```json { "success": true, "model": "mapanything", // indicates which model was used "glb_data": "base64_encoded_glb_file", "camera_poses": [ { "rotation": [0, 0, 0, 0], // quaternion rotation [x, y, z, w] "translation": [0, 0, 0] // 3D translation vector [x, y, z] } ], "intrinsics": [ [ [0, 0, 0], [0, 0, 0], [0, 0, 1] ] // 3x3 intrinsics matrix [[fx, 0, cx], [0, fy, cy], [0, 0, 1]] ], "processing_time": 15.23, "message": "Success message" } ``` ## Building and Running Check out [How to Build from Source](./build-from-source.md) for instructions on building the service from source and running it. ## Using the API ### Example with Python Client ```python import base64 import requests # Encode images to base64 def encode_image(image_path): with open(image_path, "rb") as f: return base64.b64encode(f.read()).decode('utf-8') # Prepare request payload = { "images": [ {"data": encode_image("image1.jpg"), "filename": "image1.jpg"}, {"data": encode_image("image2.jpg"), "filename": "image2.jpg"} ], "output_format": "glb" } # Send request response = requests.post("https://localhost:8444/reconstruction", json=payload) result = response.json() if result["success"]: # Save GLB file glb_data = base64.b64decode(result["glb_data"]) with open("output.glb", "wb") as f: f.write(glb_data) print(f"Model used: {result['model']}") print(f"Processing time: {result['processing_time']:.2f}s") print(f"Camera poses: {len(result['camera_poses'])}") ``` ### Using the Included Client ```bash # Check API health (model-agnostic) python client_example.py --health-check --insecure # Specify output type python client_example.py --images image1.jpg image2.jpg --mesh-type mesh --output mesh.glb --insecure python client_example.py --images image1.jpg image2.jpg --mesh-type pointcloud --output points.glb --insecure ``` ### Using curl ```bash # Health check curl https://localhost:8444/health --insecure # List models curl https://localhost:8444/models --insecure # Reconstruction with images (using multipart/form-data - recommended) curl -X POST "https://localhost:8444/reconstruction" \ -F "images=@image1.jpg" \ -F "images=@image2.jpg" \ -F "output_format=glb" \ -F "mesh_type=mesh" \ --insecure # Reconstruction with video curl -X POST "https://localhost:8444/reconstruction" \ -F "video=@video.mp4" \ -F "output_format=glb" \ -F "mesh_type=mesh" \ -F "use_keyframes=true" \ --insecure # Reconstruction with both images and video curl -X POST "https://localhost:8444/reconstruction" \ -F "images=@image1.jpg" \ -F "images=@image2.jpg" \ -F "video=@video.mp4" \ -F "output_format=glb" \ -F "mesh_type=mesh" \ --insecure # Save GLB output to file (requires jq for JSON parsing) curl -X POST "https://localhost:8444/reconstruction" \ -F "images=@image1.jpg" \ -F "images=@image2.jpg" \ -F "output_format=glb" \ -F "mesh_type=mesh" \ --insecure | jq -r '.glb_data' | base64 -d > output.glb ``` ## Model Comparison | Feature | MapAnything | VGGT | | --------------------- | --------------------- | ------------------------------------------------------------------------ | | **License** | Apache 2.0 | [VGGT License](https://github.com/3d-scene-recon/vggt/blob/main/LICENSE) | | **Input** | Multiple images | Multiple images/video frames | | **Strength** | Metric reconstruction | Sparse view reconstruction | | **Speed** | Fast | Moderate | | **Memory** | Lower | Higher | | **Quality** | High for dense views | High for sparse views | | **Native Output** | Watertight mesh | Point cloud | | **Supported Outputs** | Mesh, Point cloud | Point cloud, Mesh | ## Development ### Adding Custom Models To add support for additional models: 1. Create a new model class following the `ReconstructionModel` interface 2. Create a model-specific service file (e.g., `mymodel_service.py`) 3. Add model installation steps to the Dockerfile 4. Update the Makefile to support the new model type 5. Add build-time model selection logic ## Minimum Hardware Requirements - **CPU**: 12th Gen or newer Intel® Core™ processors (i5 or higher), or 2nd Gen or newer Intel® Xeon® processors - **RAM**: - MapAnything: 8GB minimum (4GB for model + overhead) - VGGT: 16GB minimum (8GB for model + overhead, more for high resolution images) - **Storage**: 12GB free space for Docker images and models ## Performance Notes - **First Run**: Initial model download may take several minutes - **Memory Requirements**: - MapAnything: ~4GB RAM - VGGT: ~8GB RAM (more for high resolution) - **Processing Time**: Varies by image count and resolution ## Best Practices - **Image Preprocessing**: All input images automatically undergo Contrast Limited Adaptive Histogram Equalization (CLAHE) to enhance contrast and improve reconstruction quality, particularly for low-contrast or unevenly-lit scenes. - **VGGT** pointcloud output scale is orders of magnitude smaller than the actual scene. The scale of the output mesh generated by **Map Anything** is closer to the actual scene than **VGGT**. - The output mesh generated by **VGGT** version of the service has several issues currently. All of these issues will be addressed in the next Intel® SceneScape release: - It is not aligned with the original point cloud - The resolution of the texture is not sharp. - Pointcloud to mesh conversion takes many multiples of time taken by inference that generates the pointcloud. - The service has not been tested with cameras which have distortion. Expect the reconstruction to perform poorly if your cameras show visual distortion. - The reconstruction does not distinguish between static and dynamic objects. If the camera frames contain objects like persons, vehicles etc., the reconstruction will include those objects as well. For best results, call the service when the camera frames do not contain objects that should not be included in the mesh. ## Supporting Resources - [Build from Source](./build-from-source.md): Build the service from source and run it. - [API Reference](./api-docs/mapping-api.yaml): Comprehensive reference for the Mapping service REST API endpoints. :::{toctree} :hidden: ./build-from-source.md :::