目錄
一.Pipline與工具棧
二.硬件設備概況
三.GPU視頻編解碼框架
四.VPI編譯使用實例
五.?jetson_multimedia_api編譯使用實例
一.Pipline與工具棧
二.硬件設備概況
三.GPU視頻編解碼框架
- jetson設備目前不支持VPF框架,關于VPF的使用我在下節PC段使用X86進行安裝與演示
- jetson目前支持的GPU編解碼框架為VPI和jetson_multimedia_api
#1.主機端 agx@ubuntu:~$ ls /usr/src/jetson_multimedia_api/ argus data include LEGAL LICENSE Makefile README samples tools agx@ubuntu:~$ ls /opt/ containerd/ genymobile/ nvidia/ ota_package/ todesk/ agx@ubuntu:~$ ls /opt/nvidia/vpi2/ bin doc etc include lib lib64 samples share#2.docker端 agx@ubuntu:~$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE nvcr.io/nvidia/l4t-pytorch r35.2.1-pth2.0-py3 853b58c1dce6 2 years ago 11.7GB agx@ubuntu:~$ docker exec -it nvpy bash root@7666a2ca87d3:/# ls /usr/src/jetson_multimedia_api/ LEGAL LICENSE Makefile README argus data include samples tools root@7666a2ca87d3:/# ls /opt/nvidia/vpi2/ bin doc etc include lib lib64 samples share
四.VPI編譯使用實例
? ? ? ? 1.運行結果
root@7666a2ca87d3:/opt/nvidia/vpi2# cd s
samples/ share/
root@7666a2ca87d3:/opt/nvidia/vpi2# cd samples/
01-convolve_2d/ 03-harris_corners/ 05-benchmark/ 07-fft/ 09-tnr/ 11-fisheye/ 13-optflow_dense/ 15-image_view/ 17-template_matching/ assets/
02-stereo_disparity/ 04-rescale/ 06-klt_tracker/ 08-cross_aarch64_l4t/ 10-perspwarp/ 12-optflow_lk/ 14-background_subtractor/ 16-vpi_pytorch/ 18-orb_feature_detector/ tutorial_blur/
root@7666a2ca87d3:/opt/nvidia/vpi2# cd samples/01-convolve_2d/
root@7666a2ca87d3:/opt/nvidia/vpi2/samples/01-convolve_2d# python3 main.py --backend=cuda --input "/opt/nvidia/vpi2/share/backgrounds/NVIDIA_icon.png"
root@7666a2ca87d3:/opt/nvidia/vpi2/samples/01-convolve_2d#
? ? ? ? 2.源碼?
import sys
import vpi
import numpy as np
from PIL import Image
from argparse import ArgumentParser# Parse command line arguments
parser = ArgumentParser()
parser.add_argument('--backend', choices=['cpu','cuda','pva'],default="cuda",help='Backend to be used for processing')parser.add_argument('--input',default="/opt/nvidia/vpi2/share/backgrounds/NVIDIA_icon.png",help='Image to be used as input')args = parser.parse_args();if args.backend == 'cpu':backend = vpi.Backend.CPU
elif args.backend == 'cuda':backend = vpi.Backend.CUDA
else:assert args.backend == 'pva'backend = vpi.Backend.PVA# Load input into a vpi.Image
try:input = vpi.asimage(np.asarray(Image.open(args.input)))
except IOError:sys.exit("Input file not found")
except:sys.exit("Error with input file")# Convert it to grayscale
input = input.convert(vpi.Format.U8, backend=vpi.Backend.CUDA)# Define a simple edge detection kernel
kernel = [[ 1, 0, -1],[ 0, 0, 0],[-1, 0, 1]]# Using the chosen backend,
with backend:# Run input through the convolution filteroutput = input.convolution(kernel, border=vpi.Border.ZERO)# Save result to disk
Image.fromarray(output.cpu()).save('edges_python'+str(sys.version_info[0])+'_'+args.backend+'.png')
? ? ? ? ?3.結果展示(上面用的是一個濾波)
五.?jetson_multimedia_api編譯使用實例
????????1.cuda h264編碼(bug警告,能編譯通過·但是無法OSD,后續兩個實驗直接在jetson-dektop上面實驗的,就行了)
root@ubuntu:/usr/src/jetson_multimedia_api/samples/03_video_cuda_enc# make clean
root@ubuntu:/usr/src/jetson_multimedia_api/samples/03_video_cuda_enc# make
Compiling: video_cuda_enc_csvparser.cpp
Compiling: video_cuda_enc_main.cpp
make[1]: 進入目錄“/usr/src/jetson_multimedia_api/samples/common/classes”
Compiling: NvElementProfiler.cpp
Compiling: NvElement.cpp
Compiling: NvApplicationProfiler.cpp
Compiling: NvVideoDecoder.cpp
Compiling: NvJpegEncoder.cpp
Compiling: NvBuffer.cpp
Compiling: NvLogging.cpp
Compiling: NvEglRenderer.cpp
Compiling: NvUtils.cpp
Compiling: NvDrmRenderer.cpp
Compiling: NvJpegDecoder.cpp
Compiling: NvVideoEncoder.cpp
Compiling: NvV4l2ElementPlane.cpp
Compiling: NvBufSurface.cpp
Compiling: NvV4l2Element.cpp
make[1]: 離開目錄“/usr/src/jetson_multimedia_api/samples/common/classes”
make[1]: 進入目錄“/usr/src/jetson_multimedia_api/samples/common/algorithm/cuda”
Compiling: NvAnalysis.cu
Compiling: NvCudaProc.cpp
make[1]: 離開目錄“/usr/src/jetson_multimedia_api/samples/common/algorithm/cuda”
Linking: video_cuda_enc
root@ubuntu:/usr/src/jetson_multimedia_api/samples/03_video_cuda_enc# ./video_cuda_enc ../../data/Video/sample_outdoor_car_1080p_10fps.yuv 1920 1080 H264 test.h264
段錯誤 (核心已轉儲)
? ? ? ? 2. cuda h264解碼
root@ubuntu:/usr/src/jetson_multimedia_api/samples/02_video_dec_cuda# ./video_dec_cuda ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264
Opening in BLOCKING MODE
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
Starting decoder capture loop thread
Input file read complete
Video Resolution: 1920x1080
[INFO] (NvEglRenderer.cpp:110) <renderer0> Setting Screen width 1920 height 1080
Query and set capture successful
Exiting decoder capture loop thread
App run was successful
? ? ? ? 3.cuda h264解碼+tensorrt目標檢測:
? ? ? ? GPU算法檢測與結果緩存
root@ubuntu:/usr/src/jetson_multimedia_api/samples/02_video_dec_cuda# cd ../04_video_dec_trt/
root@ubuntu:/usr/src/jetson_multimedia_api/samples/04_video_dec_trt# ./video_dec_trt 2 ../../data/Video/sample_outdoor_car_1080p_10fps.h264 ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 --trt-onnxmodel ../../data/Model/resnet10/resnet10_dynamic_batch.onnx --trt-mode 0
set onnx modefile: ../../data/Model/resnet10/resnet10_dynamic_batch.onnx
Using cached TRT model
Deserialization required 13048 microseconds.
Total per-runner device persistent memory is 5632
Total per-runner host persistent memory is 45440
Allocated activation device memory of size 22138880
Opening in BLOCKING MODE
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
Starting decoder capture loop thread
Input file read complete
Video Resolution: 1920x1080
Opening in BLOCKING MODE
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
Resolution change successful
Starting decoder capture loop thread
Input file read complete
Video Resolution: 1920x1080
Resolution change successful
Time elapsed:1 ms per frame in past 100 frames
Time elapsed:1 ms per frame in past 100 frames
Time elapsed:1 ms per frame in past 100 frames
Time elapsed:1 ms per frame in past 100 frames
Time elapsed:1 ms per frame in past 100 frames
Time elapsed:1 ms per frame in past 100 frames
? ? ? ? ?CUDA-H264視頻解碼+OSD
root@ubuntu:/usr/src/jetson_multimedia_api/samples/02_video_dec_cuda# ./video_dec_cuda ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 --bbox-file result0.txt
ctx.osd_file_path:result0.txt
Opening in BLOCKING MODE
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
Starting decoder capture loop thread
Input file read complete
Video Resolution: 1920x1080
[INFO] (NvEglRenderer.cpp:110) <renderer0> Setting Screen width 1920 height 1080
Query and set capture successful
Exiting decoder capture loop thread
App run was successful
root@ubuntu:/usr/src/jetson_multimedia_api/samples/02_video_dec_cuda# ls
Makefile resuilt.txt result0.txt result1.txt result.txt videodec_csvparser.cpp videodec_csvparser.o video_dec_cuda videodec.h videodec_main.cpp videodec_main.o
root@ubuntu:/usr/src/jetson_multimedia_api/samples/02_video_dec_cuda# ./video_dec_cuda ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264
Opening in BLOCKING MODE
NvMMLiteOpen : Block : BlockType = 261
NVMEDIA: Reading vendor.tegra.display-size : status: 6
NvMMLiteBlockCreate : Block : BlockType = 261
Starting decoder capture loop thread
Input file read complete
Video Resolution: 1920x1080
[INFO] (NvEglRenderer.cpp:110) <renderer0> Setting Screen width 1920 height 1080
Query and set capture successful