愛智EdgerOS之深入解析AI圖像引擎如何實現AI視覺開發

一、前言

AI 視覺是為了讓計算機利用攝像機來替代人眼對目標進行識別，跟蹤并進一步完成一些更加復雜的圖像處理。這一領域的學術研究已經存在了很長時間，但直到 20 世紀 70 年代后期，當計算機的性能提高到足以處理圖片這樣大規模的數據時，計算機視覺才得到了正式的關注和發展。
現在 AI 視覺已經在我們的生活中無處不在，從日常使用的二維碼到人臉識別直至更專業的病理分析。AI 視覺的應用所滲透到的領域遠比我們想象的更加廣泛。雖然 AI 視覺的應用已經隨處可見，但如果想要自己去開發一套屬于自己的 AI 視覺應用，對于一個非專業領域的開發者還是非常復雜的，單從最基礎的算法訓練就要消耗掉大量的精力與時間。
EdgerOS 系統則內置了多種不同方向的 AI 引擎，使開發者可以實現快速實現 AI 視覺領域的開發，極大的降低了開發周期。開發者可以根據自己的需求對不同 AI 引擎進行組合達到自己想要的業務實現。本文將帶領大家一起了解 EdgerOS 中常用的兩款 AI 引擎。

二、FaceNN

FaceNN 是 EdgerOS 所提供的一個針對人臉識別的 AI 處理引擎，它可以從視頻流或者圖片中捕捉到人臉的具體位置，還可以根據人臉的特征來分析出對應人物的特征信息如：年齡、性別、情感等一些具體信息。
FaceNN 引擎封裝在 “facenn” 模塊中，可以通過以下方式來導入：

const facenn= require('facenn');

FaceNN 引擎提供了極簡的接口，這使得開發者可以更加快速的實現關于人臉的 AI 處理，同時也降低了巨大的學習成本。
首先需要明確一下被識別的圖像格式，目前 FaceNN 引擎支持如下格式：

類型	說明
facenn.PIX FMT RGB24	RGB24 pixel format
facenn.PIX FMT BGR2RGB24	BGR24 to RBG24 pixel format
facenn.PIXFMTGRAY2RGB24	Grayscale to RGB24 pixel format
facenn.PIX FMT RGBA2RGB24	RGBA to RGB24 pixel format

facenn.detect(videoBuf, attribute[, quick])
- attribute {Object} 圖像格式
- - width {Integer} 圖像寬度
- - height {Integer} 圖像高度
- - pixelFormat {Integer} 圖像格式
- quick {Boolean} 是否啟用快速模式
返回信息：
- score {Number} 人臉的覆蓋率
- x0 {Integer} 左上角 x 的位置
- y0 {Integer} 左上角 y 的位置
- x1 {Integer} 右下角 x 的位置
- y1 {Integer} 右下角 y 的位置
- area {Number} Area，非快速模式
- regreCoord {Array} RegreCoord，非快速模式
- landmark {Array} Landmark，非快速模式
facenn.detect 可以識別出一幀圖像數據中的人臉個數以及人臉所在圖像中的位置。
facenn.feature(videoBuf, attribute, faceInfo[, extra])
- videoBuf {Buffer} 圖像格式
- attribute {Object} 圖像屬性
- - width {Integer} 圖像寬度
- - height {Integer} 圖像高度
- - pixelFormat {Integer} 圖像格式
- extra {Object} 需要擴展的人臉信息 default: undefined
返回信息：
- keys {Array} Face keys
- male {Boolean} 性別, 需要在擴展中選擇
- age {Integer} Age, 需要在擴展中選擇
- emotion {String} Emotion, 需要在擴展中選擇
- emotion 可分辨情緒包括: angry,disgust,fear,happy,sad,surprise,neutral
- live {Number} 存活率，需要在擴展中選擇
facenn.feature 可以識別出一張人像的具體信息，例如性別，情緒年齡等。
facenn.compare(faceKeys1, faceKeys2)
- faceKey1 {Object} Face keys 1
- faceKey2 {Object} Face keys 2
返回信息：
- 相似值 0.0 ~ 1.0
- facenn.compare 可以比對出兩張人臉信息的相似值。
接下來用一下兩張圖片來嘗試使用 FaceNN 引擎，讀取其中的特征信息：

在這里插入圖片描述

const imagecodec = require('imagecodec'); // 圖片解析模塊
const facenn = require('facenn'); function facennHandel(imagePath, imagePath2) {const image1 = imagecodec.decode(imagePath, imagecodec.COMPONENTS_RGB)const imageInfo1 = imagecodec.info(imagePath)const videoAttrFacenn = { width: imageInfo1.width, height: imageInfo1.height, pixelFormat: facenn.PIX_FMT_RGB24 }const faceInfos = facenn.detect(image1.buffer, videoAttrFacenn);const facennFeature = facenn.feature(image1.buffer, videoAttrFacenn, faceInfos[0], {male: true,age: true,emotion: true,live: true})console.log(`image1.png  male:${facennFeature.male} age:${facennFeature.age} emotion:${facennFeature.emotion} live:${facennFeature.live}`)const image2 = imagecodec.decode(imagePath2, imagecodec.COMPONENTS_RGB)const imageInfo2 = imagecodec.info(imagePath2)const videoAttrFacenn2 = { width: imageInfo2.width, height: imageInfo2.height, pixelFormat: facenn.PIX_FMT_RGB24 }const faceInfos2 = facenn.detect(image2.buffer, videoAttrFacenn2);const facennFeature2 = facenn.feature(image2.buffer, videoAttrFacenn2, faceInfos2[0], {male: true,age: true,emotion: true,live: true})console.log(`image2.png  male:${facennFeature2.male} age:${facennFeature2.age} emotion:${facennFeature2.emotion} live:${facennFeature2.live}`)const compareNum = facenn.compare(facennFeature.keys, facennFeature2.keys)console.log(compareNum)
}facennHandel('/image/image1.png', '/image/image2.png')// 輸出如下：
// [JSRE-CON]image1.png  male:false age:21 emotion:neutral live:0.9843575954437256
// [JSRE-CON]image2.png  male:true age:58 emotion:sad live:0.33667701482772827
// [JSRE-CON]-0.1453045904636383

三、ThingNN

ThingNN 是 EdgerOS 可以從視頻流或者圖片中捕捉到具體事物，分別標記事務所在圖片中的具體位置。
ThingNN 引擎封裝在 “thingnn” 模塊中，可以通過以下方式來導入：

const facenn= require('thingnn');

同樣也需要明確一下被識別的圖像格式，目前 ThingNN 引擎支持如下格式：

類型	說明
thingnn.PIX FMT_ RGB24	RGB24 pixel format
thingnn.PIX_FMT_BGR2RGB24	BGR24 to RBG24 pixel format
thingnn.PIX FMT GRAY2RGB24	Grayscale to RGB24 pixel format
thingnn.PIX FMT RGBA2RGB24	RGBA to RGB24 pixel format

接下來看看 ThingNN 接口提供了那些接口：
thingnn.detect(videoBuf, attribute)
- videoBuf {Buffer} 圖像格式
- attribute {Object} 圖像屬性
- width {Integer} 圖像寬度
- height {Integer} 圖像高度
- pixelFormat {Integer} 圖像格式
返回信息：
- className{Array} Face keys
- prob{Boolean} 性別, 需要在擴展中選擇
- x0 {Integer} 左上角 x 的位置
- y0 {Integer} 左上角 y 的位置
- x1 {Integer} 右下角 x 的位置
- y1 {Integer} 右下角 y 的位置
目前 ThingNN 模塊所支持可識別的類型都有：

background, aeroplane, bicycle, bird, boat,bottle, bus, car, cat, chair,cow, diningtable, dog, horse,motorbike,person, pottedplant,sheep, sofa, train, tvmonitor

thingnn.detect 可以獲取到圖片中事物的類別以及所在圖像中的位置。
thingnn.identify(videoBuf, attribute, thingInfo)
- videoBuf {Buffer} 圖像格式
- attribute {Object} 圖像屬性
- width {Integer} 圖像寬度
- height {Integer} 圖像高度
- pixelFormat {Integer} 圖像格式
- thingInfo {Object} 事務對象
返回信息：具體事物的名稱，thingnn.identify 可以獲取到具體 thinginfo 的類型名稱。
以下圖為例子作為演示：

在這里插入圖片描述

const imagecodec = require('imagecodec'); // 圖片解析模塊
const facenn = require('facenn'); function licplatennHandel(imagePath) {
const imageInfo = imagecodec.info(imagePath)
const imageBuf= imagecodec.decode(imagePath, imagecodec.COMPONENTS_RGB).buffer
let videoAttrThingnn = { width: imageInfo.width, height: imageInfo.height, pixelFormat: thingnn.PIX_FMT_BGR24 }const thingInfos = thingnn.detect(imageBuf, videoAttrThingnn);thingInfos.forEach((thingInfo, index) => {const thingName = thingnn.identify(imageBuf, videoAttrThingnn, thingInfo);console.log(index,thingInfo.className, thingName)})
}licplatennHandel('/image/dog.png')// 輸出如下：
// [JSRE-CON]0 dog Labrador retriever

四、ImageCodec

FaceNN 模塊在單獨使用時是處理視頻流中的人臉信息的，現在假設我們的場景是一個智能門鎖，首先需要錄入人臉信息，添加為合法的開鎖用戶，門鎖攝像頭再捕獲視頻流檢測出人臉信息進行核對，校驗通過則打開門鎖。在錄入人臉信息的時候，需要將多張人臉照片處理成流信息提供給 FanceNN 模塊進行解析，ImageCodec 模塊剛好就可以勝任此工作。
ImageCodec 模塊提供了對多種圖像格式進行編碼和解碼方法，包括：PNG，JPG，BMP，TGA，HDR，接下來具體看一下，如何通過 ImageCodec 處理圖片數據。

const imagecodec = require('imagecodec')

① 區分帶通道的圖片

在對圖片進行解碼的時候需要區別處理帶通道的 PNG 圖片，ImageCodec 模塊上的 decode 方法支持傳入第二個可選參數：
- imagecodec.decode(path[, opt])：

const image = imagecodec.decode('./test.png', {components: imagecodec.COMPONENTS_RGB_ALPHA})

opt 的配置選項 components 可以指定以下值來區別處理不同格式的圖片：

定義	值	描述
imagecodec.COMPONENTS_DEFAULT	0	使用圖片的默認值
imagecodec.COMPONENTS_GREY	1	單字節灰度圖像
imagecodec.COMPONENTS_GREY_ALPHA	2	帶有 Alpha 通道的灰度圖像
imagecodec.COMPONENTS_RGB	3	三字節 RGB 圖像
imagecodec.COMPONENTS_RGB_ALPHA	4	帶有 Alpha 通道的 RGB 圖像

如何判斷一個圖片的格式，我們知道計算機實際并不是根據后綴來判斷文件類型的，事實上，有個東西叫魔法數字（Magic Number），它是某一類型的文件的頭一個或幾個字節的內容，可以根據這個來判斷傳入的圖片文件是什么類型的：

const fs = require('fs')
const imagecodec = require('imagecodec')
const imageBuffer = fs.readFile('./human.jpg')let type = ''
const arr = (new Uint8Array(picture)).subarray(0, 4)
const headerString = arr.reduce((acc, cur) => acc+cur.toString(16), '')
switch (headerString) {case "89504e47":type = "png";breakcase "47494638":type = "gif";breakcase "ffd8ffe0":case "ffd8ffe1":case "ffd8ffe2":type = "jpg"breakdefault:console.log('[mime-type] not png/gif/jpg.')break
}

將圖片文件的前 4 個字節（4 個字節的長度已經足夠判斷出圖片的類型了）拿出來進行判斷，一般拍照上傳的照片是 JPG 或 PNG，所以這里只需要判斷出圖片是否是帶有 ALPHA 通道的圖片即可。

② decode 方法解析圖片文件

上面判斷出圖片類型之后，就可以通過 decode 方法解碼圖片文件：

const bitmap = imagecodec.decode(picture, {components: type === 'png' ? imagecodec.COMPONENTS_RGB_ALPHA : imagecodec.COMPONENTS_RGB
})

decode解析得到的 bitmap 為一個圖像像素對象，它包含 width，height，components，buffer 4個屬性，也正是 FaceNN 所需要的內容。

③ 解析圖片中的人臉信息

這里跟 AI 識別的內容基本一致：

const facenn = require('facenn')const faces = facenn.detect(bitmap.buffer, {width: bitmap.width,height: bitmap.height,pixelFormat: type === 'png' ? facenn.PIX_FMT_RGBA2RGB24 : facenn.PIX_FMT_RGB24
}, true)

此時得到的 faces 內容就是識別之后的人臉特征信息，從圖片中獲取面部信息的功能就完成。

④ 封裝成包

這個功能已經封裝成一個 jsre 包上傳到了 npm 倉庫，可以通過以下方式進行安裝和使用：

npm install @edgeros/ofiiconst getFaceFeature = require('@edgeros/ofii')
const imageBuffer = fs.readFile('./hunman.png')
const keys = getFaceFeature(imageBuffer)
// 如果沒有檢測到人臉信息則返回 []