位置指紋法的實現（KNN）

基本原理

位置指紋法可以看作是分類或回歸問題（特征是RSS向量，標簽是位置），監督式機器學習方法可以從數據中訓練出一個從特征到標簽的映射關系模型。kNN是一種很簡單的監督式機器學習算法，可以用來做分類或回歸。

對于在線RSS向量s，分別計算它與指紋庫中各個RSS向量{s1,s2,...,sM}的距離（比如歐氏距離），選取最近的k個位置指紋（一個指紋是一個RSS向量與一個位置的對應）。

對于knn回歸，標簽是坐標x和坐標y，可以進行數值計算，使用這k個指紋的位置坐標取平均，得到作為定位結果。
對于knn分類，將定位區域劃分為1m×1m的網格，每個網格是看作一個類別，用網格標號代替，對k個網格標號計數投票，選擇票數做多的網格作為定位結果。

kNN是一種lazy式的學習方法，在上面的過程中不需要使用訓練數據進行“學習”，在定位的時候直接在訓練數據中搜索就可以。一些工具包中的kNN算法的訓練過程中會建立一個kd樹（一種數據結構），有利于在線預測時的搜索。

具體實現

Github地址，包括matlab版本和python版本
數據來源說明：http://www.cnblogs.com/rubbninja/p/6118430.html

導入數據

# 導入數據
import numpy as np
import scipy.io as scio
offline_data = scio.loadmat('offline_data_random.mat')
online_data = scio.loadmat('online_data.mat')
offline_location, offline_rss = offline_data['offline_location'], offline_data['offline_rss']
trace, rss = online_data['trace'][0:1000, :], online_data['rss'][0:1000, :]
del offline_data
del online_data

# 定位準確度
def accuracy(predictions, labels):return np.mean(np.sqrt(np.sum((predictions - labels)**2, 1)))

knn回歸

# knn回歸
from sklearn import neighbors
knn_reg = neighbors.KNeighborsRegressor(40, weights='uniform', metric='euclidean')
predictions = knn_reg.fit(offline_rss, offline_location).predict(rss)
acc = accuracy(predictions, trace)
print "accuracy: ", acc/100, "m"

accuracy:  2.24421479398 m

knn分類

# knn分類，需要把坐標轉換成網格標號，預測后將網格標號轉換為坐標
labels = np.round(offline_location[:, 0]/100.0) * 100 + np.round(offline_location[:, 1]/100.0)
from sklearn import neighbors
knn_cls = neighbors.KNeighborsClassifier(n_neighbors=40, weights='uniform', metric='euclidean')
predict_labels = knn_cls.fit(offline_rss, labels).predict(rss)
x = np.floor(predict_labels/100.0)
y = predict_labels - x * 100
predictions = np.column_stack((x, y)) * 100
acc = accuracy(predictions, trace)
print "accuracy: ", acc/100, 'm'

accuracy:  2.73213398632 m

定位算法分析

加入數據預處理和交叉驗證

# 預處理，標準化數據(其實RSS數據還算正常，不預處理應該也無所謂，特征選擇什么的也都不需要)
from sklearn.preprocessing import StandardScaler
standard_scaler = StandardScaler().fit(offline_rss)
X_train = standard_scaler.transform(offline_rss)
Y_train = offline_location
X_test = standard_scaler.transform(rss)
Y_test = trace

# 交叉驗證，在knn里用來選擇最優的超參數k
from sklearn.model_selection import GridSearchCV
from sklearn import neighbors
parameters = {'n_neighbors':range(1, 50)}
knn_reg = neighbors.KNeighborsRegressor(weights='uniform', metric='euclidean')
clf = GridSearchCV(knn_reg, parameters)
clf.fit(offline_rss, offline_location)
scores = clf.cv_results_['mean_test_score']
k = np.argmax(scores) #選擇score最大的k

# 繪制超參數k與score的關系曲線
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(range(1, scores.shape[0] + 1), scores, '-o', linewidth=2.0)
plt.xlabel("k")
plt.ylabel("score")
plt.grid(True)
plt.show()

png

# 使用最優的k做knn回歸
knn_reg = neighbors.KNeighborsRegressor(n_neighbors=k, weights='uniform', metric='euclidean')
predictions = knn_reg.fit(offline_rss, offline_location).predict(rss)
acc = accuracy(predictions, trace)
print "accuracy: ", acc/100, "m"

accuracy:  2.22455511073 m

# 訓練數據量與accuracy
k = 29
data_num = range(100, 30000, 300)
acc = []
for i in data_num:knn_reg = neighbors.KNeighborsRegressor(n_neighbors=k, weights='uniform', metric='euclidean')predictions = knn_reg.fit(offline_rss[:i, :], offline_location[:i, :]).predict(rss)acc.append(accuracy(predictions, trace) / 100)

# 繪制訓練數據量與accuracy的曲線
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(data_num, acc, '-o', linewidth=2.0)
plt.xlabel("data number")
plt.ylabel("accuracy (m)")
plt.grid(True)
plt.show()

png

作者：rubbninja
出處：http://www.cnblogs.com/rubbninja/
關于作者：目前主要研究領域為機器學習與無線定位技術，歡迎討論與指正！
版權聲明：本文版權歸作者和博客園共有，轉載請注明出處。

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/387275.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/387275.shtml
英文地址，請注明出處：http://en.pswp.cn/news/387275.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！