A Commute in Data: The comma2k19 Dataset
通勤數據:Comma2k19 數據集
https://arxiv.org/pdf/1812.05752v1
Abstract— comma.ai presents comma2k19, a dataset of over 33 hours of commute in California’s 280 highway. This means 2019 segments, 1 minute long each, on a 20km section of highway driving between California’s San Jose and San Francisco. The dataset was collected using comma EONs that have sensors similar to those of any modern smartphone including a road-facing camera, phone GPS, thermometers and a 9-axis IMU. Additionally, the EON captures raw GNSS measurements and all CAN data sent by the car with a comma grey panda. Laika, an open-source GNSS processing library, is also introduced here. Laika produces 40% more accurate positions than the GNSS module used to collect the raw data. This dataset includes pose (position + orientation) estimates in a global reference frame of the recording camera. These poses were computed with a tightly coupled INS/GNSS/Vision optimizer that relies on data processed by Laika. comma2k19 is ideal for development and validation of tightly coupled GNSS algorithms and mapping algorithms that work with commodity sensors.
摘要—comma.ai 介紹了 comma2k19,這是一個包含超過 33 小時加州 280 號高速公路通勤場景的數據集。該數據集包含 2019 個各為 1 分鐘長的片段,記錄了在加州圣何塞和舊金山之間 20 公里高速公路上的駕駛情況。數據集利用配備了與現代智能手機相似傳感器的 comma EON 設備收集而成,這些傳感器包括一個面向道路的攝像頭、手機 GPS、溫度計和一個九軸慣性測量單元(IMU)。此外,EON 設備還能捕獲由汽車通過 comma grey panda 發送的原始 GNSS 測量數據和所有的 CAN 數據。文中還介紹了 Laika,這是一個開源的 GNSS 處理庫,其提供的定位精度比收集原始數據時使用的 GNSS 模塊高出 40%。該數據集包括了記錄攝像頭在全球參考框架中的姿態(位置和方向)估計,這些估計是通過一個緊密耦合的慣性導航系統/GNSS/視覺優化器計算得出的,該優化器依賴于 Laika 處理的數據。comma2k19 數據集非常適合用于開發和驗證緊密耦合的 GNSS 算法和適用于普通傳感器的地圖算法。
I. INTRODUCTION
“Quality over quantity”, or that’s what they say anyway, but is this true in the world of data? The reality is that collecting data with high-end sensors is expensive as dedicated hardware is needed and this quickly becomes unfeasible for larger datasets. Affordable sensors on the other hand, are ubiquitous and already continuously logging data on billions of devices around the world. The world is a noisy place, some trends require big data to become obvious. To find such trends, algorithms need to be developed to deal with huge amounts of less than perfect data. It is this core idea that motivates comma.ai’s strategy to collect data with scalibility as a priority.
“質量勝于數量”,人們通常這么說,但在數據的世界里,這真的成立嗎?實際情況是,使用高端傳感器收集數據成本很高,因為需要專門的硬件,而且對于大規模數據集而言,這種方法很快就會變得不切實際。而價格可接受的傳感器則隨處可見,它們已經在全球數十億設備上不斷地記錄著數據。世界充滿了噪聲,一些趨勢只有通過大數據才能顯現出來。為了發現這些趨勢,需要開發能夠處理大量不完美數據的算法。正是這個核心理念驅動了 comma.ai 優先考慮數據收集的可擴展性策略。
The dataset released here, comma2k191, contains data collected by an EON2 and a grey panda3 during 2019 minutes of driving sampled from a Californian commute (Figure 1). There are logs of a road-facing camera, a 9-axis IMU, the vehicle’s transmitted CAN messages and raw GNSS measurements. This makes this dataset uniquely valuable for the development of mapping algorithms that require dense data and can use raw GNSS data.
這里發布的數據集名為 comma2k19,它包含了在 2019 分鐘的駕駛過程中,通過 EON2 設備和 grey panda 設備在加州通勤路線上收集的數據(見圖 1)。數據記錄包括一個面向道路的攝像頭、一個九軸 IMU(慣性測量單元)、車輛傳輸的 CAN 總線消息,以及原始的 GNSS 測量數據。這些數據使得該數據集對于開發需要密集數據支持并且能夠使用原始 GNSS 數據的地圖算法來說非常寶貴。
GNSS數據怎么用啊???
Conventionally this is done by fusing global position fixes from a GNSS module with other sensors [1], [2], [3], [4]. However, these methods use a pre-computed navigation solution from the GNSS module, i.e. they are loosely coupled. A more optimal approach is to directly integrate the raw GNSS measurements into the mapping optimizer/filter, this is called tight coupling [5], [6], [7]. A tightly coupled GNSS/INS/Vision fusion algorithm is likely the state-of-theart global pose estimator for a commodity sensor package. The comma2k19 dataset is ideal to develop and validate such an algorithm.
傳統上,這是通過將 GNSS 模塊提供的全球定位修正與其它傳感器數據結合起來實現的 [1]、[2]、[3]、[4]。但是,這些方法依賴于 GNSS 模塊預先計算的導航解決方案,也就是說,它們之間的耦合是松散的。一個更優的方法是直接將原始的 GNSS 測量數據集成到映射優化器或濾波器中,這種方法稱為緊耦合 [5]、[6]、[7]。緊耦合的 GNSS/INS/視覺融合算法可能是目前最先進的,用于商品化傳感器套件的全局姿態估計技術。comma2k19 數據集是開發和驗證此類算法的理想選擇。
We also introduce Laika, an open-source GNSS processing library that was developed and validated using data from the comma2k19 dataset. Laika produces significantly more accurate position fixes than reported by the u-blox M8 GNSS module used for raw data collection.
我們還推出了 Laika,這是一個開源的 GNSS 處理庫,它利用 comma2k19 數據集進行了開發和驗證。與用于收集原始數據的 u-blox M8 GNSS 模塊所報告的定位修正相比,Laika 能夠提供顯著更精確的位置修正。
II. RELATED DATASETS
There are several driving datasets in the literature such as KITTI [8], Cityscapes [9], RoboCar [10], ApolloScapes [11], Berkeley Deep Drive [12], including our previous public dataset [13]. Most of these datasets focus on high quality sensors such as LIDAR or high level computer vision annotations such as semantic segmentation, object detection and imitation learning.
文獻中存在多種駕駛數據集,例如 KITTI [8]、Cityscapes [9]、RoboCar [10]、ApolloScapes [11]、Berkeley Deep Drive [12],以及我們之前發布的公共數據集 [13]。這些數據集中的大多數集中在使用高質量傳感器,例如激光雷達(LIDAR),以及高級的計算機視覺注釋上,如語義分割、目標檢測和模仿學習。
On the other hand, the dataset presented here focuses on consumer grade sensors for reproducibility and scalability. Additionally, all the data collected in this dataset is concentrated in a very small area, this high density ensures repeated observations of the same location across a variety of conditions. This combined with the raw GNSS logs makes this dataset more suited for the development of high performance localization and mapping algorithms intended to run on commodity hardware.
另一方面,這里介紹的數據集著眼于消費級傳感器,以便復制和擴展。此外,該數據集收集的所有數據都密集在一個很小的地理區域內,這樣的高密度確保了在不同條件下對相同地點的多次觀測。這一點,加上原始的 GNSS 日志數據,使得該數據集更適合開發高性能的定位和地圖算法,這些算法旨在普通硬件上運行。
III. SAMPLE CHOICE
The data was collected with the EON’s standard logging infrastructure. This specific highway was chosen because it is representative of the commute of millions of Americans that drive similar urban roads across the country every day. Data was only selected from this small portion of road to ensure that it is sufficiently dense for experiments mapping-related experiments. An interesting challenge of this dataset is that the vision data we collected is quite different from other datasets, in that there are a less good features to track[14] in the video. This makes it particularly interesting to test vision algorithms that need to work on the common highway driving scenarios.
數據是通過 EON 設備的標準日志記錄功能收集的。選擇這條特定的高速公路是因為它代表了數百萬美國人每天駕駛的類似城市道路的情況。僅從這段道路上選取數據,以確保數據密度足以滿足地圖相關的實驗需求。這個數據集的一個有趣挑戰在于,我們收集的視覺數據與其他數據集相比有很大的不同,特別是在視頻中可追蹤的特征不多[14]。這使得它成為測試需要在普通高速公路駕駛場景中運行的視覺算法的絕佳材料。
IV. SENSOR SETUP
A. Vehicles
Data was logged on two different setups. A 2016 Honda Civic Touring and a 2017 Toyota RAV4 Platinum.
數據收集自兩款車型。一款是 2016 年的 Honda Civic Touring,另一款是 2017 年的 Toyota RAV4 Platinum。
B. CAN messages
All the vehicles CAN messages are received and logged. Radar, steering angle and wheel speed readings have been parsed in this dataset.
所有車輛的 CAN 總線信息都被捕獲并記錄了下來。在這個數據集中,已經解析了雷達信號、轉向角度和車輪速度的數據。
C. Camera data
The road-facing camera data was logged with a Sony IMX2984 camera sensor. Video is captured at 20Hz and compressed with H.264.
面向道路的攝像頭數據使用了索尼 IMX298 傳感器進行記錄。視頻以每秒 20 幀的頻率捕獲,并采用 H.264 編碼進行壓縮。這種設置旨在平衡數據的詳細程度和存儲效率,以便于處理和分析。
D. Raw GNSS
The grey panda, contains a u-blox M8 chip5 connected to a Tallysman TW4721 antenna. Raw data and u-blox’s navigation fix are logged at 10Hz. The raw data includes the doppler shifts, pseudoranges and carrier phases on the L1 channel for GLONASS and GPS. On the Civic the antenna was mounted inside the car under the windshield, on the RAV4 the antenna was mounted on the roof, resulting in a signal about 15dB stronger.
grey panda 設備內含一枚 u-blox M8 芯片,并連接到一枚 Tallysman TW4721 天線。原始數據和 u-blox 的導航修正以 10Hz 的頻率被記錄。這些原始數據包括了 L1 頻道上 GLONASS 和 GPS 的多普勒頻移、偽距和載波相位。在本田思域上,天線被安裝在車內的擋風玻璃下,而在豐田 RAV4 上,天線被安裝在車頂上,這導致信號強度大約增強了 15dB。
E. Other Sensors
Gyro and accelerometer data was collected with a LSM6DS3 at 100Hz and magnetometer data with a AK09911 at 10Hz. The EON also has an integrated WGR7640 GNSS receiver that also logs raw GNSS measurements in the same format as the u-blox module and logs at 1Hz. However, at least partly due to the bad antenna, the quality of the WGR7640 data is much lower.
陀螺儀和加速度計數據使用 LSM6DS3 傳感器以每秒 100 次的頻率收集,而磁力計數據則使用 AK09911 傳感器以每秒 10 次的頻率收集。EON 設備還集成了一個 WGR7640 GNSS 接收器,它也以與 u-blox 模塊相同的格式記錄原始的 GNSS 測量數據,記錄頻率為每秒 1 次。然而,至少部分原因是由于天線性能不佳,WGR7640 接收到的數據質量明顯較差。
V. LAIKA
Laika6 is an open source GNSS processing library developed with comma2k19. Laika is similar to projects like [15] and [16], with a strong focus on simplicity, readability and straight-forward integration with other optimizers. Laika can be used to compute location fixes from the raw GNSS data that can be significantly more accurate than the live fix computed by GNSS module used for data collection.
Laika6 是一個開源的 GNSS 處理庫,它是配合 comma2k19 數據集開發的。Laika 與 [15] 和 [16] 等項目類似,特別注重簡潔性、易讀性和與其他優化工具的直接集成。Laika 能夠利用原始 GNSS 數據計算位置修正,這些修正的準確性可能顯著高于數據收集時使用的 GNSS 模塊所實時計算的修正。
To compute the fixes, raw measurements from the dataset are processed with Laika and then fed into a Kalman filter or an other optimizer that estimates positions. To prove the efficacy of Laika we used a simple Kalman filter that only takes GNSS data as input. A lack of ground truth can make it difficult to judge GNSS algorithms, since the true position of the receiver is never known. However, assuming the height of the road is constant within a small area, we can estimate the altitude accuracy of a position fix by checking the variation of estimated road height over small sections (5m x 5m) of road. This requires many passes through the same section of road to be reliable; luckily the high density data from this dataset is more than sufficient. Figure 3 shows the altitude error distribution for positions computed with Laika and the positions reported by the u-blox module. Overall the positioning error was reduced by 40%.
為了計算位置修正,首先使用 Laika 對數據集中的原始測量數據進行處理,然后將處理后的數據輸入到卡爾曼濾波器或其他優化器中,以估算位置。為了驗證 Laika 的效果,我們采用了一個只將 GNSS 數據作為輸入的簡單卡爾曼濾波器。由于接收器的確切位置從未被知曉,缺乏地面真實數據可能使得評估 GNSS 算法的有效性變得具有挑戰性。但是,如果我們假設在一個小區域內道路的高度是恒定的,我們可以通過檢查小段道路(5 米 x 5 米)上估計的道路高度變化來估算位置修正的高程精度。為了確保可靠性,這需要多次通過道路的相同部分;幸運的是,數據集中的高密度數據完全足夠。圖 3 展示了使用 Laika 計算出的位置與 u-blox 模塊報告的位置的高程誤差分布。總體來看,位置誤差減少了 40%。
圖 3 展示了 Laika 和實時 u-blox 基線算法在兩種不同場景下的高程誤差分布:一種場景是天線安裝在車頂(左側圖),另一種場景是天線安裝在車內(右側圖)。這些分布圖用于比較 Laika 處理數據的方法與 u-blox 模塊實時計算的精度,進而評估 Laika 在不同安裝條件下的性能表現。
VI. GLOBAL POSES
In addition to the raw sensor data, the logs also contain best estimates for global pose (position + orientation) calculated by Mesh3D, comma.ai’s internal post-processing infrastructure that relies on data processed by Laika. They were computed with a tightly coupled GNSS/INS/Vision optimizer, where raw GNSS measurements and ORB [17] features were fed into a Multi-State Constraint Kalman Filter (MSCKF) [18], [19]. Figure 4 shows a snapshot of the resulting 3D path and lane estimates projected into camera frame.
除了原始的傳感器數據,日志還包含了通過 Mesh3D 計算出的全局姿態(位置加方向)的最佳估計值,Mesh3D 是 comma.ai 的內部后處理系統,它依賴于 Laika 處理的數據。這些估計值是通過一個緊密耦合的 GNSS/INS/視覺優化器得出的,其中原始的 GNSS 測量數據和 ORB 特征被輸入到一個多狀態約束卡爾曼濾波器(MSCKF)中。圖 4 展示了將計算出的 3D 路徑和車道估計投影到攝像頭坐標系的截圖。
怎么能將規劃的路徑投影到圖像數據呢???
The global position in the comma2k19 is given in ECEF [20] frame in meters, and the orientation is given as the quaternion that is needed to rotate from ECEF frame into local frame. Where the local frame is defined as [forward; right; down] in accordance with NED (North East Down) [21] conventions.
在 comma2k19 數據集中,全局位置以地心地球固定(Earth-Centered, Earth-Fixed,ECEF)坐標系的形式表示,單位是米,而姿態則通過四元數來定義,該四元數用于將位置從 ECEF 坐標系旋轉到本地坐標系。本地坐標系是依據北、東、下(North East Down,NED)的約定定義的,其方向為“向前、向右、向下”。
To estimate the Root Mean Squared Error (RMSE) of the vertical component of position, we used the same technique as in Section V. By using the observed DOP[22] of each fix we can get a reliable estimate of horizontal errors too. To estimate the accuracy of the provided orientation, we took the Jacobian, J?θi = @Ri=@?θi, of the re-projection error ? for the ith observed ORB feature, with respect to orientation errors, ?θ. We can then create linear equations to estimate the orientation error by using the Jacobian to linearize around ?θi = 0. The high level equations used to calculate the RMSE of the orientation, θ^, are shown below (1).
為了估計位置垂直分量的均方根誤差(RMSE),我們采用了與第五節相同的技術。通過使用每個位置修正點的觀測到的幾何稀釋因子(DOP)[22],我們同樣可以得到水平誤差的可靠估計。為了評估所提供姿態的準確性,我們計算了第 i 個觀測到的 ORB 特征的重投影誤差(R)相對于姿態誤差 ?θ 的雅可比矩陣 。然后,我們可以利用雅可比矩陣在 ?θi = 0 處進行線性化,創建線性方程來估計姿態誤差。用于計算姿態 RMSE, θ \theta θ 的高級方程如下所示(1)。
Since most of the measured reprojection error, R, is due to noise in the ORB feature detection, it is fair to assume that (1) is an upper bound of the true orientation errors in our estimates. In Table I we show both estimated position and orientation errors.
由于大部分測量到的重投影誤差 R 是由 ORB 特征檢測中的噪聲造成的,因此可以合理地假設方程(1)表示的是我們估計中真實姿態誤差的一個上限。在表 I 中,我們列出了估計的位置誤差和姿態誤差。
Some applications require even more accurate poses than provided above. One can use vision to fine tune the pose estimates with a simple Expectation-Maximization algorithm: first average the ECEF position of the matching ORB features across image/pose pairs from different drives, this reduces the error in ORB feature localization. After that, we infer the corrected poses by relocalizing the frames against the averaged ORB features. An example of a single iteration of this type of correction is show in Figure 5.
有些應用場景需要比前文所述提供的姿態更為精確。可以通過使用簡單的期望最大化(Expectation-Maximization)算法結合視覺數據來微調姿態估計:首先,對不同駕駛過程中匹配的 ORB 特征的 ECEF 位置進行平均,這有助于降低 ORB 特征定位的誤差。然后,我們可以通過將幀與平均后的 ORB 特征重新進行匹配來推斷出校正后的姿態。圖 5 展示了這種校正方法單次迭代的示例。
VII. CONCLUSION
We proposed the comma2k19, a state-of-the-art dataset to develop and validate tightly coupled GNSS algorithms, fused pose estimators and mapping algorithms that are intended to work with commodity sensors. Using comma2k19 we built and open sourced Laika, a raw GNSS processing library that reduced positioning errors by 40% compared to the baseline algorithm shipped with the u-blox sensor used data collection. comma2k19 also includes camera poses in a global reference frame of the over 2 million images provided. We believe the most interesting future research directions using comma2k19 and Laika should be developing novel vision and sensor fusion based mapping algorithms for HD maps in highways with sparse features to track.
我們提出了 comma2k19,這是一個尖端的數據集,旨在開發和驗證緊密耦合的 GNSS 算法、融合姿態估計器以及地圖算法,它們都設計為與商用傳感器配合使用。利用 comma2k19,我們構建并開源了 Laika,這是一個原始 GNSS 處理庫,它將定位誤差比 u-blox 傳感器附帶的基線算法降低了 40%。comma2k19 還包含了超過 200 萬張圖像的全局參考坐標系下的相機姿態。我們認為,使用 comma2k19 和 Laika 的最引人注目的未來研究方向應該是開發新型的視覺和傳感器融合地圖算法,這些算法適用于高速公路上特征稀疏的高精地圖。
ACKNOWLEDGEMENT
We’d like to thank Eddie Samuels, Nicholas McCoy, George Hotz, Greg Hogan, Viviane Ford and Willem Melching for setting up the hardware and infrastructure that enabled this research.
我們要感謝 Eddie Samuels、Nicholas McCoy、George Hotz、Greg Hogan、Viviane Ford 和 Willem Melching,他們為設置硬件和基礎設施做出了貢獻,這些硬件和基礎設施為這項研究提供了必要的支持。