国产 无码 综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV)

這篇具有很好參考價值的文章主要介紹了3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV)。希望對大家有所幫助。如果存在錯誤或未考慮完全的地方,請大家不吝賜教,您也可以點(diǎn)擊"舉報(bào)違法"按鈕提交疑問。

本文介紹在3D目標(biāo)檢測中,理解和使用KITTI 數(shù)據(jù)集,包括KITTI 的基本情況、下載數(shù)據(jù)集、標(biāo)簽格式解析、3D框可視化、點(diǎn)云轉(zhuǎn)圖像、畫BEV鳥瞰圖等,并配有實(shí)現(xiàn)代碼。

目錄

?1、KITTI數(shù)據(jù)集3D框可視化

2、KITTI 3D數(shù)據(jù)集

3、下載數(shù)據(jù)集

4、標(biāo)簽格式

5、標(biāo)定參數(shù)解析

6、點(diǎn)云數(shù)據(jù)-->投影到圖像

7、圖像數(shù)據(jù)-->投影到點(diǎn)云

8、可視化圖像2D結(jié)果、3D結(jié)果

9、點(diǎn)云3D結(jié)果-->圖像BEV鳥瞰圖結(jié)果(坐標(biāo)系轉(zhuǎn)換)

10、繪制BEV鳥瞰圖

11、BEV鳥瞰圖畫2d框

12、完整工程代碼


?1、KITTI數(shù)據(jù)集3D框可視化

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

2、KITTI 3D數(shù)據(jù)集

kitti 3D數(shù)據(jù)集的基本情況:

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

KITTI整個數(shù)據(jù)集是在德國卡爾斯魯厄采集的,采集時長6小時。KITTI官網(wǎng)放出的數(shù)據(jù)大約占采集全部的25%,去除了測試集中相關(guān)的數(shù)據(jù)片段,按場景可以分為“道路”、“城市”、“住宅區(qū)”、“校園”和“行人”5類。

傳感器配置:

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

傳感器安裝位置:

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

3、下載數(shù)據(jù)集

The KITTI Vision Benchmark Suite (cvlibs.net)

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

下載數(shù)據(jù)需要注冊賬號的,獲取取百度網(wǎng)盤下載;文件的格式如下所示

圖片格式:xxx.jpg

點(diǎn)云格式:xxx.bin(點(diǎn)云是以bin二進(jìn)制的方式存儲的)

標(biāo)定參數(shù):xxx.txt(一個文件中包括各個相機(jī)的內(nèi)參、畸變校正矩陣、激光雷達(dá)坐標(biāo)轉(zhuǎn)到相機(jī)坐標(biāo)的矩陣、IMU坐標(biāo)轉(zhuǎn)激光雷達(dá)坐標(biāo)的矩陣)

標(biāo)簽格式:xxx.txt(包含類別、截?cái)嗲闆r、遮擋情況、觀測角度、2D框左上角坐標(biāo)、2D框右下角坐標(biāo)、3D物體的尺寸-高寬長、3D物體的中心坐標(biāo)-xyz、置信度)

4、標(biāo)簽格式

示例標(biāo)簽:Pedestrian 0.00 0 -0.20 712.40 143.00 810.73 307.92 1.89 0.48 1.20 1.84 1.47 8.41 0.01?

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

這時可以看看這個視頻:

Nuscenes、KITTI等多個BEV開源數(shù)據(jù)集介紹

5、標(biāo)定參數(shù)解析

然后看一下標(biāo)定參數(shù):

P0-P3:是各個相機(jī)的內(nèi)參矩陣;3×4的相機(jī)投影矩陣,0~3分別對應(yīng)左側(cè)灰度相機(jī)、右側(cè)灰度相機(jī)、左側(cè)彩色相機(jī)、右側(cè)彩色相機(jī)。

R0_rect: 是左相機(jī)的畸變矯正矩陣;3×3的旋轉(zhuǎn)修正矩陣。

Tr_velo_to_cam:是激光雷達(dá)坐標(biāo)系 轉(zhuǎn)到 相機(jī)坐標(biāo)系矩陣;3×4的激光坐標(biāo)系到Cam 0坐標(biāo)系的變換矩陣。

Tr_imu_to_velo: 是IMU坐標(biāo)轉(zhuǎn)到激光雷達(dá)坐標(biāo)的矩陣;3×4的IMU坐標(biāo)系到激光坐標(biāo)系的變換矩陣。

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

6、點(diǎn)云數(shù)據(jù)-->投影到圖像

當(dāng)有了點(diǎn)云數(shù)據(jù)信息,如何投影到圖像中呢?本質(zhì)上是一個坐標(biāo)系轉(zhuǎn)換的問題,流程思路如下:

  1. 已知點(diǎn)云坐標(biāo)(x,y,z),當(dāng)前是處于激光雷達(dá)坐標(biāo)系
  2. 激光雷達(dá)坐標(biāo)系 轉(zhuǎn)到 相機(jī)坐標(biāo)系,需要用到標(biāo)定參數(shù)中的Tr_velo_to_cam矩陣,此時得到相機(jī)坐標(biāo)(x1,y1,z1)
  3. 相機(jī)坐標(biāo)系進(jìn)行畸變矯正,需要用到標(biāo)定參數(shù)中的R0_rect矩陣,此時得到相機(jī)坐標(biāo)(x2,y2,z2)
  4. 相機(jī)坐標(biāo)系轉(zhuǎn)為圖像坐標(biāo)系,需要用到標(biāo)定參數(shù)中的P0矩陣,即相機(jī)內(nèi)存矩陣,此時得到圖像坐標(biāo)(u,v)

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

看一下示例效果:

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

接口代碼:

'''
將點(diǎn)云數(shù)據(jù)投影到圖像
'''
def show_lidar_on_image(pc_velo, img, calib, img_width, img_height):
    ''' Project LiDAR points to image '''
    imgfov_pc_velo, pts_2d, fov_inds = get_lidar_in_image_fov(pc_velo,
        calib, 0, 0, img_width, img_height, True)
    imgfov_pts_2d = pts_2d[fov_inds,:]
    imgfov_pc_rect = calib.project_velo_to_rect(imgfov_pc_velo)

    import matplotlib.pyplot as plt
    cmap = plt.cm.get_cmap('hsv', 256)
    cmap = np.array([cmap(i) for i in range(256)])[:,:3]*255

    for i in range(imgfov_pts_2d.shape[0]):
        depth = imgfov_pc_rect[i,2]
        color = cmap[int(640.0/depth),:]
        cv2.circle(img, (int(np.round(imgfov_pts_2d[i,0])),
            int(np.round(imgfov_pts_2d[i,1]))),
            2, color=tuple(color), thickness=-1)
    Image.fromarray(img).save('save_output/lidar_on_image.png')
    Image.fromarray(img).show() 
    return img

核心代碼:

'''
將點(diǎn)云數(shù)據(jù)投影到相機(jī)坐標(biāo)系
'''
def get_lidar_in_image_fov(pc_velo, calib, xmin, ymin, xmax, ymax,
                           return_more=False, clip_distance=2.0):
    ''' Filter lidar points, keep those in image FOV '''
    pts_2d = calib.project_velo_to_image(pc_velo)
    fov_inds = (pts_2d[:,0]<xmax) & (pts_2d[:,0]>=xmin) & \
        (pts_2d[:,1]<ymax) & (pts_2d[:,1]>=ymin)
    fov_inds = fov_inds & (pc_velo[:,0]>clip_distance)
    imgfov_pc_velo = pc_velo[fov_inds,:]
    if return_more:
        return imgfov_pc_velo, pts_2d, fov_inds
    else:
        return imgfov_pc_velo
    

7、圖像數(shù)據(jù)-->投影到點(diǎn)云

當(dāng)有了圖像RGB信息,如何投影到點(diǎn)云中呢?本質(zhì)上是一個坐標(biāo)系轉(zhuǎn)換的問題,和上面的是逆過程,流程思路如下:

  1. 已知圖像坐標(biāo)(u,v),當(dāng)前是處于圖像坐標(biāo)系
  2. 圖像坐標(biāo)系 轉(zhuǎn) 相機(jī)坐標(biāo)系,需要用到標(biāo)定參數(shù)中的P0逆矩陣,即相機(jī)內(nèi)存矩陣,得到相機(jī)坐標(biāo)(x,y,z)
  3. 相機(jī)坐標(biāo)系進(jìn)行畸變矯正,需要用到標(biāo)定參數(shù)中的R0_rect逆矩陣,得到相機(jī)坐標(biāo)(x1,y1,z1)
  4. 矯正后相機(jī)坐標(biāo)系 轉(zhuǎn)?激光雷達(dá)坐標(biāo)系,需要用到標(biāo)定參數(shù)中的Tr_velo_to_cam逆矩陣,此時得到激光雷達(dá)坐標(biāo)(x2,y2,z2)

8、可視化圖像2D結(jié)果、3D結(jié)果

先看一下2D框的效果:

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

3D框的效果:

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

?接口代碼:

'''
在圖像中畫2D框、3D框
'''
def show_image_with_boxes(img, objects, calib, show3d=True):
    img1 = np.copy(img) # for 2d bbox
    img2 = np.copy(img) # for 3d bbox
    for obj in objects:
        if obj.type=='DontCare':continue
        cv2.rectangle(img1, (int(obj.xmin),int(obj.ymin)), (int(obj.xmax),int(obj.ymax)), (0,255,0), 2) # 畫2D框
        box3d_pts_2d, box3d_pts_3d = utils.compute_box_3d(obj, calib.P) # 獲取圖像3D框(8*2)、相機(jī)坐標(biāo)系3D框(8*3)
        img2 = utils.draw_projected_box3d(img2, box3d_pts_2d) # 在圖像上畫3D框
    if show3d:
        Image.fromarray(img2).save('save_output/image_with_3Dboxes.png')
        Image.fromarray(img2).show()
    else:
        Image.fromarray(img1).save('save_output/image_with_2Dboxes.png')
        Image.fromarray(img1).show()

核心代碼:

def compute_box_3d(obj, P):
    '''
    計(jì)算對象的3D邊界框在圖像平面上的投影
    輸入: obj代表一個物體標(biāo)簽信息,  P代表相機(jī)的投影矩陣-內(nèi)參。
    輸出: 返回兩個值, corners_3d表示3D邊界框在 相機(jī)坐標(biāo)系 的8個角點(diǎn)的坐標(biāo)-3D坐標(biāo)。
                                     corners_2d表示3D邊界框在 圖像上 的8個角點(diǎn)的坐標(biāo)-2D坐標(biāo)。
    '''
    # 計(jì)算一個繞Y軸旋轉(zhuǎn)的旋轉(zhuǎn)矩陣R,用于將3D坐標(biāo)從世界坐標(biāo)系轉(zhuǎn)換到相機(jī)坐標(biāo)系。obj.ry是對象的偏航角
    R = roty(obj.ry)    

    # 物體實(shí)際的長、寬、高
    l = obj.l;
    w = obj.w;
    h = obj.h;
    
    # 存儲了3D邊界框的8個角點(diǎn)相對于對象中心的坐標(biāo)。這些坐標(biāo)定義了3D邊界框的形狀。
    x_corners = [l/2,l/2,-l/2,-l/2,l/2,l/2,-l/2,-l/2];
    y_corners = [0,0,0,0,-h,-h,-h,-h];
    z_corners = [w/2,-w/2,-w/2,w/2,w/2,-w/2,-w/2,w/2];
    
    # 1、將3D邊界框的角點(diǎn)坐標(biāo)從對象坐標(biāo)系轉(zhuǎn)換到相機(jī)坐標(biāo)系。它使用了旋轉(zhuǎn)矩陣R
    corners_3d = np.dot(R, np.vstack([x_corners,y_corners,z_corners]))
    # 3D邊界框的坐標(biāo)進(jìn)行平移
    corners_3d[0,:] = corners_3d[0,:] + obj.t[0];
    corners_3d[1,:] = corners_3d[1,:] + obj.t[1];
    corners_3d[2,:] = corners_3d[2,:] + obj.t[2];

    # 2、檢查對象是否在相機(jī)前方,因?yàn)橹挥性谙鄼C(jī)前方的對象才會被繪制。
    # 如果對象的Z坐標(biāo)(深度)小于0.1,就意味著對象在相機(jī)后方,那么corners_2d將被設(shè)置為None,函數(shù)將返回None。
    if np.any(corners_3d[2,:]<0.1):
        corners_2d = None
        return corners_2d, np.transpose(corners_3d)
    
    # 3、將相機(jī)坐標(biāo)系下的3D邊界框的角點(diǎn),投影到圖像平面上,得到它們在圖像上的2D坐標(biāo)。
    corners_2d = project_to_image(np.transpose(corners_3d), P);
    return corners_2d, np.transpose(corners_3d)


def draw_projected_box3d(image, qs, color=(0,60,255), thickness=2):
    '''
    qs: 包含8個3D邊界框角點(diǎn)坐標(biāo)的數(shù)組, 形狀為(8, 2)。圖像坐標(biāo)下的3D框, 8個頂點(diǎn)坐標(biāo)。
    '''
    ''' Draw 3d bounding box in image
        qs: (8,2) array of vertices for the 3d box in following order:
            1 -------- 0
           /|         /|
          2 -------- 3 .
          | |        | |
          . 5 -------- 4
          |/         |/
          6 -------- 7
    '''
    qs = qs.astype(np.int32) # 將輸入的頂點(diǎn)坐標(biāo)轉(zhuǎn)換為整數(shù)類型,以便在圖像上繪制。

    # 這個循環(huán)迭代4次,每次處理一個邊界框的一條邊。
    for k in range(0,4):
       # Ref: http://docs.enthought.com/mayavi/mayavi/auto/mlab_helper_functions.html

       # 定義了要繪制的邊的起始點(diǎn)和結(jié)束點(diǎn)的索引。在這個循環(huán)中,它用于繪制邊界框的前四條邊。
       i,j=k,(k+1)%4
       cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness)

        # 定義了要繪制的邊的起始點(diǎn)和結(jié)束點(diǎn)的索引。在這個循環(huán)中,它用于繪制邊界框的后四條邊,與前四條邊平行
       i,j=k+4,(k+1)%4 + 4
       cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness)

        # 定義了要繪制的邊的起始點(diǎn)和結(jié)束點(diǎn)的索引。在這個循環(huán)中,它用于繪制連接前四條邊和后四條邊的邊界框的邊。
       i,j=k,k+4
       cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness)
    return image

9、點(diǎn)云3D結(jié)果-->圖像BEV鳥瞰圖結(jié)果(坐標(biāo)系轉(zhuǎn)換)

思路流程:

  1. 讀取點(diǎn)云數(shù)據(jù),點(diǎn)云得存儲格式是n*4,n是指當(dāng)前文件點(diǎn)云的數(shù)量,4分別表示(x,y,z,intensity),即點(diǎn)云的空間三維坐標(biāo)、反射強(qiáng)度
  2. 我們只需讀取前兩行即可,得到坐標(biāo)點(diǎn)(x,y)
  3. 然后將坐標(biāo)點(diǎn)(x,y),畫散點(diǎn)圖

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

BEV鳥瞰圖效果如下:

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

10、繪制BEV鳥瞰圖

BEV圖像示例效果:

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

核心代碼:


'''
可視化BEV鳥瞰圖
'''
def show_lidar_topview(pc_velo, objects, calib):
      # 1-設(shè)置鳥瞰圖范圍
    side_range = (-30, 30)  # 左右距離
    fwd_range = (0, 80)  # 后前距離
    
    x_points = pc_velo[:, 0]
    y_points = pc_velo[:, 1]
    z_points = pc_velo[:, 2]
    
    # 2-獲得區(qū)域內(nèi)的點(diǎn)
    f_filt = np.logical_and(x_points > fwd_range[0], x_points < fwd_range[1])
    s_filt = np.logical_and(y_points > side_range[0], y_points < side_range[1])
    filter = np.logical_and(f_filt, s_filt)
    indices = np.argwhere(filter).flatten() 
    x_points = x_points[indices]
    y_points = y_points[indices]
    z_points = z_points[indices]
    
    # 定義了鳥瞰圖中每個像素代表的距離
    res = 0.1   
    # 3-1將點(diǎn)云坐標(biāo)系 轉(zhuǎn)到 BEV坐標(biāo)系
    x_img = (-y_points / res).astype(np.int32)
    y_img = (-x_points / res).astype(np.int32)
    # 3-2調(diào)整坐標(biāo)原點(diǎn)
    x_img -= int(np.floor(side_range[0]) / res)
    y_img += int(np.floor(fwd_range[1]) / res)
    print(x_img.min(), x_img.max(), y_img.min(), y_img.max()) 
    
    # 4-填充像素值, 將點(diǎn)云數(shù)據(jù)的高度信息(Z坐標(biāo))映射到像素值
    height_range = (-3, 1.0)
    pixel_value = np.clip(a=z_points, a_max=height_range[1], a_min=height_range[0])
     

    def scale_to_255(a, min, max, dtype=np.uint8):
        return ((a - min) / float(max - min) * 255).astype(dtype)
    
    pixel_value = scale_to_255(pixel_value, height_range[0], height_range[1])
    
    # 創(chuàng)建圖像數(shù)組
    x_max = 1 + int((side_range[1] - side_range[0]) / res)
    y_max = 1 + int((fwd_range[1] - fwd_range[0]) / res)
    im = np.zeros([y_max, x_max], dtype=np.uint8)
    im[y_img, x_img] = pixel_value
    
    im2 = Image.fromarray(im)
    im2.save('save_output/BEV.png')
    im2.show()

11、BEV鳥瞰圖畫2d框

在BEV視圖中畫框,可視化結(jié)果:

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

接口代碼:

'''
將點(diǎn)云數(shù)據(jù)3D框投影到BEV
'''
def show_lidar_topview_with_boxes(img, objects, calib):
    def bbox3d(obj):
        box3d_pts_2d, box3d_pts_3d = utils.compute_box_3d(obj, calib.P) # 獲取3D框-圖像、3D框-相機(jī)坐標(biāo)系
        box3d_pts_3d_velo = calib.project_rect_to_velo(box3d_pts_3d) # 將相機(jī)坐標(biāo)系的框 轉(zhuǎn)到 激光雷達(dá)坐標(biāo)系
        return box3d_pts_3d_velo # 返回nx3的點(diǎn)

    boxes3d = [bbox3d(obj) for obj in objects if obj.type == "Car"]
    gt = np.array(boxes3d)
    im2 = utils.draw_box3d_label_on_bev(img, gt, scores=None, thickness=1) # 獲取激光雷達(dá)坐標(biāo)系的3D點(diǎn),選擇x, y兩維,畫到BEV平面坐標(biāo)系上
    im2 = Image.fromarray(im2)
    im2.save('save_output/BEV with boxes.png')
    im2.show()

核心代碼:

# 設(shè)置BEV鳥瞰圖參數(shù)
side_range = (-30, 30)  # 左右距離
fwd_range = (0, 80)  # 后前距離
res = 0.1  # 分辨率0.05m

def compute_box_3d(obj, P):
    '''
    計(jì)算對象的3D邊界框在圖像平面上的投影
    輸入: obj代表一個物體標(biāo)簽信息,  P代表相機(jī)的投影矩陣-內(nèi)參。
    輸出: 返回兩個值, corners_3d表示3D邊界框在 相機(jī)坐標(biāo)系 的8個角點(diǎn)的坐標(biāo)-3D坐標(biāo)。
                                     corners_2d表示3D邊界框在 圖像上 的8個角點(diǎn)的坐標(biāo)-2D坐標(biāo)。
    '''
    # 計(jì)算一個繞Y軸旋轉(zhuǎn)的旋轉(zhuǎn)矩陣R,用于將3D坐標(biāo)從世界坐標(biāo)系轉(zhuǎn)換到相機(jī)坐標(biāo)系。obj.ry是對象的偏航角
    R = roty(obj.ry)    

    # 物體實(shí)際的長、寬、高
    l = obj.l;
    w = obj.w;
    h = obj.h;
    
    # 存儲了3D邊界框的8個角點(diǎn)相對于對象中心的坐標(biāo)。這些坐標(biāo)定義了3D邊界框的形狀。
    x_corners = [l/2,l/2,-l/2,-l/2,l/2,l/2,-l/2,-l/2];
    y_corners = [0,0,0,0,-h,-h,-h,-h];
    z_corners = [w/2,-w/2,-w/2,w/2,w/2,-w/2,-w/2,w/2];
    
    # 1、將3D邊界框的角點(diǎn)坐標(biāo)從對象坐標(biāo)系轉(zhuǎn)換到相機(jī)坐標(biāo)系。它使用了旋轉(zhuǎn)矩陣R
    corners_3d = np.dot(R, np.vstack([x_corners,y_corners,z_corners]))
    # 3D邊界框的坐標(biāo)進(jìn)行平移
    corners_3d[0,:] = corners_3d[0,:] + obj.t[0];
    corners_3d[1,:] = corners_3d[1,:] + obj.t[1];
    corners_3d[2,:] = corners_3d[2,:] + obj.t[2];

    # 2、檢查對象是否在相機(jī)前方,因?yàn)橹挥性谙鄼C(jī)前方的對象才會被繪制。
    # 如果對象的Z坐標(biāo)(深度)小于0.1,就意味著對象在相機(jī)后方,那么corners_2d將被設(shè)置為None,函數(shù)將返回None。
    if np.any(corners_3d[2,:]<0.1):
        corners_2d = None
        return corners_2d, np.transpose(corners_3d)
    
    # 3、將相機(jī)坐標(biāo)系下的3D邊界框的角點(diǎn),投影到圖像平面上,得到它們在圖像上的2D坐標(biāo)。
    corners_2d = project_to_image(np.transpose(corners_3d), P);
    return corners_2d, np.transpose(corners_3d)



12、完整工程代碼

工程目錄:

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

kitti_vis_main.py(主代碼入口)


from __future__ import print_function

import os
import sys
import cv2
import os.path
from PIL import Image
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
ROOT_DIR = os.path.dirname(BASE_DIR)
sys.path.append(BASE_DIR)
sys.path.append(os.path.join(ROOT_DIR, 'mayavi'))
from kitti_object import *


def visualization():
    import mayavi.mlab as mlab
    dataset = kitti_object(os.path.join(ROOT_DIR, 'Kitti_3D_Vis/dataset/object'))   # linux 路徑
    data_idx = 10               # 選擇第幾張圖像

    # 1-加載標(biāo)簽數(shù)據(jù)
    objects = dataset.get_label_objects(data_idx)
    print("There are %d objects.", len(objects))

    # 2-加載圖像
    img = dataset.get_image(data_idx)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img_height, img_width, img_channel = img.shape

    # 3-加載點(diǎn)云數(shù)據(jù)
    pc_velo = dataset.get_lidar(data_idx)[:,0:3] # (x, y, z)

    # 4-加載標(biāo)定參數(shù)
    calib = dataset.get_calibration(data_idx)

    # 5-可視化原始圖像
    print(' ------------ show raw image -------- ')
    Image.fromarray(img).show()
    
    # 6-在圖像中畫2D框
    print(' ------------ show image with 2D bounding box -------- ')
    show_image_with_boxes(img, objects, calib, False)

    # 7-在圖像中畫3D框
    print(' ------------ show image with 3D bounding box ------- ')
    show_image_with_boxes(img, objects, calib, True)
    
    # 8-將點(diǎn)云數(shù)據(jù)投影到圖像
    print(' ----------- LiDAR points projected to image plane -- ')
    show_lidar_on_image(pc_velo, img, calib, img_width, img_height)

    # 9-畫BEV圖
    print('------------------ BEV of LiDAR points -----------------------------')
    show_lidar_topview(pc_velo, objects, calib)

     # 10-在BEV圖中畫2D框
    print('--------------- BEV of LiDAR points with bobes ---------------------')
    img1 = cv2.imread('save_output/BEV.png')     
    img = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)
    show_lidar_topview_with_boxes(img1, objects, calib)
    
    
if __name__=='__main__':
    visualization()

kitti_util.py

from __future__ import print_function

import numpy as np
import cv2
from PIL import Image
import os

# 設(shè)置BEV鳥瞰圖參數(shù)
side_range = (-30, 30)  # 左右距離
fwd_range = (0, 80)  # 后前距離
res = 0.1  # 分辨率0.05m

def compute_box_3d(obj, P):
    '''
    計(jì)算對象的3D邊界框在圖像平面上的投影
    輸入: obj代表一個物體標(biāo)簽信息,  P代表相機(jī)的投影矩陣-內(nèi)參。
    輸出: 返回兩個值, corners_3d表示3D邊界框在 相機(jī)坐標(biāo)系 的8個角點(diǎn)的坐標(biāo)-3D坐標(biāo)。
                                     corners_2d表示3D邊界框在 圖像上 的8個角點(diǎn)的坐標(biāo)-2D坐標(biāo)。
    '''
    # 計(jì)算一個繞Y軸旋轉(zhuǎn)的旋轉(zhuǎn)矩陣R,用于將3D坐標(biāo)從世界坐標(biāo)系轉(zhuǎn)換到相機(jī)坐標(biāo)系。obj.ry是對象的偏航角
    R = roty(obj.ry)    

    # 物體實(shí)際的長、寬、高
    l = obj.l;
    w = obj.w;
    h = obj.h;
    
    # 存儲了3D邊界框的8個角點(diǎn)相對于對象中心的坐標(biāo)。這些坐標(biāo)定義了3D邊界框的形狀。
    x_corners = [l/2,l/2,-l/2,-l/2,l/2,l/2,-l/2,-l/2];
    y_corners = [0,0,0,0,-h,-h,-h,-h];
    z_corners = [w/2,-w/2,-w/2,w/2,w/2,-w/2,-w/2,w/2];
    
    # 1、將3D邊界框的角點(diǎn)坐標(biāo)從對象坐標(biāo)系轉(zhuǎn)換到相機(jī)坐標(biāo)系。它使用了旋轉(zhuǎn)矩陣R
    corners_3d = np.dot(R, np.vstack([x_corners,y_corners,z_corners]))
    # 3D邊界框的坐標(biāo)進(jìn)行平移
    corners_3d[0,:] = corners_3d[0,:] + obj.t[0];
    corners_3d[1,:] = corners_3d[1,:] + obj.t[1];
    corners_3d[2,:] = corners_3d[2,:] + obj.t[2];

    # 2、檢查對象是否在相機(jī)前方,因?yàn)橹挥性谙鄼C(jī)前方的對象才會被繪制。
    # 如果對象的Z坐標(biāo)(深度)小于0.1,就意味著對象在相機(jī)后方,那么corners_2d將被設(shè)置為None,函數(shù)將返回None。
    if np.any(corners_3d[2,:]<0.1):
        corners_2d = None
        return corners_2d, np.transpose(corners_3d)
    
    # 3、將相機(jī)坐標(biāo)系下的3D邊界框的角點(diǎn),投影到圖像平面上,得到它們在圖像上的2D坐標(biāo)。
    corners_2d = project_to_image(np.transpose(corners_3d), P);
    return corners_2d, np.transpose(corners_3d)


def project_to_image(pts_3d, P):
    '''
    將相機(jī)坐標(biāo)系下的3D邊界框的角點(diǎn), 投影到圖像平面上, 得到它們在圖像上的2D坐標(biāo)
    輸入: pts_3d是一個nx3的矩陣, 包含了待投影的3D坐標(biāo)點(diǎn)(每行一個點(diǎn)), P是相機(jī)的投影矩陣, 通常是一個3x4的矩陣。
    輸出: 返回一個nx2的矩陣, 包含了投影到圖像平面上的2D坐標(biāo)點(diǎn)。

      P(3x4) dot pts_3d_extended(4xn) = projected_pts_2d(3xn)  => normalize projected_pts_2d(2xn)
      <=> pts_3d_extended(nx4) dot P'(4x3) = projected_pts_2d(nx3)   => normalize projected_pts_2d(nx2)
    '''
    n = pts_3d.shape[0] # 獲取3D點(diǎn)的數(shù)量
    pts_3d_extend = np.hstack((pts_3d, np.ones((n,1)))) # 將每個3D點(diǎn)的坐標(biāo)擴(kuò)展為齊次坐標(biāo)形式(4D),通過在每個點(diǎn)的末尾添加1,創(chuàng)建了一個nx4的矩陣。
    pts_2d = np.dot(pts_3d_extend, np.transpose(P)) # 將擴(kuò)展的3D坐標(biāo)點(diǎn)矩陣與投影矩陣P相乘,得到一個nx3的矩陣,其中每一行包含了3D點(diǎn)在圖像平面上的投影坐標(biāo)。每個點(diǎn)的坐標(biāo)表示為[x, y, z]。
    pts_2d[:,0] /= pts_2d[:,2] # 將投影坐標(biāo)中的x坐標(biāo)除以z坐標(biāo),從而獲得2D圖像上的x坐標(biāo)。
    pts_2d[:,1] /= pts_2d[:,2] # 將投影坐標(biāo)中的y坐標(biāo)除以z坐標(biāo),從而獲得2D圖像上的y坐標(biāo)。
    return pts_2d[:,0:2] # 返回一個nx2的矩陣,其中包含了每個3D點(diǎn)在2D圖像上的坐標(biāo)。



def draw_projected_box3d(image, qs, color=(0,60,255), thickness=2):
    '''
    qs: 包含8個3D邊界框角點(diǎn)坐標(biāo)的數(shù)組, 形狀為(8, 2)。圖像坐標(biāo)下的3D框, 8個頂點(diǎn)坐標(biāo)。
    '''
    ''' Draw 3d bounding box in image
        qs: (8,2) array of vertices for the 3d box in following order:
            1 -------- 0
           /|         /|
          2 -------- 3 .
          | |        | |
          . 5 -------- 4
          |/         |/
          6 -------- 7
    '''
    qs = qs.astype(np.int32) # 將輸入的頂點(diǎn)坐標(biāo)轉(zhuǎn)換為整數(shù)類型,以便在圖像上繪制。

    # 這個循環(huán)迭代4次,每次處理一個邊界框的一條邊。
    for k in range(0,4):
       # Ref: http://docs.enthought.com/mayavi/mayavi/auto/mlab_helper_functions.html

       # 定義了要繪制的邊的起始點(diǎn)和結(jié)束點(diǎn)的索引。在這個循環(huán)中,它用于繪制邊界框的前四條邊。
       i,j=k,(k+1)%4
       cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness)

        # 定義了要繪制的邊的起始點(diǎn)和結(jié)束點(diǎn)的索引。在這個循環(huán)中,它用于繪制邊界框的后四條邊,與前四條邊平行
       i,j=k+4,(k+1)%4 + 4
       cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness)

        # 定義了要繪制的邊的起始點(diǎn)和結(jié)束點(diǎn)的索引。在這個循環(huán)中,它用于繪制連接前四條邊和后四條邊的邊界框的邊。
       i,j=k,k+4
       cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness)
    return image

def draw_box3d_label_on_bev(image, boxes3d, thickness=1, scores=None):
    # if scores is not None and scores.shape[0] >0:
    img = image.copy() 
    num = len(boxes3d)
    for n in range(num):
        b = boxes3d[n]
        x0 = b[0, 0]
        y0 = b[0, 1]
        x1 = b[1, 0]
        y1 = b[1, 1]
        x2 = b[2, 0]
        y2 = b[2, 1]
        x3 = b[3, 0]
        y3 = b[3, 1]
        if (x0<30 and x1<30 and x2<30 and x3<30):
            u0, v0 = lidar_to_top_coords(x0, y0)
            u1, v1 = lidar_to_top_coords(x1, y1)
            u2, v2 = lidar_to_top_coords(x2, y2)
            u3, v3 = lidar_to_top_coords(x3, y3)
            color = (0, 255, 0) # green
            cv2.line(img, (u0, v0), (u1, v1), color, thickness, cv2.LINE_AA)
            cv2.line(img, (u1, v1), (u2, v2), color, thickness, cv2.LINE_AA)
            cv2.line(img, (u2, v2), (u3, v3), color, thickness, cv2.LINE_AA)
            cv2.line(img, (u3, v3), (u0, v0), color, thickness, cv2.LINE_AA)
        elif (x0<50 and x1<50 and x2<50 and x3<50):
            color = (255, 0, 0) # red
            u0, v0 = lidar_to_top_coords(x0, y0)
            u1, v1 = lidar_to_top_coords(x1, y1)
            u2, v2 = lidar_to_top_coords(x2, y2)
            u3, v3 = lidar_to_top_coords(x3, y3)
            cv2.line(img, (u0, v0), (u1, v1), color, thickness, cv2.LINE_AA)
            cv2.line(img, (u1, v1), (u2, v2), color, thickness, cv2.LINE_AA)
            cv2.line(img, (u2, v2), (u3, v3), color, thickness, cv2.LINE_AA)
            cv2.line(img, (u3, v3), (u0, v0), color, thickness, cv2.LINE_AA)
        else:
            color = (0, 0, 255) # blue
            u0, v0 = lidar_to_top_coords(x0, y0)
            u1, v1 = lidar_to_top_coords(x1, y1)
            u2, v2 = lidar_to_top_coords(x2, y2)
            u3, v3 = lidar_to_top_coords(x3, y3)
            cv2.line(img, (u0, v0), (u1, v1), color, thickness, cv2.LINE_AA)
            cv2.line(img, (u1, v1), (u2, v2), color, thickness, cv2.LINE_AA)
            cv2.line(img, (u2, v2), (u3, v3), color, thickness, cv2.LINE_AA)
            cv2.line(img, (u3, v3), (u0, v0), color, thickness, cv2.LINE_AA)       

    return img

def draw_box3d_predict_on_bev(image, boxes3d, thickness=1, scores=None):
     # if scores is not None and scores.shape[0] >0:
    img = image.copy() 
    num = len(boxes3d)
    for n in range(num):
        b = boxes3d[n]
        x0 = b[0, 0]
        y0 = b[0, 1]
        x1 = b[1, 0]
        y1 = b[1, 1]
        x2 = b[2, 0]
        y2 = b[2, 1]
        x3 = b[3, 0]
        y3 = b[3, 1]
        color = (255, 255, 255) # white
        u0, v0 = lidar_to_top_coords(x0, y0)
        u1, v1 = lidar_to_top_coords(x1, y1)
        u2, v2 = lidar_to_top_coords(x2, y2)
        u3, v3 = lidar_to_top_coords(x3, y3)
        cv2.line(img, (u0, v0), (u1, v1), color, thickness, cv2.LINE_AA)
        cv2.line(img, (u1, v1), (u2, v2), color, thickness, cv2.LINE_AA)
        cv2.line(img, (u2, v2), (u3, v3), color, thickness, cv2.LINE_AA)
        cv2.line(img, (u3, v3), (u0, v0), color, thickness, cv2.LINE_AA)
    return img

def lidar_to_top_coords(x, y, z=None):
    if 0:
        return x, y
    else:
        # print("TOP_X_MAX-TOP_X_MIN:",TOP_X_MAX,TOP_X_MIN)
        xx = (-y / res).astype(np.int32)
        yy = (-x / res).astype(np.int32)
        # 調(diào)整坐標(biāo)原點(diǎn)
        xx -= int(np.floor(side_range[0]) / res)
        yy += int(np.floor(fwd_range[1]) / res)
        return xx, yy


# 解析標(biāo)簽數(shù)據(jù)
class Object3d(object):
    ''' 3d object label '''
    def __init__(self, label_file_line):
        data = label_file_line.split(' ')
        data[1:] = [float(x) for x in data[1:]]

        # extract label, truncation, occlusion
        self.type = data[0] # 'Car', 'Pedestrian', ...
        self.truncation = data[1] # truncated pixel ratio [0..1]
        self.occlusion = int(data[2]) # 0=visible, 1=partly occluded, 2=fully occluded, 3=unknown
        self.alpha = data[3] # object observation angle [-pi..pi]

        # extract 2d bounding box in 0-based coordinates
        self.xmin = data[4] # left
        self.ymin = data[5] # top
        self.xmax = data[6] # right
        self.ymax = data[7] # bottom
        self.box2d = np.array([self.xmin,self.ymin,self.xmax,self.ymax])
        
        # extract 3d bounding box information
        self.h = data[8] # box height
        self.w = data[9] # box width
        self.l = data[10] # box length (in meters)
        self.t = (data[11],data[12],data[13]) # location (x,y,z) in camera coord.
        self.ry = data[14] # yaw angle (around Y-axis in camera coordinates) [-pi..pi]

    def print_object(self):
        print('Type, truncation, occlusion, alpha: %s, %d, %d, %f' % \
            (self.type, self.truncation, self.occlusion, self.alpha))
        print('2d bbox (x0,y0,x1,y1): %f, %f, %f, %f' % \
            (self.xmin, self.ymin, self.xmax, self.ymax))
        print('3d bbox h,w,l: %f, %f, %f' % \
            (self.h, self.w, self.l))
        print('3d bbox location, ry: (%f, %f, %f), %f' % \
            (self.t[0],self.t[1],self.t[2],self.ry))


class Calibration(object):
    ''' Calibration matrices and utils
        3d XYZ in <label>.txt are in rect camera coord.
        2d box xy are in image2 coord
        Points in <lidar>.bin are in Velodyne coord.

        y_image2 = P^2_rect * x_rect
        y_image2 = P^2_rect * R0_rect * Tr_velo_to_cam * x_velo
        x_ref = Tr_velo_to_cam * x_velo
        x_rect = R0_rect * x_ref

        P^2_rect = [f^2_u,  0,      c^2_u,  -f^2_u b^2_x;
                    0,      f^2_v,  c^2_v,  -f^2_v b^2_y;
                    0,      0,      1,      0]
                 = K * [1|t]

        image2 coord:
         ----> x-axis (u)
        |
        |
        v y-axis (v)

        velodyne coord:
        front x, left y, up z

        rect/ref camera coord:
        right x, down y, front z

        Ref (KITTI paper): http://www.cvlibs.net/publications/Geiger2013IJRR.pdf

        TODO(rqi): do matrix multiplication only once for each projection.
    '''
    def __init__(self, calib_filepath, from_video=False):
        if from_video:
            calibs = self.read_calib_from_video(calib_filepath)
        else:
            calibs = self.read_calib_file(calib_filepath)
        # Projection matrix from rect camera coord to image2 coord
        self.P = calibs['P2'] 
        self.P = np.reshape(self.P, [3,4])
        # Rigid transform from Velodyne coord to reference camera coord
        self.V2C = calibs['Tr_velo_to_cam']
        self.V2C = np.reshape(self.V2C, [3,4])
        self.C2V = inverse_rigid_trans(self.V2C)
        # Rotation from reference camera coord to rect camera coord
        self.R0 = calibs['R0_rect']
        self.R0 = np.reshape(self.R0,[3,3])

        # Camera intrinsics and extrinsics
        self.c_u = self.P[0,2]
        self.c_v = self.P[1,2]
        self.f_u = self.P[0,0]
        self.f_v = self.P[1,1]
        self.b_x = self.P[0,3]/(-self.f_u) # relative 
        self.b_y = self.P[1,3]/(-self.f_v)

    def read_calib_file(self, filepath):
        ''' Read in a calibration file and parse into a dictionary.'''
        data = {}
        with open(filepath, 'r') as f:
            for line in f.readlines():
                line = line.rstrip()
                if len(line)==0: continue
                key, value = line.split(':', 1)
                # The only non-float values in these files are dates, which
                # we don't care about anyway
                try:
                    data[key] = np.array([float(x) for x in value.split()])
                except ValueError:
                    pass

        return data
    
    def read_calib_from_video(self, calib_root_dir):
        ''' Read calibration for camera 2 from video calib files.
            there are calib_cam_to_cam and calib_velo_to_cam under the calib_root_dir
        '''
        data = {}
        cam2cam = self.read_calib_file(os.path.join(calib_root_dir, 'calib_cam_to_cam.txt'))
        velo2cam = self.read_calib_file(os.path.join(calib_root_dir, 'calib_velo_to_cam.txt'))
        Tr_velo_to_cam = np.zeros((3,4))
        Tr_velo_to_cam[0:3,0:3] = np.reshape(velo2cam['R'], [3,3])
        Tr_velo_to_cam[:,3] = velo2cam['T']
        data['Tr_velo_to_cam'] = np.reshape(Tr_velo_to_cam, [12])
        data['R0_rect'] = cam2cam['R_rect_00']
        data['P2'] = cam2cam['P_rect_02']
        return data

    def cart2hom(self, pts_3d):
        ''' Input: nx3 points in Cartesian
            Oupput: nx4 points in Homogeneous by pending 1
        '''
        n = pts_3d.shape[0]
        pts_3d_hom = np.hstack((pts_3d, np.ones((n,1))))
        return pts_3d_hom
 
    # =========================== 
    # ------- 3d to 3d ---------- 
    # =========================== 
    def project_velo_to_ref(self, pts_3d_velo):
        pts_3d_velo = self.cart2hom(pts_3d_velo) # nx4
        return np.dot(pts_3d_velo, np.transpose(self.V2C))

    def project_ref_to_velo(self, pts_3d_ref):
        pts_3d_ref = self.cart2hom(pts_3d_ref) # nx4
        return np.dot(pts_3d_ref, np.transpose(self.C2V))

    def project_rect_to_ref(self, pts_3d_rect):
        ''' Input and Output are nx3 points '''
        return np.transpose(np.dot(np.linalg.inv(self.R0), np.transpose(pts_3d_rect)))
    
    def project_ref_to_rect(self, pts_3d_ref):
        ''' Input and Output are nx3 points '''
        return np.transpose(np.dot(self.R0, np.transpose(pts_3d_ref)))
 
    def project_rect_to_velo(self, pts_3d_rect):
        ''' Input: nx3 points in rect camera coord.
            Output: nx3 points in velodyne coord.
        ''' 
        pts_3d_ref = self.project_rect_to_ref(pts_3d_rect)
        return self.project_ref_to_velo(pts_3d_ref)

    def project_velo_to_rect(self, pts_3d_velo):
        pts_3d_ref = self.project_velo_to_ref(pts_3d_velo)
        return self.project_ref_to_rect(pts_3d_ref)
    
    def corners3d_to_img_boxes(self, corners3d):
        """
        :param corners3d: (N, 8, 3) corners in rect coordinate
        :return: boxes: (None, 4) [x1, y1, x2, y2] in rgb coordinate
        :return: boxes_corner: (None, 8) [xi, yi] in rgb coordinate
        """
        sample_num = corners3d.shape[0]
        corners3d_hom = np.concatenate((corners3d, np.ones((sample_num, 8, 1))), axis=2)  # (N, 8, 4)

        img_pts = np.matmul(corners3d_hom, self.P.T)  # (N, 8, 3)

        x, y = img_pts[:, :, 0] / img_pts[:, :, 2], img_pts[:, :, 1] / img_pts[:, :, 2]
        x1, y1 = np.min(x, axis=1), np.min(y, axis=1)
        x2, y2 = np.max(x, axis=1), np.max(y, axis=1)

        boxes = np.concatenate((x1.reshape(-1, 1), y1.reshape(-1, 1), x2.reshape(-1, 1), y2.reshape(-1, 1)), axis=1)
        boxes_corner = np.concatenate((x.reshape(-1, 8, 1), y.reshape(-1, 8, 1)), axis=2)

        return boxes, boxes_corner


    # =========================== 
    # ------- 3d to 2d ---------- 
    # =========================== 
    def project_rect_to_image(self, pts_3d_rect):
        ''' Input: nx3 points in rect camera coord.
            Output: nx2 points in image2 coord.
        '''
        pts_3d_rect = self.cart2hom(pts_3d_rect)
        pts_2d = np.dot(pts_3d_rect, np.transpose(self.P)) # nx3
        pts_2d[:,0] /= pts_2d[:,2]
        pts_2d[:,1] /= pts_2d[:,2]
        return pts_2d[:,0:2]
    
    def project_velo_to_image(self, pts_3d_velo):
        ''' Input: nx3 points in velodyne coord.
            Output: nx2 points in image2 coord.
        '''
        pts_3d_rect = self.project_velo_to_rect(pts_3d_velo)
        return self.project_rect_to_image(pts_3d_rect)

    # =========================== 
    # ------- 2d to 3d ---------- 
    # =========================== 
    def project_image_to_rect(self, uv_depth):
        ''' Input: nx3 first two channels are uv, 3rd channel
                   is depth in rect camera coord.
            Output: nx3 points in rect camera coord.
        '''
        n = uv_depth.shape[0]
        x = ((uv_depth[:,0]-self.c_u)*uv_depth[:,2])/self.f_u + self.b_x
        y = ((uv_depth[:,1]-self.c_v)*uv_depth[:,2])/self.f_v + self.b_y
        pts_3d_rect = np.zeros((n,3))
        pts_3d_rect[:,0] = x
        pts_3d_rect[:,1] = y
        pts_3d_rect[:,2] = uv_depth[:,2]
        return pts_3d_rect

    def project_image_to_velo(self, uv_depth):
        pts_3d_rect = self.project_image_to_rect(uv_depth)
        return self.project_rect_to_velo(pts_3d_rect)

 
def rotx(t):
    ''' 3D Rotation about the x-axis. '''
    c = np.cos(t)
    s = np.sin(t)
    return np.array([[1,  0,  0],
                     [0,  c, -s],
                     [0,  s,  c]])


def roty(t):
    ''' Rotation about the y-axis. '''
    c = np.cos(t)
    s = np.sin(t)
    return np.array([[c,  0,  s],
                     [0,  1,  0],
                     [-s, 0,  c]])


def rotz(t):
    ''' Rotation about the z-axis. '''
    c = np.cos(t)
    s = np.sin(t)
    return np.array([[c, -s,  0],
                     [s,  c,  0],
                     [0,  0,  1]])


def transform_from_rot_trans(R, t):
    ''' Transforation matrix from rotation matrix and translation vector. '''
    R = R.reshape(3, 3)
    t = t.reshape(3, 1)
    return np.vstack((np.hstack([R, t]), [0, 0, 0, 1]))


def inverse_rigid_trans(Tr):
    ''' Inverse a rigid body transform matrix (3x4 as [R|t])
        [R'|-R't; 0|1]
    '''
    inv_Tr = np.zeros_like(Tr) # 3x4
    inv_Tr[0:3,0:3] = np.transpose(Tr[0:3,0:3])
    inv_Tr[0:3,3] = np.dot(-np.transpose(Tr[0:3,0:3]), Tr[0:3,3])
    return inv_Tr

def read_label(label_filename):
    lines = [line.rstrip() for line in open(label_filename)]
    objects = [Object3d(line) for line in lines]
    return objects

def load_image(img_filename):
    return cv2.imread(img_filename)

def load_velo_scan(velo_filename):
    scan = np.fromfile(velo_filename, dtype=np.float32)
    scan = scan.reshape((-1, 4))
    return scan

kitti_object.py



from __future__ import print_function

import os
import sys
import cv2
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
ROOT_DIR = os.path.dirname(BASE_DIR)
sys.path.append(os.path.join(ROOT_DIR, 'mayavi'))
import kitti_util as utils


'''
在圖像中畫2D框、3D框
'''
def show_image_with_boxes(img, objects, calib, show3d=True):
    img1 = np.copy(img) # for 2d bbox
    img2 = np.copy(img) # for 3d bbox
    for obj in objects:
        if obj.type=='DontCare':continue
        cv2.rectangle(img1, (int(obj.xmin),int(obj.ymin)), (int(obj.xmax),int(obj.ymax)), (0,255,0), 2) # 畫2D框
        box3d_pts_2d, box3d_pts_3d = utils.compute_box_3d(obj, calib.P) # 獲取圖像3D框(8*2)、相機(jī)坐標(biāo)系3D框(8*3)
        img2 = utils.draw_projected_box3d(img2, box3d_pts_2d) # 在圖像上畫3D框
    if show3d:
        Image.fromarray(img2).save('save_output/image_with_3Dboxes.png')
        Image.fromarray(img2).show()
    else:
        Image.fromarray(img1).save('save_output/image_with_2Dboxes.png')
        Image.fromarray(img1).show()


'''
可視化BEV鳥瞰圖
'''
def show_lidar_topview(pc_velo, objects, calib):
      # 1-設(shè)置鳥瞰圖范圍
    side_range = (-30, 30)  # 左右距離
    fwd_range = (0, 80)  # 后前距離
    
    x_points = pc_velo[:, 0]
    y_points = pc_velo[:, 1]
    z_points = pc_velo[:, 2]
    
    # 2-獲得區(qū)域內(nèi)的點(diǎn)
    f_filt = np.logical_and(x_points > fwd_range[0], x_points < fwd_range[1])
    s_filt = np.logical_and(y_points > side_range[0], y_points < side_range[1])
    filter = np.logical_and(f_filt, s_filt)
    indices = np.argwhere(filter).flatten() 
    x_points = x_points[indices]
    y_points = y_points[indices]
    z_points = z_points[indices]
    
    # 定義了鳥瞰圖中每個像素代表的距離
    res = 0.1   
    # 3-1將點(diǎn)云坐標(biāo)系 轉(zhuǎn)到 BEV坐標(biāo)系
    x_img = (-y_points / res).astype(np.int32)
    y_img = (-x_points / res).astype(np.int32)
    # 3-2調(diào)整坐標(biāo)原點(diǎn)
    x_img -= int(np.floor(side_range[0]) / res)
    y_img += int(np.floor(fwd_range[1]) / res)
    print(x_img.min(), x_img.max(), y_img.min(), y_img.max()) 
    
    # 4-填充像素值, 將點(diǎn)云數(shù)據(jù)的高度信息(Z坐標(biāo))映射到像素值
    height_range = (-3, 1.0)
    pixel_value = np.clip(a=z_points, a_max=height_range[1], a_min=height_range[0])
     

    def scale_to_255(a, min, max, dtype=np.uint8):
        return ((a - min) / float(max - min) * 255).astype(dtype)
    
    pixel_value = scale_to_255(pixel_value, height_range[0], height_range[1])
    
    # 創(chuàng)建圖像數(shù)組
    x_max = 1 + int((side_range[1] - side_range[0]) / res)
    y_max = 1 + int((fwd_range[1] - fwd_range[0]) / res)
    im = np.zeros([y_max, x_max], dtype=np.uint8)
    im[y_img, x_img] = pixel_value
    
    im2 = Image.fromarray(im)
    im2.save('save_output/BEV.png')
    im2.show()


'''
將點(diǎn)云數(shù)據(jù)3D框投影到BEV
'''
def show_lidar_topview_with_boxes(img, objects, calib):
    def bbox3d(obj):
        box3d_pts_2d, box3d_pts_3d = utils.compute_box_3d(obj, calib.P) # 獲取3D框-圖像、3D框-相機(jī)坐標(biāo)系
        box3d_pts_3d_velo = calib.project_rect_to_velo(box3d_pts_3d) # 將相機(jī)坐標(biāo)系的框 轉(zhuǎn)到 激光雷達(dá)坐標(biāo)系
        return box3d_pts_3d_velo # 返回nx3的點(diǎn)

    boxes3d = [bbox3d(obj) for obj in objects if obj.type == "Car"]
    gt = np.array(boxes3d)
    im2 = utils.draw_box3d_label_on_bev(img, gt, scores=None, thickness=1) # 獲取激光雷達(dá)坐標(biāo)系的3D點(diǎn),選擇x, y兩維,畫到BEV平面坐標(biāo)系上
    im2 = Image.fromarray(im2)
    im2.save('save_output/BEV with boxes.png')
    im2.show()


'''
將點(diǎn)云數(shù)據(jù)投影到圖像
'''
def show_lidar_on_image(pc_velo, img, calib, img_width, img_height):
    ''' Project LiDAR points to image '''
    imgfov_pc_velo, pts_2d, fov_inds = get_lidar_in_image_fov(pc_velo,
        calib, 0, 0, img_width, img_height, True)
    imgfov_pts_2d = pts_2d[fov_inds,:]
    imgfov_pc_rect = calib.project_velo_to_rect(imgfov_pc_velo)

    import matplotlib.pyplot as plt
    cmap = plt.cm.get_cmap('hsv', 256)
    cmap = np.array([cmap(i) for i in range(256)])[:,:3]*255

    for i in range(imgfov_pts_2d.shape[0]):
        depth = imgfov_pc_rect[i,2]
        color = cmap[int(640.0/depth),:]
        cv2.circle(img, (int(np.round(imgfov_pts_2d[i,0])),
            int(np.round(imgfov_pts_2d[i,1]))),
            2, color=tuple(color), thickness=-1)
    Image.fromarray(img).save('save_output/lidar_on_image.png')
    Image.fromarray(img).show() 
    return img


'''
將點(diǎn)云數(shù)據(jù)投影到相機(jī)坐標(biāo)系
'''
def get_lidar_in_image_fov(pc_velo, calib, xmin, ymin, xmax, ymax,
                           return_more=False, clip_distance=2.0):
    ''' Filter lidar points, keep those in image FOV '''
    pts_2d = calib.project_velo_to_image(pc_velo)
    fov_inds = (pts_2d[:,0]<xmax) & (pts_2d[:,0]>=xmin) & \
        (pts_2d[:,1]<ymax) & (pts_2d[:,1]>=ymin)
    fov_inds = fov_inds & (pc_velo[:,0]>clip_distance)
    imgfov_pc_velo = pc_velo[fov_inds,:]
    if return_more:
        return imgfov_pc_velo, pts_2d, fov_inds
    else:
        return imgfov_pc_velo
    

'''
解析標(biāo)簽
'''
class kitti_object(object):
    '''Load and parse object data into a usable format.'''
    
    def __init__(self, root_dir, split='training'):
        '''root_dir contains training and testing folders'''
        self.root_dir = root_dir
        self.split = split
        self.split_dir = os.path.join(root_dir, split)

        if split == 'training':
            self.num_samples = 7481
        elif split == 'testing':
            self.num_samples = 7518
        else:
            print('Unknown split: %s' % (split))
            exit(-1)

        self.image_dir = os.path.join(self.split_dir, 'image_2')
        self.calib_dir = os.path.join(self.split_dir, 'calib')
        self.lidar_dir = os.path.join(self.split_dir, 'velodyne')
        self.label_dir = os.path.join(self.split_dir, 'label_2')

    def __len__(self):
        return self.num_samples

    def get_image(self, idx):
        assert(idx<self.num_samples) 
        img_filename = os.path.join(self.image_dir, '%06d.png'%(idx))
        return utils.load_image(img_filename)

    def get_lidar(self, idx): 
        assert(idx<self.num_samples) 
        lidar_filename = os.path.join(self.lidar_dir, '%06d.bin'%(idx))
        return utils.load_velo_scan(lidar_filename)

    def get_calibration(self, idx):
        assert(idx<self.num_samples) 
        calib_filename = os.path.join(self.calib_dir, '%06d.txt'%(idx))
        return utils.Calibration(calib_filename)

    def get_label_objects(self, idx):
        assert(idx<self.num_samples and self.split=='training') 
        label_filename = os.path.join(self.label_dir, '%06d.txt'%(idx))
        return utils.read_label(label_filename)
    
    def get_depth_map(self, idx):
        pass

    def get_top_down(self, idx):
        pass

運(yùn)行程序后kitti_vis_main.py后,回保存5張結(jié)果圖片

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV),3D目標(biāo)檢測,人工智能,3D目標(biāo)jianc,數(shù)據(jù)集,KITTI,坐標(biāo)轉(zhuǎn)換

后面還會介紹Nuscenes、Waymo等3D數(shù)據(jù)集。文章來源地址http://www.zghlxwxcb.cn/news/detail-708010.html

到了這里,關(guān)于3D目標(biāo)檢測數(shù)據(jù)集 KITTI(標(biāo)簽格式解析、點(diǎn)云轉(zhuǎn)圖像、點(diǎn)云轉(zhuǎn)BEV)的文章就介紹完了。如果您還想了解更多內(nèi)容,請?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!

本文來自互聯(lián)網(wǎng)用戶投稿,該文觀點(diǎn)僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務(wù),不擁有所有權(quán),不承擔(dān)相關(guān)法律責(zé)任。如若轉(zhuǎn)載,請注明出處: 如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實(shí)不符,請點(diǎn)擊違法舉報(bào)進(jìn)行投訴反饋,一經(jīng)查實(shí),立即刪除!

領(lǐng)支付寶紅包贊助服務(wù)器費(fèi)用

相關(guān)文章

  • Open3D——批量將KITTI數(shù)據(jù)集的.bin文件轉(zhuǎn)換成.pcd點(diǎn)云格式

    Open3D——批量將KITTI數(shù)據(jù)集的.bin文件轉(zhuǎn)換成.pcd點(diǎn)云格式 隨著自動駕駛、智能制造等應(yīng)用的逐漸推廣,在處理三維點(diǎn)云數(shù)據(jù)方面的需求越來越大。而KITTI數(shù)據(jù)集是一個包含激光雷達(dá)、相機(jī)、GPS等多種傳感器數(shù)據(jù)的三維視覺數(shù)據(jù)集,廣泛用于計(jì)算機(jī)視覺領(lǐng)域的研究。 然而KITTI數(shù)據(jù)

    2024年01月16日
    瀏覽(19)
  • 3D檢測數(shù)據(jù)集 DAIR-V2X-V 轉(zhuǎn)為Kitti格式 | 可視化

    3D檢測數(shù)據(jù)集 DAIR-V2X-V 轉(zhuǎn)為Kitti格式 | 可視化

    本文分享在DAIR-V2X-V數(shù)據(jù)集中,將標(biāo)簽轉(zhuǎn)為Kitti格式,并可視化3D檢測效果。 DAIR-V2X包括不同類型的數(shù)據(jù)集: DAIR-V2X-I DAIR-V2X-V DAIR-V2X-C V2X-Seq-SPD V2X-Seq-TFD DAIR-V2X-C-Example:?google_drive_link V2X-Seq-SPD-Example:?google_drive_link V2X-Seq-TFD-Example:?google_drive_link 本文選擇DAIR-V2X-V作為示例。 1、下

    2024年02月03日
    瀏覽(24)
  • 利用 labelCloud 開源工具標(biāo)注自己的點(diǎn)云數(shù)據(jù)集為KITTI標(biāo)注格式教程(支持pcd、bin格式點(diǎn)云)

    利用 labelCloud 開源工具標(biāo)注自己的點(diǎn)云數(shù)據(jù)集為KITTI標(biāo)注格式教程(支持pcd、bin格式點(diǎn)云)

    先貼地址 github 地址:https://github.com/ch-sa/labelcloud 標(biāo)注為KITTI格式的復(fù)現(xiàn)步驟與操作流程 首先吧pcd格式點(diǎn)云轉(zhuǎn)乘bin格式 克隆代碼 環(huán)境搭配 打開終端 依次輸入下列命令 準(zhǔn)備數(shù)據(jù):把轉(zhuǎn)換好后的bin格式點(diǎn)云數(shù)據(jù)放入pointclouds文件夾下 開始使用labelCloud ,進(jìn)入你下載的地方打開終端

    2024年02月09日
    瀏覽(14)
  • YOLO目標(biāo)檢測——VOC2007數(shù)據(jù)集+已標(biāo)注VOC格式標(biāo)簽下載分享

    YOLO目標(biāo)檢測——VOC2007數(shù)據(jù)集+已標(biāo)注VOC格式標(biāo)簽下載分享

    VOC2007數(shù)據(jù)集是一個經(jīng)典的目標(biāo)檢測數(shù)據(jù)集,該數(shù)據(jù)集包含了20個常見的目標(biāo)類別,涵蓋了人、動物、交通工具等多個領(lǐng)域,共同11220圖片。使用lableimg標(biāo)注軟件標(biāo)注,標(biāo)注框質(zhì)量高,標(biāo)簽格式為VOC格式(即xml標(biāo)簽),可以直接用于YOLO系列的目標(biāo)檢測。 數(shù)據(jù)集點(diǎn)擊下載 :YOLO目

    2024年02月09日
    瀏覽(32)
  • YOLO目標(biāo)檢測——棉花病蟲害數(shù)據(jù)集+已標(biāo)注txt格式標(biāo)簽下載分享

    YOLO目標(biāo)檢測——棉花病蟲害數(shù)據(jù)集+已標(biāo)注txt格式標(biāo)簽下載分享

    實(shí)際項(xiàng)目應(yīng)用 :棉花病蟲害防治 數(shù)據(jù)集說明 :棉花病蟲害檢測數(shù)據(jù)集,真實(shí)場景的高質(zhì)量圖片數(shù)據(jù),數(shù)據(jù)場景豐富 標(biāo)簽說明 :使用lableimg標(biāo)注軟件標(biāo)注,標(biāo)注框質(zhì)量高,含voc(xml)、coco(json)和yolo(txt)三種格式標(biāo)簽,分別存放在不同文件夾下,可以直接用于YOLO系列的目標(biāo)檢測

    2024年02月09日
    瀏覽(54)
  • YOLO目標(biāo)檢測——口罩規(guī)范佩戴數(shù)據(jù)集+已標(biāo)注xml和txt格式標(biāo)簽下載分享

    YOLO目標(biāo)檢測——口罩規(guī)范佩戴數(shù)據(jù)集+已標(biāo)注xml和txt格式標(biāo)簽下載分享

    實(shí)際項(xiàng)目應(yīng)用 :疫情防控、智能安檢、公共場所監(jiān)控場景下的大密度人群檢測是否佩戴口罩 數(shù)據(jù)集說明 :人臉口罩規(guī)范佩戴數(shù)據(jù)集,真實(shí)場景的高質(zhì)量圖片數(shù)據(jù),數(shù)據(jù)場景豐富,含有正確佩戴口罩、未正確佩戴口罩和沒佩戴口罩圖片 標(biāo)簽說明 :使用lableimg標(biāo)注軟件標(biāo)注,標(biāo)

    2024年02月09日
    瀏覽(32)
  • YOLO目標(biāo)檢測——無人機(jī)航拍行人檢測數(shù)據(jù)集下載分享【含對應(yīng)voc、coc和yolo三種格式標(biāo)簽】

    YOLO目標(biāo)檢測——無人機(jī)航拍行人檢測數(shù)據(jù)集下載分享【含對應(yīng)voc、coc和yolo三種格式標(biāo)簽】

    實(shí)際項(xiàng)目應(yīng)用 :智能交通管理、城市安防監(jiān)控、公共安全救援等領(lǐng)域 數(shù)據(jù)集說明 :無人機(jī)航拍行人檢測數(shù)據(jù)集,真實(shí)場景的高質(zhì)量圖片數(shù)據(jù),數(shù)據(jù)場景豐富 標(biāo)簽說明 :使用lableimg標(biāo)注軟件標(biāo)注,標(biāo)注框質(zhì)量高,含voc(xml)、coco(json)和yolo(txt)三種格式標(biāo)簽,分別存放在不同文件

    2024年01月18日
    瀏覽(178)
  • 點(diǎn)云 3D 目標(biāo)檢測 - RangeDet(ICCV 2021)

    點(diǎn)云 3D 目標(biāo)檢測 - RangeDet(ICCV 2021)

    聲明:此翻譯僅為個人學(xué)習(xí)記錄 文章信息 標(biāo)題: RangeDet: In Defense of Range View for LiDAR-based 3D Object Detection (ICCV 2021) 作者: Lue Fan * , Xuan Xiong * , Feng Wang, Naiyan Wang, Zhaoxiang Zhang ( * The first two authors contribute equally to this work and are listed in the alphabetical order.) 文章鏈接:https://openaccess

    2024年02月12日
    瀏覽(17)
  • 基于3D點(diǎn)云的小目標(biāo)檢測學(xué)習(xí)筆記

    基于3D點(diǎn)云的小目標(biāo)檢測學(xué)習(xí)筆記

    一、與圖像相比, 基于點(diǎn)云的目標(biāo)檢測 一直面臨著一些 挑戰(zhàn) : 1、 非結(jié)構(gòu)化數(shù)據(jù) :點(diǎn)云作為場景中點(diǎn)的位置具有稀疏和非結(jié)構(gòu)化的性質(zhì),因此它們的密度和數(shù)量都隨著場景中對象而變化。 2、 不變性排列 :點(diǎn)云本質(zhì)上是一長串點(diǎn)(nx3矩陣,其中n是點(diǎn)數(shù))。 在幾何上,點(diǎn)

    2024年02月12日
    瀏覽(19)
  • 【3D目標(biāo)檢測】基于偽雷達(dá)點(diǎn)云的單目3D目標(biāo)檢測方法研宄

    【3D目標(biāo)檢測】基于偽雷達(dá)點(diǎn)云的單目3D目標(biāo)檢測方法研宄

    本文是基于單目圖像的3D目標(biāo)檢測方法,是西安電子科技大學(xué)的郭鑫宇學(xué)長的碩士學(xué)位論文。 【2021】【單目圖像的3D目標(biāo)檢測方法研究】 研究的問題: 如何提高偽點(diǎn)云的質(zhì)量 偽點(diǎn)云體系中如何提高基于點(diǎn)云的檢測算法的效果 提出的方法: 一種基于置信度的偽點(diǎn)云采樣方法

    2024年02月06日
    瀏覽(21)

覺得文章有用就打賞一下文章作者

支付寶掃一掃打賞

博客贊助

微信掃一掃打賞

請作者喝杯咖啡吧~博客贊助

支付寶掃一掃領(lǐng)取紅包,優(yōu)惠每天領(lǐng)

二維碼1

領(lǐng)取紅包

二維碼2

領(lǐng)紅包