百度Apollo视觉算法解析-视觉车道线检测后处理

今天好好吃饭了吗

已于 2023-06-11 22:27:02 修改

阅读量2.2k

点赞数 1

CC 4.0 BY-SA版权

文章标签：算法 Apollo感知算法

于 2023-06-11 11:59:08 首次发布

本文链接：https://blog.csdn.net/qq_32376851/article/details/131151826

文章介绍了自动驾驶中车道线检测的过程，包括模型推理得到点坐标，通过softmax和阈值处理得到前景点标签，使用线扫描的连通域分析进行车道线聚类，DisjointSet数据结构处理集合合并，以及3D处理中的点转换和线性拟合。该流程涉及图像处理、计算机视觉和传感器融合技术。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1. Process2D：模型推理和后处理得到点坐标

模型预测的是每个点的车道线的四种类型的conf（表示车道线4种位置关系属性），后处理首先做softmax，然后过threshold，得到前景点的标签。

bool DenselineLanePostprocessor::LocateLanelinePointSet(
    const CameraFrame* frame) {}  // 入口

得到车道线前景mask，分类四类

CalLaneMap(output_data, lane_map_width_, lane_map_height_, &lane_map_);

其中

for (int x = 0; x < width; ++x) {
	//遍历获取点的类型max index
	score_channel[0] = output_data[channel0_pos + x];
	score_channel[1] = output_data[channel1_pos + x];
	score_channel[2] = output_data[channel2_pos + x];
	score_channel[3] = output_data[channel3_pos + x];
	// Utilize softmax to get the probability
	float sum_score = 0.0f;
	for (int i = 0; i < 4; i++) {
	score_channel[i] = static_cast<float>(exp(score_channel[i]));
	sum_score += score_channel[i];
	}
	for (int i = 0; i < 4; i++) {
	score_channel[i] /= sum_score;
	}
	// 1: ego-lane; 2: adj-left lane; 3: adj-right lane
	//  find the score with max lane map
	int max_channel_idx = 0;

	//下面代码获取保留lane的点到左右点的距离
	int pixel_pos = row_start + x;
	(*lane_map)[pixel_pos] = 1;
	
	float dist_left = sigmoid(output_data[channel5_pos + x]);
	float dist_right = sigmoid(output_data[channel6_pos + x]);
	lane_output_[pixel_pos] = dist_left;
	lane_output_[out_dim + pixel_pos] = dist_right;
	lane_output_[out_dim * 2 + pixel_pos] = max_score;

下一步进行点的联通域分析，同种类型的车道线聚类到一起

bool flag =
      FindCC(lane_map_, lane_map_width_, lane_map_height_, roi, &lane_ccs_);

这里的实现算法是线扫描的方式，基本原理如下:

它基于扫描线的思想，通过对像素进行遍历和分析，将具有相同特征或连通关系的像素归为同一个连通区域（Connected Component）。

算法原理如下：

初始化一个空的 Connected Component 列表 cc。
遍历图像中的每个像素，并按照顺序从左到右、从上到下进行扫描。
对于当前扫描到的像素，判断其是否为前景像素（具有特定特征）。如果是前景像素，则进行下一步处理；如果是背景像素，则继续扫描下一个像素。
对于前景像素，判断其与相邻像素的关系，主要包括左侧像素和上方像素。根据连通关系的定义，如果左侧像素和上方像素都是背景像素，则当前像素属于一个新的连通区域，创建一个新的 Connected Component，并将当前像素添加到该连通区域中。
如果左侧像素和上方像素中至少有一个属于前景像素，那么当前像素属于与之相连的连通区域。在 cc 列表中查找与左侧像素和上方像素相连的连通区域，并将当前像素合并到该连通区域中。
继续扫描下一个像素，重复步骤 4 和步骤 5，直到遍历完所有像素。
完成图像扫描后，cc 列表中存储了所有的连通区域及其像素。
扫描线算法基于像素的连通关系进行区域分割，适用于处理图像中连通区域较少的情况。它具有简单、高效的特点，并且可以对连通区域进行实时检测和分析。在计算机视觉和图像处理领域，扫描线算法常被用于目标检测、图像分割、边缘提取等应用中。

其中的重点在于**在遍历过程中查询像素点的前背景属性，以及联通属性，遇到新的标签就新增联通域，当联通域不产生冲突时，合并像素到指定集合，产生冲突的时候，进行已有的冲突集合的合并。**如下

if (left_val == 0 && up_val == 0) {
          // current pixel is foreground and has no connected neighbors
          frame_label[cur_idx] = labels.Add();
          root_map.push_back(-1);
        } else if (left_val != 0 && up_val == 0) {
          // current pixel is foreground and has left neighbor connected
          frame_label[cur_idx] = frame_label[left_idx];
        } else if (left_val == 0 && up_val != 0) {
          // current pixel is foreground and has up neighbor connect
          frame_label[cur_idx] = frame_label[up_idx];
        } else {
          // current pixel is foreground
          // and is connected to left and up neighbors
          frame_label[cur_idx] = (frame_label[left_idx] > frame_label[up_idx])
                                     ? frame_label[up_idx]
                                     : frame_label[left_idx];
          labels.Unite(frame_label[left_idx], frame_label[up_idx]);  //集合合并
        }

这里的labels是DisjointSet的数据结构，Disjoint Set 是一种用于处理集合合并与查找问题的数据结构。在该数据结构中，每个元素都被分配到一个集合中，并且每个集合都有一个代表元素（也称为根元素）。用一个total-pixels长度的vector记录所有集合的跟节点值，理论上最多有num_total_pixels多个集合，如果多个点属于同一个集合，可以给他们都赋值为代表同一个集合的最小的标记值也就是这个root值，root用来标记此集合。那么当不同的集合，需要合并的时候，首先查找他们是否属于同一个集合，也就是找到root值做对比，如果相同那么不用执行操作，如果不同就将大的root设置为小的root值，从而也就实现了集合的合并。后续再做一次遍历，把所有属于同一个root值的点放到一个vector中，也就拿到了最终的集合结果。

下面是apollo中disjointset实现中的集合合并的逻辑，

// point the x and y to smaller root of the two
void DisjointSet::Unite(int x, int y) {
  if (x == y) {
    return;
  }
  int x_root = Find(x);
  int y_root = Find(y);
  if (x_root == y_root) {
    return;
  } else if (x_root < y_root) {
    disjoint_array_[y_root] = x_root;
  } else {
    disjoint_array_[x_root] = y_root;
  }
  --subset_num_;
}

得到前景的车道线集合后，根据前景车道线点数量的筛选（少于阈值点数的车道线删掉），然后排序。

然后进行车道线laneindex的判定，（判定是egolane还是其他lane）

//  4. Classify the lane_ccs_ type
  std::vector<LaneType> ccs_pos_type(select_lane_ccs_.size(),
                                     LaneType::UNKNOWN_LANE);
  flag = ClassifyLaneCCsPosTypeInImage(select_lane_ccs_, &ccs_pos_type);
  if (!flag) {
    return false;
  }

根据得到的laneindex结果，得到最终的车道线lanleine点

//  5. get the lane line points
  std::vector<std::vector<LanePointInfo>> lane_map_group_point_set(4);
  InferPointSetFromLaneCenter(select_lane_ccs_, ccs_pos_type,
                              &lane_map_group_point_set);

  //  6. convert to the original image
  Convert2OriginalCoord(lane_map_group_point_set, &image_group_point_set_);
  return true;

online相机标定

这里五年没有更新代码了，不打算看了。

Process3D：3D处理和线拟合操作

首先转换点到车体坐标系

利用内参+相机标定得到的height和pitch信息转换点到车体坐标。

bool ImagePoint2Camera(const base::Point2DF& img_point, float pitch_angle,
                       float camera_ground_height,
                       const Eigen::Matrix3f& intrinsic_params_inverse,
                       Eigen::Vector3d* camera_point) {
  Eigen::MatrixXf pt_m(3, 1);
  pt_m << img_point.x, img_point.y, 1;
  const Eigen::MatrixXf& org_camera_point = intrinsic_params_inverse * pt_m;
  //
  float cos_pitch = static_cast<float>(cos(pitch_angle));
  float sin_pitch = static_cast<float>(sin(pitch_angle));
  Eigen::Matrix3f pitch_matrix;
  pitch_matrix << 1, 0, 0, 0, cos_pitch, sin_pitch, 0, -sin_pitch, cos_pitch;
  const Eigen::MatrixXf& rotate_point = pitch_matrix * org_camera_point;
  if (fabs(rotate_point(1, 0)) < lane_eps_value) {
    return false;
  }
  float scale = camera_ground_height / rotate_point(1, 0);
  (*camera_point)(0) = scale * org_camera_point(0, 0);
  (*camera_point)(1) = scale * org_camera_point(1, 0);
  (*camera_point)(2) = scale * org_camera_point(2, 0);
  return true;
}