2020-Dynamic ORB SLAM [Reading Seminar] [Open Source]

Share:
[toc] # 2020-Dynamic ORB SLAM **About**: Paper: [Dynamic ORB SLAM](https://github.com/bijustin/YOLO-DynaSLAM/blob/master/dynamic-orb-slam.pdf) W2020 EECS 568: Mobile Robotics Demo video (Youtube): https://www.youtube.com/watch?v=sazkFVVjITQ FAST Dynamic ORB SLAM: https://github.com/bijustin/Fast-Dynamic-ORB-SLAM YOLO Dynamic ORB SLAM: https://github.com/bijustin/YOLO-DynaSLAM **Problem definition:** Simultaneous Localization and Mapping (SLAM) is often a necessity. Many state-of-the-art SLAM systems over the past decade have shown impressive performance, but they oftentimes make **the assumption that the environment is static**. This tends to be a na¨ıve assumption, as this is not the case with robots that must operate with humans in their environment such as autonomous vehicles. **Abstract**: In this work, we present **two** visual SLAM systems built on ORB SLAM2 that is more robust to dynamic environments while operating in real time, - one with the use of deep learning: YOLO Dynamic ORB SLAM. Use the YOLOv3 to segment out objects that may be dynamic. - one without the usage of deep learning: FAST Dynamic ORB SLAM respectively. A combination of optical flow and epipolar constraints are used to prevent dynamic objects from impacting the SLAM system, without the need of special hardware such as a GPU. **Idea**: The idea is to filter out dynamic ORB features bedore they can be processed by the local map tracking module. This is done through a combination of estimating the fundamental matrix between frames, enforcing epipolar constraints and outlier detection of optical flow magnitudes. **Disadvantage**: Unfortunately, it is difficult to filter slowly moving features. 没有实验数据对比(ATE/RPE)。 在论文中的discussion部分有详细的讨论。 **YOLO Dynamic ORB SLAM**: YOLO Dynamic ORB_SLAM is a visual SLAM system that is robust in dynamic scenarios for RGB-D configuration. YOLO-Dynamic ORB SLAM rgbd_tum_yolo.cc: ```cpp yolov3::yolov3Segment* yolo; if (argc==6 || argc==7) { cout << "Loading Yolov3 net. This could take a while..." << endl; yolo = new yolov3::yolov3Segment(); cout << "Yolov3 net loaded!" << endl; } ... if (argc == 6 || argc == 7) mask = yolo->Segmentation(imRGB); ``` yolo.h ```cpp class yolov3Segment { private: float confThreshold = 0.5; // Confidence threshold float nmsThreshold = 0.4; // Non-maximum suppression threshold int inpWidth = 640; // Width of network's input image int inpHeight = 480; // Height of network's input image vector classes; Net net; public: yolov3Segment(); ~yolov3Segment(); cv::Mat Segmentation(cv::Mat &image); // Remove the bounding boxes with low confidence using non-maxima suppression cv::Mat postprocess(Mat& frame, const vector& out); // Get the names of the output layers vector getOutputsNames(const Net& net); }; ``` yolo.cc ```cpp yolov3Segment::yolov3Segment() { // Load names of classes string classesFile = "src/yolo/coco.names"; ifstream ifs(classesFile.c_str()); string line; while (getline(ifs, line)) classes.push_back(line); // Give the configuration and weight files for the model String modelConfiguration = "src/yolo/yolov3.cfg"; String modelWeights = "src/yolo/yolov3.weights"; // Load the network net = readNetFromDarknet(modelConfiguration, modelWeights); net.setPreferableBackend(DNN_BACKEND_OPENCV); net.setPreferableTarget(DNN_TARGET_CPU); } cv::Mat yolov3Segment::Segmentation(cv::Mat &image) { cv::Mat blob; // Create a 4D blob from a frame. blobFromImage(image, blob, 1/255.0, cvSize(this->inpWidth, this->inpHeight), Scalar(0,0,0), true, false); //Sets the input to the network this->net.setInput(blob); // Runs the forward pass to get output of the output layers vector outs; this->net.forward(outs, this->getOutputsNames(this->net)); int dilation_size = 15; cv::Mat kernel = getStructuringElement(cv::MORPH_ELLIPSE, cv::Size( 2*dilation_size + 1, 2*dilation_size+1 ), cv::Point( dilation_size, dilation_size ) ); cv::Mat mask = cv::Mat::ones(480,640,CV_8U); cv::Mat maskyolo = this->postprocess(image, outs); cv::Mat maskyolodil = maskyolo.clone(); cv::dilate(maskyolo, maskyolodil, kernel); mask = mask - maskyolodil; return mask; } // Remove the bounding boxes with low confidence using non-maxima suppression cv::Mat yolov3Segment::postprocess(Mat& frame, const vector& outs) { vector classIds; vector confidences; vector boxes; for (size_t i = 0; i < outs.size(); ++i) { // Scan through all the bounding boxes output from the network and keep only the // ones with high confidence scores. Assign the box's class label as the class // with the highest score for the box. float* data = (float*)outs[i].data; for (int j = 0; j < outs[i].rows; ++j, data += outs[i].cols) { Mat scores = outs[i].row(j).colRange(5, outs[i].cols); Point classIdPoint; double confidence; // Get the value and location of the maximum score minMaxLoc(scores, 0, &confidence, 0, &classIdPoint); if (confidence > this->confThreshold) { int centerX = (int)(data[0] * frame.cols); int centerY = (int)(data[1] * frame.rows); int width = (int)(data[2] * frame.cols); int height = (int)(data[3] * frame.rows); int left = centerX - width / 2; int top = centerY - height / 2; classIds.push_back(classIdPoint.x); confidences.push_back((float)confidence); boxes.push_back(Rect(left, top, width, height)); } } } cv::Mat mask = cv::Mat::zeros(480,640,CV_8U); // Perform non maximum suppression to eliminate redundant overlapping boxes with // lower confidences vector indices; NMSBoxes(boxes, confidences, this->confThreshold, this->nmsThreshold, indices); for (size_t i = 0; i < indices.size(); ++i) { int idx = indices[i]; Rect box = boxes[idx]; //drawPred(classIds[idx], confidences[idx], box.x, box.y, // box.x + box.width, box.y + box.height, frame); string c = this->classes[classIds[idx]]; if (c == "person" || c == "car" || c == "bicycle" || c == "motorcycle" || c == "bus" || c == "truck") { for (int x = max(0, box.x + box.width / 4); x < box.x + 3*box.width/4 && x < 640; ++x) for (int y = max(0, box.y); y < box.y + box.height && y < 480; ++y) mask.at(y, x) = 1; } } return mask; } // Get the names of the output layers vector yolov3Segment::getOutputsNames(const Net& net) { static vector names; if (names.empty()) { //Get the indices of the output layers, i.e. the layers with unconnected outputs vector outLayers = this->net.getUnconnectedOutLayers(); //get the names of all the layers in the network vector layersNames = this->net.getLayerNames(); // Get the names of the output layers in names names.resize(outLayers.size()); for (size_t i = 0; i < outLayers.size(); ++i) names[i] = layersNames[outLayers[i] - 1]; } return names; } ``` **Fast-Dynamic-ORB-SLAM**: FAST Dynamic ORB-SLAM2 is a near real-time SLAM library that is built on ORB-SLAM 2 by Mur-Artal et al. It attempts to mitigate the error introduced by dynamic objects in the SLAM system on RGB-D cameras. Tracking.cc ```cpp mCurrentFrame = Frame(mImGray,imDepth,timestamp,mpORBextractorLeft,mpORBVocabulary,mK,mDistCoef,mbf,mThDepth); dynpointfinder.findOutliers(&mCurrentFrame, mImGray, scaleFactors, nlevels); mCurrentFrame.finishFrame(mImGray, imDepth, mK); ``` Frame.cc ```cpp void Frame::finishFrame(const cv::Mat &imGray,const cv::Mat &imDepth, cv::Mat &K) { //modify initmvKeys ExtractORBDesp(0,imGray); N = mvKeys.size(); if(mvKeys.empty()) return; UndistortKeyPoints(); ComputeStereoFromRGBD(imDepth); mvpMapPoints = vector(N,static_cast(NULL)); mvbOutlier = vector(N,false); // This is done only for the first Frame (or after a change in the calibration) if(mbInitialComputations) { ComputeImageBounds(imGray); mfGridElementWidthInv=static_cast(FRAME_GRID_COLS)/static_cast(mnMaxX-mnMinX); mfGridElementHeightInv=static_cast(FRAME_GRID_ROWS)/static_cast(mnMaxY-mnMinY); fx = K.at(0,0); fy = K.at(1,1); cx = K.at(0,2); cy = K.at(1,2); invfx = 1.0f/fx; invfy = 1.0f/fy; mbInitialComputations=false; } mb = mbf/fx; AssignFeaturesToGrid(); } ``` DynPointFinder.h ```cpp class DynPointFinder{ Frame prevFrame; cv::Mat prevImage; bool first = true; vector prevMatches; std::vector mvIniMatches; Initializer* poseEstimator; std::vector prevKeypoints; public: bool findOutliers(Frame *pFrame, cv::Mat &imGray, vector &scaleFactors, int nlevels); bool findOutliers2(Frame *pFrame, cv::Mat &imGray, vector &scaleFactors, int nlevels); cv::Mat match(vector &Keypoints1,cv::Mat imgray1); }; ``` DynPointFinder.cc 提取ORB特征点,匹配,并计算Fundamental Matrix 并返回。 ```cpp cv::Mat DynPointFinder::match(vector &Keypoints1, cv::Mat imgray1) { std::vector matches; cv::Mat descriptors1, descriptors2; cv::Ptr detector = cv::ORB::create(1000); prevKeypoints.clear(); detector->detect(prevImage, prevKeypoints); detector->compute(imgray1, Keypoints1, descriptors1); detector->compute(prevImage, prevKeypoints, descriptors2); cv::Ptr matcher = cv::DescriptorMatcher::create(cv::DescriptorMatcher::BRUTEFORCE_HAMMINGLUT); std::vector matches1; matcher->match(descriptors1, descriptors2, matches1); std::vector points1, points2; cv::Mat F; for (auto iter = matches1.begin(); iter != matches1.end(); iter++) { float x = Keypoints1[iter->queryIdx].pt.x; float y = Keypoints1[iter->queryIdx].pt.y; points1.push_back(cv::Point2f(x, y)); x = prevKeypoints[iter->trainIdx].pt.x; y = prevKeypoints[iter->trainIdx].pt.y; points2.push_back(cv::Point2f(x, y)); } // Compute F matrix using RANSAC std::vector inliers(points1.size(), 0); //std::cout << "Points " << points1.size() << " " << points2.size() << std::endl; if (points1.size() > 0 && points2.size() > 0) { F = cv::findFundamentalMat( cv::Mat(points1), cv::Mat(points2), inliers,cv::FM_RANSAC, 0.7,.90); } vector inliermatches; for (int i = 0;i < inliers.size(); i++) { if (inliers[i]) { // it is a valid match inliermatches.push_back(Keypoints1[i]); } } //std::cout << inliermatches.size() << std::endl; prevKeypoints = Keypoints1; //std::cout << prevKeypoints.size() << std::endl; // return the found fundemental matrix return F; } ``` 计算对极约束,并根据特征点与极线的距离来确定outliers, 这里2.5应该是threshold.标识0 应该是outliers,1应该代表可能的静态点。 ```cpp vector cal_epipolar_constraints(vector prepoints, vector postpoints, cv::Mat F) { vector dis_idx; for (int i = 0; i < prepoints.size(); i++) { double a = F.at(0, 0)*prepoints[i].x + F.at(0, 1)*prepoints[i].y + F.at(0, 2); double b = F.at(1, 0)*prepoints[i].x + F.at(1, 1)*prepoints[i].y + F.at(1, 2); double c = F.at(2, 0)*prepoints[i].x + F.at(2, 1)*prepoints[i].y + F.at(2, 2); double distance = fabs(a*postpoints[i].x + b*postpoints[i].y + c) / sqrt(a*a + b*b); if (distance>2.5) { dis_idx.push_back(0); } else{ dis_idx.push_back(1); } } return dis_idx; } ``` 根据上面的 0,1 状态标识来移除outliers. ```cpp //removes keypoints based on a status vector (1 for keep, 0 for remove) void removePoints(vector> &initmvKeys, vector &status, int nlevels) { int startindex = 0; int endindex = 0; int removedpoints = 0; for (int i = 0; i < status.size(); i++) { if (status[i] == 0) { removedpoints++; } } removedpoints = 0; int totalsize = 0; for (int level = 0; level < nlevels; ++level) { totalsize += initmvKeys[level].size(); } for (int level = 0; level < nlevels; ++level) { vector &mkeypoints = initmvKeys[level]; int nkeypointsLevel = (int)mkeypoints.size(); if (nkeypointsLevel == 0) continue; endindex = mkeypoints.size() - 1; for (int i = endindex; i >= 0; i--) { if (status[i + startindex] == 0) { mkeypoints.erase(mkeypoints.begin() + i); removedpoints++; } } startindex += endindex + 1; } } ``` DynPointFinder类中的主函数。 - 计算光流 calcOpticalFlowPyrLK。 status – output status vector (of unsigned chars); each element of the vector is set to 1 if the flow for the corresponding features has been found, otherwise, it is set to 0. - 计算对极约束cal_epipolar_constraints - 移除外点 removePoints ```cpp bool DynPointFinder::findOutliers(Frame *pFrame, cv::Mat &imGray, vector &scaleFactors, int nlevels) { if(first){ prevFrame = Frame(*pFrame); imGray.copyTo(prevImage); cv::Ptr detector = cv::ORB::create(1000); vector kps; detector->detect(prevImage, kps); prevKeypoints = kps; first = false; return false; } //get ORBSLAM's keypoints float scale; vector keypoints; vector points; for (int level = 0; level < nlevels; ++level) { vector &mkeypoints = pFrame->initmvKeys[level]; int nkeypointsLevel = (int)mkeypoints.size(); if (level != 0) scale = scaleFactors[level]; else scale = 1; vector::iterator keypoint = mkeypoints.begin(); while (keypoint != mkeypoints.end()) { keypoints.push_back(*keypoint); points.push_back(keypoint->pt); keypoint++; } } cv::Ptr detector = cv::ORB::create(1000); vector currKeypoints; detector->detect(imGray, currKeypoints); vector state(keypoints.size(), 1); cv::Mat F = match(currKeypoints, imGray); //std::cout << F << std::endl; if(F.empty()){ prevFrame = Frame(*pFrame); imGray.copyTo(prevImage); return false; } cv::Size winSize(15,15); cv::TermCriteria termcrit(CV_TERMCRIT_ITER|CV_TERMCRIT_EPS, 10, 0.03); vector status; vector err; vector outpts; cv::calcOpticalFlowPyrLK(imGray, prevImage, points, outpts, status, err, winSize, 2, termcrit, 0, 0.001); vector p01op, p02op; int numFailed = 0; float erravg = 0; for(int i = 0; i < err.size(); i++){ erravg += err[i]; } erravg = erravg; vector indices; for (int i = 0; i < status.size(); i++) { if (status[i]==1) { p01op.push_back(points[i]); p02op.push_back(outpts[i]); indices.push_back(i); } else{ state[i] = 0; numFailed++; } } float avgmag = 0; vector mag; //模长 for(int i = 0; i < p01op.size(); i++){ float diffx = p01op[i].x - p02op[i].x; float diffy = p01op[i].y - p02op[i].y; avgmag += sqrt(diffx*diffx+diffy*diffy); mag.push_back(sqrt(diffx*diffx+diffy*diffy)); } avgmag = avgmag/p01op.size(); float stdev = 0; for(int i = 0; i < mag.size(); i++){ stdev += (mag[i] - avgmag)*(mag[i] - avgmag); } stdev = sqrt(stdev/mag.size()); //std::cout << stdev << std::endl; //std::cout << p01op.size() << std::endl; vector idx1 = cal_epipolar_constraints(p01op, p02op, F); cv::Mat show; imGray.copyTo(show); vector final1, final2; for(int i = 0; i < p01op.size();i++){ if((idx1[i] == 0 || abs(mag[i]-avgmag) > 1 && stdev < 3 ) || (abs(mag[i]-avgmag) > 2*stdev)){ state[indices[i]] = 0; } else{ final1.push_back(p01op[i]); final2.push_back(p02op[i]); } } removePoints(pFrame->initmvKeys, state, nlevels); prevFrame = Frame(*pFrame); imGray.copyTo(prevImage); } ``` Initializer.cc 计算Fundamental matrix ```cpp void Initializer::getRelativePose(const Frame &CurrentFrame, const vector &vMatches12, cv::Mat &F, vector &vP3D, vector &vbTriangulated) { // Fill structures with current keypoints and matches with reference frame // Reference Frame: 1, Current Frame: 2 mvKeys2 = CurrentFrame.mvKeysUn; mvMatches12.clear(); mvMatches12.reserve(mvKeys2.size()); mvbMatched1.resize(mvKeys1.size()); for(size_t i=0, iend=vMatches12.size();i=0) { mvMatches12.push_back(make_pair(i,vMatches12[i])); mvbMatched1[i]=true; } else mvbMatched1[i]=false; } const int N = mvMatches12.size(); // Indices for minimum set selection vector vAllIndices; vAllIndices.reserve(N); vector vAvailableIndices; for(int i=0; i >(mMaxIterations,vector(8,0)); DUtils::Random::SeedRandOnce(0); for(int it=0; it vbMatchesInliersH, vbMatchesInliersF; float SF; cv::Mat fund; FindFundamental(vbMatchesInliersF, SF, fund); //std::cout << SF << std::endl; fund.copyTo(F); } ```

No comments