Visual Servoing Platform  version 3.2.0
Tutorial: Homography estimation from points

Introduction

This tutorial shows how to estimate an homography from points. Two cases are considered in the next sections:

Homography estimation

Note that all the material (source code) described in this section is part of ViSP source code and could be downloaded using the following command:

$ svn export https://github.com/lagadic/visp.git/trunk/tutorial/computer-vision

Let us consider the following source code also available in tutorial-homography-from-points.cpp. To resume, we do the following:

  • define four 3D points oP in an object frame,
  • define an homogeneous transformation aMo from frame a to to the object frame o,
  • define a similar homogeneous transformation bMo from frame b to to the object frame o,
  • compute the coordinates of the four 3D points in the image plane a and b. These are the matched coordinates (xa,ya) and (xb,yb) that are then used to estimate the homography using either vpHomography::DLT() or vpHomography::HLM().
#include <visp3/vision/vpHomography.h>
#include <visp3/core/vpMeterPixelConversion.h>
int main()
{
double L = 0.1;
std::vector<vpPoint> oP;
oP.push_back(vpPoint(-L, -L, 0));
oP.push_back(vpPoint(2 * L, -L, 0));
oP.push_back(vpPoint(L, 3 * L, 0));
oP.push_back(vpPoint(-L, 4 * L, 0));
vpHomogeneousMatrix bMo(0.1, 0, 1, 0, vpMath::rad(15), 0);
vpHomogeneousMatrix aMb(0.2, -0.1, 0.1, vpMath::rad(-3), vpMath::rad(20), vpMath::rad(5));
vpHomogeneousMatrix aMo = aMb * bMo;
std::vector<vpPoint> aP(4), bP(4);
std::vector<double> xa(4), ya(4), xb(4), yb(4);
for (unsigned int i = 0; i < 4; i++) {
oP[i].project(aMo);
xa[i] = oP[i].get_x();
ya[i] = oP[i].get_y();
oP[i].project(bMo);
xb[i] = oP[i].get_x();
yb[i] = oP[i].get_y();
}
vpHomography::DLT(xb, yb, xa, ya, aHb, true);
std::cout << "Estimated homography using DLT:\n" << aHb / aHb[2][2] << std::endl;
vpHomography::HLM(xb, yb, xa, ya, true, aHb);
std::cout << "Estimated homography using HLM:\n" << aHb / aHb[2][2] << std::endl;
aHb.computeDisplacement(aRb, atb, n);
std::cout << "\nEstimated displacement:" << std::endl;
std::cout << " atb: " << atb.t() << std::endl;
atub.buildFrom(aRb);
std::cout << " athetaub: ";
for (unsigned int i = 0; i < 3; i++)
std::cout << vpMath::deg(atub[i]) << " ";
std::cout << std::endl;
std::cout << " n: " << n.t() << std::endl;
vpImagePoint iPa, iPb;
vpMeterPixelConversion::convertPoint(cam, xb[3], yb[3], iPb);
vpMeterPixelConversion::convertPoint(cam, xa[3], ya[3], iPa);
std::cout << "Ground truth: Point 3 in pixels in frame b: " << iPb << std::endl;
std::cout << "Ground truth: Point 3 in pixels in frame a: " << iPa << std::endl;
// Project the position in pixel of point 3 from the homography
std::cout << "Estimation from homography: Point 3 in pixels in frame a: " << vpHomography::project(cam, aHb, iPb)
<< std::endl;
}

Now we give a line by line explanation of the code:

First we have to include the header of the vpHomography class.

#include <visp3/vision/vpHomography.h>

In the main() function we first define the 3D coordinates of 4 points that are localized in the plane Z=0:

double L = 0.1;
std::vector<vpPoint> oP;
oP.push_back(vpPoint(-L, -L, 0));
oP.push_back(vpPoint(2 * L, -L, 0));
oP.push_back(vpPoint(L, 3 * L, 0));
oP.push_back(vpPoint(-L, 4 * L, 0));

Then we define the homogeneous transformations between frames a, b and object frame o:

vpHomogeneousMatrix bMo(0.1, 0, 1, 0, vpMath::rad(15), 0);
vpHomogeneousMatrix aMb(0.2, -0.1, 0.1, vpMath::rad(-3), vpMath::rad(20), vpMath::rad(5));
vpHomogeneousMatrix aMo = aMb * bMo;

From these transformations we compute the coordinates of the points in the image plane a and b. For each point we have its coordinates (xa,ya) in frame a and (xb,yb) in frame b:

std::vector<vpPoint> aP(4), bP(4);
std::vector<double> xa(4), ya(4), xb(4), yb(4);
for (unsigned int i = 0; i < 4; i++) {
oP[i].project(aMo);
xa[i] = oP[i].get_x();
ya[i] = oP[i].get_y();
oP[i].project(bMo);
xb[i] = oP[i].get_x();
yb[i] = oP[i].get_y();
}

We have now matched couples of coordinates of four points that are used to estimate an homography between image plane a and b. Two methods are available, either using the DLT (Direct Linear Transform) algorithm or the HLM algorithm.

vpHomography::DLT(xb, yb, xa, ya, aHb, true);
std::cout << "Estimated homography using DLT:\n" << aHb / aHb[2][2] << std::endl;
vpHomography::HLM(xb, yb, xa, ya, true, aHb);
std::cout << "Estimated homography using HLM:\n" << aHb / aHb[2][2] << std::endl;
Note
Note that vpHomography::HLM() allows to consider points that are not coplanar.

Once the homography is estimated, the vpHomography class allows to extract the 3D homogeneous transformation between frames a and b:

Just for fun we print the values to this transformation using:

std::cout << "\nEstimated displacement:" << std::endl;
std::cout << " atb: " << atb.t() << std::endl;
atub.buildFrom(aRb);
std::cout << " athetaub: ";
for (unsigned int i = 0; i < 3; i++)
std::cout << vpMath::deg(atub[i]) << " ";
std::cout << std::endl;
std::cout << " n: " << n.t() << std::endl;

This code lead to the following output:

Estimated displacement:
atb: 0.2016519874 -0.1008259937 0.1008259937
athetaub: -3 20 5
n: 0.2588190451 -1.124100812e-14 0.9659258263

where we can see that the values for atb and athetaub are the one specified at the beginning of the source code during aMb initialization.

After we show how to retrieve the coordinates in pixels of a point (here point [3]) in the corresponding images using camera parameters:

vpImagePoint iPa, iPb;
vpMeterPixelConversion::convertPoint(cam, xb[3], yb[3], iPb);
vpMeterPixelConversion::convertPoint(cam, xa[3], ya[3], iPa);
std::cout << "Ground truth: Point 3 in pixels in frame b: " << iPb << std::endl;
std::cout << "Ground truth: Point 3 in pixels in frame a: " << iPa << std::endl;

At the end, we show how to project a point with pixel coordinates from image b to image a using the homography:

// Project the position in pixel of point 3 from the homography
std::cout << "Estimation from homography: Point 3 in pixels in frame a: " << vpHomography::project(cam, aHb, iPb)
<< std::endl;

This last part of the code produce the following output:

Ground truth: Point 3 in pixels in frame b: 377.9450564, 193.9928711
Ground truth: Point 3 in pixels in frame a: 353.8501593, 486.1851856
Estimation from homography: Point 3 in pixels in frame a: 353.8501593, 486.1851856

Ransac or robust homography estimation

Note that all the material (source code) described in this section is part of ViSP source code and could be downloaded using the following command:

$ svn export https://github.com/lagadic/visp.git/trunk/tutorial/detection/matching

This section follows the Tutorial: Keypoints matching. It explains how to exploit couples of matched points obtained using SURF detector in order to estimate an homography that allows to reject mismatched couples of points. The homography is then used to track a postcard from its initial position in the reference image.

Let us consider the following source code also available in tutorial-matching-keypoint-homography.cpp.

#include <visp3/core/vpPixelMeterConversion.h>
#include <visp3/gui/vpDisplayOpenCV.h>
#include <visp3/io/vpVideoReader.h>
#include <visp3/vision/vpHomography.h>
#include <visp3/vision/vpKeyPoint.h>
int main(int argc, const char **argv)
{
#if (VISP_HAVE_OPENCV_VERSION >= 0x020101)
int method = 0;
if (argc > 1)
method = atoi(argv[1]);
if (method == 0)
std::cout << "Uses Ransac to estimate the homography" << std::endl;
else
std::cout << "Uses a robust scheme to estimate the homography" << std::endl;
vpVideoReader reader;
reader.setFileName("video-postcard.mpeg");
reader.acquire(I);
const std::string detectorName = "ORB";
const std::string extractorName = "ORB";
// Hamming distance must be used with ORB
const std::string matcherName = "BruteForce-Hamming";
vpKeyPoint keypoint(detectorName, extractorName, matcherName, filterType);
keypoint.buildReference(I);
Idisp.resize(I.getHeight(), 2 * I.getWidth());
Idisp.insert(I, vpImagePoint(0, 0));
Idisp.insert(I, vpImagePoint(0, I.getWidth()));
vpDisplayOpenCV d(Idisp, 0, 0, "Homography from matched keypoints");
vpImagePoint corner_ref[4];
corner_ref[0].set_ij(115, 64);
corner_ref[1].set_ij(83, 253);
corner_ref[2].set_ij(282, 307);
corner_ref[3].set_ij(330, 72);
for (unsigned int i = 0; i < 4; i++) {
vpDisplay::displayCross(Idisp, corner_ref[i], 12, vpColor::red);
}
vpCameraParameters cam(840, 840, I.getWidth() / 2, I.getHeight() / 2);
vpHomography curHref;
while (!reader.end()) {
reader.acquire(I);
Idisp.insert(I, vpImagePoint(0, I.getWidth()));
unsigned int nbMatch = keypoint.matchPoint(I);
std::cout << "Nb matches: " << nbMatch << std::endl;
std::vector<vpImagePoint> iPref(nbMatch),
iPcur(nbMatch); // Coordinates in pixels (for display only)
std::vector<double> mPref_x(nbMatch), mPref_y(nbMatch);
std::vector<double> mPcur_x(nbMatch), mPcur_y(nbMatch);
std::vector<bool> inliers(nbMatch);
for (unsigned int i = 0; i < nbMatch; i++) {
keypoint.getMatchedPoints(i, iPref[i], iPcur[i]);
vpPixelMeterConversion::convertPoint(cam, iPref[i], mPref_x[i], mPref_y[i]);
vpPixelMeterConversion::convertPoint(cam, iPcur[i], mPcur_x[i], mPcur_y[i]);
}
try {
double residual;
if (method == 0)
vpHomography::ransac(mPref_x, mPref_y, mPcur_x, mPcur_y, curHref, inliers, residual,
(unsigned int)(mPref_x.size() * 0.25), 2.0 / cam.get_px(), true);
else
vpHomography::robust(mPref_x, mPref_y, mPcur_x, mPcur_y, curHref, inliers, residual, 0.4, 4, true);
} catch (...) {
std::cout << "Cannot compute homography from matches..." << std::endl;
}
vpImagePoint corner_cur[4];
for (int i = 0; i < 4; i++) {
corner_cur[i] = vpHomography::project(cam, curHref, corner_ref[i]);
}
vpImagePoint offset(0, I.getWidth());
for (int i = 0; i < 4; i++) {
vpDisplay::displayLine(Idisp, corner_cur[i] + offset, corner_cur[(i + 1) % 4] + offset, vpColor::blue, 3);
}
for (unsigned int i = 0; i < nbMatch; i++) {
if (inliers[i] == true)
vpDisplay::displayLine(Idisp, iPref[i], iPcur[i] + offset, vpColor::green);
else
vpDisplay::displayLine(Idisp, iPref[i], iPcur[i] + offset, vpColor::red);
}
if (vpDisplay::getClick(Idisp, false))
break;
}
#else
(void)argc;
(void)argv;
#endif
return 0;
}

The command line allows to use either Ransac algorithm of a robust M-estimator approach:

% ./tutorial-matching-keypoint-homography 0 // to run Ransac
% ./tutorial-matching-keypoint-homography 1 // to run the robust M-estimator

Here after is the resulting video. The left image represents the reference image. The right images correspond to the successive images of the input video. All the green lines extremities represent the points that are well matched and used in the homography estimation process. All the red lines represent couple of matched points that are rejected by the robust estimator.

Now, let us explain the new lines that were introduced to estimate the homography.

First we detect the command line arguments to be able later to user either Ransac or the robust M-estimator:

int method = 0;
if (argc > 1)
method = atoi(argv[1]);
if (method == 0)
std::cout << "Uses Ransac to estimate the homography" << std::endl;
else
std::cout << "Uses a robust scheme to estimate the homography" << std::endl;

We also initialize the coordinates of the pixels in the reference image that correspond to the postcard corners. These coordinates were obtained after a user initial click. To simplify the code, we set directly the coordinates of the points:

vpImagePoint corner_ref[4];
corner_ref[0].set_ij(115, 64);
corner_ref[1].set_ij(83, 253);
corner_ref[2].set_ij(282, 307);
corner_ref[3].set_ij(330, 72);

Using these coordinates, we display red lines around the postcard:

for (unsigned int i = 0; i < 4; i++) {
vpDisplay::displayCross(Idisp, corner_ref[i], 12, vpColor::red);
}

We need also to define roughly the parameters of our camera:

vpCameraParameters cam(840, 840, I.getWidth() / 2, I.getHeight() / 2);

For each new image, once points are matched using:

unsigned int nbMatch = keypoint.matchPoint(I);
std::cout << "Nb matches: " << nbMatch << std::endl;

We allocate new containers useful for the homography estimation. The coordinates of the points in the reference image will be stored in (mPref_x, mPref_y), while the corresponding coordinates in the current image will be stored in (mPcur_x, mPcur_y). We also allocate inliers a vector of boolean that will indicate if a couple of point is an inlier or an outlier:

std::vector<double> mPref_x(nbMatch), mPref_y(nbMatch);
std::vector<double> mPcur_x(nbMatch), mPcur_y(nbMatch);
std::vector<bool> inliers(nbMatch);

To estimate the homography we need first to convert the points from pixel to meters:

vpPixelMeterConversion::convertPoint(cam, iPref[i], mPref_x[i], mPref_y[i]);
vpPixelMeterConversion::convertPoint(cam, iPcur[i], mPcur_x[i], mPcur_y[i]);

We can now estimate the homography using either Ransac or the robust M-estimator approach:

try {
double residual;
if (method == 0)
vpHomography::ransac(mPref_x, mPref_y, mPcur_x, mPcur_y, curHref, inliers, residual,
(unsigned int)(mPref_x.size() * 0.25), 2.0 / cam.get_px(), true);
else
vpHomography::robust(mPref_x, mPref_y, mPcur_x, mPcur_y, curHref, inliers, residual, 0.4, 4, true);
} catch (...) {
std::cout << "Cannot compute homography from matches..." << std::endl;
}

For Ransac we consider that at least 50 percent of the points should be inliers (mPref_x.size()/2) to reach a consensus and that a couple of point is stated as an inlier if the reprojection error is lower than 2 pixels (2.0/cam.get_px()).

Then using the homography, we project the coordinates of the postcard corners in the current image:

vpImagePoint corner_cur[4];
for (int i = 0; i < 4; i++) {
corner_cur[i] = vpHomography::project(cam, curHref, corner_ref[i]);
}

We use these coordinates to draw blue lines arround the postcard:

vpImagePoint offset(0, I.getWidth());
for (int i = 0; i < 4; i++) {
vpDisplay::displayLine(Idisp, corner_cur[i] + offset, corner_cur[(i + 1) % 4] + offset, vpColor::blue, 3);
}

Since the homography estimation updates the status of the couples of matched points as inliers or outliers, between the matched points we are able to draw green lines when they are inliers, or red lines when they are outliers.

for (unsigned int i = 0; i < nbMatch; i++) {
if (inliers[i] == true)
vpDisplay::displayLine(Idisp, iPref[i], iPcur[i] + offset, vpColor::green);
else
vpDisplay::displayLine(Idisp, iPref[i], iPcur[i] + offset, vpColor::red);
}
vpThetaUVector::buildFrom
vpThetaUVector buildFrom(const vpHomogeneousMatrix &M)
Definition: vpThetaUVector.cpp:110
vpHomography::ransac
static bool ransac(const std::vector< double > &xb, const std::vector< double > &yb, const std::vector< double > &xa, const std::vector< double > &ya, vpHomography &aHb, std::vector< bool > &inliers, double &residual, unsigned int nbInliersConsensus, double threshold, bool normalization=true)
Definition: vpHomographyRansac.cpp:317
vpColVector::t
vpRowVector t() const
Definition: vpColVector.cpp:695
vpHomography::DLT
static void DLT(const std::vector< double > &xb, const std::vector< double > &yb, const std::vector< double > &xa, const std::vector< double > &ya, vpHomography &aHb, bool normalization=true)
Definition: vpHomographyDLT.cpp:249
vpMath::rad
static double rad(double deg)
Definition: vpMath.h:101
vpKeyPoint::ratioDistanceThreshold
Definition: vpKeyPoint.h:237
vpCameraParameters
Generic class defining intrinsic camera parameters.
Definition: vpCameraParameters.h:232
vpMath::deg
static double deg(double rad)
Definition: vpMath.h:94
vpVideoReader::end
bool end()
Definition: vpVideoReader.h:257
vpHomography::robust
static void robust(const std::vector< double > &xb, const std::vector< double > &yb, const std::vector< double > &xa, const std::vector< double > &ya, vpHomography &aHb, std::vector< bool > &inlier, double &residual, double weights_threshold=0.4, unsigned int niter=4, bool normalization=true)
Definition: vpHomography.cpp:570
vpKeyPoint::vpFilterMatchingType
vpFilterMatchingType
Definition: vpKeyPoint.h:232
vpHomography::HLM
static void HLM(const std::vector< double > &xb, const std::vector< double > &yb, const std::vector< double > &xa, const std::vector< double > &ya, bool isplanar, vpHomography &aHb)
Definition: vpHomographyMalis.cpp:621
vpHomography::computeDisplacement
void computeDisplacement(vpRotationMatrix &aRb, vpTranslationVector &atb, vpColVector &n)
Definition: vpHomographyExtract.cpp:60
vpImage::insert
void insert(const vpImage< Type > &src, const vpImagePoint &topLeft)
Definition: vpImage.h:1187
vpTranslationVector
Class that consider the case of a translation vector.
Definition: vpTranslationVector.h:90
vpThetaUVector
Implementation of a rotation vector as axis-angle minimal representation.
Definition: vpThetaUVector.h:145
vpCameraParameters::get_px
double get_px() const
Definition: vpCameraParameters.h:329
vpColVector
Implementation of column vector and the associated operations.
Definition: vpColVector.h:71
vpDisplayOpenCV
The vpDisplayOpenCV allows to display image using the OpenCV library. Thus to enable this class OpenC...
Definition: vpDisplayOpenCV.h:141
vpVideoReader
Class that enables to manipulate easily a video file or a sequence of images. As it inherits from the...
Definition: vpVideoReader.h:170
vpColor::green
static const vpColor green
Definition: vpColor.h:182
vpImagePoint::set_ij
void set_ij(const double ii, const double jj)
Definition: vpImagePoint.h:188
vpDisplay::display
static void display(const vpImage< unsigned char > &I)
Definition: vpDisplay_uchar.cpp:676
vpRotationMatrix
Implementation of a rotation matrix and operations on such kind of matrices.
Definition: vpRotationMatrix.h:70
vpKeyPoint
Class that allows keypoints detection (and descriptors extraction) and matching thanks to OpenCV libr...
Definition: vpKeyPoint.h:227
vpImage::getHeight
unsigned int getHeight() const
Definition: vpImage.h:177
vpHomography::project
static vpImagePoint project(const vpCameraParameters &cam, const vpHomography &bHa, const vpImagePoint &iPa)
Definition: vpHomography.cpp:499
vpPixelMeterConversion::convertPoint
static void convertPoint(const vpCameraParameters &cam, const double &u, const double &v, double &x, double &y)
Definition: vpPixelMeterConversion.h:102
vpHomography
Implementation of an homography and operations on homographies.
Definition: vpHomography.h:173
vpImagePoint
Class that defines a 2D point in an image. This class is useful for image processing and stores only ...
Definition: vpImagePoint.h:87
vpColor::white
static const vpColor white
Definition: vpColor.h:174
vpColor::blue
static const vpColor blue
Definition: vpColor.h:185
vpDisplay::flush
static void flush(const vpImage< unsigned char > &I)
Definition: vpDisplay_uchar.cpp:652
vpDisplay::displayCross
static void displayCross(const vpImage< unsigned char > &I, const vpImagePoint &ip, unsigned int size, const vpColor &color, unsigned int thickness=1)
Definition: vpDisplay_uchar.cpp:179
vpImage< unsigned char >
vpPoint
Class that defines what is a point.
Definition: vpPoint.h:57
vpDisplay::getClick
static bool getClick(const vpImage< unsigned char > &I, bool blocking=true)
Definition: vpDisplay_uchar.cpp:701
vpDisplay::displayLine
static void displayLine(const vpImage< unsigned char > &I, const vpImagePoint &ip1, const vpImagePoint &ip2, const vpColor &color, unsigned int thickness=1)
Definition: vpDisplay_uchar.cpp:391
vpHomogeneousMatrix
Implementation of an homogeneous matrix and operations on such kind of matrices.
Definition: vpHomogeneousMatrix.h:91
vpMeterPixelConversion::convertPoint
static void convertPoint(const vpCameraParameters &cam, const double &x, const double &y, double &u, double &v)
Definition: vpMeterPixelConversion.h:106
vpImage::resize
void resize(const unsigned int h, const unsigned int w)
resize the image : Image initialization
Definition: vpImage.h:865
vpVideoReader::acquire
void acquire(vpImage< vpRGBa > &I)
Definition: vpVideoReader.cpp:265
vpVideoReader::setFileName
void setFileName(const char *filename)
Definition: vpVideoReader.cpp:91
vpTranslationVector::t
vpRowVector t() const
Definition: vpTranslationVector.cpp:606
vpColor::red
static const vpColor red
Definition: vpColor.h:179
vpImage::getWidth
unsigned int getWidth() const
Definition: vpImage.h:238