The Problem
Given a collection of points in 3D space, we're trying to find the plane that is the closest to those points.
minimize( sum ( distance²(point, plane), point), plane)
If you're anything like me, you're probably wondering why we sum the squares of the distances.
The reasoning is a little backwards, but basically it's so that we can solve it by using the method of Linear Least Squares. It turns our approximation problem into a so-called "Quadratic-Form", which we know from theory we can solve exactly and efficiently using linear algebra.
There are other ways we could have phrased the problem. One of my favorites involves using Robust Statistics to ignore outliers. Another involves using radial basis functions to try and trap the plane within a region - this works great when we're not sure what type of function best approximates the points. Yet another approach involves perturbing and clustering the points to re-weight the problem.
Something I find fascinating is that most of those methods eventually require a linear least squares solver.
[Update 2015-11: I have a suspicion that finding the 'maximum likelihood estimator' for this particular problem doesn't require the use of LLS - can any kind reader confirm or deny?]
So lets start by forming the linear least squares approximation, and then later we can see if we still need to extend it. (Of course, if you'd like to know more, why not leave a comment in the comments section down below?)
Linear Least Squares
Let's start by defining our plane in implicit form:
C = Center of Plane
N = Plane's normal
C + X.N = 0
Then for an arbitrary point P, we can write it in 'plane' co-ordinates like so :
P = C + μN + pN⟂
Here μ is the distance from the point to the plane, and N⟂ is a 2-by-3 matrix representing the perpendicular to the plane's normal, and p is a 2-vector of co-factors.
We are trying to minimize the following :
E = sum( μ², points)
With a little math, we can show that
C = sum ( P, points ) / count ( points )
With a lot more math, we can show that N is the eigenvector associated with the smallest eigenvalue of the following matrix :
M = sum [ (Px-Cx).(Px-Cx), (Px-Cx).(Py-Cy), (Px-Cx).(Pz-Cz),
(Py-Cy).(Px-Cx), (Py-Cy).(Py-Cy), (Py-Cy).(Pz-Cz),
(Pz-Cz).(Px-Cx), (Pz-Cz).(Py-Cy), (Pz-Cz).(Pz-Cz)]
Finding the Center
Lets find the C first, that's the Center of the plane. In the mathematical theory, it doesn't really make sense to talk about the center of an infinite plane - any point on the plane, and any multiple of the plane's normal describe the same plane. Indeed, it's tantalizingly seductive to describe a plane by it's normal, N, and its distance to the origin, C.N.
But for those of us fortunate enough to be trapped inside a computer, we must face the realities of floating point numbers and round off error. For this reason, I always try to represent a plane using a Center and a Normal, until I've exhausted all other available optimizations.
def FindCenter(points):
sum = Vector3(0, 0, 0)
for p in points:
sum += p
return sum / len(points)
Finding the Normal
Now lets find the normal. You'll see this type of computation fairly often when dealing with quadratic forms, lots of square terms on the diagonal, and lots of cross terms off the diagonal.
As it's a symmetric matrix, we only need to compute half of the off-diagonal terms : X.{X,Y,Z} Y.{Y,Z} Z.{Z}
def FindNormal(points):
sumxx = sumxy = sumxz = 0
sumyy = sumyz = 0
sumzz = 0
center = FindCenter(points)
for p in Points:
dx = p.X - center.X
dy = p.Y - center.Y
dz = p.Z - center.Z
sumxx += dx*dx
sumxy += dx*dy
sumxz += dx*dz
sumyy += dy*dy
sumyz += dy*dz
sumzz += dz*dz
symmetricM = Matrix33( \
sumxx,sumxy,sumxz, \
sumxy,sumyy,sumyz, \
sumxz,sumyz,sumzz)
return FindSmallestMumbleMumbleVector(symmetricM)
Note that we've had to make two passes through the points collection of points - once to find the center, and a second pass to form the matrix. In theory, we can compute both at the same time using a single pass through the points collection. In practice, the round-off error will obliterate any precision in our result. Still, it's useful to know it's possible if you're stuck on a geometry shader and only have one chance to see the incoming vertices.
The smallest Eigenvalue
So now we just need to find an eigenvector associated with the smallest eigenvalue of a matrix.
If you recall, for a (square) matrix M, an eigenvalue, λ, and an eigenvector, v are given by:
Mv = λv
That is, the matrix M, operating on the vector v, simply scales the vector by λ.
Our 3x3 symmetric matrix will have 1, 2 or 3 (real) eigenvalues (see appendix!), and we need to find the smallest one. But watch this:
M-1v = λ-1v
We just turned our smallest eigenvalue of M into the largest eigenvalue of M-1!
In code :
def FindEigenvectorAssociatedWithSmallestEigenvalue(m):
det = m.GetDeterminant()
if det == 0:
return m.GetNullSpace()
mInverse = m.GetInverse()
return FindEigenvectorAssociatedWithLargestEigenvalue(mInverse)
The largest Eigenvalue
Computing eigenvalues exactly can be a little tricky to code correctly, but luckily forming an approximation to the largest eigenvalue is really easy:
def FindEigenvectorAssociatedWithLargestEigenvalue(m):
v = Vector(1, 1, 1)
for _ in xrange(10000):
v = (m * v).GetNormalized()
return v
Sucesss!
So there we have it, linear least squares in two passes! All that's left to do is write the code!
Appendix - Linear Least Squares plane for a Point Cloud in C++
The following 3 functions for linear least squares are hereby licensed under CC0.// From http://missingbytes.blogspot.com/2012/06/fitting-plane-to-point-cloud.html
float FindLargestEntry(const Matrix33 &m){
float result=0.0f;
for(int i=0;i<3;i++){
for(int j=0;j<3;j++){
float entry=fabs(m.GetElement(i,j));
result=std::max(entry,result);
}
}
return result;
}
// From http://missingbytes.blogspot.com/2012/06/fitting-plane-to-point-cloud.html
// note: This function will perform badly if the largest eigenvalue is complex
Vector3 FindEigenVectorAssociatedWithLargestEigenValue(const Matrix33 &m){
//pre-condition
float scale=FindLargestEntry(m);
Matrix33 mc=m*(1.0f/scale);
mc=mc*mc;
mc=mc*mc;
mc=mc*mc;
Vector3 v(1,1,1);
Vector3 lastV=v;
for(int i=0;i<100;i++){
v=(mc*v).GetUnit();
if(DistanceSquared(v,lastV)<1e-16f){
break;
}
lastV=v;
}
return v;
}
// From http://missingbytes.blogspot.com/2012/06/fitting-plane-to-point-cloud.html
void FindLLSQPlane(Vector3 *points,int count,Vector3 *destCenter,Vector3 *destNormal){
assert(count>0);
Vector3 sum(0,0,0);
for(int i=0;i<count;i++){
sum+=points[i];
}
Vector3 center=sum*(1.0f/count);
if(destCenter){
*destCenter=center;
}
if(!destNormal){
return;
}
float sumXX=0.0f,sumXY=0.0f,sumXZ=0.0f;
float sumYY=0.0f,sumYZ=0.0f;
float sumZZ=0.0f;
for(int i=0;i<count;i++){
float diffX=points[i].X-center.X;
float diffY=points[i].Y-center.Y;
float diffZ=points[i].Z-center.Z;
sumXX+=diffX*diffX;
sumXY+=diffX*diffY;
sumXZ+=diffX*diffZ;
sumYY+=diffY*diffY;
sumYZ+=diffY*diffZ;
sumZZ+=diffZ*diffZ;
}
Matrix33 m(sumXX,sumXY,sumXZ,\
sumXY,sumYY,sumYZ,\
sumXZ,sumYZ,sumZZ);
float det=m.GetDeterminant();
if(det==0.0f){
m.GetNullSpace(destNormal,NULL,NULL);
return;
}
Matrix33 mInverse=m.GetInverse();
*destNormal=FindEigenVectorAssociatedWithLargestEigenValue(mInverse);
}
Appendix - Citation Needed
Our 3x3 symmetric matrix is Hermitian, therefore all it's eigenvalues are real and it's Jordan Normal form looks like this:
[ λ1, 0, 0,
0, λ2, 0,
0, 0, λ3 ]
Assume w.l.o.g. that λ1 >= λ2 >= λ3.
If λ1 == λ2 == λ3, then we have only one eigenvalue.
If λ1 > λ2 > λ3, then we have three eigenvalues.
The only remaining cases both have two eigenvalues.
Thanks for the great post Chris! This is a fascinating solution. I've never been able to really visualize a use for eigenvectors before. So this is actually pretty exciting for me.
ReplyDeleteI'm trying to gain some intuition about your "matrix of squared distances"(if I can call it that). I understand that it encodes the "directionality" and distance of the point field relative to the centroid/plane center.
So thinking in those terms, I can see how the basis vectors would grow outwards towards areas with a high density of distant points. That all makes sense.
I can see how the eigenvectors for this matrix would be perpendicular to average plane of the points, and thus be related to the normal of the plane. So that's all good.
I do a get a bit lost in the details though. Can you speak to the pre-conditioning that you do to the matrix? It looks like you normalize it, and then square it a few times. Is this to keep it well away from floating point errors? If so, I'm having trouble understanding why this would have that effect...
Also, I'm unable to follow inner loop inside FindEigenVectorWithBlahBlah(). You are successively transforming the unit X vector by the matrix until it converges on a constant value. So I understand that a vector that is unchanged by a transformation IS an eigenvector. I just don't understand how this function is guaranteed to find it. Why do you start with the X-axis? That seems arbitrary (I'm guessing it is) Can you point to some literature that explains this method of finding an eigenvector?
I've already learned a lot from this article. Thanks so much for putting it together!
Awesome!
ReplyDeleteI find it useful to think of the matrix as a quadratic form.
Just for simplicity, set C = [0, 0, 0]^T
Then by horribly abusing notation, M looks something like M = P * P^T
...and symmetry follows automatically.
So now the energy associated with a probe vector looks something like:
E(probe) = probe . M . probe^T <- this is the quadratic form, with M-symmetric
In essence, M is a non-uniform scaling matrix, and the eigenvectors are the basis vectors along which it scales.
It's like 'M' picks out the natural basis for the point cloud.
As for the FindEigenVectorWithBlahBlah(), I'm using a power iteration: http://en.wikipedia.org/wiki/Power_iteration
Actually, that would make a great topic for next week's post!
Today I learned about Power iteration method. It finds a single eigenvector with the greatest magnitude, which in this case is exactly what we're looking for! Very neat.
ReplyDeleteIf you're taking requests, I'd love to see more articles exploring numerical optimization methods like this one. There's lots of math tutorials on the net, but I find it far easier to absorb when it's broken down into pseudocode as you have done here.
This Matrix33 class referred to in your code sample, is that available somewhere as well?
ReplyDeleteMy apologies, Thomas, I've not yet published the source code for Vector3/Matrix33.
ReplyDeleteYou could try the Ogre3D matrix class : http://www.ogre3d.org/docs/api/1.9/classOgre_1_1Matrix3.html
Alternatively, contact me privately and I'm sure we can come to arrangement.
A nice piece of code and valuable answer to a question I was facing.
ReplyDeleteHowever, the cloud of point cannot be a set of points lying on a plane with a normal orientated along one of the main axes. In such a situation, the matrix is singular.
Awesome!
ReplyDeleteActually, I don't think a singular matrix is necessarily a problem, it simply corresponds to one (or more) of the eigenvalues being zero, which would actually accelerate the convergence.
Why not try it and see?
Now that you mention it, I think I can see a related bug in the code, if you find a good fix I'd love to improve it.
I'm having trouble implementing the "GetNullSpace" method. I can't find this in the matrix implementation of any library (Eigen, Ogre, Qt, IPP). What is it supposed to do?
ReplyDeleteAlso, in which case is it used? What does it mean if the determinant is 0? One of my unit tests inputs only 3 coordinates and it ends up in that case. Will the determinant be 0 whenever all the points lie on the exact same plane? If it's the only case, I guess I can simply do a cross product between (v2-v1) and (v1-v0), with v0, v1 and v2 being 3 arbitrary coordinates in the input set.
Hi Samuel, and apologies for the late response.
ReplyDeleteSo the nullspace of a 3x3 matrix, M, is the space of all vectors, v where M.v = 0
For a given 3x3 matrix, there's four different possibilities.
case (1) Either v = {0,0,0} and nothing else else.
case (2) v is a multiple of a vector, something like e.g. {1, 2, 3}, {2, 4, 6}, {3, 6, 9} etc.
case (3) is the sum two multiples of vectors. So for example if M is [{1,0,0},{0,0,0},{0,0,0}] then we can choose any y and z and make a vector {0,y,z} and then M.{0,y,z} = 0
case (4), M is the null matrix =[{0,0,0},{0,0,0},{0,0,0}], and multiplying by *any* vector will be zero.
Remember that the nullspace of a Matrix M, is the set of all v such that M.v = 0? Well it turns out that if the determinant of the matrix is zero, then we must be in case (2), (3) or (4). And it the determinant is non-zero, then we must be in case (1).
So the procedure outlined above to find the plane only works if the matrix is in case (1). If we're in case (2), (3) or (4), then we can use any element of the nullspace as the normal vector for our plane.
To actually find the nullspace in a computer program, you can use "Gaussian Elimination", but a note of caution, the algorithm isn't very natural to express in computer programming languages, which is why it isn't often included in the public API.
(In a computer program, it's more natural to use the Singular Value Decomposition, and choose the singular values which are zero.. but there the math doesn't flow as naturally.)