## Removig Points That Are Close to Each Other, From a Point Matrix

### Removig Points That Are Close to Each Other, From a Point Matrix

Hi All,

I have a matrix of points in x,y coordinate, and I need to eliminate
the points that are closer than a threshold value to each other. i.e
if my matrix is like:
A= [2,2; 2,2.1; 5,5; 5.4,5; 10,4; 13,7] (each row representing a
point) and my threshold value is .5, then I need to eliminate ((2,2)
or (2,2.1)) and ((5,5) or (5.4,5)), so that no point in the matrix are
closer than .5 to each other.
The code I wrote for this problem is an unefficient and slow one and I
would be grateful for any idea for a better performance.

Here is my code:

%points matrix contains point coordinates
pointDist = pdist(points);
iPoint = 1;
i=0;
while (i+1<=size(points,1)-1)
i = i + 1;
j = 1;
while (j<=size(points,1)-i)
if (pointDist(iPoint)<threshold)
points(i+j,:) = [];
pointDist = pdist(points);
i = 0;
iPoint = 1;
break;
end
iPoint = iPoint + 1;
j = j+1;
end
end

### Removig Points That Are Close to Each Other, From a Point Matrix

There's no reason to call pdist inside the loop. When
you call it the first time, you've already calculated the distance
between all pairs that exist in your set of points.

That's the main reason this loop takes so much time.

Instead of removing your points one at a time, I would
use the loop to build up a set of indexes corresponding
to the rows you want to remove, based on your precalculated
pointDist. Then use that list at the end to remove all the
rows at once.

For instance, suppose "remove_rows" is a list of
rows to remove. Then

points(remove_rows, :) = [];

will remove them all at once.

- Randy

### Removig Points That Are Close to Each Other, From a Point Matrix

I wonder: what you want to remove of you have a cluster of
points that are close to each other?

P1=(0,0)
P2=(0,0.4)
P3=(0.0.8)
...

- Do you keep P1 and P3? or
- Do you keep only one of the three, e.g. P1?

Otherwise, you can use delaunay triangulation, and check
only on distances between two vertexes of the same triangle.
This would accelerate the calculation.

And finally avoid to use loop in MATLAB if you could.

Bruno

### Removig Points That Are Close to Each Other, From a Point Matrix

Thanks a lot to Randy and Bruno for quick responses.

Randy, the method you talked about was the first method I tried to
use, but since I couldn't find an algorithm to do that, I used that
simple but inefficient method in my first message, just to achieve a
result. "Building up a set of indexes corresponding to the rows I want
to remove" as you said, is my main problem, and I wonder if anybody
knows a built-in function that would help with solving my problem.

Bruno, the goal is to have a cluster, with no point in the cluster,
closer than the threshold value to each other, so in your example any
2 of the 3 points must be removed.
I will work on delaunay triangulation, and post the result if I can
find a solution, thanks for the suggestion.

### Removig Points That Are Close to Each Other, From a Point Matrix

You have a set of pairs. There is a distance associated
with each pair. You search for all the pairs whose distance
is less than that value.

Where does the difficulty arise?

Perhaps the difficulty is in this clustering part of the
algorithm?

Other than that, I can't see why you abandoned your
first approach. You have a vector of distances, and each
element of that vector corresponds to a particular (i,j).
What stumbling blocks did you run into in an algorithm
that said:

for each distance
if distance < 0.5, mark something for removal
end

- Randy

### Removig Points That Are Close to Each Other, From a Point Matrix

OK, the problem is very easily fixed. The problem was, it was not very
easy to work on the pointDist vector. But when I turn that vector into
a square matrix with squareform function, close points are easily
removed by the following function:

function [a] = distance(a)

aDist = squareform(pdist(a));
i = 1;
while (i<size(aDist,1))
j=i+1;
while(j<=size(aDist,1))
if(aDist(i,j)<.5)
a(j,:)=[];
aDist(:,j) = [];
aDist(j,:) = [];
end
j = j+1;
end
i=i+1;
end