The overlap of two chromosomes can be resolved by the four-point method provided that from these four points, a maximum area quadrilateral can be defined .
Previously that kind of quadrilateral was found from the 24 permutations of four points:
Maximal quadrilateral overlaid on the overlap of two chromosomes |
Previously that kind of quadrilateral was found from the 24 permutations of four points:
4!=24 quadrilaterals formed by 4 points chosen randomly. |
In these 24 permutations, some of them are derived from the others by circular permutations; these circular permutations define a necklace. The area of quadrilaterals obtained by circular permutation of a necklace, is invariant.
The maximal area quadrilateral can be found in the set of the 6 necklaces.
The maximal area quadrilateral can be found in the set of the 6 necklaces.
To find a necklace, let's start from an ordered list of four points :
A, B, C, D
and swap two elements as (A,B), we get a necklace:
B, A, C, D
From four randomly chosen point, the following python function yields the six possible necklaces :
For example:
When run in ipython cell on sagemathcloud, the unique_necklaces function applied on four number took:
Applied on four characters 'a', 'b', 'c', 'd' , the function yields if the original 4-tuple is included (remove # in line 10):
Then a maximal area quadrilateral can be found beyond the six necklaces. As the number of quadrilaterals explored is smaller, the new implementation is faster than the previous one based on brute force:
points=[(random.randint(0,20),random.randint(0,20)) for i in range(0,4)]
%timeit maxAreaQuad(*points)
%timeit maxAreaQuad_brutforce(*points)
It's funny to note that the area of the maximal area quadrilaterals seems to follow an asymetric distribution:
For example:
When run in ipython cell on sagemathcloud, the unique_necklaces function applied on four number took:
%timeit unique_necklaces(1,2,3,4)
100000 loops, best of 3: 15.2 µs per loop
print unique_necklaces('a','b','c','d')
[['a', 'b', 'c', 'd'],
['b', 'a', 'c', 'd'],
['c', 'b', 'a', 'd'],
['d', 'b', 'c', 'a'],
['a', 'c', 'b', 'd'],
['a', 'd', 'c', 'b'],
['a', 'b', 'd', 'c']]
points=[(random.randint(0,20),random.randint(0,20)) for i in range(0,4)]
%timeit maxAreaQuad(*points)
%timeit maxAreaQuad_brutforce(*points)
It's funny to note that the area of the maximal area quadrilaterals seems to follow an asymetric distribution: