Python: optimizing pairwise overlaps between intervals -
I have a lot of interval (about 5k to 10k). The beginning and the end position of these elements is; Such as (203, 405) coordinates of intervals are deposited in a list.
I want to determine the coordinates and length of the between each pair overlapping parts of the intervals it can be done as follows: : As can be seen, it works ... because it can take some time (20 seconds) I have a question, how do I optimize it? I tried to make another cut for the loop when the initial position of the second loop exceeds the end position: This process reduces time, but the resulting number of overlaps is almost three times less than before, and as a result it is definitely not valid . This is due to elements that are much higher in length than the preceding elements. I'm sure there is some mathematical move to solve this problem If the algorithm you described can be written then: If you sort elements, as soon as you get a non-overlapping segment You can "short-circuit" because you know it further, the list will be "far away": Note, in general, Instead of double To use Finally, if you have ever Want to perform on the fly at intervals normal , you can also consider using the data structure. Is there.
#pecification For a small list, usually around 5000 = length (= (20, 54), (25, 48), (67, 133) (9, 152), (140,211), (19230)), C1 in enumerate (CLIIS [: - 1]): # Linear pairing for C2 in # CLIS [i + 1:]: left = maximum (C1 [0], C2 [0]) = right (C1 [1], C2 [1]) overlap = Overlap if right-left & gt; 0: Results in the "left:% s, right:% s, length:% s"% (left, right, overlap)
Left: 25, Right: 48, Length: 23 left: 90, right: 133, Length: 43 Left side: 140, Right: 152, Length: 12 left: 190, Right: 211, Length: 21
if c1 [1]
for
I, c1 enumerate (cLIIS [ : - 1]): CLIS [i + 1:]: o = Overlap (C1, C2) if it is not O: Printing "left:% s, true:% s, length:% s"% o
l = sorted (cList) for i, c1 e In the numeric (L [: - 1]): L for [i + 1:]: o = overlap (c1, c2) if o is none: skip print "left:% s, right:% S, length:% Definitely, if your input is already sorted (as it seems), you can skip that step.
for , you can use very clear it guarantees the same kind of order Unfortunately, this is not suitable for the optimized version of the algorithm, but it is written by you Can be done
Comments
Post a Comment