Find In Python Combinations Of Mutually Exclusive Sets From A List's Elements
Solution 1:
I'd use a generator:
import itertools
defcomb(seq):
for n inrange(1, len(seq)):
for c in itertools.combinations(seq, n): # all combinations of length niflen(set.union(*map(set, c))) == sum(len(s) for s in c): # pairwise disjoint?yieldlist(c)
for c in comb([[1, 2, 3], [3, 6, 8], [4, 9], [6, 11]]):
print c
This produces:
[[1, 2, 3]][[3, 6, 8]][[4, 9]][[6, 11]][[1, 2, 3], [4, 9]][[1, 2, 3], [6, 11]][[3, 6, 8], [4, 9]][[4, 9], [6, 11]][[1, 2, 3], [4, 9], [6, 11]]
If you need to store the results in a single list:
print list(comb([[1, 2, 3], [3, 6, 8], [4, 9], [6, 11]]))
Solution 2:
The following is a recursive generator:
defcomb(input, lst = [], lset = set()):
if lst:
yield lst
for i, el inenumerate(input):
if lset.isdisjoint(el):
for out in comb(input[i+1:], lst + [el], lset | set(el)):
yield out
for c in comb([[1, 2, 3], [3, 6, 8], [4, 9], [6, 11]]):
print c
This is likely to be a lot more efficient than the other solutions in situations where a lot of sets have common elements (of course in the worst case it still has to iterate over the 2**n
elements of the powerset).
Solution 3:
The method used in the program below is similar to a couple of previous answers in excluding not-disjoint sets and therefore usually not testing all combinations. It differs from previous answers by greedily excluding all the sets it can, as early as it can. This allows it to run several times faster than NPE's solution. Here is a time comparison of the two methods, using input data with 200, 400, ... 1000 size-6 sets having elements in the range 0 to 20:
Setsize=6,Numbermax=20NPEmethod0.042s Sizes: [200, 1534, 67]
0.281s Sizes: [400, 6257, 618]
0.890s Sizes: [600, 13908, 2043]
2.097s Sizes: [800, 24589, 4620]
4.387s Sizes: [1000, 39035, 9689]
Setsize=6,Numbermax=20jwpat7method0.041s Sizes: [200, 1534, 67]
0.077s Sizes: [400, 6257, 618]
0.167s Sizes: [600, 13908, 2043]
0.330s Sizes: [800, 24589, 4620]
0.590s Sizes: [1000, 39035, 9689]
In the above data, the left column shows execution time in seconds. The lists of numbers show how many single, double, or triple unions occurred. Constants in the program specify data set sizes and characteristics.
#!/usr/bin/pythonfrom random import sample, seed
import time
nsets, ndelta, ncount, setsize = 200, 200, 5, 6
topnum, ranSeed, shoSets, shoUnion = 20, 1234, 0, 0
seed(ranSeed)
print'Set size = {:3d}, Number max = {:3d}'.format(setsize, topnum)
for casenumber inrange(ncount):
t0 = time.time()
sets, sizes, ssum = [], [0]*nsets, [0]*(nsets+1);
for i inrange(nsets):
sets.append(set(sample(xrange(topnum), setsize)))
if shoSets:
print'sets = {}, setSize = {}, top# = {}, seed = {}'.format(
nsets, setsize, topnum, ranSeed)
print'Sets:'for s in sets: print s
# Method by jwpat7defaccrue(u, bset, csets):
for i, c inenumerate(csets):
y = u + [c]
yield y
boc = bset|c
ts = [s for s in csets[i+1:] if boc.isdisjoint(s)]
for v in accrue (y, boc, ts):
yield v
# Method by NPEdefcomb(input, lst = [], lset = set()):
if lst:
yield lst
for i, el inenumerate(input):
if lset.isdisjoint(el):
for out in comb(input[i+1:], lst + [el], lset | set(el)):
yield out
# Uncomment one of the following 2 lines to select method#for u in comb (sets):for u in accrue ([], set(), sets):
sizes[len(u)-1] += 1if shoUnion: print u
t1 = time.time()
for t inrange(nsets-1, -1, -1):
ssum[t] = sizes[t] + ssum[t+1]
print'{:7.3f}s Sizes:'.format(t1-t0), [s for (s,t) inzip(sizes, ssum) if t>0]
nsets += ndelta
Edit: In function accrue
, arguments (u, bset, csets)
are used as follows:
• u = list of sets in current union of sets
• bset = "big set" = flat value of u = elements already used
• csets = candidate sets = list of sets eligible to be included
Note that if the first line of accrue
is replaced by
def accrue(csets, u=[], bset=set()):
and the seventh line by
for v in accrue (ts, y, boc):
(ie, if parameters are re-ordered and defaults given for u and bset) then accrue
can be invoked via [accrue(listofsets)]
to produce its list of compatible unions.
Regarding the ValueError: zero length field name in format
error mentioned in a comment as occurring when using Python 2.6, try the following.
# change:print"Set size = {:3d}, Number max = {:3d}".format(setsize, topnum)
# to:print"Set size = {0:3d}, Number max = {1:3d}".format(setsize, topnum)
Similar changes (adding appropriate field numbers) may be needed in other formats in the program. Note, the what's new in 2.6 page says “Support for the str.format() method has been backported to Python 2.6”. While it does not say whether field names or numbers are required, it does not show examples without them. By contrast, either way works in 2.7.3.
Solution 4:
using itertools.combinations
, set.intersection
and for-else
loop:
from itertools import *
lis=[[1, 2, 3], [3, 6, 8], [4, 9], [6, 11]]
deffunc(lis):
for i inrange(1,len(lis)+1):
for x in combinations(lis,i):
s=set(x[0])
for y in x[1:]:
iflen(s & set(y)) != 0:
breakelse:
s.update(y)
else:
yield x
for item in func(lis):
print item
output:
([1, 2, 3],)
([3, 6, 8],)
([4, 9],)
([6, 11],)
([1, 2, 3], [4, 9])
([1, 2, 3], [6, 11])
([3, 6, 8], [4, 9])
([4, 9], [6, 11])
([1, 2, 3], [4, 9], [6, 11])
Solution 5:
Similar to NPE's solution, but it's without recursion and it returns a list:
def disjoint_combinations(seqs):
disjoint = []
forseqin seqs:
disjoint.extend([(each + [seq], items.union(seq))
for each, items in disjoint
if items.isdisjoint(seq)])
disjoint.append(([seq], set(seq)))
return [each for each, _ in disjoint]
for each in disjoint_combinations([[1, 2, 3], [3, 6, 8], [4, 9], [6, 11]]):
print each
Result:
[[1, 2, 3]][[3, 6, 8]][[1, 2, 3], [4, 9]][[3, 6, 8], [4, 9]][[4, 9]][[1, 2, 3], [6, 11]][[1, 2, 3], [4, 9], [6, 11]][[4, 9], [6, 11]][[6, 11]]
Post a Comment for "Find In Python Combinations Of Mutually Exclusive Sets From A List's Elements"