with open('','r') as in_file, open('','w') as out_file:
s = set() # set for fast O(1) amortized lookup
for line in in_file:
if line in s: continue # skip duplicate
s.add(line)
out_file.write(line)
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
There is a bug in this code. When the duplicate columns are arranged neatly, the last duplicate element will not remove all duplicate elements, and it will be left with two. (This bug will occasionally occur, and it will be fine after adjusting it all night, manual dog head)