import pandas as pd
import csv
l = list()
with open('','r') as read:
reader = csv.reader(read)
for i in reader:
l.append(i)
df = pd.DataFrame(l)
df.drop_duplicates(subset=3,inplace=True)
df.to_csv('')
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
Read the list first and turn it intoDataframeFormat, then deduplicate, output.
subset : column label or sequence of labels, optional
Only consider certain columns for identifying duplicates, by
default use all of the columns
keep : {‘first’, ‘last’, False}, default ‘first’
- first
: Drop duplicates except for the first occurrence.
- last
: Drop duplicates except for the last occurrence.
- False : Drop all duplicates.
inplace : boolean, default False
Whether to drop duplicates in place or to return a copy