python如何去除重复行并写入另一个文件?
去除1.txt里的重复行并写入到2.txt。试了好多办法总是删除不完全。1.txt:xh:['0001'],fenshu:['63'1]xh:['0005'],fensh...
去除1.txt里的重复行并写入到2.txt。试了好多办法总是删除不完全。
1.txt:
xh:['0001'],fenshu:['63'1]
xh:['0005'],fenshu:['63'5]
xh:['0002'],fenshu:['63'2]
xh:['0003'],fenshu:['63'3]
xh:['0004'],fenshu:['63'4]
xh:['0005'],fenshu:['63'5]
xh:['0002'],fenshu:['63'2] 展开
1.txt:
xh:['0001'],fenshu:['63'1]
xh:['0005'],fenshu:['63'5]
xh:['0002'],fenshu:['63'2]
xh:['0003'],fenshu:['63'3]
xh:['0004'],fenshu:['63'4]
xh:['0005'],fenshu:['63'5]
xh:['0002'],fenshu:['63'2] 展开
展开全部
问题描述的并不准确
要看重复行是连续的还是非连续的
如果是连续的, 可以考虑缓存一行, 然后把后面的行与缓存的行比较, 如果不是,那就得缓存所有不同的行, 每次读一行就要与所有缓存过的行做比较了,伪代码如下:
#coding=utf-8
fin = open("input.txt", "r")
fout = open("output.txt","a+")
bufferedline = ""
while(fin and (line = fin.readline()) != bufferedline):
fout.write(line)
fin.close()
fout.close()
如果缓存行不连续的话,伪代码如下:
#coding=utf-8
fin = open("input.txt",'r')
fout = open('output.txt','a+')
bufferedlines = []
while(fin and (line = fin.readline())):
i = 0;
for(i=0;i<len(bufferedlines);++i):
if(line == bufferedlines[i]):
break;
if i == len(bufferedlines):
bufferedlines.append(line)
fout.write(line)
展开全部
用正则吧兄弟
已赞过
已踩过<
评论
收起
你对这个回答的评价是?
展开全部
lines_seen = set()
outfile = open("2.txt", "w")
for line in open("1.txt", "r"):
if line not in lines_seen:
outfile.write(line)
lines_seen.add(line)
outfile.close()
outfile = open("2.txt", "w")
for line in open("1.txt", "r"):
if line not in lines_seen:
outfile.write(line)
lines_seen.add(line)
outfile.close()
更多追问追答
追问
呵呵,没过:
xh:['0001'],fenshu:['63'1]
xh:['0005'],fenshu:['63'5]
xh:['0002'],fenshu:['63'2]
xh:['0003'],fenshu:['63'3]
xh:['0004'],fenshu:['63'4]
xh:['0002'],fenshu:['63'2]
追答
你的最后一行和之前的并不相同,因为之前的行尾都有"/n",最后一行没有"/n",由于"/n"在显示的时候会显示成换行,所以看起来好像和最后一行一样的。
本回答被提问者采纳
已赞过
已踩过<
评论
收起
你对这个回答的评价是?
展开全部
from sets import Set
s = Set(["""xh:['0001'],fenshu:['63'1]""", """xh:['0005'],fenshu:['63'5]""", """xh:['0002'],fenshu:['63'2]""","""xh:['0003'],fenshu:['63'3]""", """xh:['0004'],fenshu:['63'4]""", """xh:['0005'],fenshu:['63'5]""", """xh:['0002'],fenshu:['63'2]"""])
for line in s:
print(line)
结果:
xh:['0001'],fenshu:['63'1]
xh:['0005'],fenshu:['63'5]
xh:['0003'],fenshu:['63'3]
xh:['0002'],fenshu:['63'2]
xh:['0004'],fenshu:['63'4]
s = Set(["""xh:['0001'],fenshu:['63'1]""", """xh:['0005'],fenshu:['63'5]""", """xh:['0002'],fenshu:['63'2]""","""xh:['0003'],fenshu:['63'3]""", """xh:['0004'],fenshu:['63'4]""", """xh:['0005'],fenshu:['63'5]""", """xh:['0002'],fenshu:['63'2]"""])
for line in s:
print(line)
结果:
xh:['0001'],fenshu:['63'1]
xh:['0005'],fenshu:['63'5]
xh:['0003'],fenshu:['63'3]
xh:['0002'],fenshu:['63'2]
xh:['0004'],fenshu:['63'4]
更多追问追答
追问
假如1.txt文件太大,电脑就挂了吧?
追答
试试:
from sets import Set
file("2.txt","w").writelines( Set(file("1.txt", "r").readlines()) )
如果文件特别大, 用c写吧. c可以很方便的管理内存, 控制大文件的读写, 而且速度比python快.
已赞过
已踩过<
评论
收起
你对这个回答的评价是?
推荐律师服务:
若未解决您的问题,请您详细描述您的问题,通过百度律临进行免费专业咨询