编码错误 – xlsxwriter – Python

我试图从一个文件拆分行,并把它们放入一个Excel文件(xlsx)。 根据PS PAD,文件的编码是'cp1250'。 所以要在xlsx文件中有正确的字符,我从cp1250 – line = line.decode("cp1250")解码这行

问题是从12000 cca 3000行返回这个错误:

 'charmap' codec can't decode byte 0x81 in position 25: character maps to <undefined> 

所以我试图解码的下一个东西(“UTF-8”),我不知道为什么,但它更好。 只有330行返回错误:

 'utf8' codec can't decode byte 0x8e in position 0: invalid start byte 

你们有什么想法我做错了吗?

编辑:错误主要发生当行包含'Ž'或'Š'

这里是代码:(在py文件的顶部我已经把“# – – coding:utf-8 – – ”)

 def toXls(file): workbook = xlsxwriter.Workbook(file) worksheet = workbook.add_worksheet() a=0 with open("filtrovane.txt") as f: x=0 for line in f: try: line = line[:-1].decode("utf-8") """It should be "cp1250" according to PSPAD editor""" # line = line.encode("ISO 8859-2") splitted = line.split("::") if len(splitted)==7: try: a=a+1 worksheet.write(a,0,splitted[0]) worksheet.write(a,1,splitted[1]) worksheet.write(a,2,splitted[2]) worksheet.write(a,3,splitted[3]) worksheet.write(a,4,splitted[4]) worksheet.write(a,5,splitted[5]) worksheet.write(a,6,splitted[6]) except Exception as e: print "!!"+line+" "+a + e except Exception as e: print e x=x+1 print x workbook.close() 

在XlsxWriter docs / repo中有两个示例,显示如何读取UTF-8和Shift JIS文件并将其转换为xlsx文件。

它应该为cp1250