pandas数据框导出为Excel导致TypeError

我想从pandas导出一个数据框到excel这样做:

writer = pd.io.excel.ExcelWriter(args.out_file, engine='xlsxwriter', options={'constant_memory': True}) summary_data.to_excel(writer, sheet_name='summary', na_rep='NA', index=False) 

但是我得到这个消息:

 "cannot convert the series to {0}".format(str(converter))) TypeError: cannot convert the series to <type 'float'> 

我的数据框没有什么问题,所以我对这个错误消息有点困惑,当dataframe包含less于1000行时,会发生这种情况,但是一旦它变大,就会发生这个错误

任何想法 ?

谢谢

更新summary_data.info()

 <class 'pandas.core.frame.DataFrame'> Int64Index: 2176 entries, 0 to 2175 Data columns (total 27 columns): chrom 2176 non-null object coord 2176 non-null int64 ref_base 2176 non-null object var_base 2176 non-null object normal_ref_counts 2176 non-null int64 normal_var_counts 2176 non-null int64 VOA867-A1_S43_merged_ref_counts 2176 non-null object VOA867-A1_S43_merged_var_counts 2176 non-null object VOA867-A1_S43_merged_somatic_status 2176 non-null object VOA867-E02_S73_merged_ref_counts 2176 non-null object VOA867-E02_S73_merged_var_counts 2176 non-null object VOA867-E02_S73_merged_somatic_status 2176 non-null object VOA867-F03_S76_merged_ref_counts 2176 non-null object VOA867-F03_S76_merged_var_counts 2176 non-null object VOA867-F03_S76_merged_somatic_status 2176 non-null object VOA867-F04_S75_merged_ref_counts 2176 non-null object VOA867-F04_S75_merged_var_counts 2176 non-null object VOA867-F04_S75_merged_somatic_status 2176 non-null object VOA867-F09_S74_merged_ref_counts 2176 non-null object VOA867-F09_S74_merged_var_counts 2176 non-null object VOA867-F09_S74_merged_somatic_status 2176 non-null object VOA867-T_S41_merged_ref_counts 2176 non-null object VOA867-T_S41_merged_var_counts 2176 non-null object VOA867-T_S41_merged_somatic_status 2176 non-null object VOA867xeno_S18_merged_ref_counts 2176 non-null object VOA867xeno_S18_merged_var_counts 2176 non-null object VOA867xeno_S18_merged_somatic_status 2176 non-null object dtypes: int64(3), object(24)None 

这里是生成它的函数

 def get_summary_data(data, normal_sample): summary_data = [] for index, normal_row in data[normal_sample].iterrows(): out_row = {'chrom': index[0], 'coord': index[1], 'ref_base': normal_row['ref_base'], 'var_base': normal_row['var_base'], 'normal_ref_counts': normal_row['ref_counts'], 'normal_var_counts': normal_row['var_counts'], } normal_variant_status = normal_row['variant_status'] normal_depth = out_row['normal_ref_counts'] + out_row['normal_var_counts'] if normal_depth > 0: normal_var_freq = out_row['normal_var_counts'] / normal_depth else: normal_var_freq = 0 for sample in data: if sample == normal_sample: continue sample_row = data[sample].ix[[index]] out_row['{0}_ref_counts'.format(sample)] = sample_row['ref_counts'] out_row['{0}_var_counts'.format(sample)] = sample_row['var_counts'] sample_variant_status = str(sample_row['variant_status'].iget(0)) sample_somatic_status = call_somatic_status(normal_variant_status, sample_variant_status, normal_var_freq, args.min_normal_germline_var_freq) out_row['{0}_somatic_status'.format(sample)] = sample_somatic_status summary_data.append(out_row) columns = ['chrom', 'coord', 'ref_base', 'var_base', 'normal_ref_counts', 'normal_var_counts'] for sample in data: if sample == normal_sample: continue columns.append('{0}_ref_counts'.format(sample)) columns.append('{0}_var_counts'.format(sample)) columns.append('{0}_somatic_status'.format(sample)) summary_data = pd.DataFrame(summary_data, columns=columns) return summary_data 

计数应该是int,但我可以看到它被认为是string在这里,可能是因为它是从另一个dataframe提取?

.to_excel只接受types为object的列。 解决这个问题的快速方法是在写入之前强制所有列都使用对象types:

 summary_data = summary_data.astype(object) 

那么你可以写它没有崩溃:

 summary_data.to_excel(writer, sheet_name='summary', na_rep='NA', index=False) 

在这里有一些munning做,因为在某些情况下,我不得不复制作为对象types的列。 奇怪的。 另一个select是只删除问题的列。