使用python更新excel电子表格

我正在跟踪各种仪器的相当大的库存数据库。 我需要一个更好的方式来更新库存系统。 该系统由许多电子表格组成,基本上每个仪器一个。 我一直在使用的组织的主要方法是仪器和部件号。 到目前为止,我有一个脚本 – 使用pandas包 – 将使用电子表格:零件号和工具引用两个类别的主文件,并通过删除重复项来更新主文件。 例如,如果我有4个5欧姆电阻,并且这个数字被更新为7个5欧姆电阻,我运行该程序,并用新值7更新主设备。

我现在需要做的是完全删除遗漏。 换句话说,我从四个5欧姆电阻到零欧姆电阻,也就是说,根本没有进入。 我需要一种方法来编辑主文件并完全删除该条目。 我还想用一种方法来引用主数据,用户input的文件数量是x,而不是一次一个。 但是我不是很确定我在python或者pandas方面做得足够好,所以就堆栈溢出问题了!

任何意见或build议表示赞赏! 这是迄今为止的计划:

import subprocess import pandas as pd import numpy as np import os, sys from os.path import basename # CSV IMPORT DEFINED FUNCTION def csvImport(ftype, fpath): try: if ftype == 1: masterdata = pd.read_csv(fpath) return masterdata if ftype == 2: updateddata = pd.read_csv(fpath) updateddata['originfile'] = pd.Series(os.path.basename(fpath), \ index=updateddata.index) return updateddata except Exception as e: print "\nUnable to import CSV file. Error {}".format(e) sys.exit(1) # EXCEL IMPORT DEFINED FUNCTION def xlImport(ftype, fpath): try: if ftype == 1: masterdata = pd.read_excel(fpath, 0) return masterdata if ftype == 2: updateddata = pd.read_excel(fpath, 0) updateddata['orginfile'] = pd.Series(os.path.basename(fpath), \ index=updateddata.index) return updateddata except Exception as e: print "\nUnable to import Excel file. Error {}".format(e) sys.exit(1) # MASTER FILE USER INPUT DEFINED FUNCTION def masterfile(): while True: masterfile = raw_input("Enter the path to the master file: ") if masterfile.endswith(".csv"): return csvImport(1, masterfile) break elif masterfile.endswith(".xlsx"): return xlImport(1, masterfile) break else: print "\nPlease enter a proper CSV format file." # UPDATED FILE USER INPUT DEFINED FUNCTION def updatefile(): while True: updatedfile = raw_input("\nEnter the path to the updated file: ") if updatedfile.endswith(".csv"): return csvImport(2, updatedfile) break elif updatedfile.endswith(".xlsx"): return xlImport(2, updatedfile) break else: print "\nPlease enter a proper Excel file in xlsx format." # CALLING OPENING FUNCTIONS masterdata = masterfile() updateddata = updatefile() # CONCATENATING DATA FRAMES combineddata = pd.concat([updateddata, masterdata]) # REMOVING DUPLICATES finaldata = combineddata.drop_duplicates(['Item']) # SETTING FINAL PATH BY USER INPUT while True: final = raw_input("\nWhere do you want the file, and what do you want to name it? \ (eg, C:\path_to_file\name_of_file.xlsx): ") if final.endswith(".xlsx"): break else: print "\nPlease enter a proper Excel file in xlsx format." # OUTPUTTING DATA FRAME TO FILE finaldata.to_excel(final) print "\nSuccessfully outputted appended data frame to Excel!" # OPENING OUTPUTTED FILE # (NOTE: PYTHON STILL RUNS UNTIL SPREADSHEET IS CLOSED) subprocess.call(final, shell=True) 

为您的第一个问题的想法 – 这种行为喜欢sql select语句:

 nozeros_finaldata = finaldata[finaldata['ColumnName'] != 0] 

'ColumnName'replace为从4到0的列的名称; 它会返回一个新的dataframe。 然后使用nozeros_finaldata.to_excel(final)

对于第二个问题:您可以使用while循环,并询问用户是否有更多的文件。

 # CALLING OPENING FUNCTIONS more_files = True while more_files: masterdata = masterfile() updateddata = updatefile() # CONCATENATING DATA FRAMES combineddata = pd.concat([updateddata, masterdata]) # REMOVING DUPLICATES finaldata = combineddata.drop_duplicates(['Item']) finaldata.dropna(subset=['originfile'],inplace=True) # SETTING FINAL PATH BY USER INPUT while True: final = raw_input("\nWhere do you want the file, and what do you want to name it? \ (eg, C:\path_to_file\name_of_file.xlsx): ") if final.endswith(".xlsx"): break else: print "\nPlease enter a proper Excel file in xlsx format." # OUTPUTTING DATA FRAME TO FILE finaldata.to_excel(final) print "\nSuccessfully outputted appended data frame to Excel!" # OPENING OUTPUTTED FILE # (NOTE: PYTHON STILL RUNS UNTIL SPREADSHEET IS CLOSED) subprocess.call(final, shell=True) user_run_again = raw_input("\nWould you like to run another file?" ) if user_run_again == "Yes": more_files = True else: more_files = False 

你可能想为最后的raw_input做一些exception处理,只是想提供一些想法。