如何在excel中创build一个xml标签?

我是新的XML文件,所以我需要一些帮助在这里,请。 我需要将一个xml文件导入到excel中,但是我无法获得正确的格式。 什么是在“P”标签我需要在一行,属性值(n值)是列标题。 我在使用的文件中多次出现“D”标签中的“D”标识,“D”的标识最多可达31个。所有标签均为“B”标签。 这里有敏感信息,所以我不得不用文字replace文字,对不起。 我希望这一切都是有道理的,任何信息或指向正确的方向是值得赞赏的。 谢谢。

<D id="1"> <V n="stuff1">stuff1</V> <V n="stuff2">stuff2</V> <V n="stuff3">stuff3</V> <P id="stuff11"> <V n="stuff111">stuff111</V> <V n="stuff112">stuff112</V> <V n="stuff113">stuff113</V> <V n="stuff114">stuff114</V> <V n="stuff115">stuff115</V> <V n="stuff116">stuff116</V> </P> </D> <D id="2"> <V n="stuff1">stuff1</V> <V n="stuff2">stuff2</V> <V n="stuff3">stuff3</V> <P id="stuff21"> <V n="stuff111">stuff211</V> <V n="stuff112">stuff212</V> <V n="stuff113">stuff213</V> <V n="stuff114">stuff214</V> <V n="stuff115">stuff215</V> <V n="stuff116">stuff216</V> </P> </D> 

不幸的是,我以前发现Excel导入元素值而不是这些元素中的属性。 例如“stuff211”导入stuff211而不是stuff111。 我认为这只是Excel中function的一个限制。 导入必须在Excel中,还是可以使用Python之类的编程语言? 我已经写了一个过程,以从XML文件中提取特定的元素和属性值,如果需要,我会很高兴挖掘和分享明天?

UPDATE

这是一个我以前写的用来​​从xml文件中去除数据到csv文件的python脚本。 请注意,我还没有设置这个来获取所有的属性和元素,因为我最初的目的只是从文件中获取特定的数据。 您将需要编辑search_items全局列表到您要search的项目。

您可以从命令行调用脚本,使用单个参数作为xml文件的path,或者您可以在没有arg的情况下使用,并提示您select一个目录。 请让我知道,如果你有任何问题:

 #!/usr/bin/python # Change Ideas # ------------ # Add option to get all xml elements / attributes # Add support for json? import sys, os, Tkinter, tkFileDialog as fd, traceback # stop tinker shell from opening as only needed for file dialog root = Tkinter.Tk() root.withdraw() # globals debug_on = False #get_all_elements = False #get_all_attributes = False # search items to be defined each time you run to identify values to search for. # each item should have a search text, a type and optionally a heading eg # search_items = ['exact_serach_text', 'item_type', 'column_heading(optional)'] # note: search items are case sensitive. # ############################## EXAMPLE ############################## search_items = [ ['policyno=', 'xml_attribute', 'Policy No' ], ['transid=', 'xml_attribute', 'Trans ID' ], ['policyPremium=', 'xml_attribute', 'Pol Prem' ], ['outstandingBalance=', 'xml_attribute', 'Balance' ], ['APRCharge=', 'xml_attribute', 'APR Chrg' ], ['PayByAnnualDD=', 'xml_attribute', 'Annual DD' ], ['PayByDD=', 'xml_attribute', 'Pay by DD' ], ['mtaDebitAmount=', 'xml_attribute', 'MTA Amt' ], ['paymentMethod=', 'xml_attribute', 'Pmt Meth' ], ['ddFirstPaymentAmount=', 'xml_attribute', '1st Amt' ], ['ddRemainingPaymentsAmount=', 'xml_attribute', 'Other Amt' ], ['ddNumberOfPaymentsRemaining=', 'xml_attribute', 'Instl Rem' ], ] item_types = ['xml_attribute', 'xml_element'] def get_heads(): heads = [] for i in search_items: try: # raise error if i[2] does not exist or is empty assert len(i[2]) > 0, "No value in heading, use search text." except: heads.append(i[0]) # use search item as not heading is given else: heads.append(i[2]) return heads def write_csv_file(path, heads, data): """ Writes data to file, use None for heads param if no headers required. """ with open(path, 'wb') as fileout: writer = csv.writer(fileout) if heads: writer.writerow(heads) for row in data: try: writer.writerow(row) except: print '...row failed in write to file:', row exc_type, exc_value, exc_traceback = sys.exc_info() lines = traceback.format_exception(exc_type, exc_value, exc_traceback) for line in lines: print '!!', line print 'Data written to:', path, '\n' def find_val_in_line(item, item_type, line): if item_type.lower() == 'xml_element': print 'Testing still in progress for xml elements, please check output carefully' b1, b2 = ">", "<" tmp = line.find(item) # find the starting point of the element value x = line.find(b1, tmp+1) + len(boundary) # find next boundary after item y = line.find(b2, x) # find subsequent boundary to mark end of element value return line[x:y] elif item_type.lower() == 'xml_attribute': b = '"' tmp = line.find(item) # find the starting point of the attribute x = line.find(b, tmp+1) + len(b) # find next boundary after item y = line.find(b, x) # find subsequent boundary to mark end of attribute return line[x:y] # return value between start and end boundaries else: print 'This program does not currently support type:', item_type print 'Returning null' return None def find_vals_in_file(file_path): with open(file_path, "r") as f: buf = f.readlines() f.seek(0) data, row = [], [] found_in_row, pos = 0, 0 l = len(search_items) if debug_on: print '\nsearch_items set to:\n' + str(search_items) + '\n' # loop through the lines in the file... for line in buf: if debug_on: print '\n..line set to:\n ' + line # loop through search items on each line... for i in search_items: if debug_on: print '...item set to:\n ' + i[0] # if the search item is found in the line... if i[0] in line: val = find_val_in_line(i[0], i[1], line) # only count as another item found if not already in that row try: # do not increment cnt if this works as item already exists row[pos] = val if debug_on: print '.....repeat item found:- ' + i[0] + ':' + val except IndexError: found_in_row += 1 # Index does not exist, count as new row.append(val) if debug_on: print '....item found, row set to:\n ' + str(row) # if we have a full row then add row to data and start row again... if found_in_row == l: if debug_on: print '......row complete, appending to data\n' data.append(row) row, found_in_row = [], 0 pos += 1 # start at 0 and increment 1 at end of each search item pos = 0 f.close() return data def main(): path, matches = None, [] os.chdir(os.getenv('userprofile')) # check cmd line args provided... if len(sys.argv) > 1: path = sys.argv[1] else: while not path: try: print 'Please select a file to be parsed...' path = fd.askopenfilename() except: print 'Error selecting file, please try again.' # search for string in each file... try: matches = find_vals_in_file(path) except: exc_type, exc_value, exc_traceback = sys.exc_info() lines = traceback.format_exception(exc_type, exc_value, exc_traceback) print "An error occurred checking file:", path print ''.join('!! ' + line for line in lines) # write output to file... if len(matches) == 0: print "No matches were found in the files reviewed." else: heads = get_heads() output_path = os.path.join(os.getcwd(),'tmp_matches.csv') write_csv_file(output_path, heads, matches) print "Please rename the file if you wish to keep it as it will be overwritten..." print "\nOpening file..." os.startfile(output_path) if debug_on: print "\nWriting output to screen\n", "-"*24 print heads for row in matches: print row if __name__ == '__main__': main() 

希望这对你有用。 到目前为止,我只testing了几个不同的xml文件,但对我来说工作还行。