excel公式去掉html

我想从文本值的左边和右边去掉所有的html:

我有这个…
<option value="41">GECommonUI</option>

我想得到这个…
GECommonUI

设置对MS Forms 2.0的引用以使用DataObject对象。

 Public Function StripHTML(sInput As String) As String Dim rTemp As Range Dim oData As DataObject Set oData = New DataObject oData.SetText "<html><style>br{mso-data-placement:same-cell;}</style>" & sInput & "</html>" oData.PutInClipboard Set rTemp = Workbooks.Add.Worksheets(1).Range("a1") rTemp.Parent.PasteSpecial "Unicode Text" StripHTML = rTemp.Text rTemp.Parent.Parent.Close False Set rTemp = Nothing Set oData = Nothing End Function 

有关更多信息,请参阅http://www.dailydoseofexcel.com/archives/2005/02/23/html-in-cells-ii/

您可以在下面的页面上下载一个带有工作用户定义的函数= stripHTML()的Excel文件。 该文件有一个工作示例和说明。 这将解决您的问题。 请注意,您将不得不允许此Excel文档中的macros使该function正常工作。

http://jfrancisconsulting.com/how-to-strip-html-tags-in-excel/

我的function如何不同?

我发现的大多数函数都做了全局查找和replace,以删除括号内的所有HTML标记。

但是,这会删除某些换行符,导致最终的结果看起来像是一团糟的文本。 我修改了一个现有的函数来添加行空格,以便最终结果保持适当的换行符。

其次,全局HTML只replace括号中的属性<>所有的HTML特殊符号将保留。 防爆。 而不是&符号,您将剩下HTML版本&。 为了解决这个问题,我插入了87个replace行来replace所有的HTML符号和字符。

 Function StripHTML(cell As Range) As String Dim RegEx As Object Set RegEx = CreateObject(“vbscript.regexp”) Dim sInput As String Dim sOut As String sInput = cell.Text sInput = Replace(sInput, “\x0D\x0A”, Chr(10)) sInput = Replace(sInput, “\x00″, Chr(10)) 'replace HTML breaks and end of paragraphs with line breaks sInput = Replace(sInput, “</P>”, Chr(10) & Chr(10)) sInput = Replace(sInput, “<BR>”, Chr(10)) 'replace bullets with dashes sInput = Replace(sInput, “<li>”, “-”) 'add back all of the special characters sInput = Replace(sInput, “&ndash;”, “–”) sInput = Replace(sInput, “&mdash;”, “—”) sInput = Replace(sInput, “&iexcl;”, “¡”) sInput = Replace(sInput, “&iquest;”, “¿”) sInput = Replace(sInput, “&quot;”, “”) sInput = Replace(sInput, “&ldquo;”, ““”) sInput = Replace(sInput, “&rdquo;”, “””) sInput = Replace(sInput, “”, “'”) sInput = Replace(sInput, “&lsquo;”, “'”) sInput = Replace(sInput, “&rsquo;”, “'”) sInput = Replace(sInput, “&laquo;”, “«”) sInput = Replace(sInput, “&raquo;”, “»”) sInput = Replace(sInput, “&nbsp;”, ” “) sInput = Replace(sInput, “&amp;”, “&”) sInput = Replace(sInput, “&cent;”, “¢”) sInput = Replace(sInput, “&copy;”, “©”) sInput = Replace(sInput, “&divide;”, “÷”) sInput = Replace(sInput, “&gt;”, “>”) sInput = Replace(sInput, “&lt;”, “<”) sInput = Replace(sInput, “&micro;”, “µ”) sInput = Replace(sInput, “&middot;”, “·”) sInput = Replace(sInput, “&para;”, “¶”) sInput = Replace(sInput, “&plusmn;”, “±”) sInput = Replace(sInput, “&euro;”, “€”) sInput = Replace(sInput, “&pound;”, “£”) sInput = Replace(sInput, “&reg;”, “®”) sInput = Replace(sInput, “&sect;”, “§”) sInput = Replace(sInput, “&trade;”, “™”) sInput = Replace(sInput, “&yen;”, “¥”) sInput = Replace(sInput, “&aacute;”, “á”) sInput = Replace(sInput, “&Aacute;”, “Á”) sInput = Replace(sInput, “&agrave;”, “à”) sInput = Replace(sInput, “&Agrave;”, “À”) sInput = Replace(sInput, “&acirc;”, “â”) sInput = Replace(sInput, “&Acirc;”, “”) sInput = Replace(sInput, “&aring;”, “å”) sInput = Replace(sInput, “&Aring;”, “Å”) sInput = Replace(sInput, “&atilde;”, “ã”) sInput = Replace(sInput, “&Atilde;”, “Ô) sInput = Replace(sInput, “&auml;”, “ä”) sInput = Replace(sInput, “&Auml;”, “Ä”) sInput = Replace(sInput, “&aelig;”, “æ”) sInput = Replace(sInput, “&AElig;”, “Æ”) sInput = Replace(sInput, “&ccedil;”, “ç”) sInput = Replace(sInput, “&Ccedil;”, “Ç”) sInput = Replace(sInput, “&eacute;”, “é”) sInput = Replace(sInput, “&Eacute;”, “É”) sInput = Replace(sInput, “&egrave;”, “è”) sInput = Replace(sInput, “&Egrave;”, “È”) sInput = Replace(sInput, “&ecirc;”, “ê”) sInput = Replace(sInput, “&Ecirc;”, “Ê”) sInput = Replace(sInput, “&euml;”, “ë”) sInput = Replace(sInput, “&Euml;”, “Ë”) sInput = Replace(sInput, “&iacute;”, “í”) sInput = Replace(sInput, “&Iacute;”, “Í”) sInput = Replace(sInput, “&igrave;”, “ì”) sInput = Replace(sInput, “&Igrave;”, “Ì”) sInput = Replace(sInput, “&icirc;”, “î”) sInput = Replace(sInput, “&Icirc;”, “Δ) sInput = Replace(sInput, “&iuml;”, “ï”) sInput = Replace(sInput, “&Iuml;”, “Ï”) sInput = Replace(sInput, “&ntilde;”, “ñ”) sInput = Replace(sInput, “&Ntilde;”, “Ñ”) sInput = Replace(sInput, “&oacute;”, “ó”) sInput = Replace(sInput, “&Oacute;”, “Ó”) sInput = Replace(sInput, “&ograve;”, “ò”) sInput = Replace(sInput, “&Ograve;”, “Ò”) sInput = Replace(sInput, “&ocirc;”, “ô”) sInput = Replace(sInput, “&Ocirc;”, “Ô”) sInput = Replace(sInput, “&oslash;”, “ø”) sInput = Replace(sInput, “&Oslash;”, “Ø”) sInput = Replace(sInput, “&otilde;”, “õ”) sInput = Replace(sInput, “&Otilde;”, “Õ”) sInput = Replace(sInput, “&ouml;”, “ö”) sInput = Replace(sInput, “&Ouml;”, “Ö”) sInput = Replace(sInput, “&szlig;”, “ß”) sInput = Replace(sInput, “&uacute;”, “ú”) sInput = Replace(sInput, “&Uacute;”, “Ú”) sInput = Replace(sInput, “&ugrave;”, “ù”) sInput = Replace(sInput, “&Ugrave;”, “Ù”) sInput = Replace(sInput, “&ucirc;”, “û”) sInput = Replace(sInput, “&Ucirc;”, “Û”) sInput = Replace(sInput, “&uuml;”, “ü”) sInput = Replace(sInput, “&Uuml;”, “Ü”) sInput = Replace(sInput, “&yuml;”, “ÿ”) sInput = Replace(sInput, “”, “´”) sInput = Replace(sInput, “”, “`”) 'replace all the remaining HTML Tags With RegEx .Global = True .IgnoreCase = True .MultiLine = True .Pattern = “<[^>]+>” 'Regular Expression for HTML Tags. End With sOut = RegEx.Replace(sInput, “”) StripHTML = sOut Set RegEx = Nothing End Function 

在Excel中select所有
打开replace(ctrl + f)
replace<*>
什么都没有

或者使用perl正则expression式
$ line =“这是一些HTML和文字的文字”;
$ line =〜s /<(.*?)>// gi;

您可以尝试VBScripts正则expression式支持 。

http://www.regular-expressions.info/vbscriptexample.html

家伙这是一个简单的string函数集。 因为我不能在Excelstring函数的速度,这是一个痛苦,我要弄清楚…

在编程语言中,这将像mid(start,length)一样得到来自完整的htmlstring的值 – 诀窍是获得长度作为开始位置减去结束html长度的长度)

mid(pos(“>”)+ 2,len(string)-pos(“>”))