范围内的单元格内的单词频率

我有一个约50个单元格的列。 每个单元格包含一个文本块,从3-8个句子的任何地方。

我喜欢填写正在使用的单词列表,并获取整个范围(A1:A50)的频率。

我试图操纵我发现在其他post中的其他代码,但他们似乎是适应单元格包含一个单词而不是多个单词。

这是我发现我正在尝试使用的代码。

Sub Ftable() Dim BigString As String, I As Long, J As Long, K As Long Dim Selection As Range Set Selection = ThisWorkbook.Sheets("Sheet1").Columns("A") BigString = "" For Each r In Selection BigString = BigString & " " & r.Value Next r BigString = Trim(BigString) ary = Split(BigString, " ") Dim cl As Collection Set cl = New Collection For Each a In ary On Error Resume Next cl.Add a, CStr(a) Next a For I = 1 To cl.Count v = cl(I) ThisWorkbook.Sheets("Sheet2").Cells(I, "B").Value = v J = 0 For Each a In ary If a = v Then J = J + 1 Next a ThisWorkbook.Sheets("Sheet2").Cells(I, "C") = J Next I End Sub 

在这里你去,字典是处理这个问题的最好方法,我认为你可以testing,如果字典已经包含一个项目。 如果有什么你没有得到的回报。

 Sub CountWords() Dim dictionary As Object Dim sentence() As String Dim arrayPos As Integer Dim lastRow, rowCounter As Long Dim ws, destination As Worksheet Set ws = Sheets("Put the source sheet name here") Set destination = Sheets("Put the destination sheet name here") rowCounter = 2 arrayPos = 0 lastRow = ws.Range("A1000000").End(xlUp).Row Set dictionary = CreateObject("Scripting.dictionary") For x = 2 To lastRow sentence = Split(ws.Cells(x, 1), " ") For y = 0 To UBound(sentence) If Not dictionary.Exists(sentence(y)) Then dictionary.Add sentence(y), 1 Else dictionary.Item(sentence(y)) = dictionary.Item(sentence(y)) + 1 End If Next y Next x For Each Item In dictionary destination.Cells(rowCounter, 1) = Item destination.Cells(rowCounter, 2) = dictionary.Item(Item) rowCounter = rowCounter + 1 Next Item End Sub 

试试这个(为我工作Lorem Ipsum文本的一些长块):

 Sub Ftable() Dim BigString As String, I As Long, J As Long, K As Long Dim countRange As Range Set countRange = ThisWorkbook.Sheets("Sheet1").Range("A1:A50") BigString = "" For Each r In countRange.Cells BigString = BigString & " " & r.Value Next r BigString = Trim(BigString) ary = Split(BigString, " ") Dim cl As Collection Set cl = New Collection For Each a In ary On Error Resume Next cl.Add a, CStr(a) Next a For I = 1 To cl.Count v = cl(I) ThisWorkbook.Sheets("Sheet2").Cells(I, "B").Value = v J = 0 For Each a In ary If a = v Then J = J + 1 Next a ThisWorkbook.Sheets("Sheet2").Cells(I, "C") = J Next I End Sub 

我把它看成仅仅是看看你有数据的50个单元,而不是那个列的所有> 100万。 我也解决了一个问题,r得到一个长度为1的数组而不是Range。 而且我将“Selection”重命名为“countRange”,因为Selection已经在应用程序中定义了,所以命名不好。

另外请注意,您的代码从“Sheet1”拉出并输出到“Sheet2”的B和C列中。 确保你重命名你的工作表或编辑这些值,否则你会得到错误/数据损坏。


这是我如何处理这个问题:

 Sub Ftable() Dim wordDict As New Dictionary Dim r As Range Dim countRange As Range Dim str As Variant Dim strArray() As String Set countRange = ThisWorkbook.Sheets("Sheet1").Range("A1:A50") For Each r In countRange strArray = Split(Trim(r.Value), " ") For Each str In strArray str = LCase(str) If wordDict.Exists(str) Then wordDict(str) = wordDict(str) + 1 Else wordDict.Add str, 1 End If Next str Next r Set r = ThisWorkbook.Sheets("Sheet2").Range("B1") For Each str In wordDict.Keys() r.Value = str r.Offset(0, 1).Value = wordDict(str) Set r = r.Offset(1, 0) Next str Set wordDict = Nothing End Sub 

它使用一个字典,所以确保你添加一个对库的引用(工具>添加引用>微软脚本库)。 它也迫使一切都小写 – 旧的代码的一个大问题是,它没有正确计数大写和非大写版本,这意味着它错过了很多单词。 删除str = LCase(str)如果你不想要这个。

奖金:这个方法在我的testing表上跑了大约8倍。