FCM集群数值数据和csv / excel文件

您好我问以前的问题,给出了一个合理的答案,我以为我回到了正轨, 模糊c-tcp转储聚类在matlab中的问题是下面的tcp / udp数据的预处理阶段,我想通过matlabs fcm聚类algorithm。我的问题:

1)我怎样才能将单元格中的文本数据转换为数字值的最佳方法? 数值应该是多less?

编辑:我的数据在Excel中现在看起来像这样:

在这里输入图像说明

0,tcp,http,SF,239,486,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,8,8,0.00,0.00,0.00,0.00,1.00,0.00,0.00,19,19,1.00,0.00,0.05,0.00,0.00,0.00,0.00,0.00,normal. 

这里是一个例子,我将如何读取数据到MATLAB。 您需要两件事情:数据本身以逗号分隔的格式,以及function列表及其types(数字,名义)。

 %# read the list of features fid = fopen('kddcup.names','rt'); C = textscan(fid, '%s %s', 'Delimiter',':', 'HeaderLines',1); fclose(fid); %# determine type of features C{2} = regexprep(C{2}, '.$',''); %# remove "." at the end attribNom = [ismember(C{2},'symbolic');true]; %# nominal features %# build format string used to read/parse the actual data frmt = cell(1,numel(C{1})); frmt( ismember(C{2},'continuous') ) = {'%f'}; %# numeric features: read as number frmt( ismember(C{2},'symbolic') ) = {'%s'}; %# nominal features: read as string frmt = [frmt{:}]; frmt = [frmt '%s']; %# add the class attribute %# read dataset fid = fopen('kddcup.data','rt'); C = textscan(fid, frmt, 'Delimiter',','); fclose(fid); %# convert nominal attributes to numeric ind = find(attribNom); G = cell(numel(ind),1); for i=1:numel(ind) [C{ind(i)},G{i}] = grp2idx( C{ind(i)} ); end %# all numeric dataset M = cell2mat(C); 

您也可以从统计工具箱查看DATASET类。