文章/答案/技术大牛

发布

社区首页 >问答首页 >LIBSVM数据准备

问LIBSVM数据准备
EN

Stack Overflow用户

提问于 2013-12-31 09:28:53

回答 2查看 4.9K关注 0票数 0

我正在用Matlab做一个关于图像处理的项目，希望实现用于监督学习的LIBSVM。

我在数据准备方面遇到了一个问题。我有CSV格式的数据，当我尝试使用libsvm中提供的信息将其转换为LIBSVM格式时，常见问题解答：

    matlab> SPECTF = csvread('SPECTF.train'); % read a csv file
    matlab> labels = SPECTF(:, 1); % labels from the 1st column
    matlab> features = SPECTF(:, 2:end); 
    matlab> features_sparse = sparse(features); % features must be in a sparse matrix
    matlab> libsvmwrite('SPECTFlibsvm.train', labels, features_sparse);

我以以下形式获取数据：

3.0012 1:2.1122 2:0.9088 ......值1值2值3

也就是说，第一个值不带索引，索引1后面的值是值2。

根据我所读到的内容，数据应该是以下格式：

labelvalue 1值2......

我需要帮助才能把这事做好。此外，如果有人能给我一个关于如何给标签的线索，那将是非常有帮助的。

先谢谢你，Sidra

matlab

libsvm

回答 2

Stack Overflow用户

发布于 2014-01-02 01:22:14

您不必将数据写入文件，而是可以使用Matlab接口来实现LIBSVM。该接口由两个函数组成：svmtrain和svmpredict。如果在不带参数的情况下调用，每个函数都会打印帮助文本：

Usage: model = svmtrain(training_label_vector, training_instance_matrix, 'libsvm_options');                                                                          
libsvm_options:                                                                                                                                                      
-s svm_type : set type of SVM (default 0)                                                                                                                            
        0 -- C-SVC                                                                                                                                                   
        1 -- nu-SVC                                                                                                                                                  
        2 -- one-class SVM                                                                                                                                           
        3 -- epsilon-SVR                                                                                                                                             
        4 -- nu-SVR                                                                                                                                                  
-t kernel_type : set type of kernel function (default 2)                                                                                                             
        0 -- linear: u'*v                                                                                                                                            
        1 -- polynomial: (gamma*u'*v + coef0)^degree
        2 -- radial basis function: exp(-gamma*|u-v|^2)
        3 -- sigmoid: tanh(gamma*u'*v + coef0)
        4 -- precomputed kernel (kernel values in training_instance_matrix)
-d degree : set degree in kernel function (default 3)
-g gamma : set gamma in kernel function (default 1/num_features)
-r coef0 : set coef0 in kernel function (default 0)
-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
-m cachesize : set cache memory size in MB (default 100)
-e epsilon : set tolerance of termination criterion (default 0.001)
-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)
-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)
-v n : n-fold cross validation mode
-q : quiet mode (no outputs)

和

Usage: [predicted_label, accuracy, decision_values/prob_estimates] = svmpredict(testing_label_vector, testing_instance_matrix, model, 'libsvm_options')
Parameters:
  model: SVM model structure from svmtrain.
  libsvm_options:
    -b probability_estimates: whether to predict probability estimates, 0 or 1 (default 0); one-class SVM not supported yet
Returns:
  predicted_label: SVM prediction output vector.
  accuracy: a vector with accuracy, mean squared error, squared correlation coefficient.
  prob_estimates: If selected, probability estimate vector.

用于在具有三个特征的四个点的数据集上训练线性SVM的示例代码：

training_label_vector = [1 ; 1 ; -1 ; -1];
training_instance_matrix = [1 2 3 ; 3 4 5 ; 5 6 7; 7 8 9];
model = svmtrain(training_label_vector, training_instance_matrix, '-t 0');

将生成的model应用于测试数据

testing_instance_matrix = [9 5 1; 2 9 5];
predicted_label = svmpredict(nan(2, 1), testing_instance_matrix, model)

结果：

predicted_label =

    -1
    -1

您还可以将真实的标签传递给svmpredict，这样它就可以直接计算精度；我在这里用NaN替换了真实的testing_label_vector。

请注意，的统计工具箱中还有一个函数svmtrain，它与LIBSVM中的函数不兼容-请确保您调用的是正确的函数。

票数 1

Stack Overflow用户

发布于 2015-03-19 12:02:33

正如@A.Donda回答的那样，如果你可以在matlab中进行训练和预测，你就不必将数据转换为'libsvm‘格式。

当你想在windows或linux中进行训练和预测工作时，你必须将数据设置为libsvm格式。

从你的错误来看，我认为你没有在“数据特征”的每一行中给出标签。您应该在数据的每一行中的要素前面添加标签。

matlab> SPECTF = csvread('SPECTF.train'); % read a csv file
matlab> features = SPECTF(:, :); % because there are no labels in your csv file
matlab> labels = [??];% to add the label as your plan 
matlab> features_sparse = sparse(features); % features must be in a sparse  matrix
matlab> libsvmwrite('SPECTFlibsvm.train', labels, features_sparse);

您应该提供有关您的数据的更多信息，以便我们可以帮助您添加标签。顺便说一句，标签数据通常由用户在开始时设置。您可以将标签数据设置为任意整数的一种数据。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/20850344

复制

相似问题

问LIBSVM数据准备
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问LIBSVM数据准备EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问LIBSVM数据准备
EN