|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
public interface Classifier
This interface defines all the necessary methods that a classifier should conform to, in order to be compatible with FunSiP.
Nested Class Summary | |
---|---|
static class |
Classifier.DATA_TYPE
The possible types of data: POSITIVE_DATA indicates that the source is positive training data. |
Method Summary | |
---|---|
void |
applyAttributeFilter(java.util.List<java.lang.Integer> attributeFilter,
int maxNumFeatures,
java.io.File toBeFilteredFile)
After having used featureselection to get a filter, this filter can be used to change the featurefiles in order to optimize the svms. |
boolean |
buildClassifier()
This method builds an SVM model file from a file with trainingexamples. |
java.lang.Double |
classify_single_instance_fast(double[] features)
Use the trained classifier to classify a single instance of data in a very fast way, without having to resort to string parsing procedures (recommanded method for doing these classifications). |
java.lang.String |
classify_single_instance(java.lang.String instance)
Use the trained classifier to classify a single instance of data, defined by the instance parameter. |
void |
classify(java.lang.String testFile,
java.lang.String outputFile)
Use the trained (or untrained, the modelfile must be set before though) SVM to classify data, and write the output to an outputfile |
CrossValidationResult |
crossValidate(int n,
int maxPosTrain,
int maxNegTrain)
Performs a crossvalidation of trainingfile. |
java.lang.String |
generateFeatureString(java.util.List<java.lang.Double> data,
Classifier.DATA_TYPE dataType)
Creates a string of features for only one featurevector |
java.lang.String |
getFileExtension()
|
java.lang.String |
getModelFile()
|
int[] |
getPosNegExamplesInFile(java.io.File file)
Returns the amount of positive and negative examples in a trainingfile. |
double |
getSigmoid_A()
|
double |
getSigmoid_B()
|
weka.core.Instances |
getTrainingFileInstances()
|
boolean |
loadClassifier()
Sets the modelfile, and - dependend on the implementation - there may be an attempt to build the SVM from this modelfile. |
boolean |
loadClassifier(java.lang.String modelFile)
Sets the modelfile, and - dependend on the implementation - there may be an attempt to build the SVM from this modelfile. |
java.io.File |
mergeFeatureFiles(java.io.File tempFilePositive,
java.io.File tempFileNegative)
Merges the featurefiles (one with positive training features, one with negative training features), in order to make the actual training file. |
java.util.List<ValPosCombination> |
performAttributeEvaluation(boolean sort,
weka.attributeSelection.AttributeEvaluator evaluator)
Performs feature selection by evaluating different attributes. |
java.lang.String[] |
prepareCrossvalidationCommand(int fold,
java.lang.String fileIn,
java.lang.String fileOut)
Creates an array with string values, to be parsed by the implementation of the classifier. |
java.lang.String[] |
prepareTrainingCommand(java.lang.String fileIn,
java.lang.String fileOut)
Creates an array with string values, to be parsed by the classifier. |
void |
setModelFile(java.lang.String svmModelFile)
Changes the model for the classifier by changing the name of the modelfile. |
void |
setOptions(ClassifierOptions options)
Changes the various options of this classifier. |
void |
setSigmoid_A(double sigmoid_A)
Changes the sigmoid variable A (see documentation about restructuring the output by use of sigmoid curves) |
void |
setSigmoid_B(double sigmoid_B)
Changes the sigmoid variable B (see documentation about restructuring the output by use of sigmoid curves) |
java.lang.String |
to_genomeview_output(int id,
java.lang.Double distance,
int funsite_start,
int funsite_stop,
java.lang.String classification_name)
Method which produces a string that can be used by the GenomeView program. |
java.lang.String |
to_splice_machine_output(java.lang.Double distance,
int funsite,
int increase,
java.lang.String classification_name)
Method which produces a string that is similar to the output provided by Splicemachine (with the provided results). |
java.lang.String |
to_splice_machine_output(java.lang.String classification_result,
int funsite,
int increase,
java.lang.String classification_name)
Method which produces a string that is similar to that of the Splicemachine program, according to the provided results. |
java.io.File |
writeTemporaryFeatureData(java.lang.String tempFileName,
boolean forward_strand,
java.util.List<java.util.List<java.lang.Double>> data,
Classifier.DATA_TYPE dataType)
This method writes the temporary featuredata (being all the features extracted from 1 sequence, each feature in a different list) to a file. |
java.io.File |
writeTemporaryFeatureData(java.lang.String tempFileName,
java.util.List<java.util.List<java.lang.Double>> data,
Classifier.DATA_TYPE dataType)
This method writes the temporary featuredata (being all the features extracted from 1 sequence, each feature in a different list) to a file. |
Method Detail |
---|
CrossValidationResult crossValidate(int n, int maxPosTrain, int maxNegTrain)
n
- The fold of the crossvalidation. Frequent numbers are 2,5 and 10maxPosTrain
- The maximum amount of positive training examples during the
training phase of the crossvalidation.maxNegTrain
- The maximum amount of negative training examples during the
training phase of the crossvalidation.
java.lang.String[] prepareCrossvalidationCommand(int fold, java.lang.String fileIn, java.lang.String fileOut)
fold
- The fold of the crossvalidationfileIn
- The file containing the training features.fileOut
- The file for output (if applicable).
java.lang.String[] prepareTrainingCommand(java.lang.String fileIn, java.lang.String fileOut)
fileIn
- The name of the file containing the extracted features.fileOut
- The name of the file to which the output should be written (if applicable).
boolean buildClassifier()
java.io.File mergeFeatureFiles(java.io.File tempFilePositive, java.io.File tempFileNegative)
tempFilePositive
- The name of the file with features for positive trainingtempFileNegative
- The name of the file with features for negative training
java.io.File writeTemporaryFeatureData(java.lang.String tempFileName, boolean forward_strand, java.util.List<java.util.List<java.lang.Double>> data, Classifier.DATA_TYPE dataType)
tempFileName
- The name of the file to which the data should be written.forward_strand
- Indicates whether or not the data is located on the forward strand.data
- The featuredata, put in a nested linked list.dataType
- The type of data (see enum in this interface)
java.io.File writeTemporaryFeatureData(java.lang.String tempFileName, java.util.List<java.util.List<java.lang.Double>> data, Classifier.DATA_TYPE dataType)
tempFileName
- The name of the file to which the data should be written.data
- The featuredatadataType
- The type of data (see enum in this interface)
java.lang.String generateFeatureString(java.util.List<java.lang.Double> data, Classifier.DATA_TYPE dataType)
data
- The featuredatadataType
- The datatype (positive,negative,unclassified)
boolean loadClassifier()
boolean loadClassifier(java.lang.String modelFile)
modelFile
- The name of the modelfile
void classify(java.lang.String testFile, java.lang.String outputFile)
testFile
- The name of the file that contains the extracted features,
outputdirectory is supposed to be in the filename.outputFile
- The name of the outputfile, outputdirectory is
supposed to be in the filename.java.lang.String classify_single_instance(java.lang.String instance)
instance
- The instance (consisting of extracted features) to be classified.
java.lang.Double classify_single_instance_fast(double[] features)
features
- The features that make up the instance that needs to be classified.
java.lang.String to_splice_machine_output(java.lang.String classification_result, int funsite, int increase, java.lang.String classification_name)
classification_result
- The result of the classification, in string format.funsite
- The location of the functional site in the sequence.increase
- An extra increae for the output (see documentation).classification_name
- The name for this type of functional site.
java.lang.String to_splice_machine_output(java.lang.Double distance, int funsite, int increase, java.lang.String classification_name)
distance
- A value (distance to hyperplane for SVM's) that is used to give a score to
a certain functional site.funsite
- The location of the fuctional site in the sequence.increase
- An extra increase for the location of the functional site
in the output (see documentation).classification_name
- The name for this type of evaluated functional site.
java.lang.String to_genomeview_output(int id, java.lang.Double distance, int funsite_start, int funsite_stop, java.lang.String classification_name)
id
- A unique id for the functional site in the sequencedistance
- A value (distance to hyperplane for SVM's) that is used to give a score to
a certain functional site.funsite_start
- The start of the functional site in the sequencefunsite_stop
- The stop of the functional site in the sequenceclassification_name
- The name for this type of evaluated functional site.
int[] getPosNegExamplesInFile(java.io.File file)
file
- The trainingfile
java.lang.String getModelFile()
void setModelFile(java.lang.String svmModelFile)
svmModelFile
- The name of the file containg the new model.void setOptions(ClassifierOptions options)
options
- The new set of options for this classifier.java.util.List<ValPosCombination> performAttributeEvaluation(boolean sort, weka.attributeSelection.AttributeEvaluator evaluator)
sort
- Whether to sort the resulting valposcombinations according to their valuesevaluator
- The evaluator used for performing the evaluation of the attributes
weka.core.Instances getTrainingFileInstances()
void applyAttributeFilter(java.util.List<java.lang.Integer> attributeFilter, int maxNumFeatures, java.io.File toBeFilteredFile)
attributeFilter
- The filter: this is an array with the numbers of the attributes
that MUST be preserved.maxNumFeatures
- The maximum amount of features to be used by the classifier.toBeFilteredFile
- The file containing the various features (set in a classifier dependend
way) which should be filtered by the given attributefilter.double getSigmoid_A()
void setSigmoid_A(double sigmoid_A)
sigmoid_A
- The new sigmoid variable Adouble getSigmoid_B()
void setSigmoid_B(double sigmoid_B)
sigmoid_B
- The new sigmoid variable Bjava.lang.String getFileExtension()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |