util.io
Class FileUtils

java.lang.Object
  extended by util.io.FileUtils

public class FileUtils
extends java.lang.Object

Util class which contains some necessary but rather specific I/O methods (mostly file I/O) for the JASPr code. As such, not a lot of these methods can be salvaged for use in other programs.

Author:
Michiel Van Bel

Constructor Summary
FileUtils()
           
 
Method Summary
static CrossValidationData fillCrossValFiles(int posLength, int negLength, java.io.File posFile, java.io.File negFile, boolean isDonor)
           
static java.io.File fillCrossValTrainingFile(int currentFold, int posLength, int negLength, java.io.File posFile, java.io.File negFile, boolean isDonor)
          Creates and fills the temporary file with data for crossvalidation.
static java.lang.String getExtension(java.lang.String fileName)
          Returns the extension of a filename
static int getLinesInFile(java.io.File file)
          Returns the number of lines (terminated by a carriage return) in the file
static void main(java.lang.String[] args)
           
static java.io.File makeCopy_DestroyOriginal(java.io.File original)
          This method makes an identical copy of a file (new filename is old filename with _copy attached at the end), and then deletes the original.
static void mergeAndDeleteResults(java.io.File f_d, java.io.File f_a, java.io.File r_d, java.io.File r_a, java.lang.String name)
          This method merges the results from 4 different files, by sorting them on the number which is in the first column (it is assumed that the files are thus well-formatted).
static CrossValidationData seperateData(java.util.SortedSet<java.lang.Integer> randomPosTest, java.util.SortedSet<java.lang.Integer> randomNegTest, java.io.File posFile, java.io.File negFile, java.io.File training, java.io.File test)
           
static void writeCrossValidation(CrossValidationResult result, Options options, org.apache.log4j.Logger logger, boolean isDonor, java.io.File featureFile, int trainPos, int testPos, int trainNeg, int testNeg)
          This methods writes the results of a crossvalidation to a file.
static void writeRocGnuPlotCommand(java.util.List<RocCurveData> fileNames, java.lang.String path, boolean isDonor)
          This method generates the Gnuplot-commands to display the roc-curves (only possible with cross-validation).
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FileUtils

public FileUtils()
Method Detail

main

public static void main(java.lang.String[] args)

mergeAndDeleteResults

public static void mergeAndDeleteResults(java.io.File f_d,
                                         java.io.File f_a,
                                         java.io.File r_d,
                                         java.io.File r_a,
                                         java.lang.String name)
This method merges the results from 4 different files, by sorting them on the number which is in the first column (it is assumed that the files are thus well-formatted). The files are supposed to be the end-result of the prediction algorithm, but it is required that only one file is delivered.

Parameters:
f_d - The file with forward donor data results
f_a - The file with forward acceptor data results
r_d - The file with reverse donor data results
r_a - The file with reverse acceptor data results
name - The name of the resultin merged file TODO : pretty specific and dependent code, better would be and array/list of files, a name for the resulting file, an index for designating which column to sort on ( in the files to be merged) and the split-character (most likely tab or space).

makeCopy_DestroyOriginal

public static java.io.File makeCopy_DestroyOriginal(java.io.File original)
This method makes an identical copy of a file (new filename is old filename with _copy attached at the end), and then deletes the original. This method is necessary when applying the feature selection filters on a line-by-line method rather than a file-by-file method.

Parameters:
original - The original file
Returns:
The copy of the original file, with as filename _copy attached

writeRocGnuPlotCommand

public static void writeRocGnuPlotCommand(java.util.List<RocCurveData> fileNames,
                                          java.lang.String path,
                                          boolean isDonor)
This method generates the Gnuplot-commands to display the roc-curves (only possible with cross-validation). Two different gnuplot-command files are generated : a) The first one displays all the roc-curves on a single display. While this works for a limited amount of roc-curves that are also rather divergent, it has limited applicability when the amount of roc-curves increases. b) A command which displays (and stores as png-files) all couples of roc-curves. This is thus easily usable for comparing all the roc-curves.

Parameters:
fileNames - A list with all the filenames and data which contain roc-curve data
path - The path in which to place the resulting gnuplot-command file
isDonor - Indicates whether we are dealing with donors or acceptors

getLinesInFile

public static int getLinesInFile(java.io.File file)
Returns the number of lines (terminated by a carriage return) in the file

Parameters:
file - The file of which we want the number of lines
Returns:
The number of lines

getExtension

public static java.lang.String getExtension(java.lang.String fileName)
Returns the extension of a filename

Parameters:
fileName - The name of the file
Returns:
The extension of the file

writeCrossValidation

public static void writeCrossValidation(CrossValidationResult result,
                                        Options options,
                                        org.apache.log4j.Logger logger,
                                        boolean isDonor,
                                        java.io.File featureFile,
                                        int trainPos,
                                        int testPos,
                                        int trainNeg,
                                        int testNeg)
This methods writes the results of a crossvalidation to a file. This includes pretty basic stuff such as the amount of true positives and false positives, but also deducted information such as the F_1 measure and the specificity. While it may seem incoherent to need both the featurefile and trainPos/trainNeg data, it is clear this is needed because we may use a special case of crossvalidation in which the data is not split evenly, thus we need more information about the total amount of trainingdata It is assumed that the crossvalidationresult has allready computed all the deducted information.

Parameters:
result - The crossvalidation result
options - The different Jaspr options
logger - The logging facility (used for error-output)
isDonor - indicates whether we are dealing with donor or acceptor data
featureFile - The file containing the training-features (used for determining the total amount of positives and negatives)
trainPos - The number positive training examples used
testPos - The number of positive test examples used
trainNeg - The number of negative training examples used
testNeg - The number of negative test examples used

fillCrossValTrainingFile

public static java.io.File fillCrossValTrainingFile(int currentFold,
                                                    int posLength,
                                                    int negLength,
                                                    java.io.File posFile,
                                                    java.io.File negFile,
                                                    boolean isDonor)
Creates and fills the temporary file with data for crossvalidation.

Parameters:
currentFold - The current subset (starting from 1)
posLength - The length of the subset for the positive file
negLength - The length of the subset for the negative file
posFile - The positive file
negFile - The negative file
isDonor - Indicates whether this concerns a donor svm (just for naming purposes)
Returns:
The file with data
Throws:
java.lang.Exception

fillCrossValFiles

public static CrossValidationData fillCrossValFiles(int posLength,
                                                    int negLength,
                                                    java.io.File posFile,
                                                    java.io.File negFile,
                                                    boolean isDonor)

seperateData

public static CrossValidationData seperateData(java.util.SortedSet<java.lang.Integer> randomPosTest,
                                               java.util.SortedSet<java.lang.Integer> randomNegTest,
                                               java.io.File posFile,
                                               java.io.File negFile,
                                               java.io.File training,
                                               java.io.File test)
                                        throws java.lang.Exception
Throws:
java.lang.Exception