util.secondaryStructure
Class SecondaryStructureExtraction

java.lang.Object
  extended by util.secondaryStructure.SecondaryStructureExtraction

public class SecondaryStructureExtraction
extends java.lang.Object

This method is primarily used for buiding the secondary structure of non-training files (although it can also be used for this purpose, there are better ways of doing this). This class allows a link the the native RNAfold method (which is assumed to be installed locally).

Author:
Michiel Van Bel, VIB, Ghent University, Michiel.vanbel\@psb.ugent.be

Constructor Summary
SecondaryStructureExtraction(org.apache.log4j.Logger logger)
          Constructor of the secondary structure builder class
 
Method Summary
 java.lang.String extractSequenceString(java.lang.String sequence, int location, int maxUp, int maxDown)
          This method extracts a string of the same length for all possible maximum upstream and downstream lengths, relative to the splicesite location.
 SecondaryStructureData extractStructures(java.lang.String sequence, java.util.List<java.lang.Integer> locations, java.lang.String outputDirectory, int maxUp, int maxDown)
          This method extracts the actual secondary structures from the sequence, given by the list of possible splicesites.
static void main(java.lang.String[] args)
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SecondaryStructureExtraction

public SecondaryStructureExtraction(org.apache.log4j.Logger logger)
Constructor of the secondary structure builder class

Parameters:
logger - The logging facility for the project
Method Detail

extractStructures

public SecondaryStructureData extractStructures(java.lang.String sequence,
                                                java.util.List<java.lang.Integer> locations,
                                                java.lang.String outputDirectory,
                                                int maxUp,
                                                int maxDown)
This method extracts the actual secondary structures from the sequence, given by the list of possible splicesites. A subsequemce of certain length (according to the maxUp/maxDown) regulations is then extracted from the sequence around the splicesite. This is done because RNAfold works faster with smaller sequencelengths (results are not as good, but we have to choose between these trade-offs).

Parameters:
sequence - The sequence that contains the splicesites and from which the sec structs will be extracted
locations - The different (pseudo-) splice sites within the sequence
outputDirectory - The outputdirectory for temporary files
maxUp - maximum upstream range from the splicesites
maxDown - maximum downstream range from the splicesite
Returns:
Data structure containing the secondary structures, the associated primary sequences of the same length and the associated free energies.

extractSequenceString

public java.lang.String extractSequenceString(java.lang.String sequence,
                                              int location,
                                              int maxUp,
                                              int maxDown)
This method extracts a string of the same length for all possible maximum upstream and downstream lengths, relative to the splicesite location. If the boundaries fall outside the normal lengths of the string, 'N' nucleotide characters are added.

Parameters:
sequence - The sequence
location - The location of the splicesite
maxUp - The maximum upstream location
maxDown - The maximum downstream location
Returns:
The substring of the right size

main

public static void main(java.lang.String[] args)