Monday, December 17, 2012

A plethora of solid2fastq or csfasta convertors to fastq

I hadn't realised that there's an accumulation of prog/scripts to do the same task. Last count is 4 of these in my tool closet.


The C binary from bfast

solid2fastq 0.6.4a

Usage: solid2fastq [options]
        -c              produce no output.
        -n      INT     number of reads per file.
        -o      STRING  output prefix.
        -j              input files are bzip2 compressed.
        -z              input files are gzip compressed.
        -J              output files are bzip2 compressed.
        -Z              output files are gzip compressed.
        -t      INT     trim INT bases from the 3' end of the reads.
        -h              print this help message.

 send bugs to bfast-help@lists

solid2fastq.pl from bfast-0.6.4a
with notes in the script to refer to the above
# Author: Nils Homer
# Please see the C implementation of this script.


EDIT: THANKS to iceman for his reminder in the comments
"Make sure that you use the BWA's solid2fastq.pl if you are going to use BWA as it "double-encodes" the reads."

solid2fastq.pl from bwa-0.5.7
Usage: solid2fastq.pl

Note: is the string showed in the `# Title:' line of a
      ".csfasta" read file. Then F3.csfasta is read sequence
      file and F3_QV.qual is the quality file. If
      R3.csfasta is present, this script assumes reads are
      paired; otherwise reads will be regarded as single-end.

      The read name will be :panel_x_y/[12] with `1' for R3
      tag and `2' for F3. Usually you may want to use short
      to save diskspace. Long also causes troubles to maq.

# Author: lh3
# Note: Ideally, this script should be written in C. It is a bit slow at present.
# Also note that this script is different from the one contained in MAQ.

maq-0.7.1/scripts/solid2fastq.pl

Usage: solid2fastq.pl

Note: is the string showed in the `# Title:' line of a
      ".csfasta" read file. Then F3.csfasta is read sequence
      file and F3_QV.qual is the quality file. If
      R3.csfasta is present, this script assumes reads are
      paired; otherwise reads will be regarded as single-end.

      The read name will be :panel_x_y/[12] with `1' for F3
      tag and `2' for R3. Usually you may want to use short
      to save diskspace. Long also causes troubles to maq.

# Author: lh3
# Note: Ideally, this script should be written in C. It is a bit slow at present.

No comments:

Post a Comment