Usage: snpEff [eff] [options] genome_version [variants_file] or: snpEff download [options] genome_version or: snpEff build [options] genome_version or: snpEff dump [options] genome_version or: snpEff cds [options] genome_versionThere are four main 'commands': calcualte effects (eff, which is the default), build database (build), dump database (dump), test cds in database (cds).
Calculate variant effects: snpEff [eff]
If you type the command without any arguments, it shows all available options ("java -jar snpEff.jar eff"):Usage: snpEff [eff] genome_version [variants_file] Input file: Default is STDIN Options: -a , -around : Show N codons and amino acids around change (only in coding regions). Default is 0 codons. -i format : Input format [ vcf, txt, pileup, bed ]. Default: VCF. -o format : Ouput format [ txt, vcf, gatk, bed, bedAnn ]. Default: VCF. -interval : Use a custom interval file (you may use this option many times) -chr string : Prepend 'string' to chromosome name (e.g. 'chr1' instead of '1'). Only on TXT output. -s, -stats : Name of stats file (summary). Default is 'snpEff_summary.html' -t : Use multiple threads (implies '-noStats'). Default 'off' Sequence change filter options: -del : Analyze deletions only -ins : Analyze insertions only -hom : Analyze homozygous variants only -het : Analyze heterozygous variants only -minQ X, -minQuality X : Filter out variants with quality lower than X -maxQ X, -maxQuality X : Filter out variants with quality higher than X -minC X, -minCoverage X : Filter out variants with coverage lower than X -maxC X, -maxCoverage X : Filter out variants with coverage higher than X -nmp : Only MNPs (multiple nucleotide polymorphisms) -snp : Only SNPs (single nucleotide polymorphisms) Results filter options: -fi bedFile : Only analyze changes that intersect with the intervals specified in this file (you may use this option many times) -no-downstream : Do not show DOWNSTREAM changes -no-intergenic : Do not show INTERGENIC changes -no-intron : Do not show INTRON changes -no-upstream : Do not show UPSTREAM changes -no-utr : Do not show 5_PRIME_UTR or 3_PRIME_UTR changes Annotations filter options: -canon : Only use canonical transcripts. -onlyReg : Only use regulation tracks. -onlyTr file.txt : Only use the transcripts in this file. Format: One transcript ID per line. -reg name : Regulation track to use (this option can be used add several times). -treatAllAsProteinCoding bool : If true, all transcript are treated as if they were protein conding. Default: Auto -ud, -upDownStreamLen : Set upstream downstream interval length (in bases) Generic options: -0 : File positions are zero-based (same as '-inOffset 0 -outOffset 0') -1 : File positions are one-based (same as '-inOffset 1 -outOffset 1') -c , -config : Specify config file -h , -help : Show this help and exit -if, -inOffset : Offset input by a number of bases. E.g. '-inOffset 1' for one-based input files -of, -outOffset : Offset output by a number of bases. E.g. '-outOffset 1' for one-based output files -noLog : Do not report usage statistics to server -noStats : Do not create stats (summary) file -q , -quiet : Quiet mode (do not show any messages or errors) -v , -verbose : Verbose modeOptions
Option | Note |
---|---|
-a, -around | Show N codons and amino acids around change (only in coding regions). Default is 0 codons (i.e. by default is turned off). |
-i | Input format: [ txt, vcf, pileup, bed ]
|
-interval | Add custom interval file. You may use this option many times to add many interval files. |
-o | Output format: [ txt, vcf, bed, bedAnn ]
|
-s, -stats | Name of stats file (summary). Default is 'snpEff_summary.html'. |
-chr | Prepend 'chr' before printing a chromosome name (e.g. 'chr7' instead of '7'). |
-t | Use multiple threads (implies '-noStats'). If active, tries to use available cores in the computer. Default 'off' |
Sequence change filter options
Option | Note |
---|---|
-del | Analyze deletions only (filter out insertions, SNPs and MNPs). |
-hom | Analyze homozygous sequence changes only (filter out heterozygous changes). |
-het | Analyze heterozygous sequence changes only (filter out homozygous changes). Note that this option may not be valid when using VCF4 files, since there might be more than two changes per line, the notion of heterozygous change is lost. |
-ins | Analyze insertions only (filter out deletions, SNPs and MNPs). |
-minC, -minCoverage | Filter out sequence changes with coverage lower than X. |
-maxC, -maxCoverage | Filter out sequence changes with coverage higher than X. |
-minQ, -minQuality | Filter out sequence changes with quality lower than X. |
-maxQ, -maxQuality | Filter out sequence changes with quality higher than X. |
-mnp | Analyze MNPs only (filter out insertions, deletions and SNPs). |
-snp | Analyze SNPs only (filter out insertions, deletions and MNPs). |
Results filter options
Option | Note |
---|---|
-fi {bedFile} | Only analyze changes intersecting intervals in file (you may use this option many times) |
-no-downstream | Do not show DOWNSTREAM changes |
-no-intergenic | Do not show INTERGENIC changes |
-no-intron | Do not show INTRON changes |
-no-upstream | Do not show UPSTREAM changes |
-no-utr | Do not show 5_PRIME_UTR or 3_PRIME_UTR changes |
Annotations filter options
Option | Note |
---|---|
-canon | Only annotate using "canonical" transcripts. Canonical transcripts are defined as the transcript having the longest CDS. |
-treatAllAsProteinCoding {val} | If value is 'true', report all transcript as if they were conding. Default: Auto, i.e. if transcripts any marked as 'protein_coding' the set to 'false', if no transcripts are marked as 'protein_coding' then set to 'true'. |
-ud, -upDownStreamLen | Set upstream downstream interval length (in bases). If set to zero or negative, then no UPSTREAM or DOWNSTREAM effects are reported. |
-onlyReg | Only use regulation tracks |
-reg {name} | Regulation track to use (this option can be used add several times). |
-onlyTr {file.txt} | Only use the transcripts in this file. Format: One transcript ID per line. |
Generic options
Option | Note |
---|---|
-0 | Indicates that input and output positions are zero-based. Tha means the the first base in a chromosome is base number 0. This is equivalent to '-inOffset 0 outOffset 0' |
-1 | Indicates that input and output positions are one-based. Tha means the the first base in a chromosome is base number 1. This is equivalent to '-inOffset 1 outOffset 1'. This is the default. |
-c, -config | Specifies the location of a configuration file. Default location is in current directory. |
-h, -help | Print help and exit. |
-if, -inOffset | Offset all position in input files by a number of bases. E.g. '-inOffset 1' for one-based input files. |
-of, -outOffset | Offset all outputs by a number of bases. E.g. '-outOffset 1' for one-based outputs. |
-v, -verbose | Verbose mode. |
-q, -quiet | Quiet mode (do not show any messages or errors). |
-noLog | Do not report usage statistics to server. |
Download a database: snpEff download
Download and install a database.
A list of databases is available at the download page.Usage: snpEff download [options] genome_version Generic options: -c , -config : Specify config file -h , -help : Show this help and exit -v , -verbose : Verbose mode -noLog : Do not report usage statistics to serverE.g. to downlaod GRCh37.64, just run:
java -jar snpEff.jar download GRCh37.64
Build database: snpEff build
If you type the command without any arguments, it shows all available options ("java -jar snpEff.jar build"):Usage: snpEff build [options] genome_version Build DB options: -embl : Use Embl format. -genbank : Use GenBank format. -gff2 : Use GFF2 format (obsolete). -gff3 : Use GFF3 format. -gtf22 : Use GTF 2.2 format. -refseq : Use RefSeq table from UCSC. -txt : Use TXT format (obsolete). -onlyReg : Only build regulation tracks. -cellType type : Only build regulation tracks for cellType "type". Generic options: -0 : File positions are zero-based (same as '-inOffset 0 -outOffset 0') -1 : File positions are one-based (same as '-inOffset 1 -outOffset 1') -c , -config : Specify config file -h , -help : Show this help and exit -if, -inOffset : Offset input by a number of bases. E.g. '-inOffset 1' for one-based input files -of, -outOffset : Offset output by a number of bases. E.g. '-outOffset 1' for one-based output files -noLog : Do not report usage statistics to server -q , -quiet : Quiet mode (do not show any messages or errors) -v , -verbose : Verbose mode
Option | Note |
---|---|
-embl | Use Embl format. It will look gene information in a file called './data/GENOME/genes.embl' which is assumed to be in EMBL format (assuming 'data_dir=./data/' in your snpEff.config file). |
-genbank | Use GenBank format. It will look gene information in a file called './data/GENOME/genes.gb' which is assumed to be in GenBank format (assuming 'data_dir=./data/' in your snpEff.config file). |
-gff3 | Use GFF3 format. It will look gene information in a file called './data/GENOME/genes.gff' which is assumed to be in GFF3 format (assuming 'data_dir=./data/' in your snpEff.config file). |
-gff2 | Use GFF2 format. It will look
gene information in a file called './data/GENOME/genes.gff' which is
assumed to be in GFF2 format (assuming 'data_dir=./data/' in your
snpEff.config file). WARNING: GFF2 format is obsolete and should not be used. |
-gtf22 | Use GFT 2.2 format. It will look gene information in a file called './data/GENOME/genes.gtf' which is assumed to be in GTF 2.2 format (assuming 'data_dir=./data/' in your snpEff.config file). |
-refseq | Use refSeq table. It will look gene information in a file called './data/GENOME/genes.txt' which is assumed to be a RefSeq table from UCSC (assuming 'data_dir=./data/' in your snpEff.config file). |
No comments:
Post a Comment