PairSeq¶
Sorts and matches sequence records with matching coordinates across files
usage: PairSeq [--version] [-h] -1 SEQ_FILES_1 [SEQ_FILES_1 ...] -2
SEQ_FILES_2 [SEQ_FILES_2 ...] [--outdir OUT_DIR]
[--outname OUT_NAME] [--failed] [--fasta]
[--delim DELIMITER DELIMITER DELIMITER]
[--1f FIELDS_1 [FIELDS_1 ...]] [--2f FIELDS_2 [FIELDS_2 ...]]
[--act {min,max,sum,set,cat}]
[--coord {illumina,solexa,sra,454,presto}]
-
--version
¶
show program’s version number and exit
-
-h
,
--help
¶
show this help message and exit
-
-1
<seq_files_1>
¶ An ordered list of FASTA/FASTQ files containing head/primary sequences.
-
-2
<seq_files_2>
¶ An ordered list of FASTA/FASTQ files containing tail/secondary sequences.
-
--outdir
<out_dir>
¶ Specify to changes the output directory to the location specified. The input file directory is used if this is not specified.
-
--outname
<out_name>
¶ Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files.
-
--failed
¶
If specified create files containing records that fail processing.
-
--fasta
¶
Specify to force output as FASTA rather than FASTQ.
-
--delim
<delimiter>
¶ A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively.
-
--1f
<fields_1>
¶ The annotation fields to copy from file 1 records into file 2 records. If a copied annotation already exists in a file 2 record, then the annotations copied from file 1 will be added to the front of the existing annotation.
-
--2f
<fields_2>
¶ The annotation fields to copy from file 2 records into file 1 records. If a copied annotation already exists in a file 1 record, then the annotations copied from file 2 will be added to the end of the existing annotation.
-
--act
{min,max,sum,set,cat}
¶ The collapse actions to take on all fields copied between files to combine duplicate fields into a single value. The actions “min”, “max”, “sum” perform the corresponding mathematical operation on numeric annotations. The action “set” collapses annotations into a comma delimited list of unique values. The action “cat” concatenates the values together into a single string. Only applies if the field already exists in the header before being copying from the other file.
-
--coord
{illumina,solexa,sra,454,presto}
¶ The format of the sequence identifier which defines shared coordinate information across mate pairs.
- output files:
- pair-pass
successfully paired reads with modified annotations.
- pair-fail
raw reads that could not be assigned to a mate-pair.
- output annotation fields:
- <user defined>
annotation fields specified by the –1f or –2f arguments.