-
Notifications
You must be signed in to change notification settings - Fork 43
Interface
Percolator can handle two types of input files: the tab-delimited PIN.tsv format (recommended) and PIN.xml format. Input files can be generated from search engine outputs using our converters.
To run Percolator on a tab-delimited input, use the following options:
$ percolator input.tsv -X output.xml
where input.tsv is a valid tab-delimited file.
To run Percolator on an XML file in PIN format, use the -k flag:
$ percolator -k pin.xml -X output.xml
Percolator accepts input in a simple tab-delimited format where each row contains features associated with a single PSM:
id <tab> label <tab> scannr <tab> feature1 <tab> ... <tab> featureN <tab> peptide <tab> proteinId1 <tab> .. <tab> proteinIdM
label is a flag set to 1 for target PSMs, and -1 for decoys, and scannum is an integer value.
These lines should be preceded by one line specifying a column header with the exact String ScanNr, followed by the names of the individual features separated by tabs.
Optionally, the spectrum filename can be specified in a column directly after the ScanNr columns with the header filename or spectrafile, which will be propagated to the result file(s).
An optional second line specifying the default scoring vector should contain the String DefaultDirection in its first column, e.g.
PSMId <tab> Label <tab> ScanNr <tab> feature1name <tab> ... <tab> featureNname <tab> Peptide <tab> Proteins
DefaultDirection <tab> - <tab> - <tab> feature1weight <tab> ... <tab> featureNweight [optional]
If pin.xml is a valid XML file, it is possible to use Percolator as a converter and generate tab-delimited files from XML files by using the following options:
$ percolator -k pin.xml -J pin.tsv
After successful termination, pin.tsv will contain a tab-delimited file that can be fed to Percolator as described above; the file will be overwritten, or created if it does not already exist.
The percolator-converters package contain a set of converters from the output format of sequest/crux (sqt2pin), x!tandem (tandem2pin) and ms-gf+ (msgf2pin) format to tab delimited-file format.
Usage:
sqt2pin [options] -o output.tsv target.sqt decoy.sqt
Where output.tsv is where the percolator input file will be written (ensure to have read and write access on the file). target.sqt is the target sqt-file, and decoy.sqt is the decoy sqt-file. Small data sets may be merged by replace the sqt-files with meta files. Meta files are text files containing the paths of sqt-files, one path per line. For successful result, the different runs should be generated under similar condition.
The same applies to msgf2pin and tandem2pin, with mzid-files or X!tandem-files instead of sqt-files respectively.
It is also still possible to output XML files in PIN format by using a -k flag instead of the -o flag for the tab delimited-file format.
The converters create an identifier for each PSM of the form <file_identifier>_<scan_number>_<charge>_<rank>, e.g. my_interesting_raw_file_24326_2_1.
Since version 1.15, Percolator has had its own XML input format, whose structure is defined by the schema percolator_in.xml.
Similarly, Percolator’s output (called POUT for Percolator-OUT) is defined by the schema percolator_out.xml.
If pin.xml is a valid Percolator XML file, Percolator can be run using the following options:
$ percolator [options] -k pin.xml -X output.xml
After a successful termination, output.xml will contain Percolator’s output formatted in POUT format; the file will be overwritten, or created if it does not already exist.
Getting started
Home
Download and Install
Example
User guide
Command line options
Interface
Container
Advanced topics
Decoys
Post translational Modifications (PTMs)
Protein inference
PSM deduplication
Additional info
Licenses
How to cite Percolator
Software that use percolator