A mapping from internal scan numbers (
which start at one) to scan numbering in the
original files. mzXML files are required to provide a scan number, those scan numbers are not
always continuous, e.g.:
-
In mzXML data converted from AB Sciex raw files one might have the first scan number as 2037, the
second as 4056 and so on, they are incrementing, but not incrementing by one.
-
In mzML data converted from Agilent raw files using proteowizard (msconvert.exe) each scan has a
separate "index" attribute, which starts at zero (index="0") and then and "id" which is a unique
textual representation of the scan's identity of the form id="cycleNumber=1, experimentNumber=1,
scanNumber=1". This scanNumber=1 is not always a unique number, it's some internal vendor
specific string.
-
There are many other example
In order to avoid all the confusion about scan numbering, MSFileToolbox uses a separate numbering
and indexing scheme. Internally all scan numbers start at one and then increment by one in
retention time order. Numbering from the original file is maintained, be it a raw vendor file
(e.g. Thermo uses normal 1 based numbering, Agilent has an ordinal number stored internally in
their files, but also has those 'experiment number' and other stuff), or an XML based file. The
ID is also retained