Website of Frank Rügheimer

mapvalues - a versatile identifier mapping tool

Download

Description

Mapvalues is a bash command line script for general identifier mapping tasks on table data. Initially a simple interface to sequences of cut, paste, sed, sort and join commands, it has over the time acquired a number of features for supporting operations with relations graphs and, of course, mapping back and forth between gene identifiers. Mapvalues operates as a configurable filter for tables of tab-separated values. It reads identifiers of a specified table column and computes a new column with the respective images of those identifiers under a mapping. The resulting column then either replaces the original or is inserted as a new column at a desired position. Behavior with respect to unmapped values can be set to either identity mapping, using missing value indicators or row removal and is controlled via command line options. Its simple command line interface renders mapvalues convenient for use in scripts and makefiles.

Command interface

General usage and arguments are explained in the help text that comes with the program. It can be directly accessed from the command line by including the "-?" option:

USAGE: mapvalues  [OPTIONS]  tabfile mapfile
DESCRIPTION: apply a mapping to all values in the columns of a table
If multiple mappings are specified for an element, the respective
lines of the input table will be replicated and output is produced for
each of the possible mappings (relational join operation). One of the
input file names may optionally be replaced by '-' to indicate that
input is to be read from stdin instead.
Output is written to stdout.
ARGUMENTS:
        tabfile: table file - attributes correspond tab-separated columns
        mapfile: mapping in either of he following fomats:
  a) 3-column notation with lines of the form
            name        ->      newname    (columns separated by <TAB>)
  b) 2-column (simplified) notation with lines of the form
            name        newname                  (columns separated by <TAB>)
OPTIONS:
-?      Show this help page
-a      Append new column to table instead of replacing original values
-h      Show this help page
-i#     Insert results as #th column in result table -- all subsequent
        columns are shifted by one position
-k      supress (kill) unmapped table file records in output
-m      indicate unmapped values as '?' (default: use value in preimage)
-p      Prepend result column to table
-f#     Select field number (counting from one), whose values are to
        be mapped, negative numbers can be used to select columns from
        the end of the table; for instance "-f -1" would select the last column
        column of the table, "-f -2" the second to last etc. If the parameter is
        0 or its absolute value exceeds the total number of columns an error
        message is generated.                (default=1)
-r      Replace values in original column with their images under the mapping
        (default behavior)