Website of Frank Rügheimer

scoreKo regulatory network search tool

Download

name		description
scoreKO		Python program only
scoreKO.zip		archive (contains program, examples and converter scripts)
scoreKO.tar.gz		archive (contains program, examples and converter scripts)

Linked Publications

Rue_jobim_2011.pdf short introduction to the tool and its application context (presented at JOBIM 2011)

Description

The scoreKO program extracts plausible regulatory pathways from a network of weighted potential interactions. It does so by conducting a graph search on a given network based on aggregated link plausibility for the edges of each path. The tool can be configured to report the families of up to the n-most plausible pathway. The currently provided aggregation operators are associative, commutative and non-increasing. These properties are exploited to speed up the search by preventing the further exploration of pathways that can no longer achieve the selection criteria.

The program is intended to be used in conjunction with interaction measures that generate networks of potential interactions from large scale empirical data (optionally modified by prior knowledge form literature/protein interaction data). It aggregates predictions about local network structure into testable regulatory hypotheses linking perturbations to observable effects. It can also be used in an iterative manner in which its output is used to select experiments to develop an set of regulatory hypotheses into an increasingly refined and validated regulatory structure.

Command-line interface

General usage and arguments are explained in the help text that comes with the program. It can be directly accessed from the command line by including the "-?" option:

USAGE:     scoreKO [OPTIONS] sourcelist targetlist edgelist
CONTENTS:  score regulatory network node by path hypotheses.
OPTIONS:   '-?' or '-h' show this help screen
           '-x' enable output mode with additional columns
           '-p' print top scoring families of pathway hypotheses
                rather than node scores. If this option is spec-
                ified the program acts as a filter to the input
                graph that extracts the top scoring putative
                signaling pathways connecting the given
                sub-networks. This option may not be used in
                conjunction with -e or -x.      (default=disabled)
           '-e' print list of non-zero edge scores rather than
                node scores. This option  may not be combined
                with -p or -x.                  (default=disabled)
           '-l #' set maximum plausibility rank to be considered
                in pathway hypothesis family mode; used in con-
                junction with -p option. Output will be based on
                the top # score ranks                  (default=1)
           '-a <op>' set aggregation operator to be used for
                path scoring. <op> is a selector for an aggre-
                gation method. Currently supported values are:
                min   - minimum (smallest edge weight on path)
                prod  - product (multiply edge weights on path)
                hprod - Hamacher product (used when aggregating
                        several edges with compar. low scores)
                lsum  - sum (used with logarithmic weights from
                        (-inf,0]; equivalent to prod on untrans-
                        formed data               (default: hprod)
ARGUMENTS: sourcelist: File with node identifiers serving as
                points of origin (optional source of reg. signal
                Several nodes may be specified, but each iden-,
                tifier must be given on a separate line.
           targetlist: File with node identifiers serving as
                regulation targets.
                Several nodes may be specified, but each iden-,
                tifier must be given on a separate line.
           edgelist:  File containing edge specifications of the
                form  "nodeA<TAB>nodeB<TAB>value",
                were value is a number from the real interval
                [0,1] which designates the weight of the direc-
                ted edge nodeA->nodeB.
                Several edges may be specified, but each speci-,
                fication must be given on a separate line.

Sample session

After unpacking the archive file a directory exam containing sample input files will be created. srcnodes.lst contains a list of nodes considered as the source of a perturbation. dstnodes.lst lists target nodes for which the effect of the perturbation can be observed. Finally the file edges.tab contains specifications of directed, edges in the regulatory network and their respective plausibility score.

To assign a score to each node that reflects the plausibility of the optimal pathway through this node using the minimum as path aggregator type

> scoreKO -a min -x exam/srcnodes.lst exam/dstnodes.lst exam/edges.tab


ID      path_score      con2src con2trg
A       0.8     1.0     0.8
B       0.67    1.0     0.67
C       0.8     1.0     0.8
I       0.8     0.8     1.0
E       0.67    0.8     0.67
F       0.12    0.75    0.12
H       0.8     0.8     0.97
D       0.48    0.8     0.48
G       0.79    0.79    0.8

The -x option enables two additional columns in the output reflecting the connectivity to the nearest nodes from the source and the target node set. To view output in pathway mode instead, using the default Hamacher product as pathway aggregator type:

> scoreKO -pl3 exam/srcnodes.lst exam/dstnodes.lst exam/edges.tab


from    to      weight
A       H       0.8
C       A       0.83
C       G       0.79
H       I       0.97
G       A       0.9

Note that the -p and the -l option were combined to obtain the pathway representations for the top-3 plausibility levels.

The Hamacher product is set as the default operator as it can distinguish paths that would be assigned the same score under the minimum aggregation operator. On the example this becomes quite evident when re-running the analysis using the minimum as pathway aggregation operator and comparing the results.

> scoreKO -pl3 -a min exam/srcnodes.lst exam/dstnodes.lst exam/edges.tab


from    to      weight
A       E       0.68
A       H       0.8
B       E       0.8
C       A       0.83
C       B       0.81
C       G       0.79
E       A       0.67
H       I       0.97
G       A       0.9

Output converters

To convert the output of the pathway mode to popular graph language formats two shell scripts are provided with the program package. Both scripts operate as filters. This allows output to be piped directly from scoreKO to the filter script, to cascade filters or to pass the processed output directly to other tools. net2dot converts edges from the resulting table format to .dot format (used e.g. with the graphviz tool suite). Once converted to .dot format the scripts dot2sif and dot2reg can be applied to create output readable, e.g. by Cytoscape. Since all output formats are text-based simple stream processing can be used to adapt the scripts for output in other formats if desired.