Identify biological processes which are significantly-enriched in network modules.

modEnrich(ig, mods, levels = c("x", "y"), go2eg, glb = 15, minGO = 5,
  thresh = 0.05, adjust = "BH", prefix = "M", process_alias = NULL)

Arguments

ig

An igraph network object output from either adj2igraph, rankHub, cisTrans.

mods

modules with three accepted formats.

  • An igraph object of class 'communities', which is typically the output of the cluster_* functions of igraph.

  • A list where each element is a character vector that only contains the node identifiers from either yinfo$id or xinfo$id input to function adj2igraph.

  • An integer vector with names as node identifiers and values as an integer. For example c(id1 = 1, id2 = 2, id3 = 2, id4 = 1).

levels

Any given module can contain x nodes or y nodes. If both predictors and responses have a functional mapping in the go2eg argument, then specify levels = c("x","y"). Otherwise, specify only those nodes that have a functional mapping. See details for more discussion.

go2eg

Named list where the names denote a biological process (e.g. Gene Ontology ID) and the elements of the list is a vector of members belonging to the biological process. The list ought to be non-redundant in names. For example, list(bio_proc_1 = c("gene1", "gene2", "gene3"), bio_proc_2 = c("gene4", "gene5", "gene6") )

glb

Integer defining the smallest possible size of a module in order for the module to be tested for enrichment.

minGO

Integer defining the smallest possible number of nodes represented in a biological process to be called a significant enrichment of that biological process.

thresh

Numeric between 0 and 1 indicating the threshold at which adjusted P-values should be considered significant.

adjust

Character of type stats::p.adjust.methods for specifying the type of multiple comparison adjustment desired.

prefix

Character to prefix module identifiers.

process_alias

Vector mapping biological process identifiers in go2eg with biologically meaningful descriptions. The vector process_alias must have names as the same names in go2eg and the elements are the biologically meaningful descriptions.

Value

A three-item list:

  • The element "ig" is the input igraph network object updated with a "process_id" attribute for nodes affiliated with a significant GO-term. The "process_id" and "module" attributes together can be especially useful for visualizing which nodes of a module are enriched for a specific biological function.

  • The element "etab" is the polished module enrichment table conveniently organized to report significant GO terms in modules, the representation of the GO term in the module relative to the size of the GO term, and what x-hubs may belong to the module.

  • The element "eraw" contains details for each (module, GO-term) pair that was subjected to the hyper-geometric test. This output gives the user more control (if desired) over enrichment by reporting all tests, the relative over-representation of a GO-term in that module, the raw P-value, and the adjusted P-value.

Details

The hyper-geometric test is used to test for over-representation of a biological process. In the phyper R function, parameter q is the overlap between the biological process group and the module, where the module is reduced to only its y node members if level = "y". Parameter m is the size of the biological process. Parameter n is the number of nodes in the network not in the biological process. This excluding node levels that do not have a functional mapping. In other words, if no x nodes do not appear in the mapping of go2eg, corresponding to level = "y", then x nodes are not counted, but "y" nodes without a mapping are counted, because most y nodes do have a mapping.

See also

adj2igraph, rankHub, cisTrans, reportHubs, xHubEnrich