

The cluster argument controls whether or not clustering is used when estimating the adjacency matrix. Note the databases argument is case sensitive so make sure to pass "reactome" and not "Reactome". The graphite databases are: # "kegg" "panther" "pathbank" "pharmgkb" "reactome" The options are the databases for homo spaiens available in graphite or NDEx (only for development version on Github). In both methods, one must specify the databases to search. Using the obtainEdgeList method, one can save the edgelist to ensure the same network information is used across iterations or in the future. Since prepareAdjMat queries the graphite databases when it is called and graphite databases can change overtime, this may not be desirable for reproducibility.

These are essentially the same thing, the only difference is that for the character vector method, obtainEdgeList is called inside prepareAdjMat and cannot be saved. The databases argument can be either (1) the result of obtainEdgeList or (2) a character vector defining the databases to search. Remember, the rownames of the data matrix X must be named as "GENE_ID:GENE_VALUE" as in "ENTREZID:7534". Note it is assumed that each edge/non-edge is directed so if you want an undirected edge/non-edge you should put in two observations as in: # base_gene_src base_id_src base_gene_dest base_id_destĪfter having the data set-up, the first step in pathway enrichment analysis with netgsa is to estimate the adjacency matrices. 4th column - Gene identifier of the destination gene (base_id_dest) e.g. “UNIPROT”.3rd column - Destination gene (base_gene_dest), e.g. “8607”.2nd column - Gene identifier of the source gene (base_id_src), e.g. “ENTREZID”.The columns do not necessarily need to be named properly, they simply must be in this specific order: They both must have 4 columns in the following order. Each observation is assumed to be a directed edge (for edgelist) or a directed non-edge (for non-edgelist). These are where users can specify known edges/non-edges of their own. The edgelist and non-edgelist are strings representing file locations and are read in using data.table’s fread() command.
