`pullString` - R Interface to STRING database

Installation

To install the package from GitHub, run:

devtools::install_github("BIGslu/pullString")

Before you start coding:

Be considerate: when submitting requests in a for loop, it's a good practice to add Sys.sleep(1) after each call, so that server won't get overloaded.
Although STRING understands a variety of identifiers and does its best to disambiguate your input it is recommeded to map your identifier first (see: mapping). Querying the API with a disambiguated identifier (for example 9606.ENSP00000269305 for human TP53) will guarantee much faster server response.
Another way to guarantee faster server response is to specify from which species your proteins come from (see 'species' parameter). In fact API will reject queries for networks larger than 10 proteins without the specified organism.
Current STRING version is 12.0 (https://version-12-0.string-db.org)

Usage & examples

First, load the package and create a gene set we will use to query STRING database

library(pullString)
gene_set <- c("POSTN", "IRF1", "CXCL10", "IFNA1", "SERPINB2", "OAS1")

NOTE: For more details on each function and full parameter list, look up help pages, e.g. ?get_net_interactions

Check current version to check the current STRING database version

get_string_version()

Retrieve and save `.png` network and enrichment plot

You can retrieve and save STRING network .png file on your disk:

get_png_network(
  ids = gene_set,
  png_filename = "./string_net.png"
)

Help page ?get_png_network lists all available parameters that you can use to control the appearance of the network, node label font size, some of them includde:

get_png_network(
  ids = gene_set,
  custom_label_font_size = 16,          # increase label font size, default is 12
  flat_node_design = 1,                 # disable 3D bubble design
  block_structure_pics_in_bubbles = 1,  # removes 3d protein structure pictures inside the bubble
  png_filename = "./string_net.png"
)

If you want to save a higher resolution png image, you can run the function below with exact same arguments

get_highres_png_network()

Embed interactive network in html Rmarkdown document:

To include interactive STRING network in your Rmarkdown document (HTML), include these (keeping the order) in your document:

pullString::load_string_js()
pullString::string_request_js(ids = gene_set)
pullString::send_string_request_js()
pullString::embed_string_net()

NOTE: you can control apperance of the network, similarly as you do in get_png_network function, by adding argument to string_request_js function. Manual page lists them all ?string_request_js.

Mapping identifiers

You can map your various identifiers like gene ensembl id, HGNC symbol, protein symbol, etc. onto the STRING protein/gene names by calling:

get_string_identifiers(ids = gene_set)

This will return all the STRING genes/proteins that were mapped to your input and their descriptive annotation. As said earlier, using identifiers that are readily understood by STRING, can make querying faster. By default, this function uses species = 9606 (human). You can browse all available organisms and use their taxon id in the function call.

You can retrieve your STRING interaction network for one or multiple proteins. It will tell you the combined score and all the channel specific scores for the set of proteins. You can also extend the network neighborhood by setting add_nodes, which will add, to your network, new interaction partners in order of their confidence.

Adding required_score parameter, will put a threshold of significance to include an interaction. You can retrieve either functional or physical network types. The dafult is functional.

get_net_interactions(
  ids = gene_set,
  required_score = 800,
  network_type = "functional"
)

Enrichment analysis

Using pullString you can also perform functional enrichment analysis.

STRING maps several databases onto its proteins, this includes: Gene Ontology, KEGG pathways, UniProt Keywords, PubMed publications, Pfam domains, InterPro domains, and SMART domains.

This function returns enrichment analysis results table as well as statistics for each enriched term

get_enrichment(ids = gene_set)

Visualize enrichment results

To visualize your enrichment results and save .png file the path provided, run get_enrichment_plot function. Color scales can be controlled, see ?get_enrichment_plot for full list of parameters

get_enrichment_plot(
  ids = gene_set,
  png_filename = "./string_enrichment.png"
)

For other enrichment/annotation tasks, look into:

get_functional_annotation()
get_ppi_enrichment()

You can get interaction partners of your list of genes and all other STRING proteins by running:

get_interact_partners()

Homology

If you want to find homologies across species, you can look into get_best_homology and get_homology functions

get_best_homology(
  ids = "POSTN", 
  species = 9606, # human
  species_b = 10090 # mouse
)

STRING internally uses the Smith–Waterman bit scores as a proxy for protein homology. The original scores are computed by SIMILARITY MATRIX OF PROTEINS (SIMAP) project

Get URL to STRING website with genes you provided

You can also generate a URL link to STRING website:

get_link(
  ids = gene_set
)

Citing STRING DB:

Szklarczyk D, Kirsch R, Koutrouli M, Nastou K, Mehryary F, Hachilif R, Annika GL, Fang T, Doncheva NT, Pyysalo S, Bork P, Jensen LJ, von Mering C. The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 2023 Jan 6;51(D1):D638-646.PubMed

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github		.github
R		R
man		man
tests		tests
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`pullString` - R Interface to STRING database

Installation

Usage & examples

Check current version to check the current STRING database version

Retrieve and save `.png` network and enrichment plot

Embed interactive network in html Rmarkdown document:

Mapping identifiers

Enrichment analysis

Visualize enrichment results

Homology

Get URL to STRING website with genes you provided

Citing STRING DB:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pullString - R Interface to STRING database

Installation

Usage & examples

Check current version to check the current STRING database version

Retrieve and save .png network and enrichment plot

Embed interactive network in html Rmarkdown document:

Mapping identifiers

Enrichment analysis

Visualize enrichment results

Homology

Get URL to STRING website with genes you provided

Citing STRING DB:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`pullString` - R Interface to STRING database

Retrieve and save `.png` network and enrichment plot

Packages