@ Biocomputing Unit - INB - ELIXIR








- HOW TO ...

Accessing 3DBIONOTES-WS can be done in two different ways: submitting a structure file or querying the application by a PDB code, EMDB id or a UniProt accession. Both ways are available through two different web forms accessible from the web server main menu (Figure 1). The 'SUBMIT' option (Figure 1A) links with the structure model submission form and the 'QUERY' option (Figure 1B) is used to query the application using a EMDB, PDB or UniProt entry.
Accessing 0282ad8a30e6fe7775efc9c368b551a439b169f21c326a1a49f544f66e92233d
Figure 1
To query 3DBIONOTES-WS using a database ID type the PDB code, EMDB id or UniProt accession in the text field (Figure 2A) and then click on the 'Submit' button.
Databases 7b0433e23c8e5b6960343ef021526b7980d4647b7c7e80d09b66ae5baa56f3bc
Figure 2
The information for multiple UniProt accession can requested in order to explore protein-protein interaction networks. To that end, first the organism of interest has to be selected using the select menu (Figure 3A); then, the UniProt accessions can be typed in the text panel (Figure 3B). If the checkbox (Figure 3C) is checked the application will also return the information for the interaction partners of the submitted proteins for which structural information is available.
Network query 3ff412883121a26995497b28884505e9a83d7415959d6808488c2ec67fdc0477
Figure 3
Uploading a structural model file can be easily done through a web form available clicking on the 'SUBMIT' option and using the file input fiedl to upload the corrdinates (figure 4A). The accepted file formats are PDB or mmCIF format. When the file is submited the system perfromes a BLAST search in order to identiffy the possible UniProt accessions of the submitted proteins. This step can take a few minutes since UniProt - TrEMBL contains more than 40000000 of protein chains. The 'JOB TITLE' field (Figure 4B) is an optional text field that can be used to identiffy the macromolecular structure or project, when no data is provided the name of the structure file becomes the title of the project.
Upload a49e14bad34a854810b45f0fae97e000dd5ff5dda6e85319908466adbaeed313
Figure 4
When the structure file is submitted the web server performs a BLAST search against SwissProt and TrEMBL databases to find the potential proteins corresponding to the different chains that are contained in the structure. When this process is finished a new form (Figure 5) is returned to the user with the best hits to select the proper protein for each chain. This new form conatins a table for each chain and the user will be asked to chose a protein (table row) for each table. The different tables conatin next fields: (A) check button for the protein selection, (B) gene symbol, (C) protein name, (D) organism name, (F) UniProt accession, (G) sequence identity, (H) BLAST e-value, (I) seqeunce alignment start position and (L) sequence alignment end position. There is no need to select a protein for each chain; however, those chains that were not identified by the user will not be annotated. Once the desired proteins have been selected for the different chains clicking the 'Submit' button, then this information is sent back to the server and a sequence alignment between the chains and UniProt sequences are calculated.
Ident e03f0a0376ec40f7a120d28d5901d2b12af88583733a7a7ac3e2c02cd25ca985
Figure 5
The graphical interface comprises 3 different panels (Figure 6). The 3D panel (Figure 6A) is based on NGL protein viewer and it displays the macromolecular structure submitted by the user. The viewers controls are explained in the STRUCTURE PANEL section. The annotation panel (Figure 6B) is build on the ProtVista plug-in and shows the biomedical and biochemical data collected form the different sources of information. Finally, the sequence panel (Figure 6C) represents the alignment between the selected UniProt sequence and the active chain of the structural model.
Gui a603d431d161023ebe4a738372dbe7aa14e3af77f5f1b37ede92eef68ce9a306
Figure 6
All these panels are interconnected and selecting an annotation or a sequence region will highlight the involved amino acids in the other panels (Figure 7). See each of the components for a more detailed explanation.
Gui2 5a4dd6074552fe22f7a00ad4880221924a79ba25d3f203107f96a7c59ef6dde1
Figure 7
The annotation panel comprises a set of different lanes where the biochemical and biomedical annotations are represented using different shapes and colors (Figure 8). Currently the web server collects information from multiple sources: post-translational modifications ( dbPTM and PhosphoSitePlus ), genomic variations and diseases ( dSysMap and BioMuta ), short linear motifs ( ELM ), disordered regions ( MobiDB ) and domain families ( Pfam , ProSite and SMART ).
Annot b6125a19a2a68abea002d13aa3e18d67e2c41557d6e9271d037c0e093f321daf
Figure 8
The different annotations can be selected clicking them, then the annotation panel will display a description of the selected annotation and the link to the original source or publication (Figure 9).
Annot2 ea9f26fb9cdec9b9bdad4f9f76bf5e4fb5d239fa7f4fe166a28d615e44df3f45
Figure 9
Also, when an annotation is selected the application will display the region in the sequence panel and highlight the amino acids in the 3d viewer (Figure 10).
Annot3 b257b4800106a2f6bf16d96c1a68be70a53aaeafc9c2b1d3091c41ce96b45c29
Figure 10
The annotation panel provides different options: download the displayed information, reset the annotation panel and zoom to the selected region (Figure 11A, left to right).
Annot4 6ea0203335c5ca1502a900f9a94fd6efef96db51a9e2d56ae21d7b39ab308630
Figure 11
The variant viewer lane contains the genomic variants associated to the protein (Figure 12). The variants can be filtered depending their consequences (disease, non-disease, predicted deleterious, ...) using the legend menu (Figure 12B). This viewer provides two different options: filter by disease and reset the variant panel (Figure 12A, left to right).
Variant 076e75d61919e96aa9779201fe4005ee2539e9156d3c2779323552cda41e4cf8
Figure 12
Clicking the disease filter (Figure 13A) will display a list of diseases associated to the variants (Figure 14B) and when clicking on one of the disease names the variant panel will display only those variants that are associated wit the disease.
Variant2 fbc397be6e836f81e088d10450c0984c7378a04546b9440a39b309c5bed229de
Figure 14
The sequence panel shows the alignments between the chains in the structural model and their corresponding UniProt accessions that were selected in the identification step (Figure 15).
Seq f4348025c4d40eb2bd7ca41eeb759c5a9db1f72beef5833330fd5b5a0a13515f
Figure 15
This frame allows the user to select a segment of the sequence to highlight their corresponding residues on the structural viewer and the corresponding region in the annotation panel (Figure 16).
Seq2 9b065bd63c04016218b3d5803a07f8f732b2dbb7bc120a31e0a639770fad36dc
Figure 16
The 3D viewer displays the macromolecular structural model submitted or queried the server (Figure 17). The menu on the top of the panel (Figure 17A) can be use to select what subunit (protein) in the structure the annotations are visualized for. Also, it allows to change the representation of a selected annotation and navigate through the different model if the structure contains more than one.
Struct d2050d8390dc5e8b492517999bd442fee5d22969fbb29396f7227b436dcaea8e
Figure 17
The tool bar in the 3D viewer is comprised of two components (Figure 18). The select menu on the top is used to navigate through the different subunits or protein that are contained in the structural model. Changing this component will also change the annotation and sequence panels to show the information related the selected protein. From left to right, the icons on the bottom of the tool bar are used for: the clockwise arrow to reset all the panels clearing the selections, the light icon to change the representation of the selected residues, the eye icon display/hide the heteroatoms, the arrow icons are used to navigate through the different models, the scope icon to mark and label a selected annotation (see next), the menu icon to display the labeled annotations (see next) and finally, the camera icon to save an image of the 3D viewer.
Struct2 c496ba43d91bdbee531aaab4d2a3820336f5ff634d6c77923478f770e183b1b8
Figure 18
The scope icon (Figure 19) is used to label a selected annotation on the 3D view of the structure. The label will indicate a short description of the annotation and its position, the color will be the same as in the annotation panel. The labels are not deleted when a new selection is done or when any of the panels are reset. The user can selected other subunits to annotate the 3D structure and all selected labels will be kept. To remove, hide or display the selected labels has to be done with the menu icon (see next).
Struct3 41ce752edd4a59e37e62fa0cf4bc9c68ae4b04dee70519c7f22704f2a9c5eabf
Figure 19
The menu icon (Figure 20) is used to list all the current selected labels in the 3D viewer. When clicked a list of the current labels is displayed. Each label in the list is preceded of two checkboxes and a close icon. The first checkbox is used to display/hide the text of the label, the second checkbox to display/hide the label representation on the structure and the close icon to completely remove the label
Struct4 ceb103e4cb046297a0278b72be4c65003493d97a0342de8c2ca3193a86422246
Figure 20
The gene panel (Figure 21) displays the information collected from ENSEMBL database. The panel header (Figure 21A) contains the title, assembly and coordinates of the ENSEMBL gene associated to the selected protein. The select menu (Figure 21B) is used to select among the different ENSEMBL transcripts associated to the gene. The checkboxes (Figure 21C) can be used to hide or display the different panel tracks (Figure 21D). From top to bottom the different tracks displays: (POS STRAND) positive sequence of the DNA gene. (NEG STRAND) negative sequence of the DNA gene. (PROTEIN SEQ) sequence of the protein. (KRAS-202) exons of the current selected transcript, in this example KRAS-202. (CODING REGION) transcript regions that are translated. (UNIPROT ALIGN) regions that can be aligned with the associated UniProt sequence.
Gene panel 3a53cfb09da261d9f39f289df7dee261a11ef3b5d02ad87d70b1b74c55be3dc2
Figure 21
Protein-protein interaction networks are displayed using a graph-based representation (Figure 22). In this graph nodes represent proteins and edges interaction among them. The color code indicates the source of the structural information: blue if is compiled from experimental data, yellow if comes from homology models or gray when structural data is not available. At the same time, yellow edges can be drawn as a continuous or discontinuous line depending whether the structure was predicted from sequence homology modeling or domain-domain interaction templates.
Ppi panel f3e0302cdffb04a9c1172db6233f501b7ca7662ca98d16340e5a2b65683ad8fb
Figure 22
The “FEATURES” menu located on the top right side of the GUI (Figure 23A) can be used to map annotations on the network.
Ppi panel features f9ca0f9d6f22706a9654a0f708f0b9dd9420b5c05093ebc4dd5c43ddf3b834b4
Figure 23
The "FILTER *" menu (Figure 24A) can be used to hide and display specific annotations.
Ppi panel variants 4268b86ce91f3527f73e8d8f5cbd272ff347eff0564597aed3c51096acef872f
Figure 24
For example, in Figure (Figure 25) only annotations associated to “Colorectal cancer” are displayed.
Ppi panel colorectal b138017fd10f5758745938a1294b260e5fd5da1364ea4068ad9f1272f2d6ed0c
Figure 25
The analysis panel displays those diseases for which the cooccurrence of their associated variants with a specific biochemical feature is statistically significant when Fisher’s exact test is computed. The statistically significance is defined in terms of Fisher’s exact test p-value and the Benjamini–Hochberg procedure with a false discovery rate of 1% to control the amount of false positives or Type I error. This information is displayed in two different tracks (Figure 26): the first track displays the biochemical feature regions (Figure 26A) and the second track the list of diseases with the results of Fisher’s exact test (Figure 26B) and the genomic variants (Figure 26C) using a colour code of red, if they are within the selected biochemical feature region ('Nucleotide Binding Site', in this case), or grey, if they are outside.
Analysis panel 7d765c315a3e2dfb8d3d357a2693b863e47ffb0b8f2c36bde66aa61f10bbc341
Figure 26
Variants can be filtered by disease in such a way that only variants for the selected diseases are visible. For example, in Figure 27 only variants associated to 'Colorectal cancer' are visible.
Analysis colorectal 84b8358f720e4517087d4826c97297ea4e231a1d2519f2270e103aa5a31c7f17
Figure 27
A custom annotation file can be attached when requesting information to 3DBIONOTES. Annotation files must be submitted in JSON format (Figure 28). An array must contain all the annotations. Each annotation (Figure 28A) encodes the information as a hash. The “track_name” key indicates the name of the lane that will appear in the annotation viewer. The key “visualization_type” is used to choose the type of data: “variants”, “continuous” or segments when the “visualization_type” is not present. Then the “chain” key indicates that the annotations are associated to the PDB chain and when the “acc” key is available (Figure 28B) means that the annotation are associated to proteins with this accession. Finally, “data” contains an array describing the positions in the sequence, value, amino acid variation and wild type, type (sub-track name in the annotation viewer) and a description. All analysis tools and viewers are available for the attached annotations.
Annotation format 2c772f9455b63cdc5737f9f639286b8a51f45a0c29d58f5a6c0a8fcb8d2940ae
Figure 28
The annotation panel provide different tools to add custom annotations or to transfer them from similar proteins. This tools are available on the annotation pane 'MENU' (Figure 29).
Import upload menu 2a8bc66a5fc106dd19a9bd267d85d3fea3d1e29d8434b8ec22d9e4a4c07bf5b8
Figure 29
WARNING: This annotations will not be included in the genomic variant analysis and viewers.
Adding your annotations can be done in two different ways: uploading a file (Figure 30A) or manually (Figure 30B). Uploaded files must be in JSON format, here you can find an example and description of the format. Adding an annotation manually only requires to fill the form displayed in the Figure 30B. The 'COLOR' field describes the annotation color using its name (e.g. red, orange, pink) or hexadecimal number (e.g. #FF0000, #C345D8). The 'INDEX' selector is used to select the desired coordinates, thus if 'INDEX' value is 'SEQUENCE' 'BEGIN' and 'END' values will be related to the sequence positions while in 'INDEX' value is structure will be related to the residues id of the PDB structure.
Upload menu 95ae6d3fa546dcdbce56262612b6c6b714904acc60d6205cb5b8d0339c94ae7a
Figure 30
When new annotations are added new tracks containing them will be displayed and the new annotations will behave as the native ones (Figure 31).
Manually annotated bf97f1c7da8dd555509a145f1791bffc5249307a8963a9014daa5863c589301c
Figure 31
Annotations can be transferred from other proteins that share at least 80% of sequence similarity with the target protein. Transferring annotation from similar proteins can be done through the annotation menu (Figure 20). When clicking on 'TRANSFER ANNOTATIONS FROM SIMILAR PROTEINS', the application will display a table containing a list of proteins that share at least 80% of sequence similarity (Figure 32). Each row of the table displays: (A) gene symbol, (B) protein name, (C) organism name, (D) UniProt accession, (E) number of UniProt annotations and (F) sequence identity.
Import menu 6cb18f4084c81eb8af98a5d0652ca1abbb89a37c033752b2c0ece23e253e9c7f
Figure 32
When a protein is selected the application will update the annotation and sequence panel (Figure 33). The sequence panel will now contain an alignment between 3 sequences (Figure 33A): the first sequence is the original sequence (target protein), the second sequence is the structure chain and the third sequence is the selected protein to transfer annotations from. The annotation panel will display the annotation for the selected protein, and now when an annotation is selected its tooltip will display a button on the right site of its title displaying 'TRANSFER'. Clicking this button will transfer the annotation to the target protein. Also, the annotation panel will include a selector (Figure 33C) to navigate through the different imported proteins.
Transferring1 a38e6aedaf917300be9b9f0f7eb4de0189cea81856c5044f333a85bbaaef7609
Figure 33
Annotations can be transferred from multiple proteins. The selector on the top of the annotation menu (Figure 34A) shows all imported proteins and the target protein, to display the target protein were the selected annotations have been transferred to we have to select the original protein in the selector, the first item (Figure 35A).
Transferring2 c9431f9816b38b57a2925a1eef659dc91271a6a624a628398f9c4c276d8e379e
Figure 35
The transferred annotations are displayed in a track named 'Imported annotations' and grouped by their type (Figure 35B). When a transferred annotation is selected a warning message displaying the protein source of the annotation will appear in its tooltip (Figure 35C).
Transferring3 b62f93f4a72f16cc1e180e07ff7749e8d42e5ea627c3152680d874502e646ceb
Figure 35