About GPML2021

What’s new?


Figure 1: GPML2021 Schema Diagram

Streamlined Annotations, Citations, and Evidences

In GPML2021, Annotation/AnnotationRef and Citation/CitationRef replaces Biopax OpenControlledVocabulary and PublicationXref/BiopaxRef respectively. Annotations and Citations of a Pathway are written at the end of a GPML, and are referenced using AnnotationRefs and CitationRefs of the Pathway or pathway elements. These new GPML2021 features allow more flexibility in annotating individual pathway elements. And are an improvement on GPML2013a for which Annotations/OpenControlledVocabulary could only be linked to the Pathway and not to individual pathway elements.

AnnotationRef and CitationRef are grouped in CommentGroup along with Comment, Property, and EvidenceRef. DataNodes, States, Interactions, GraphicalLines, Labels, Shapes, and Groups can all have CommentGroup. In addition, CitationRefs and EvidenceRefs can be nested in AnnotationRefs; and AnnotationRefs can be nested in CitationRefs to provide related information.

Related Read: Guide to Adding Annotations, Citations, and Evidences

Annotation and AnnotationRef

An Annotation has:

An AnnotationRef has:

GPML2013a OpenControlledVocabulary compared with GPML2021 Annotation:

Example of GPML2013a OpenControlledVocabulary

<Biopax>
    <bp:openControlledVocabulary xmlns:bp="http://www.biopax.org/release/biopax-level3.owl#">
        <bp:TERM xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">thyroid cancer</bp:TERM>
        <bp:ID xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">DOID:1781</bp:ID>
        <bp:Ontology xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Disease</bp:Ontology>
    </bp:openControlledVocabulary>
    ...
</Biopax>

Example of GPML2021 Annotation and AnnotationRef

<Annotations>
    <Annotation elementId="a2" name="thyroid cancer" type="Disease">
        <Xref identifier="1781" dataSource="DOID"/>
	<Url link="https://identifiers.org/DOID:1781"/>
    </Annotation>
    ...
</Annotations>

<AnnotationRef elementRef="a2">
    <CitationRef elementRef="cbc" />
    <EvidenceRef elementRef="evd" />
</AnnotationRef>

Citation and CitationRef

A Citation has:

From a Citation Xref all information about a publication can be found. Therefore, publication details (e.g. author, title) are not written in GPML2021.

A CitationRef has:

GPML2013a PublicationXref compared with GPML2021 Citation:

Example of GPML2013a PublicationXref and BiopaxRef

<Biopax>
    <bp:PublicationXref xmlns:bp="http://www.biopax.org/release/biopax-level3.owl#" 
        xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" rdf:id="cbc">
        <bp:ID rdf:datatype="http://www.w3.org/2001/XMLSchema#string">7730304</bp:ID>
        <bp:DB rdf:datatype="http://www.w3.org/2001/XMLSchema#string">PubMed</bp:DB>
        <bp:TITLE rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Isotopomer analysis of citric acid cycle and 
        gluconeogenesis in rat liver. Reversibility of isocitrate dehydrogenase and involvement of ATP-citrate lyase in 
        gluconeogenesis.</bp:TITLE>
        <bp:SOURCE rdf:datatype="http://www.w3.org/2001/XMLSchema#string">J Biol Chem</bp:SOURCE>
        <bp:YEAR rdf:datatype="http://www.w3.org/2001/XMLSchema#string">1995</bp:YEAR>
        <bp:AUTHORS rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Des Rosiers C</bp:AUTHORS>
        <bp:AUTHORS rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Di Donato L</bp:AUTHORS>
        <bp:AUTHORS... 
    </bp:PublicationXref>
    ...
</Biopax>

<BiopaxRef>cbc</BiopaxRef>

Example of GPML2021 Citation and CitationRef

<Citations>
    <Citation elementId="cbc">
        <Xref identifier="7730304" dataSource="pubmed"/>
        <Url link="https://identifiers.org/pubmed:7730304"/>
    </Citation>
    ...
</Citations>

<CitationRef elementRef="cbc">
    <AnnotationRef elementRef="a2" />
</CitationRef>

Url

An Url has:

New Evidence Code

New elements Evidence and EvidenceRef (reference to an Evidence) are introduced for the annotation of Evidence Codes. Evidences of a Pathway are written at the end of a GPML after Annotations and Citations, and are referenced using EvidenceRefs. EvidencRef is grouped in CommentGroup along with AnnotationRef, CitationRef, Comment, and Property. DataNodes, States, Interactions, GraphicalLines, Labels, Shapes, and Groups can all have CommentGroup. EvidenceRef can also be nested in an AnnotationRef element.

An Evidence has:

An EvidenceRef has:

Example of GPML2021 Evidence and EvidenceRef

<Evidences>
    <Evidence elementId="evd" value="experimental evidence">
        <Xref identifier="0000006" dataSource="ECO"/>
        <Url link="https://identifiers.org/ECO:0000006"/>
   </Evidence>
   ...
</Evidences>

<EvidenceRef elementRef="evd" />

Updated Extensible Type Definitions

New categories for DataNode (Molecules vs. Concepts), State, Group, and Annotation types are defined. As well as enumeration types for connectorType, StyleType, and Anchor shapeType. All Types are extensible with type string and written in UpperCamelCase format.

DataNode type   State type Group type Annotation type
Undefined (default)   Undefined (default) Group (default) Undefined (default)
Molecules: Concepts: ProteinModification Transparent Ontology
GeneProduct Pathway GeneticVariant Complex Taxonomy
DNA Disease EpigeneticModification Pathway  
RNA Phenotype   Analog  
Protein Alias   Paralog  
Complex Event      
Metabolite        


ConnectorType Line/BorderStyle) Anchor shapeType
Straight (default) Solid (default) Square (default)
Elbow Dashed Circle
Curved Double None
Segmented    

Additionally, LineStyle “Double” and CellularComponent shapeTypes enumeration types are cleaned up so that such information no longer needs to be stored in key-value pair Property.

New Interaction Panel

A new Interaction Panel (Interaction or Line arrowHead type) is introduced. ArrowHead type is extensible with type string and written in UpperCamelCase format. Plugins will be available for:

The GPML Interaction Panel ArrowHead types can be mapped to equivalent Old ArrowHead Types and MIM ArrowHead types.

Interaction Panel Old MIM ArrowHead
Undirected (default) Line  
Directed Arrow  
Conversion   mim-conversion; mim-modification; mim-cleavage; mim-gap; mim-branching-left; mim-branching-right
Inhibition TBar mim-inhibition
Catalysis   mim-catalysis
Stimulation   mim-stimulation; mim-necessary-stimulation
Binding   mim-binding; mim-covalent-bond
Translocation   mim-translocation
Transcription-translation   mim-transcription-translation

Because there are many more SBGN ArrowHead types, they are not mapped to the GPML Interaction Panel. The SBGN Plugin handles SBGN Arrowhead types.

DataNode Type Alias and Attribute AliasRef

An Alias for a Group can be represented by a DataNode with type=”Alias” and aliasRef referring to the elementId of the parent Group.

Example of GPML2021 Group and DataNode Alias:

<DataNode elementId="c27d1" aliasRef="d4107" textLabel="Alias_for_abc" .../>

<DataNode elementId="ed3f8" textLabel="DataNode" type="GeneProduct" groupRef="d4107"  .../ >

<Label elementId="e180c"  textLabel="group_abc" groupRef="d4107".../>

<Group elementId="d4107" style="Complex".../>

Nested Groups are also possible by utilizing DataNode as Aliases.

Example of GPML2021 Nested Groups and DataNode Aliases:

  <DataNode elementId="a1" elementRef="efg" textLabel="Alias_for_efg" />
  <DataNode elementId="a2" elementRef="abc" groupRef="efg" textLabel="Alias_for_abc" />
  <DataNode elementId="a3" elementRef="abc" textLabel="Alias_for_abc" />

  <DataNode elementId="n1" groupRef="abc" textLabel="DataNode;" type="GeneProduct" .../>
  <DataNode elementId="n2" groupRef="efg" textLabel="DataNode;" type="GeneProduct" .../>
  <Label elementId="n3" groupRef="abc" textLabel="group_abc" .../>
  <Label elementId="n4" groupRef="efg" textLabel="group_efg".../>

  <Group elementId="abc" Style="Complex" textLabel="GroupABC" groupRef="efg" />
  <Group elementId="efg" Style="Complex" textLabel="GroupEFG" />

States Nested in DataNodes

Given that a State is always linked to a DataNode, States are moved to be nested inside of DataNodes in GPML2021.

Example of GPML2021 DataNode with States:

<DataNodes>
    <DataNode elementId="b4d99" textLabel="ERBB2" type="GeneProduct">
        <Xref identifier="ENSG00000141736" dataSource="ensembl" />
        <States>
            <State elementId="c0d56" textLabel="+" type="Undefined">
              <Graphics ... />
            </State>
            ...
      </DataNode>
</DataNodes>

Pathway Description and Author Elements

The Description element stores the pathway textual description (previously stored in a Comment element).

Example of GPML2021 Description:

...
<Description> textual description for pathway </Description
...

The Author elements nested in Authors allows the storing of pathway author information, including:

Example of GPML2021 Authors:

<Authors>
		<Author name="Mkutmon" username="Martina Kutmon" order="1">
			<Xref identifier="0000" dataSource="orcid"/>
		</Author>
		<Author name="Egonw" username="Egon Willighagen" order="2"/>
	  ...
</Authors>

Graphics Customization

Additional customization of graphics (e.g. color, shape) is now possible.

Xref DataSource as Compact Identifier Prefix

Xref dataSource(s) will be retrieved from BridgeDb, and it is also possible to register new data sources. The DataSource compact identifier prefix is given priority and written for by default GPML2021 format (in contrast, DataSource full name is written by default for GPML2013a). See registered BridgeDB datasources. Additionally, Xref is added for the Pathway.

Example of GPML2021 Pathway Xref:

  <Xref identifier="WP1_r1" dataSource="wikipathways"/>

For Example:

identifier compact identifier prefix full name system code
ENSG00000139618 ensembl Ensembl En

Merged Group GroupId and GraphId

Group Element previously had both GroupId and GraphId, identification of Groups are merged into just elementId.

Removed Unused Elements and Attributes

Unimplemented elements and attributes of GPML2013a are removed in GPML2021.
For Example:

Modified Schema Format and Structure