Skip to main content

Table 2 Basic secondary structure grammar. This grammar is a simplified version, included for illustrative purposes. The grammar that is actually used for calculating shape probabilities is larger, owing to the requirement to be unambiguous; see the discussion in paragraph "A non-ambiguous grammar with correct dangles" and Table 6. Part a) shows the grammar in its algebraic form. | | | signifies alternative right-hand sides of productions, ... h the application of choice function h, ~~~ juxtaposition of terms. <<< denotes application of the operator to its left-hand side to the arguments of its right-hand side. Operators are as in Table 1, plus ul(x) as an abbreviation for ad(x,e), str for structures, and blk for blocks. The axiom of the grammar is struct. Part b) shows the same grammar in EBNF notation, naturally without the operators to be applied.

From: Complete probabilistic analysis of RNA shapes

a)

struct = str <<< comps |||

   str <<< singlestrand |||

   str <<< (e <<< empty) ... h

block = ad <<< singlestrand ~~~ closed ... h

comps = ad <<< block ~~~ comps |||

   block

   ad <<< block ~~~ singlestrand ... h

singlestrand = ss <<< region

closed = (hl <<< base ~~~ region3 ~~~ base |||

   sp <<< base ~~~ closed ~~~ base |||

   sr <<< base ~~~ (bl <<< region ~~~ closed) ~~~ base |||

   sr <<< base ~~~ (br <<< closed ~~~ region) ~~~ base |||

   ml <<< base ~~~ (ad <<< block ~~~ comps) ~~~ base |||

   sr <<< base ~~~ (il <<< region ~~~ closed ~~~

   region) ~~~ base)

   'with' basepairing ... h

region3 = region 'with' (minsize 3)

b)

struct = comps |

   singlestrand |

   empty

block = singlestrand closed |

comps = block comps |

   block |

   block singlestrand

singlestrand = region

closed = base region 3 base |

   base closed base |

   base region closed base |

   base closed region base |

   base region closed region base |

   base block comps base

region3 = base base region

region = base |

   base region

base = 'A' | 'C' | 'G' | 'U'