Complete probabilistic analysis of RNA shapes

Table 2 Basic secondary structure grammar. This grammar is a simplified version, included for illustrative purposes. The grammar that is actually used for calculating shape probabilities is larger, owing to the requirement to be unambiguous; see the discussion in paragraph "A non-ambiguous grammar with correct dangles" and Table 6. Part a) shows the grammar in its algebraic form. | | | signifies alternative right-hand sides of productions, ... h the application of choice function h, ^~~~ juxtaposition of terms. <<< denotes application of the operator to its left-hand side to the arguments of its right-hand side. Operators are as in Table 1, plus ul(x) as an abbreviation for ad(x,e), str for structures, and blk for blocks. The axiom of the grammar is struct. Part b) shows the same grammar in EBNF notation, naturally without the operators to be applied.

a)
struct = str <<< comps \|\|\|
str <<< singlestrand \|\|\|
str <<< (e <<< empty) ... h
block = ad <<< singlestrand ^~~~ closed ... h
comps = ad <<< block ^~~~ comps \|\|\|
block
ad <<< block ^~~~ singlestrand ... h
singlestrand = ss <<< region
closed = (hl <<< base ^~~~ region3 ^~~~ base \|\|\|
sp <<< base ^~~~ closed ^~~~ base \|\|\|
sr <<< base ^~~~ (bl <<< region ^~~~ closed) ^~~~ base \|\|\|
sr <<< base ^~~~ (br <<< closed ^~~~ region) ^~~~ base \|\|\|
ml <<< base ^~~~ (ad <<< block ^~~~ comps) ^~~~ base \|\|\|
sr <<< base ^~~~ (il <<< region ^~~~ closed ^~~~
region) ^~~~ base)
'with' basepairing ... h
region3 = region 'with' (minsize 3)
b)
struct = comps \|
singlestrand \|
empty
block = singlestrand closed \|
comps = block comps \|
block \|
block singlestrand
singlestrand = region
closed = base region 3 base \|
base closed base \|
base region closed base \|
base closed region base \|
base region closed region base \|
base block comps base
region3 = base base region
region = base \|
base region
base = 'A' \| 'C' \| 'G' \| 'U'

ISSN: 1741-7007