Babel Bridge
Parse Tree Class Structure
Once you are familiar with how to do basic parsing in Babel-Bridge, you'll want to do something with the results. The first step is understanding the parse tree class structure.
Defining a parser automatically generates several classes. For example:
class MyParser < BabelBridge :: Parser
rule :foo , "foo"
end
Generates:
MyParser < BabelBridge :: Parser
MyParser :: FooNode < BabelBridge :: Node
MyParser :: FooNode1 < MyParser :: FooNode
FooNode was generated by the :foo rule. It inherits from the BabelBridge::Node class. FooNode1 represents the first (and only) varient of :foo. FooNode is never instantiated, but FooNode1 will be created whenever the first varient of :foo matches.
irb example:
> > MyParser . new . parse ( "foo" ). class
=> MyParser :: FooNode1
You can examine the children of FooNode1 with the matches method:
> > MyParser . new . parse ( "foo" ). matches
=> [ "foo" ]
> > MyParser . new . parse ( "foo" ). matches [ 0 ]. class
=> BabelBridge :: TerminalNode
Let's do a more complex example. Below is a parser that recognizes any number of non-negative integers concatenated by pluses. Note that the :add rule has two variants which will create two variant sub-classes, AddNode1 and AddNode2, of the rule's parse-tree-node class AddNode.
class MyMathParser < BabelBridge :: Parser
rule :add , :number , "+" , :add
rule :add , :number
rule :number , /[0-9]+/
end
puts MyMathParser . new . parse ( "34+12" ). inspect
Running the code above outputs:
MyMathParser :: AddNode1
MyMathParser :: NumberNode1 > "34"
"+"
MyMathParser :: AddNode2 > MyMathParser :: NumberNode1 > "11"
If you inspect the classes of the child matches of the root AddNode1, you'll get:
> > MyMathParser . new . parse ( "34+12" ). matches . collect {| m | m . class }
=> [ MyMathParser :: NumberNode1 ,
BabelBridge :: TerminalNode ,
MyMathParser :: AddNode2 ]
Every rule consists of one or more pattern elements which must match in order. The index of each pattern element directly corresponds
to the index of it's parse-tree-node in the matches list.
There are several ways to access the children matches of a Node. All of the examples below return the parse-tree-node for the first number:
# returns the first matched pattern-element
MyMathParser . new . parse ( "34+12" ). matches [ 0 ]
# shortcut that also returns the first pattern-element
# '.matches' is optional
# Nodes implement Enumerable over their matches
MyMathParser . new . parse ( "34+12" )[ 0 ]
# matched sub-rules can also be accessed by name
MyMathParser . new . parse ( "34+12" ). number
Adding Functionality to the Parse Tree
Manually walking the parse tree is nice and all, but things really start to get fun when we start adding some methods to the rule-varient parse-tree-nodes. This is done adding a ruby do-block to the end of a rule declaration. Inside this do-block you can add anything you want to that rule varient's class definition.
Example:
class MyMathParser < BabelBridge :: Parser
rule :add , :number , "+" , :add do
def result
number . result + add . result
end
end
rule :add , :number
rule :number , /[0-9]+/ do
def result
to_s . to_i
end
end
end
puts MyMathParser . new . parse ( "34+12" ). result
# outputs "46"
There is a little bit of magic going on here. First, for the first varient of :add (AddNode1), we define a method "result". The result is just the sum of the results of the left and right-hand-sides of the add operator. We can access the sub-matched parse-tree-nodes by their rule names - in this case "number" and "add". Then we just recursively call "result" on them and add their return values.
The second bit of magic is in :number's "result" method, we call to_s on self. The to_s method on a Node just returns the string of characters that rule matched. In this case, a string of digits are returned and calling to_i on them gives us the integer value.
The last bit of magic is we never define a "result" method for the second varient of :add (AddNode2). By convention, if a Node doesn't know how to respond to a method, it forwards the method call to its first sub-match. In this case, calling "result" on AddNode2 automatically calls "result" on the sub-matched NumberNode1.