Release date: 2001-10-26
Available from: http://rockit.sf.net
Author: Robert Feldt, firstname.lastname@example.org
README version: $Id: README,v 1.11 2001/10/17 02:04:19 feldt Exp $
Project page: http://sourceforge.net/projects/rockit
An easy-to-use, object-oriented compiler construction toolkit written in
and generating code for Ruby. Currently focusing on the "front-end"
phases of compiler construction.
Main features of rockit:
- Grammars written in Extended Backus-Naur Form. (=> use *, ? and + ops).
- Generates both lexer and parser.
- Parsers will return abstract syntax trees (AST).
- Generated AST's support simple tree-walking using iterators.
- "Ruby-friendly" with for example Array's for repetition, Regexp's
for tokens etc.
- More advanced parser than yacc's LaLr(1) (If you're curious its called
"Generalized LR parsing with scanner forking"!)
- AST's can be dumped to postscript (if you have graphviz/dot)
- Reports when the grammar is ambigous and shows the alternative ways
to parse the sentence. Helps you resolve ambiguities.
- Associativity and precedence can be specified based on productions/rules
in the grammar (NOT on operators which is less "portable").
- unpack tarball (if you haven't already)
- install: ruby install.rb
- If you've got RubyUnit you can also run tests: ruby tests/runtests.rb
Yes, but they (racc and rbison) both use the bison/yacc format which, IMHO,
is not optimal for an OO language like Ruby. You also have
to write the "action code" (the one to be executed for each production)
by hand. This is sometimes a good thing if you simply want to extract some
info; but for general use you probably want to make several passes over the
result from the parse (which will likely be an AST). Instead of writing the
code for building the AST rockit does it for you.
- No need to write a lexer/scanner; rockit gives you both a lexer and a
parser from the same grammar.
- No need to write standard code for building an abstract syntax tree;
rockit automatically generates it and you can specify how the tree should
- rockit-generated parsers builds the AST; NO need to write "action code"
in the grammar. "Action code" separated from grammar.
- More powerful operators.
- You can write grammars directly in Ruby code.
- Rockit will show you why your grammar is ambigous (if it is!) by showing
you the two ways the sentence can be parsed. This helps you resolve
the ambiguity by introducing priorities.
In the longer term rockit will include components that are typically not part
of compiler compilers like yacc and bison such as for example syntax-directed
translation, pretty-printer generation etc.
$ rockit my_grammar.grammar myparser.rb MyModule my_parser
Generated parser for my_grammar.grammar and saved it in myparser.rb. Use it by
ast = MyModule.my_parser.parse "..."
parser = Parse.generate_parser <<-'END_OF_GRAMMAR'
Blank = /\s+/ [:Skip]
Number = /\d+/
Expr -> Number [^]
| Expr '+' Expr [Plus: left,_,right]
| Expr '-' Expr [Minus: left,_,right]
| Expr '*' Expr [Mul: left,_,right]
| Expr '/' Expr [Div: left,_,right]
| '(' Expr ')' [^: _,expr,_]
left(Plus), left(Minus), left(Mul), left(Div)
Div = Mul > Plus = Minus
calc_eval(ast.left) + calc_eval(ast.right)
calc_eval(ast.left) - calc_eval(ast.right)
calc_eval(ast.left) * calc_eval(ast.right)
calc_eval(ast.left) / calc_eval(ast.right)
calc_eval(parser.parse '(4*((2+6)-3))/2') # => 10
Memoize from RAA (or my Ruby page) is needed for a slight performance increase.
But should work without it. Please mail me if it doesn't!
Otherwise it should work with any Ruby >= 1.6. If you've got strscan by Minero
Aoki installed it will be used and give a slight performance increase.
But things work even if you haven't.
I've successfully used rockit with Ruby 1.7.1 (2001-09-20) and
cygwin 1.1.8 (gcc version 2.95.2-6) on Windows 2000 Professional. But people
have reported it works with 1.6.5.
NOTE THAT THIS IS AN ALPHA RELEASE SO THERE WILL LIKELY BE BUGS AND
THE API WILL LIKELY CHANGE.
RubyUnit is needed to run unit tests.
Not much yet. Check out the examples in the examples dir.
You can get a good intro to writing grammars by looking at the grammar
for rockit grammars. Its in lib and called 'rockit-grammar-files.grammar'.
You can also compare it to the grammar in bootstrap.rb which is (almost)
the same grammar but written directly in Ruby code.
Also check out the tests. Lots of good info and examples in there.
There are some stuff in the examples dir:
rockit is copyright (c) 2001 Robert Feldt, email@example.com.
All rights reserved.
rockit is distributed under LGPL. See LICENSE and COPYING-LESSER.
Parsers you generate are LGPL so should not restrict you. If it does please
Rockit is currently:
SLOW! Both when generating the parser and when parsing.
I haven't given performance much thought yet and haven't profiled
so expect significant performance gains when we get to this issue on
BAD AT HANDLING AND REPORTING ERRORS! Will be fixed when someone shows me
"the/a right way" to do it.
If you're installing from sources grabbed by CVS you should remove the file
"lib/rockit_grammars_parser.rb" before running "ruby install.rb".
Lots of stuff, see TODO.
I'd appreciate if you drop me a note if you're successfully using
rockit. If there are some known users I'll be more motivated to packing
up additions / new versions and post them to RAA.
Please give feedback!
(You don't need to understand this to use rockit but if you're interested
you might learn something about parsing...)
It is a pseudo-parallel parsing algorithm wihch runs a dynamically varying number of LR parsers in parallel. LR parsing algorithms, such as for example yacc and bison, generate a table with parsing actions. If there is an ambiguity in the language or the generation technique used introduces ambiguities because its "imperfect" there are multiple actions in some position(s) in the table. In ordinary (yacc-style) LALR(1) parsing these are called conflicts and must be resolved by rewriting the grammar or introducing associativity and precedence rules since the parser must take one and only one action. In generalized LR parsing all actions are taken by spawning of parsers for each one of them. So if the ambiguity arose not because of the grammar but because of the limitations of the parser generator all but the parser taking the correct action will fail. And if there are multiple ways to parse the sequence they will all be found! This procedure incurs a performance penalty at compile-time, but it can be overcome by clever encodings of the different parsers and their data.
Robert Feldt, firstname.lastname@example.org