Wednesday, May 16, 2007

Oniguruma and Named Regexes in Ruby 1.8

Unlike Python and C# (and probably others) Ruby 1.8 does not support named groups. This is available in 1.9 though. Or so I hear.

What are named groups? Basically they allow you to reference the results of the match option with the nice names you embedded in the regex object. Since Blogger hates anything with <>'s you can see an example here . Look near synopsis.

So how do you get this in 1.8.6?

Windows users have it easy, just install the win32 GEM, but OSX (and I assume other UNIX) require a bit of extra steps and it is sort of confusing because there are lots of versions of the Oniguruma C Library (do I really want to download stuff hosted on geocities Japan?) to choose from and there is a way to install it which requires recompiling your Ruby but that only works for 1.8.4 so I didn't bother.

So here is what I did to get it working on OSX (PPC) with 1.8.6

1. Download 4.6.2 of the C Library. Configure with whatever prefix you are using for Ruby (I use /my) and make install

2. Download version 1.10 of the GEM Source Tarball and do the standard ruby extconf.rb dance within ext. However

Change INFLAGS in the Makefile so it can find the oniguruma.h

Mine looks like this because ruby is installed in /my

INCFLAGS = -I. -I. -I/my/include -I/my/lib/ruby/1.8/powerpc-darwin8.9.0 -I.

If you don't do this you will get


gcc -I/my -I. -I. -I/my/lib/ruby/1.8/powerpc-darwin8.9.0 -I. -fno-common -Wall -c oregexp.c
oregexp.c:2:23: error: oniguruma.h: No such file or directory
oregexp.c:9: error: parse error before 'regex_t'
oregexp.c:9: warning: no semicolon at end of struct or union
oregexp.c:10: warning: type defaults to 'int' in declaration of 'ORegexp'
oregexp.c:10: warning: data definition has no type or storage class
oregexp.c:15: error: parse error before '*' token
oregexp.c: In function 'oregexp_free':
oregexp.c:16: warning: implicit declaration of function 'onig_free'
oregexp.c:16: error: 'oregexp' undeclared (first use in this function)
oregexp.c:16: error: (Each undeclared identifier is reported only once
oregexp.c:16: error: for each function it appears in.)
oregexp.c: In function 'oregexp_allocate':
oregexp.c:21: error: 'oregexp' undeclared (first use in this function)
oregexp.c: At top level:
oregexp.c:27: error: parse error before '*' token
oregexp.c:27: warning: return type defaults to 'int'
oregexp.c: In function 'int2encoding':


If you compile it successfully you'll see the library (oregexp.bundle, whatever that is) put in /my/lib/ruby/site_ruby/1.8/powerpc-darwin8.9.0 (or whatever you path is)

However require 'oniguruma.rb' will still fail until you do this:

cp oniguruma.rb /my/lib/ruby/site_ruby/1.8/


But it if you did get it installed you can run the test suite


franz-g4:/tmp root# ruby test_oniguruma.rb
4.6.2
Loaded suite test_oniguruma
Started
..............................................
Finished in 0.046221 seconds.

46 tests, 105 assertions, 0 failures, 0 errors


But of course the example from the README doesn't work (either on OSX or Windows) in typical Ruby fashion.

Simple, easy, and fun!

Tuesday, May 15, 2007

Death, Rebirth, and yet another Ruby Blog

I deinstalled (yes, FreeBSD is assimilating me) the ports version of Ruby on my 12-inch Powerbook (still the best damn laptop, ever-Intel Macs may be fast but they are flimsy, hot, and noisy!) this afternoon due to all the pain of getting trying to onigurumu installed just so I could do named group regular expressions like in Python (more on that later).

But sometimes it is good just to clean house and start a new page. So except for the Rails nonsense, I did pretty much everything here. You know, installing libreadline, a fresh new install 1.8.6 tarball, etc. Except I'm starting from scratch with everything in /my.

But enough procrastination. Why the new blog? Why another Ruby blog? Well, the manifesto (oh, yeah, and that I'm spending too much time on Ruby on BlogFranz) is below. I hope you hold me (or us, should anyone else agree to the spirit, if not necessarily the letter of this effort) to it:

Back to the Basics - I happened to look at the README during the build and the non-hyped description of Ruby (obviously back before Ruby was cool) was:

Ruby is the interpreted scripting language for quick and easy object-oriented programming. It has many features to process text files and to do system management tasks (as in Perl). It is simple, straight-forward, and extensible.

Unfortunately most of the stuff I try to do (which is simple and system-management-ish) that I could achieve with a blink of an eye in Python (because it just works and is in there) does not work (or is not built into the standard library) with Ruby. I guess everybody is doing web development in Ruby except me. Or the various OSX, Linux, FreeBSD, OpenBSD boxes I do work on with Ruby are so hopelessly broken. I assume others are in this boat as well.

Which leads me to #2.

Rails-free - This is a timewarp. We go back to 2001 when I first encountered Ruby and it was just weird and Japanese and the socket APIs were crappy, before Metasploit 3.0. Before BaseCamp (yes my previous company actually used it for a while, and it was "allright" if a bit sluggish). Back when my wife owned the only Mac in the family. Bottom line, if its' related to Rails you won't find it here. Period. End of story. No exceptions.

Mad as hell - The "happy-ass-ness" (yes, a psychological term) of Ruby (packaging bliss, instant enlightenment? fun programming?) is maddening to me. Vomitus negro. I loathe OReilly Ruby blogs and in particular I don't like this guy or anyone else that that does "math for fun." What you find here will be hard, critical, and hype-free. No advocacy. No fanboys. Just blogs on (or references to other blogs) on how to get shit done and solve problems. A lot of this began over on BlogFranz where I talked about sloppy documentation, insecure implementations, YAML, API comparisions, and other fare that was getting way too much attention there, where I want to focus more on security, Linux, and BSDs.

JRuby - You think Ruby is bad. Try Java? But sometimes there are Java APIs you just have to use (or abuse). Jython is obsolete and what is the point of something like Groovy?!

So there you have it. I've decided to use it with UbuntuTrinux and we're using it a work. So I have to be make the best of using Ruby. So let the fun begin. And yeah, I'm not really mad. Just a blog persona.