September 17, 2007

Using Ruby instead of Awk/SED/Grep/SH/Perl

I have been recently wondering if Ruby is a better language for script kiddie stuff. The kind of stuff I use awk, sed, grep, csh, and sh to accomplish. Some crazy people use perl to do this. But some folks suggest that Ruby is good for this. This is a very different application from Rails - it is super-perl.

I read on the Internet somewhere that Matz had a plan to make a better and more consistent perl when he started Ruby and you can see that in its design and what it has out of the box.

I recently decided to switch to the iPhone (previous post) and had to switch from MeetingMaker to iCal / iPhone / Google Calendar. But all my contacts were in Meeting Maker - what to do?

Meeting Maker has a text export of the contacts - the file is tab delimited and they do line-ends DOS-style. The Apple Address Book takes VCARD format.

Usually faced with this, I write some sed, awk, and shell script stuff - but this looked a little too nasty for that. Usually when I am writing multiline awk programs in separate files to get the job done, I realize that I have gone too far. Often I fall into something like Java at that point and kind of have to start over - but ultimately I get it done in Java feeling kind of bad about it.

So this was the perfect high-motivation situation for me to try Ruby on the command line.

Short story isthat I really liked it - it is simple but powerful - want to make an object? Do it. Want to split lines - do it? Want to ready the whole file into a string and break it into an array of strings based on CRTL-M - do it.

All in all I ended up with a pretty nice Meeting Maker to VCard conversion script. I will likely clean it up, add some doc, and then release it - I include my basic version below.

#!/usr/bin/ruby

# how many columns we expect
MAX = 25

FIRST = 0
LAST = 1
TITLE = 4
COMPANY = 6 
EMAIL = 12
HOME = 13
WORK = 14
CELL = 15 
NOTES = 25

# These are assumed to be contiguous and in the 
# right order, see below
ADDR = 7
CITY = 8
STATE = 9
ZIP = 10
COUNTRY = 11

records = []
lines = []

#depending on line end - this will either get the
# whole file or the first line
firstline = gets
if firstline
  secondline = gets
  if secondline
    # puts "Reading lines from input ..."
    lines << firstline
    lines << secondline
    while gets
      lines << $_
    end
  else
    # split the lines basedon newline CTRL-M 
    lines = firstline.split("\015");
    # puts "Split lines based on newline" 
  end
  # puts "Lines read: " + lines.size.to_s
else
  puts "No Input - nothing to process"
end

# Ignore the first three lines - 
# the first is a blank line
# the second is "Contacts" and 
# the third is the Column headings
3.upto(lines.size - 1 ) { |recpos|
  # puts recpos.to_s + lines[recpos].to_s
  record = lines[recpos].split("\011")
  # puts record.size

  # Set all nils to empty string to simplify code below
  0.upto(MAX) { |j| 
    if record[j] == nil
      record[j] = "" 
    end
    record[j] = record[j].gsub('\n',' ')
    record[j] = record[j].strip
    # print j," ",record[j],"\n"; }
  }

  # Insist on at least a first or last name
  fname = record[FIRST]
  lname = record[LAST]

  # no need to continue...
  if  fname.empty? and lname.empty? 
    # puts "Skipping "+i.to_s
    next
  end

  print "BEGIN:VCARD\n"
  print "VERSION:3.0\n"

  # "FN:AAA First AAA Last\n"
  print "N:"
  print lname unless lname.empty?
  print ";"
  print fname unless fname.empty?
  print ";;;\n"
  print "FN:"
  print fname unless fname.empty? 
  print " " unless ( fname.empty? || lname.empty? )
  print lname unless lname.empty? 
  print "\n";
  
  unless record[COMPANY].empty?
    print "ORG:"+record[COMPANY]+";\n"
  end
  unless record[TITLE].empty?
    print "TITLE:"+record[TITLE]+";\n"
  end
  
  # print "EMAIL;type=INTERNET;type=WORK;type=pref:email@work.com\n"
  unless record[EMAIL].empty?
    print "EMAIL;type=INTERNET;type=WORK;type=pref:"+record[EMAIL]+"\n"
  end

  # print "TEL;type=WORK;type=pref:734-work\n"
  unless record[WORK].empty?
    print "TEL;type=WORK;type=pref:"+record[WORK]+"\n"
  end
  unless record[HOME].empty?
    print "TEL;type=HOME;type=pref:"+record[HOME]+"\n"
  end
  unless record[CELL].empty?
    print "TEL;type=CELL;type=pref:"+record[CELL]+"\n"
  end

  unless record[ADDR].empty? && record[CITY].empty? 
    && record[STATE].empty? && record[ZIP].empty?
    && record[COUNTRY].empty?
    print "item1.ADR;type=WORK;type=pref:;;"
    # Assume they are contiguous
    ADDR.upto(COUNTRY) { |pos|
      field = record[pos]
      print field unless field.empty?
      print ";"
    }
    print "\n"
  end
  print "END:VCARD\n"
}
Posted by csev at September 17, 2007 12:24 PM