Monthly Archives: November 2008

Visualizing Sakai Commit Data

I am preparing for my Data Mining/Social Networking course next semester (SI301) and of course the main project will be about analyzing Sakai activities as a developer community and social community. Of course visualization will be a big part of the course activity.

So I got a head start and played with some data from commit logs. Because I was not aware of the xml output from svn -log (thanks Seth) – I did this by screen scraping ViewSVN at source.sakaiproject.org. I scraped all the commits in the main SVN and Contrib into some SQLite3 databases. The databases allowed me to restart the process if it croaked. It took about a day and I ended up with 540 MB of data. Then I ran a process to parse and categorize the data – that process read all 540MB and produced a nice, normalized database of 3MB in about 20 minutes.

After I had my nice 3MB database, I wrote some Python+Sqlite3 scripts to grock, and accumulate the data various ways. The work is still in draft form.

http://www-personal.umich.edu/~csev/sakai/data/

There are five graphs – the most fun one is the movement graph including both the main SVN and Contrib. Hint: To make it visually more fun follow these hints: (1) Change from “Same Color” to “Unique Colors”, (2) Change from “Same Size” to “Cumulative”, and (3) Experiment switching from Linear to Log scale to make it more exciting, (4) hover over a data point to see who it represents, and (5) you can click on a data point and then replay the data to make it have a “trail”.


This is all pretty cool and it uses the Google Vizualization API which does all the work in Flash/Javascript.

Wow – An Index really matters

I am doing some spidering using Python and SQLite 3 – I made a process to allow my spider to be restarted by retrieving and storing spidered material i the database – to make sure that I could restart this – I was doing the following select to find a url to make sure I did not re-retrieve:
select text from revisions where url = ? limit 0,1
But after there were about 50000 pages – this started to slow down a lot. It was doing a full scan. So I stopped the process and added an index:
create unique index revidx on revisions (url) ;
Wow – it is so much faster. Nothing like a ton of data to remind you of the speed difference between Order(N) and Order(Log N).
I am liking Python and its built in support for Sqlite3. Nice.

Laptop Serendipity

This is a Shaggy Dog story about my Mac Air. I have been doing some Sakai coding lately (adding LTI to Melete) so I have been carrying both my MacBook Pro and MacAir in my backpack this week. The MBP is just *so much faster* when doing big stuff that it was worth carrying around two laptops. So my backpack weighs differently than usual this week.
This is a short and stressful week – My SI539 Assignment 10 (true to form) missed a few essential steps at the beginning so I had a lot to do in lab Tuesday night running around – figuring our how students got confused from my instructions and figuring about how to un-confuse things for the whole class as quickly as possible. I foolishly sent out some code and by mistake sent out a partial answer key. And I did not have slides ready for Wednesday morning’s lecture (10 short hours away) – so I wanted to get out of lab on time – but I need to stay.
All in all Tuesday evening was pretty frazzling. So when I put my Mac Air laptop down in the third row working with one student and then ran off to help several other students – I quickly forgot about the laptop. As folks got a little caught up – and the last bus was leaving the room started clearing out. I rushed to the front grabbed the backpack – got dressed and took off. The backpack *felt* pretty good. It was heavy and fled about right. Of course the Air is so light – weight is a *bad* way to judge whether or not you have what you need.
Of course I left the Air sitting on a desk in the third row.
About 20 minutes later cruising downthe highway – I get a e-mail on my iPhone from Lisa – the last student left in the lab – telling me that I left my laptop! Of course! I quickly told her to grab it and bring it to class – but she never got the message.
She was unable to decide whether to take it or not – she thought if she took it – I would be rushing back and it would be gone – if she did not take it – of course that is a a “not in possession of a known person” problem.
I decided not to turn around since I was 1/3 of the way home and it was 11PM and I had to be back at 7AM anyways for the next day’s lecture – I figure that either Lisa took the laptop or ti would be there at the crack of dawn when I got in.
Of course between 11PM and 7AM I had to do some slide preparation. Here is where I lucked out – I had done a full time machine backup at 6AM that very same day. I plugged the Time Machine disk into my MacBook pro – and Viola! I could get my partially worked on slides onto the MBP. SO I worked slides that night before going to bed at 1:30AM. Teh I got up at 5:00 AM and worked on slides some more and at 6AM I started to drive back to UM.
Up to this point – no mail from Lisa – so it was “rush into the lab” and hunt for the Laptop. The library was not open until 8AM. So I waited until it opened and rushed in (with the 2 other 8AM library patrons) and rushed up to the lab.
Of course – no laptop. Not so good.
I figured – not much could be done – so I pulled out my MacBook and started finishing the last 15 slides or so aiming for the 9AM lecture. I sat alone in the lab working on slides in the partial darkness for about 25 minutes.
Then the door opened and the Janitor popped in with a Vacuum cleaner. He says, “Sorry – usually there is no one here at this time in the morning – I hope I don’t bother you.” I said, “Don’t worry – I am only here because I am depressed because I stupidly left my laptop here last night and it is not here.”.
He said, “That’s funny – I found a laptop here last night. The guy who is missing the laptop must be pretty bummed – it is one of those fancy Apple ones and it has a bunch of stickers on it.”
So I said “Where would the laptop be now?” He said he gives that kind of stuff to his boss – and if I hurry down to the front desk – they might have not turned it over to the police yet. He said the police are a little slow at 8AM.
I said, “Boy I am glad you had to vacuum this room this morning.” He said, “you just must have good Karma.”. I agree that it never hurts to have a bit of spare Karma around for situations just like this.
So I thank him profusely and rush down to the front desk. There is my pride and joy on the back counter – I tell them that it is mine. They ask me to identify it – Duh – they were kidding. Their tech support guy had looked around on the desktop and realized it was my computer. So with a flash of my UM id, I had my baby back! And it was still 8:20 – time enough to get a $2.00 latte, knock out 5-6 more slides and make lecture at 9:10.
At lecture – I ran into Lisa and she shared with me that she just could not decide if she should have taken it or not. I told her that I generally prefer “in possession of a known human” versus “sitting on a desk in the open” any time.
After lecture I caught up on my e-mail and had a nice message from the library staff telling me they had my laptop. Since I had already picked it up – I just sent a nice thank you note.
All is well that ends well. Maybe I should put a password on the system. Hmmm.

Mission: AC/DC

A few months back, I promised Brent that we would go see AC/DC. I kind of figure that this is likely the last chance to see them touring. Unfortunately they were in Detroit on November 5 and I was away at an IMS meeting. There were several other options – there was a Saturday date in Chicago and some week day dates in Cleveland and Columbus. I teach during the week so the weekday dates were impractical.
The only other weekend dates were in Los Angeles at the Forum. So Brent and I will be off to Los Angeles to see AC/DC on December 6 in the Forum. He will miss one day of school (Monday). The seats are not so great – I got them from StubHub at about 2.5X face value – the tickets are pretty pricey but this probably is the last opportunity for another 5-10 years – so what the heck. CHristmas comes a little early.

I Want This Right Now: Google Friend Connect

This is the coolest thing since sliced bread. It is a set of OpenSocial Widgets that you can add to your site that makes your site “instantly social”. Take a look:
http://www.google.com/friendconnect/
I want this SOOOO badly. Because I am writing all these widgets as we speak. And my stuff is not OpenSocial enabled – and the Google UI’s are much better than mine – makes me sad to be wasting time – but until I have access to these awsome widgets – I simply cannot wait and have to keep coding. Grrr. Sakai 3.0 needs these as part of ts initial release.
There is a great video:

And a cool sample site:
http://www.myfirstearthquake.com/index_fc.php
I have sent a note to Google pleading for early access.

Book Proposal: Google Application Engine – Up and Running

As I have been teaching SI502 (www.si502.com) and SI539 (www.si539.com) I have been accumulating a bunch of lecture and written material about the Google Application Engine. It started to very much feel like a book that should have a pretty good market if it could be done quickly.
So I called up my buddy Michael Loukides at O’Reilly and pitched a book proposal to him. Half of the book already has first draft written materials from SI539 and the other half of the book has well-developed lecture materials from a combination of SI539 and SI502.
Here is my book proposal for you to look at if you find it interesting:
http://www.appenginelearn.com/chapters/toc.htm
The proposal has links to existing printed and lecture material for all of the proposed chapters.
I hope and expect to be able to keep the first drafts of the AppEngine chapters in their current form under Creative Commons Attribution License. Once I sign a contract with O’Reilly, further work will be according to their terms and conditions. Of course ORA is pretty cool about giving books back to me under CC after they are out of print – another reason I like ORA as a publisher.
As always, comments are welcome. I may be looking for reviewers if the book moves forward.

Pedagogy: Note to Self – Never Violate this Rule Again!

Big mistake giving out the grade assignment and asking to turn it into a login assignment. DOH! There are so many unintended consequences! it is better to hand out a complex chapter with bits and pieces scattered through out and have the students construct the application by typing in little snippets of code and getting them right – working through little issues that naturally come up.
Far more learning happens as students attempt something and backtrack and re-read/re-attempt stuff.
The next 539 assignment will be different. I have to send a note to my future self to make sure I don’t make this mistake over and over each semester.

Neat Keynote on Google Search from Google I/O

This is a keynote at the Google I/O developers conference by Marissa Mayer, VP of Search and User Experience. I was there at Google I/O and really loved the talk:
Keynote: Imagination, Immediacy, and Innovation… and a little glimpse under the hood at Google
She talks about user studies and the design of the search page – seemingly simply stuff – but a lot of thought goes into things.

Learning Objectives .versus. learning objectives

Jeff MacKie-Mason is our Dean of Academic Affairs at the UM School of Information and my boss. He is a cool dean and I really like his style – he has agenda which he does move forward firmly but take a really easygoing approach to things and always takes input along the way.
Recently we adopted a policy about requiring Learning Objectives in all syllabuses starting Winter 2009 – and in Dean 2.0 fashion – Jeff used his blog to host an open discussion about the policy.
Here is Jeff’s original blog post on the topic.
I tried and tried to resist commenting – but a lazy Saturday morning, a free keyboard, a fresh and warm cup of coffee, and the fact that it is really too cold outside to do anything but write blog posts – and I was hooked.
I reproduce my comments below (which will likely get me in trouble – but what the heck – Jeff asked…)

Continue reading