{"id":571,"date":"2008-11-30T21:57:25","date_gmt":"2008-12-01T01:57:25","guid":{"rendered":"http:\/\/www.dr-chuck.com\/wordpress\/?p=571"},"modified":"2011-12-17T12:28:35","modified_gmt":"2011-12-17T16:28:35","slug":"visualizing-sakai-commit-data","status":"publish","type":"post","link":"https:\/\/www.dr-chuck.com\/csev-blog\/2008\/11\/visualizing-sakai-commit-data\/","title":{"rendered":"Visualizing Sakai Commit Data"},"content":{"rendered":"<p>I am preparing for my Data Mining\/Social Networking course next semester (SI301) and of course the main project will be about analyzing Sakai activities as a developer community and social community.  Of course visualization will be a big part of the course activity.<\/p>\n<p>\nSo I got a head start and played with some data from commit logs.  Because I was not aware of the xml output from svn -log (thanks Seth) &#8211; I did this by screen scraping ViewSVN at source.sakaiproject.org.  I scraped all the commits in the main SVN and Contrib into some SQLite3 databases.  The databases allowed me to restart the process if it croaked.  It took about a day and I ended up with 540 MB of data.  Then I ran a process to parse and categorize the data &#8211; that process read all 540MB and produced a nice, normalized database of 3MB in about 20 minutes.<\/p>\n<p>\nAfter I had my nice 3MB database, I wrote some Python+Sqlite3 scripts to grock, and accumulate the data various ways.  The work is still in draft form.<\/p>\n<p>\n<a href=http:\/\/www-personal.umich.edu\/~csev\/sakai\/data\/>http:\/\/www-personal.umich.edu\/~csev\/sakai\/data\/<\/a><\/p>\n<p>\nThere are five graphs &#8211; the most fun one is the <a href=http:\/\/www-personal.umich.edu\/~csev\/sakai\/data\/movement-all.htm>movement graph including both the main SVN and Contrib<\/a>.  <b>Hint:<\/b> To make it visually more fun follow these hints: (1) Change from &#8220;Same Color&#8221; to &#8220;Unique Colors&#8221;, (2) Change from &#8220;Same Size&#8221; to &#8220;Cumulative&#8221;, and (3) Experiment switching from Linear to Log scale to make it more exciting, (4) hover over a data point to see who it represents, and (5) you can click on a data point and then replay the data to make it have a &#8220;trail&#8221;.<br \/>\n<center><\/p>\n<p><a href=\"http:\/\/www-personal.umich.edu\/~csev\/sakai\/data\/movement-all.htm\"><img decoding=\"async\" src=\"http:\/\/www-personal.umich.edu\/~csev\/sakai\/data\/movement-all.jpg\" width=500 border=0 \/><\/a><br \/>\n<\/center><\/p>\n<p>\nThis is all pretty cool and it uses the <a href=\"http:\/\/code.google.com\/apis\/visualization\/\" target=\"_new\">Google Vizualization API<\/a> which does all the work in Flash\/Javascript.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I am preparing for my Data Mining\/Social Networking course next semester (SI301) and of course the main project will be about analyzing Sakai activities as a developer community and social community. Of course visualization will be a big part of the course activity. So I got a head start and played with some data from [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-571","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/posts\/571","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/comments?post=571"}],"version-history":[{"count":1,"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/posts\/571\/revisions"}],"predecessor-version":[{"id":2682,"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/posts\/571\/revisions\/2682"}],"wp:attachment":[{"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/media?parent=571"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/categories?post=571"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dr-chuck.com\/csev-blog\/wp-json\/wp\/v2\/tags?post=571"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}