I will publish all the data in a recorded lecture summarizing the class, but I wanted to give a sneak preview of some of the geographic data results because the Python code to retrieve the data was fun to build. Click on each image to play with a zoomable map of the visualized data in a new window. At the end of the post, I describe how the data was gathered, processed and visualized.
Where are you taking the class from (State/Country)?
If you went to college or are currently going to college, what is the name of your college or university?
The second graph is naturally more detailed as the first question asked them to reduce their answer to a state versus the second question asking about a particular university. The data is noisy because it is all based on user-entered data with no human cleanup.
Gathering the data
Both fields were open-ended (i.e. the user was not picking from a drop-down). I had no idea how I would ever clean up the data, and when I got 4701 responses, I figured I would just take a look around and realized that my students were from a lot of places. On a lark Friday morning I started looking for the Yahoo! Geocoding API that I had heard about several years ago at a Yahoo! hackathon on the UM campus where I met Rasmus Lerdorf – the inventor of PHP. I was disappointed to find out that Y! was out of the geocoding business because it sounded cool. But I was pleased to find Google’s Geocoding API looked like it provided the same functionality and was available and easy to use.
So I set out to write a spider in Python that would go though the user-entered data and submit it to the geo-coder lookup API and retrieve the results. I used a local SQLite3 database to make sure that I only looked up the same string once. I had two data sets with nearly 6000 items total and the Google API stops you after 2500 queries in a 24 hour period. So it took three days to get the data all geocoded.
I did not clean up the data at all – I just submitted the user-entered text to Google’s API and took back what it said. Then I used Google’s Maps API for Javascript to produce the zoomable maps.
If you are curious about the nature of the spider, I adapted the code from the sample code in chapters 12-14 in my Python for Informatics textbook.
This is a bit of a weird blog post. In my Internet History, Technology, and Security Coursera course, I adjusted the grading policy as the course went along as events happened. I was not pleased with the rubric on my week 2 assignment so I gave all full credit. We had two peer-graded extra credit assignments.
I ended up putting up a second copy of the final exam (hence two exam columns) because some students had reported that something went wrong (hard to verify these things) with the final when they took it the first time. I made it abundantly clear that the second final was only for students who had technical difficulties with the first final. For those who took the second final twice but had a reasonable score on the first exam (19 students) – I decided *not* to add the extra five point penalty after looking at the pattens in the data and merely took the lower of two exam scores. For those students with some obvious technical, internet, user error, or timing problem on the first exam (67 students) – the second exam was *very much needed*. I will share all the data with Coursera tech support and we can dig through logs to try to narrow down what might have gone wrong and see what we can learn.
This all ends up in a Python program to do the grading. I am putting it up for code review for a few days to see if anyone sees a bug before this little bit-o-code decides who gets certificates.
The output includes the computed values and the input data as the last value in the tuple to allow verification and checking of the algorithm.
Dr. Severance taught the online course “Internet History, Technology, and Security” using the Coursera teaching platform. His course started July 23, 2012 and was free to all who want to register. The course has over 46,000 registered students from all over the world and 6,000 are on track to complete the course and earn a certificate. In this keynote, we will look at at the current trends in teaching and learning technology as well as look at technology and pedagogy behind the course, and behind Coursera in general. We will look at the data gathered for the course and talk about what worked well and what could be improved. We will also look some potential long-term effects of the current MOOC efforts.
Charles is a Clinical Associate Professor and teaches in the School of Information at the University of Michigan. Charles is a founding faculty member of the Informatics Concentration undergraduate degree program at the University of Michigan. He also works for Blackboard as Sakai Chief Strategist. He also works with the IMS Global Learning Consortium promoting and developing standards for teaching and learning technology. Previously he was the Executive Director of the Sakai Foundation and the Chief Architect of the Sakai Project.
I am not an expert on rubrics. For the first peer-graded assessment in my Coursera Internet History, Technology, and Security my rubric was really poor. This triggered a discussion in the student forums led by a student named Su-Lyn to produce what the students felt would be the ideal rubric. There was several rounds of edits and comments before the students reached their “final” rubric.
I adopted this Rubric for the rest of the peer-graded assignments for the course and it was far superior to the rubric I used in the first assignment.
The mistake I made in the first rubric I built was that I was trying to construct a rubric that ended up with a average of about 8.5 / 10 – but then all the rubrics were too simplistic and no one felt that they could express their assessment appropriately. The grade on that first assignment was 8.85 / 10 with a standard deviation of 1.49 – pretty much exactly as I planned from a numbers perspective – but the students did not like it.
The student-built rubric was a little harsher and but at least it felt expressive when evaluating basic expository writing – so students assessing each other’s work felt like what they were communicating in their grading was *useful*.
The first peer-graded assignment that used the student-built rubric had a range of -6.0 – 10.0 with an average of 7.15 and a standard deviation of 2.83. Clearly the second rubric was far more expressive.
I don’t like a mean of 7.15 for a straight scale graded course. So I would need to come up with a formula that mapped from the raw score to the actual score. I punted on any formula and just made the peer grading assignments “extra credit” – this meant that students were going to have to fight a little to get those extra points and that felt right to me. If you were going to the the extra credit – you better do some good writing – because if you just cut and pasted Wikipedia in – you would get a quick -6. Negative scores will be changed to zeros – people should not lose points on extra credit.
The last little bit of data is that 5808 students took the first (bad) required peer-graded assignment, 758 students took the second optional assignment and 641 took the third optional assignment. Interestingly the data on the third assignment was a range of -2 – 10 with a mean of 7.99 and a standard deviation of 2.35. I would interpret the drop in the number of students between the second and third assignments as well as the change in range and to mean that students who did badly on the first assignment just gave up and did not submit the second assignment.
This supports my instinct that perhaps in a course like mine, writing needs to be optional / extra credit.
Well, enough of the prelude – on to the question and rubric.
My Question and Rubric
Question: What element of Internet History prior to 2003 would you add to the History of the Internet as described in this course and where would it fit in the course? Draw from the course material and support with additional materials as necessary.
Essay length: 500-1000 words not including references. A separate space for references will be provided – only use this space for references (i.e. don’t continue your essay in this space). There is no specific citation format. While there is no minimum nor maximum required references, most essays will have somewhere between two and five references. If your references are web sites use the URL – if the references are papers include enough information to identify the source using APA http://owl.english.purdue.edu/owl/resource/560/01/ format. Graders will not take points off for syntax errors in references, but they are welcome to suggest how the syntax of references can be improved.
While we would like your answers to be well written, given the number of different languages in the course, graders will **not** take points off for structural mistakes like grammar or punctuation. Graders may *comment* on how to improve the writing technique – but the grade will be based on the quality of the ideas in the answers and how well thought out the arguments are that support those ideas. As graders, please make your comments constructive and helpful and focused on improving learning.
Plagiarism: Looking for plagiarism should *not* be your primary purpose for peer graders. The purpose of the plagiarism deduction rubric is to give graders a way to note when plagiarism is clear and obvious. If you are taking points off from plagiarism, include the source of the material you consider to have been copied in your comments. Please do not add any editorial comments or value judgements about the author or plagiarism in your comments. Be respectful in your comments and make sure to focus on making this a learning experience for the author.
And above all, while the purpose of peer-grading is to assign an accurate score – we are all here to learn. Graders should approach weak or flawed essays as situations where they can help the essay author learn through useful and constructive comments. Our prime directive is to teach each other – within that directive we also assign score.
Interest (4 points): Is the answer interesting to read? Did the answer make you think? Did you learn something from the answer?
0 – No
2 – Somewhat
4 – Yes
Relevance (2 points): Does the essay answer the question? Is the answer on-topic?
0 – No
1 – Somewhat
2 – Yes
Analysis (2 points): Are the ideas logical and communicated clearly? Are the arguments reasonable / plausible? Does the analysis go beyond simply stating the obvious?
0 – No
1 – Somewhat
2 – Yes
Evidence (2 points): Does the essay use good examples? Are the arguments well-supported by facts? Does the essay cite its sources?
0 – No
1 – Somewhat
2 – Yes
Plagiarism (up to 6 point deduction): Is there is evidence of plagiarism such as simply cutting and pasting all or part of text from another source without citing?
0 – The essay did not have any evidence of plagiarism
-3 – A portion of the answer was literal text from another source
-6 – The entire essay was taken from another source
As my Internet History Technology, and Security Coursera course is winding down as students take the final, the discussion is turning to the issue of certificates. There will certainly be certificates for those that meet the minimum score. But there is still a discussion around how the certificates look and what the certificates contain.
One student (R.D.R.C.) proposed a great question:
I know that in some of the other courses on Coursera they are giving a Certificate with Distinction for those how score very high, I was wondering if we would have that here since there is no mention of it. Should those with 75 average get the same “commendation (since it isnt accreditation”) as those who scored 95 or above? Was just wondering.
Here is my answer I posted to the course forum
I am not going to distinguish the certificates. There are lots of factors that lead to the ultimate number of points. An important factor for many was technical issues and problems. I do not have the time to double grade 6000 students assignments if there was a glitch. The grade of “75 points” allowed a certificate to be earned by a diligent student even if there were some technical difficulties. I did not want to make points “so valuable” that students would get upset over every little thing that went wrong. If I set some “91” as “distinguished” – I would start hearing from hundreds of people who got a 90 because there was a bad question or the Coursera software or their Internet connection suffered a glitch on them.
It is also why I am not putting the points earned on the certificate. I was in communication with a student coming to the University of Michgan and he sent me two of his Coursera certificates as evidence of his skill. They were hard courses and so I knew that the certificates represented real work. And the rest of his transcript/resume supported that he was a very talented student.
But his certificates from Coursera had the scores on them – one of his scores was 750 / 700 – and it made me wonder. I did not wonder about the student’s achievement. It make it look to me like the points were too easy to earn – which makes me question the teacher of the class. I too have given “extra credit” in my course – so if you think I am being a little inconsistent – you are right. :) The difference is that in my course, I know *exactly why* I set up grading the way I did and how easy/hard points were to earn. In this other course, I don’t have inside information on how hard points were to get – or what the purpose of extra credit was. My point is that not knowing the grading approach and seeing a 750/700 – caused me to question the *course* but not question not the student’s achievement.
In a course like this, we need to be flexible in awarding points for many reasons – but as a result students who are (a) highly skilled before they come into the class or (b) have nothing go wrong, or (c) have lots of free time and are not juggling family or other schooling achieve these astronomical scores. Including the score to me reduced the value of the certificate IMHO even though the student I was interacting with had an extremely high score. It is like including a grade point averages on a diploma when you graduate – a diploma is far more than your grade point average.
We need to learn in this kind of new teaching and learning pattern that your achievement is not automatically higher because you are in the top 10 percent of the ultimate score. The score is only a proxy / approximation for what you have learned. And what you have taught others as part of a learning community is even more important and hard to measure.
In this class we have some in this course that are near to 120 / 100 – it is great to get these high scores – but it does not diminish those who got 80/100 or even 78/100.
What I am thinking of doing is labeling all the certificates as “the first time the course was taught” (or similar wording) to indicate that you are all pioneers and helped me so much in crafting what the course has become. You are the “first graduating class of the University of IHTS”. From this point forward whenever the course is taught (in Coursera or live) – your contributions wil be part of the course and I thank you.
I ended up in an E-Mail discussion with the folks in the University of Michigan School of Information mailing list and ended up writing this little essay on Coursera – it sounded pretty good so I decided to keep it. It is just kind of a thought piece.
July 23, 2012
There is way too much hype around Coursera and what it means, portends, and how a meteor will strike the earth and cause all of higher education to have a thin dusting of Iridium, etc etc. We need to factor *all* that out.
Coursera is a way for us to share a tiny tiny fraction of our niftiest on-campus courses and faculty members to an extremely wide audience, nearly all of whom will never get to Ann Arbor, let alone be enrolled in the University of Michigan. Sharing what we know and do with the world for the betterment of the world is what we do and who we are and for me as you well know it is doubly what I do and who I am. Hence my work with Sakai, Moodle, Open.Michigan, the open textbooks I write, appenginelearn.com, pythonlearn.com, and every other venue that I can share with the world the things that I do.
Coursera is a wonderful piece of technology that is tuned to allow me to share my material with 35,000 students around the world and it works amazingly well. Me teaching a Coursera course is my *research* in how we can better use technology to get reasonable education in the hands of underserved people. This is a problem that the world must solve in the next 20 years and working with Coursera is the most exciting thing I have done in my career because I can almost touch that seemingly impossible future with the help of Coursera. And there are a bunch of researchers here at SI and the School of Ed that are with me every step of the way in trying to understand this new form and help improve it and evolve it.
Coursera is six classes at UM – it is not a sea-change. It is a grand experiment and one that in my opinion we are duty-bound to participate in to fulfill our mission and if I were not involved, I would be in grave pain because I would know that the future was being explored and I was not part of it.
Now after all that hyperbole, there are some caveats. It is early days. At this point in time, there is no way to achieve the same rigor (there is that word again) in my Coursera course that I achieve in SI502. Even if I put every single lecture and assignment of SI502 into Coursera – it would not be the same as SI502 because of the lack of rigor. Everything in a Coursera course must be scalable and rigor is hard to scale – especially when folks are learning very emergent skills that are cognitively challenging – it is too easy to just quit and walk away. Coursera works well when students strive to gain the knowledge and they fiercely want the knowledge. But in a class like SI502, a large number of students (at least for the first 5-6 weeks) really might be happier without the knowledge in SI502 and if it were a Coursera course they would quietly drop out or game the system to get some weak but passing grade and get the certificate.
So it turns out that *not* putting too much value on the Coursera certificates is an essential founding notion of what it takes to make a scalable course scale. What students get out of these courses is best correlated to whatever they put in – and there is no good measure for that.
But even with all its limitations, Coursera is far better than anything that came before it in to solve the use case of “teach the world”. Folks will find lots of flaws and those flaws are indeed there – but the best way to fix those flaws is to jump in and life with the flaws and let the solutions come to us as we gain experience.