Code Review Requested: Coursera IHTS Grading Algorithm

This is a bit of a weird blog post. In my Internet History, Technology, and Security Coursera course, I adjusted the grading policy as the course went along as events happened. I was not pleased with the rubric on my week 2 assignment so I gave all full credit. We had two peer-graded extra credit assignments.

I ended up putting up a second copy of the final exam (hence two exam columns) because some students had reported that something went wrong (hard to verify these things) with the final when they took it the first time. I made it abundantly clear that the second final was only for students who had technical difficulties with the first final. For those who took the second final twice but had a reasonable score on the first exam (19 students) – I decided *not* to add the extra five point penalty after looking at the pattens in the data and merely took the lower of two exam scores. For those students with some obvious technical, internet, user error, or timing problem on the first exam (67 students) – the second exam was *very much needed*. I will share all the data with Coursera tech support and we can dig through logs to try to narrow down what might have gone wrong and see what we can learn.

This all ends up in a Python program to do the grading. I am putting it up for code review for a few days to see if anyone sees a bug before this little bit-o-code decides who gets certificates. I include the code and some sample output with names removed. You may need to make your screen quite wide or just view source to get the real data.

Code

fh = open('Gradebook.csv')
grades = list()
for line in fh:
    line = line.rstrip()
    fields = line.split(';')

    for i in range(len(fields)) :
        if len(fields[i]) < 1 : fields[i] = '0'

    # Skip header lines
    try : id = int(fields[0])
    except: 
        print line
        continue

    name = fields[1]
    name = name.replace('"','')

    # field[2] : "Late Days Left"
    # field[3] : "Optional: Demographic Survey"
    # field[4] : "Optional: Propose Final Exam Questions"

    quiz = 0
    # Quizzes for all 1, 3-7 weeks (Week 2 was field[13])
    for i in [5, 6, 7, 8, 9, 10] :
        quiz = quiz + float(fields[i])

    # Exams
    exam1 = float(fields[11])
    exam2 = float(fields[12])

    if ( exam2 == 0.0 ) :
        exam = exam1
    elif ( exam1 <= 10 and exam2 > exam1 ) :
        # print 'Second exam OK ', id, name, exam1, exam2
        exam = exam2
    else:
        # print 'Second exam penalty ', id, name, exam1, exam2
        exam = exam1
        if exam2 < exam1 : exam = exam2
        # exam = exam - 5  # penalty
        if exam < 0 : exam = 0

    # fields[13] was peer-graded week 2 - free 10 points
    excr1 = float(fields[14])
    if ( excr1 < 0 ) : excr1 = 0
    excr2 = float(fields[15])
    if ( excr2 < 0 ) : excr2 = 0
    excr = excr1 + excr2
    
    # Ten points was week 2 peer-graded assessment
    tot = quiz + excr + exam + 10  
    # print fields, tot, quiz, excr, exam, exam1, exam2
    tup = (tot,quiz,excr,exam,id, name, line);
    # print tup
    grades.append( tup )

grades.sort(reverse=True)
for i in grades:
    print i

Output

The output includes the computed values and the input data as the last value in the tuple to allow verification and checking of the algorithm. I have manually line-broken the header line. Names are first and last initial and the user id is all zeros to obscure the data.

"User ID";"Full Name";"Late Days Left";"Optional: Demographic Survey";
"Optional: Propose Final Exam Questions";"Week 1 Quiz";"Week 3 Quiz";
"Week 4 Quiz";"Week 5 Quiz";"Week 6 Quiz";"Week 7 Quiz";
"Final Exam - IHTS";"Final Exam (2) - Do not take the Final Twice - See Email";
"Internet HTS - assignment 1";"Extra Credit - Assignment 1";"Extra Credit - Assignment 2"

(120.0, 60.0, 20.0, 30.0, 0000, 'DT', '0000;"DT";8;5.125;0;10;10;10;10;10;10;30;;9;10;10')
(120.0, 60.0, 20.0, 30.0, 0000, 'AC', '0000;"AC";8;6.25595;;10;10;10;10;10;10;30;;10;10;10')
(120.0, 60.0, 20.0, 30.0, 0000, 'JB', '0000;"JB";8;;;10;10;10;10;10;10;30;;10;10;10')
(119.5, 60.0, 19.5, 30.0, 0000, 'BL', '0000;"BL";8;6.71429;;10;10;10;10;10;10;30;;10;9.5;10')
(119.0, 60.0, 20.0, 29.0, 0000, 'KP', '0000;"KP";8;5.72619;;10;10;10;10;10;10;29;;9;10;10')
(106.5, 50.0, 18.5, 28.0, 0000, 'PJ', '0000;"PJ";8;4.83333;;0;10;10;10;10;10;28;;;8.5;10')
(99.8, 44.75, 18.0, 27.0, 0000, 'VK', '0000;"VK";0;;;9;8.75;7;0;10;10;27;;7;10;8')
(80.9, 47.9, 0.0, 23.0, 0000, 'MM', '0000;"MM";8;6.875;;7;9.75;8;7;8.25;7.9;23;;9;;')
(77.8, 43.8, 0.0, 24.0, 0000, 'DE', '0000;"DE";8;4.76786;;8;0;8;10;8;9.8;24;;7;;')
(76.8, 36.8, 0.0, 30.0, 0000, 'MS', '0000;"MS";8;5.98809;;9;0;0;9;9;9.8;30;;6.4;;')
(60.8, 50.75, 0.0, 0.0, 0000, 'JA', '0000;JA;8;5.96429;;9;9;10;7;5.75;10;;;9;;')
(55.8, 19.8, 0.0, 26.0, 0000, 'JO', '0000;"JO";7;5.09524;;10;0;0;0;0;9.8;26;;9;;')
(48.8, 38.75, 0.0, 0.0, 0000, 'AG', '0000;"AG";1;;;10;9.75;10;9;-0;;;;7.2;;')

This is a *tiny* representative sample pulled from a 45627 line resulting output from the program.