Paper 2 Analytics: first explorations

Posted on May 4, 2014

I’m a connoisseur for chemistry exams.  (I need a life!)  Writing good exam is an art, where the final paper need to represent an optimal blend of objectives (skills required), topic distribution, chemical theme, and algorithmic complexity, all within a constraint of points.  It’s quite sad that this intricacy is invisible to most.

IB chemistry paper 2 is distinctive in that while the format is preserved over the past 20 years — 135 minutes, 90 marks; 40 marks compulsory, 25×2 marks optional — there is a subtle evolution within the questions themselves, with the later years reading more beautiful to my eyes.  I think I see a trend of more even topic coverage, enhanced connection between topics, and deeper links with the practical side of doing chemistry.  This is manifested on both the scale of individual questions as well as over the whole paper.

A hunch does not science make, and I started doing some analytics to better understand what it is that my “sense of beauty” is telling me.  The first take is to emulate the IB Questionbanks: simply take an exam and categorize each question as a topic (e.g., 9 – redox) — points (e.g., 12 pts for Q1) pair.  Doing this for May 2000 / 2006 / 2009 / 2013 shows that it “works”, but not at a fine enough granularity; it is much better to tackle this on a sub-topic (e.g., 20.4, elimination) and sub-question (1a-iii) scale.

Doing this on a spreadsheet, using some primitive conditional formatting and summation, can show the big pattern: an example for the May 2000 paper 2 is shown here.  (The original spreadsheet.)

Visualizing May 2000 paper 2.

Visualizing May 2000 paper 2.  (Incomplete; the entire paper would span another page.)

Procedurally I first (A) established the sub-topics (color-coded for HL/SL material) and (B) questions and their splitting.  For each sub-part I would (C) enter its point value, and (D) locate the relevant sub-topics (referencing their point values from part C1).

These can then be (E) automatically tallied up, and conditional formatting to visualize the representation of the topics.

The advantage of this is its technical simplicity; entering new papers is a simple matter, and the results is immediately visible.  I’ll be referring to this as the Instant Gratification method.

Having thought about this some more, however, this instant gratification is ultimately a waste of time.  A much more upwind option is to systematically tag each sub-question for:

  • points
  • objective (skill requested: e.g., “define”, “calculate”, etc)
  • theme (e.g., organic, environmental)
  • sub-sub-topic (e.g., 4.2.1) in the 2009-15 syllabus, with point value [4.2.1, 3]
  • examiners notes
  • JC comments

All of the question from all years is placed in a single array, and subsequently visualized.  The time required to enter the data is marginally higher than the IG method, and the results would need to be visualized programmatically.  The benefit of having all this data is that it would be possible to skin the cat in more than one way.

Adding to the value of the more rigorous approach is the impending switch to a the 2016-22 syllabus.  The new syllabus have entirely different topic number from previous syllabi, and would render the 1st/2nd edition IB questionbanks obsolete (the mapping is incomplete and inconsistent).  This indexing / analytics effort would thus double as the foundation of a complete questionbank that goes back 20 years.  Having this database at hand is extra-spiffy when coupled to the “chemical dependency” project I’ve been chipping away (more in a later post); it would also open the possibility for student analytics, wherein after attempts an automated report can be generated to pinpoint their strengths and deficiencies.2

So at the moment I’m slowly plugging away there, mildly burdened with anxiety that my first pass does not captured all that is needed.  A complete pass of a paper 2 takes 2-3 hours, and it takes mental work but not to a prohibitive extent.3  Going back to 2000 would take ~80 hours in total, with an additional 6 hours each subsequent year for maintenance.  Visualization of those data will be the subject of the next post.