# 1. Planning

# Choosing a Topic

Choosing an appropriate IA topic is of critical importance, but it is also the hardest part to get right. You should give significant thoughts (and perhaps explore preliminarily) before commiting to a topic, because the topic is not only directly responsible for the Exploration criteria, but also sets a ceiling for what you can attain in Analysis and Conclusion / Evaluation.

The development of an IA topic has three parts:

  1. a research question,
  2. a hypothesis, and
  3. a method

A research question need to fulfill two criteria. First, it must be one that you can adequately address given the limits of resources you have. This includes considerations in safety and ethics, as well as the available time, equipment, reagents, and abilities that you have (or will be able to learn). Secondly, your topic also must give you the scope to shine. Through the project you should be able to showcase your grasp of chemistry, your mastery of analytical skills, and your resourcefulness in overcoming problems and strive independently.

Think of choosing a research question as designing a routine in a gymnastic competition, or a choreography in a dance show. You need to know what resources you have available (no point in designing a ring routine if the venue has no rings!). You also must inventory your strengths so you neither fall from the 1080 degree turn nor short-changing yourself. The most perfect cartwheel will not suffice for a 10 / 10 in the Olympics.

I have a method to systematically lead you to a research question; along the way I will show you some of the common "good ideas" that will turn out problematic. Read this section completely, then follow the step-by-step preparation checklist to tally your resources.


This is an opinionated guide that lead you to a particular "form" of the research question. Where we end up at will be at once too broad and too narrow.

  • The form is be overly broad because there are duds within the form; some question that falls in the mold will not work. You will need good judgment to discern the chaf from the wheat, and an experienced teacher can help you develop this judgment.
  • The topic can be too narrow because --- regardless of the details --- they will all be testing a hypothesis. Not all science need to be testing hypothesis. Open-ended exploration, or descriptive observation, is perfectly acceptable way of doing science, but they are generally not suitable for your Internal Assessment.

# The Research Question

# Developing the Master Question

A historian is different from a physicist not only by the content they hold in their memory. They differ in what questions they ask at first place, and how they approach questions in general. What does it mean, then, to "think as a scientist", and more specifically, "think as a chemist"?

Natural Science is a codeword for humanity's attempt to systematically explain and understand the world around us. A scientist asks, "how" and "why" with reference to aspect of the material world that they do not yet understand. (I will leave a fuller exposition to your ToK teacher.) Our elementary research question thus looks like this:

  • How does / Why does
  • [a dependent variable]
  • depends on
  • [an independent variable]
  • in
  • [a natural science context]?

Note that this should be a question where you genuinely do not know the answer. Avoid the temptation to play it safe and choose something that you have learnt from the syllabus or is known to be well-understood (e.g., "How does boiling point depend on molar mass?")

Chemists operate at different level of abstractions: an organic chemist may prioritize shape and reactivity, whereas a physical chemist work at a higher abstraction of collective properties such as enthalpy. However, at the core they share an implicit understanding. To think chemically" is to interpret the world --- the observable, macroscopic world we inhabit and interact with --- in terms of the invisible, microscopic world of protons, electrons, atoms, ions, and molecules.

Putting these together, our current "master chemistry question", then, is a menu that comprises four parts:

  • How does / Why does
  • [an aspect of the macroscopic world]
  • depends on
  • [an aspect of the microscopic world]
  • in
  • [a natural science context]?

This general form is very broad, and doesn't seem very actionable. In the next sections we will define and refine the moving parts so you can come to a good research question. However, you should note that this excludes these other general forms:

  1. What is the [property] of [X]? Examples: What is the vitamin C content of an orange? What is the caffeine concentration of cold-brew coffee? What is the enthalpy of combustion of octanol? What is the pKb of triethylamine?
  2. Which of {X / Y / Z} has more/most/highest [property]? Examples: Which brand of orange juice has the most vitamin C? Which kind of tea has the most caffeine? Which functional group of the same molar mass has the most enthalpy of combustion? Which substituted amine has the highest pKa?

In the section "Common Pitfalls" we will look, in more details, why these two forms tends to doom IAs to low ceilings.

# Using the Levels of Measurements to Refine your RQ

To begin our refinement of the master question, you need to have an understanding of Levels of Measurement. If you already know what nominal or rational variables are, feel free to skip to the next section. If not, read on.

To resolve a scientific investigation we need to make observations; "levels of measurement" is about classifying observations into different types. Here I will discuss this using the simplest classification originated in the 1960s by the psychologist Stevens, and extending this to highlight what you may encounter in chemistry.

Stevens look at the observations one could make, and noted that they can be split into four distinct categories. He named these categories as follows:

  • Nominal are observations that different by name. Examples include: "ester" / "amide", "crystalline" / "powdery", "caffeine"/"theobromine", or "blue"/"brown". You can think of these as qualitative variables.
  • Ordinal are observations that differ in ranking. An example may be that "a chlorine atom is smaller than the chloride ion". Knowing this allow us to rank the sizes and order the items, but we are not able to tell just how much smaller chlorine is than chloride.
  • Interval are numerical observations where we can now judge the degree of difference but not the ratio. An example may be the temperature of water in $\celsius{}$": 20 $\celsius{}$ water is hotter than 10 $\celsius{}$ water by the same degree as the difference between 70 $\celsius{}$ water and 60 $\celsius{}$, but it does not make sense to say that 20 $\celsius{}$ water is "twice as hot" as 10 $\celsius{}$ water.
  • Rational are numerical observations where we can now judge not only the degree, but additionally the ratio. Absolute temperature, in Kelvin, is an example of this. It is a measurement of the kinetic energy contained in particles; a sample at 20 $\kelvin{}$ contains twice as much energy than that at 10 $\kelvin{}$.

[insert link to original paper]

You can test your understanding by classifying these observations using Stevens' Levels of Measurement. (Hint: if a meaningful 0 exists --- that it indicate a complete absence --- it should be classified as a rati0 variable.)

  • The mixture boiled at 65 $\celsius{}$
  • The reaction with iron(III) catalyst was faster
  • The cell generated a potential of 1.40 V
  • The heat released was 34 $\kilo\joule{}$
  • The heat released was 34 $\kilo\joule{} / mol$
  • The crystal was blue
  • The solution absorbed the most light at 400 nm wavelength.
  • The pressure was 1.32 atm
  • The concentration is 4.2 $\molar{}$
  • The equilibrium constant is 3.0

Now that you are aware of this classification, let us use this to revise our Master Question. Both of your macroscopic and microscopic aspects should be rational. That is,

  • How does / Why does
  • [a rational aspect of the macroscopic world]
  • depends on
  • [a rational aspect of the microscopic world]
  • in
  • [a natural science context]?

This is essential if you want to adequately control your investigation, and to generate testable hypothesis. Let me show you two examples of what happens when you break this rule:

  • (For [X reaction],) how does the rate of reaction depend on whether an iron(II), copper(II), or nickel(II) catalyst was added?
  • Why does the smell of a compound depend on the functional group?

Both of these questions included at least one nominal variable (type of catalyst, smell, type of functional group). A nominal variable conceals a multitude of different properties: an alcohol differs from an ester in many ways, as does an iron(II) ion from a nickel(II) catalyst. Even when the experiment is conducted perfectly it remains impossible to discern which property is the cause. A nominal variable is further problematic as an observable: sample A smell like oranges and B smell like peaches. How does one begin to compare these categorically different observations?

When you choose an aspect to investigate, keep in mind these two criteria (in addition to it being a number):

  • the zero is meangful and designates a complete absence.
  • a difference of 2X always means the same physical difference as twice that of X

There are values that are common and numerical but does not satisfy these criteria. An example that does not fulfill either criteria is pH. pH 0 does not designate the absence of $\ce{H+}$ (quite the opposite), and the pH scale is logarithmic, such that a pH 9 solution has 100-fold [$\ce{H+}$] than that of a pH 7 solution (not 20-fold as we may expect for a pH 8 / pH 7 comparison).

However, as this example shows, if you identify that the variable you have selected is not suitable, it can often be re-casted into a rational variable. While pH is not suitable, [$\ce{H+}$] is; color can be re-casted into the maximum wavelength of absorption.

(Sometimes you can get away with the meaningful zero (i.e., choosing an interval variable).)

# Refining the Macroscopic Dependent Variable

What are examples of suitable "rational aspect of the macroscopic world"? For experimental work this depends on what equipment you are in command of. You should consult a list of the equipment you have available in the laboratory, and for each piece of equipment note the properties you could measure with it. Some examples are given in Table XX.

  • Thermometer: enthalpies (of combustion, of reaction, of solution...)
  • Burette: volume (and thus concentration / amount)
  • Electronic balance: mass (and mass gain, mass loss)
  • pH sensor: $\ce{H+}$ concentration
  • colorimeter: wavelength / absorbance
  • multimeter: voltage / current / conductivity

You may notice equipment that you have not yet used in your course, or is outside the official IB chemistry curriculum. This does not invalidate their use in the IA. Investigate what they do. My students have used thin layer chromatography $R_f$, surface tension, viscosity, or luminosity (as recorded by a camera) as their dependent property.

For database projects these are properties that can be found in the literature. Examples include:

  • spectral data (e.g. IR / NMR from SDBS)
  • solubility
  • boiling point
  • conductivity

Molecular modeling projects will often use a database value --- a real-world point of reference --- for the macroscopic property. If a real-world reference cannot be found, it may be appropriate to substitute with a calculated macroscopic variable that could be experimentally measured or found in databases. The most common computed property is the enthalpy of formation.

However, molecular modeling additionally opens up the ability to look into the microscopic world. It is thus possible to correlate two microscopic properties against one another; for example, examining how the bond angle or torsional angle changes with increasing volume of substituents.

# Refining the Microscopic Independent Variable

For "rational, microscopic" variables you should think of numbers that characterize changes at the molecular level. While obvious examples include dipole moment or molar masses of molecules, you should not neglect seemingly macroscopic properties that is at its core reflecting what happens at the molecular level. Temperature, for example, is an indication of kinetic energy; concentration is about the fraction of "active" particles in a volume of space.

There are two additional constraints with the independent variable. First, you need to be able to dial it up and down by increments. To investigate a trend you generally need at least five "settings" of your increments. A binary presence--absence of a feature usually turns out data that is too simplistic to support high-calibre analyses.

Second, you should be able to defend why you expect a relationship between your macroscopic and microscopic variables. It would be folly to investigate "how does the mass of a steel chunk depend on temperature changes between 20--50 celsius" --- there is no reason to expect the mass to change, and the likely result is that, after the experimentation, you find out that the mass really didn't change. What discussion does that "discovery" support?

Table 2 summarizes some possible dependent and independent variables.

# Choosing an Appropriate Scope

There is only one task left: to clearly restrain the scope, so that it is narrow enough for you to address with the resources you have. Within 10 hours you are unlikely to be able to address all contexts of "how does surface tension of a solution depend on concentration?" Concentration of what solute? In what solvent? Within which ranges of concentrations? These need to be clearly defined. A more specific phrasing of the above may be [B], "How does the surface tension of aqueous solutions depend on the molar concentration of NaCl?"

A Trick for New Scopes

Solvent mixtures (aqueous methanol, aqueous DMSO, aqueous acetone) is an interesting way of creating new context from an existing project whose outcomes are well-known. The addition of a secondary solvent often changes the property of the system in a way that is difficult to predict quantitatively at the outset.

Note that making this constraint can liberate you to many possibilities. You may find that the answer to the question [B] is readily found on the internet. But what about the surface tension changes when other salts are added? In a solvent that is not pure water --- e.g., 20% methanol? Now you find yourself trying to answer a genuinely unknown question, and the results will likely support a high-level discussion.

We thus arrive at:

Why/How does [macroscopic, rational/interval] depend on [dialable, microscopic, rational/interval] in [clearly determined context]?

Do not rush into the research question, and do not leave it until the last moment. Start early. Read some exemplars to know what to expect for an IA. Look through the equipment list to see what is available. Choose an area you find interesting, do your preliminary research, give it some time to age, and speak with your teacher. Expect to adjust course and iterate. Having a good research question is half the battle won.

Recurring poor topics

  • Product measurements: 10 hr later you found what was written on the box. Why didn't you just read the box? In any case - where is the science - the search for cause and effect? Examples:

    • How much vitamin C is in Mr Juicy orange juice
    • How much iron is in an iron tablet?
  • Value measurements: why are you measuring $\Delta{}H_{c}$ of ethanol... when you can read it off a data booklet?

  • Melting point: melting points depends on how molecules / ions pack. No one knows how to accurately predict melting points. You are probably going to get stuck too.

  • Experimentally problematic. Some experiments are very hard to do right with common highschool lab equipment. The errors and uncertainties are often unacceptably large.

    • enthalpy of combustion using simple calorimeter - the heat loss is sometimes > 60%.
    • an alternative synthesis of vitamin B12. This took 99 PhDs at two continents 10 years to do. You may not finish your IA on time.
  • Comparison between nominal variables. Nominal variables usually conceal too much difference, and make it impossible to attribute a cause even if the experiment is done right.

    • brand comparison: what is a more effective brand of soap? Because there is limited chemistry that can be talked about, the discussion tends to focus on experimental errors.
  • Theoretically fully known, or too trivial. If the phenomena is fully known, why are you investigating it? (The most common reason is that the student hasn't done any reading, and didn't know that it is fully known. Reading is an essential part of your scholarship.) Examples:

    • colligative properties (freezing point depression, boiling point elevation),
    • whether weak acids incompletely dissociate,
    • "is larger amount of HCl needed to neutralize more concentrated NaOH"
  • Human-senses is the measurement: odor, taste... these are tough to get reproducible, objective results. To do this kind of research you need a large number of volunteers (>50) under highly controlled environment.

  • Too many research questions. If your research question is subdivided into several separatable (or even unrelated!) parts, it needs focus. You only have 10 hours.

    • "How does the rate of [...] reaction depend on the concentrations of [X], [Y], and [Z]? Does it change when the temperature is raised? What happens when [C1] is added as a catalyst instead of [C2]?"

It is hard to anticipate all the problems, especially when you are doing chemistry for the first time. It is essential to approach and seek guidance from your teacher.

# The Hypothesis

Now that you have a research question — an unknown that you can investigate — you should formulate a hypothesis. A hypothesis is a specific, testable statement that your experiments could refute (show to be false). Formulate it in quantitative terms, and be as mathematically specific as possible. For example, “the surface tension of water increases with concentration of methanol” seems specific, but it is not refuted whether it increases in either one of the following cases:

[insert graph]

It would be better if the hypothesis is modified to be “the surface tension of water increases linearly with concentration of methanol”.

Earlier on you have taken pains to ensure the relationship between the variables is plausible; you should use this to articulate why your hypothesis is reasonable. By crafting a specific hypothesis here, you will be able to argue that your data supports or refute the hypothesis. It does not matter whether your hypothesis is proven wrong; what matters is its existence which allows for a very high ceiling of discussion.

# The Method

# Writing up the Exploration

Last Updated: 2 years ago