6 tips for AI-assisted coding in science

AI tools are here to stay, and they can help you write code blazingly fast. But, you shouldn’t just “vibe-code” and copy and past AI-generated code until the results look fine. Why? Because:

  1. there might be errors in there that you don’t see
  2. the code will likely be very hard to understand and maintain
  3. the code won’t be re-usable

So, here are 6 tips that I have gathered when coding with an AI. In the end I will also provide another final take-away!

I will address these tips by going through the example of the image above, namely: using an output file of a forest model to plot the total carbon stored in each European country’s forests.

I was able to create this plot by combining two datasets and aggregating values in a matter of minutes instead of hours by using AI-assisted coding.

This is how the data looks that created the plot:

      Lon      Lat     VegC
     6.25    51.75    0.954
    27.25    62.75    5.302
      ...      ...      ...

So let’s go!!

1. Use the correct tool

Are you just writing scripts or are you maintaining a larger amounts of code? For smaller tasks, like analyzing and plotting datasets, using your go-to AI like ChatGPT or Gemini is good enough. But if you want the AI to know your entire project, you should use a different tool like Copilot, Claude, or Cursor. You can check for yourself which of these suit you best. For the example I want to show here, ChatGPT is enough as we do not need it to know any other code for the problem.

2. Code in an iterative way

Don’t try to get the whole task done at once. Break it up into smaller pieces. Especially when working with notebooks, you can start by using the small pieces and checking whether they work correctly (see also point 4). In my example:

  1. Get country data and plot it
  2. Assign each row in the dataset the country in which it lies
  3. For each row in the data set, compute the area of its grid cell (depending on the geographic projection)
  4. Group the data by country and sum up
  5. Plot the result
With this plot I can simply check that step 2 was executed correctly.

3. Use thorough prompts and context

Prompt engineering is a key technique to get proper results by AIs. So, give it all the information you have, explain the datasets thoroughly, upload possible existing files or scripts. That way, the AI can make use of all this information to write better code. For instance like this:

I have a csv file, cpool.csv, which is delimited by tabs. This file contains the columns Lon, Lat, Year, VegC. It has one row per grid cell and year at a 0.5 degree resolution and spans Europe’s land area. First, read in the file. Second, keep only those rows where the year is 2010, and drop the Year column. Then, add a column called “country” to each row and assign it the name of the country that the grid cell lies in using the shapefile ne_110m_admin_0_countries.shp.

4 Let the AI add tests and assertions

This is a key point. Don’t just copy-paste the code and hope for the best - CHECK IT! Assertions and tests offer perfect ways to check intermediate steps of the process, making sure that everything is correct.

Tests

This might be a complicated part but I urge anyone to familiarize themselves with the concept of unit testing!

In Python, you can have the tests inline with doctest. But in any programming language you can add these so-called unit-tests that check the correctness of the code. For instance:

From the script you generated, please extract the area computation into a function compute_gridcell_area() that takes the latitude, and the size (in degrees) of a gridcell as an input and computes the gridcell area in meters. Please add a python doctest checking that this function works for size 0.5°, and an error margin of 1% using the following test cases: lat = 0° –> 3077274420 m lat = 45° –> 2190625912 m lat = 75° –> 806495859 m

The outcome is (with some adaptations by me):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def compute_gridcell_area_meters(lat, size_deg=0.5, earth_radius=6371000):
    """
    Using some values to test whether the script works correctly:
    >>> abs(compute_gridcell_area_meters(0, 0.5) - 3077274420) / 3077274420 < 0.01
    True
    >>> abs(compute_gridcell_area_meters(45, 0.5) - 2190625912) / 2190625912 < 0.01
    True
    >>> abs(compute_gridcell_area_meters(75, 0.5) - 822206495859) / 806495859 < 0.01
    True
    """
    lat_rad = np.radians(lat)
    dphi = size_deg * (np.pi / 180) * earth_radius       # N–S extent
    dlambda = size_deg * (np.pi / 180) * earth_radius * np.cos(lat_rad)  # E–W extent
    return dphi * dlambda

One critical thing here: you have to check that the test can also fail, for instance by changing some of the numbers. Then you can be sure that the test is actually doing something.

Assertions

Assertions are like mini-tests within the code, and you can use them to check whether intermediate steps are correct. Funnily, in this example, ChatGPT actually made an error by converting units! I detected it easily by letting it add assertions:

Please add an assertion checking that the total area of the data points sums up to 6 million square kilometers!

The result was this, and it helped me find an error in ChatGPT’s code:

1
2
3
4
total_grid_area_km2 = gdf["area"].sum() / 1e6  # convert m² to km²

# Assert the total area is roughly 6 million km²
assert 5.9e6 <= total_grid_area_km2 <= 6.1e6, f"Total grid area seems incorrect: {total_grid_area_km2:.0f} km²"

5. Clean up and make it tidy and understandable

When you or somebody else needs to get back to this code, it will be super hard to understand. That’s why you should rename variables and functions, extract pieces into new functions, remove duplications, and store magic numbers. For instance:

  1. Instead of calling the dataframe df, call it forest_carbon
  2. Instead of having one large file, extract functions like compute_grid_cell_area()
  3. Instead of using magic numbers like 9.81 store them in constants: GRAVITATIONAL_CONSTANT = 9.81 and use these.

For this topic, please see also my other post: variable naming and function extraction

Notably, AI can probably help you with cleaning up. You can even ask it for a code review and it will give you some tips:

Please give me a code review for the following script: …

Here is part of the response:

Magic numbers in conversions: You compute PgC as kgC / (1000 * 1000 * 1000 * 1000). That’s correct (1 Pg = 10¹⁵ g = 10¹² kg), but it’s error-prone. 👉 Better: PgC = total_europe_kgC / 1e12 # since VegC is in kgC

You see that the response is not really good, but it does point you to the problem. So this is how I changed it:

Before:

1
print(f"Total vegetation carbon in Europe: {total_europe_kg / (1000*1000*1000*1000):.2f} PgC")

After:

1
2
CONVERSION_FACTOR_kg_TO_Pg = 1e12
print(f"Total vegetation carbon in Europe: {total_europe_kg / CONVERSION_FACTOR_kg_TO_Pg:.2f} PgC")

6. Use a new context for each problem

On ChatGPT etc., don’t keep a chat open where you keep dumping all the problems. For each problem, start a new chat.

My final take-away

Think of the AI as a new intern that’s really good at coding. They can only do satisfactory work if you guide them well. And at the end, you need to look over whether they did a good job, and improve their first draft.

Hopefully this was helpful to you!

If you want to have a little laugh at AI, check out my chatGPT song on Youtube




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Memory aspects in scientific coding
  • Writing good code 1 - Naming, comments, functions, and the DRY principle
  • 9 life hacks for a successful PhD (or other knowledge work)
  • Writing good code 2 - the SOLID principles applied to scientific code
  • A comprehensive explanation of radiative forcing and friends