Image from William Putman/NASA Goddard Space Flight Center


I mainly use Python with the scientific stack of libraries (e.g., Numpy, xarray, Pandas, Matplotlib, Jupyter, Dask).

Here is a course I created on High Performance Python. It includes profiling, vectorisation with NumPy, compiling with Numba, parallelisation with Dask and Ray, and using GPUs with JAX and CUDA/Numba.

Numerical atmospheric models#

I taught and provided support for a complex air quality model, WRFChem (Bash and Fortran).

Machine learning#

I used machine learning models to predict optimum emission reduction strategies to improve air quality and public health in China. These were Gaussian process emulators trained from ~20 TB of simulated data from numerical atmospheric models.

I provided training for these emulators video (Colab, slides, and GitHub).

Here is a course I created on an Introduction to Machine Learning. It covers fundamentals, machine learning with scikit-learn, deep learning with TensorFlow / Keras and PyTorch / PyTorch Lightning, data pipelines, model tuning, transfer learning, and distributed training.

Useful learning resources#

Here are some great learning resources for topics in data science, software engineering, and machine learning that I’ve found helpful.