Telemetry Active
Depth:0m
Zone:Epipelagic Zone
Range:0 - 200m
The sunlight zone
Jesse Wood
PhD AI CandidateWellington, NZGitHub Followers

Fin-tuned Deep Learning

Jesse Wood @ Victoria University of Wellington

Transitioning from NIWA software engineering to a PhD in AI, using code as a scientific medium to unlock biochemical secrets through transformer architectures.

0207541506225830050210370530690850m/z (Mass-to-Charge Ratio)Intensity (Counts)

Featured Research

fishy-business

Gone Phishing

100% Accuracy

Species Identification (Hoki vs. Mackerel) using MoE Transformer architectures.

Autobots Ensemble

74.13% Accuracy

Body Part Classification, significantly outperforming traditional OPLS-DA (51.17%).

SpectroSim & XAI

Batch Traceability

Self-supervised contrastive learning enabling physical tag-less tracking. Decisions mapped via LIME/SHAP to specific m/z chemical peaks.

Wellington Marine Pulse

Live Environmental Feed

41.28° S, 174.77° E

Expertise Compass

The Scientific
Tech Stack.

AI & Data Science

PyTorch
Transformers
Scikit-Learn
Pandas
NumPy
HuggingFace

Optimization & Systems

Optuna
DEAP
Rust
C++
Haskell
Docker
Streamlit

Explainable AI (XAI)

LIME
SHAP
WandB

Scientific Papers

Advancing self-supervised learning, Masked Spectra Modeling (MSM), and Evolutionary Computation.

2026
Te Herenga Waka—Victoria University of Wellington

Machine Learning for Rapid Evaporative Ionization Mass Spectrometry for Marine Biomass Analysis

This thesis advances seafood processing by applying deep learning to Rapid Evaporative Ionization Mass Spectrometry (REIMS) data, enabling automated and accurate marine biomass analysis. This research addresses critical industry challenges in quality control, food safety, and fraud prevention. These include foundational tasks like species identification to combat mislabeling fraud and body part classification to optimize by-product utilization and prevent adulteration with offal. It also formalizes novel food safety problems, such as detecting oil contamination (a food safety hazard) and cross-species adulteration (a form of economic fraud). Finally, it tackles batch traceability, a task essential for rapid product recalls that currently relies on costly and impractical physical tagging methods. To achieve this, the thesis first establishes a suite of advanced models for foundational tasks species and body part identification (Chapter 4), drawing on datasets described in Chapter 3. Specifically, this involves a binary classification dataset of 106 samples for Hoki and Mackerel identification and a multi-class dataset of 33 samples for classifying seven distinct body parts (fillets, heads, livers, skins, gonads, guts, and frames). The work then formalizes and solves novel food safety problems, including oil contamination (seven ordinal concentration levels from $50\%$ to $0\%$ in a 126-sample dataset) and cross-species adulteration (pure Hoki, pure Mackerel, and mixed Hoki-Mackerel samples from a 144-sample dataset) (Chapter 5), also based on Chapter 3 data. Finally, it introduces a self-supervised framework to address the challenge of batch traceability using a highly imbalanced, pairwise comparison dataset of 2,556 instances, derived from 72 fish samples across 24 distinct batches (Chapter 6). The high-dimensional nature of the REIMS data, with 2,080 features per spectrum for all tasks, necessitates sophisticated solutions. Key contributions include the development of advanced Transformer architectures, which significantly outperform traditional methods by achieving up to $100\%$ accuracy in species identification. The study also critically evaluates Mixture of Experts (MoE) Transformers to determine optimal configurations and investigates the asymmetric effects of transfer learning across different tasks, namely species identification, body part classification, oil contamination, and cross-species adulteration detection. Furthermore, a comparative analysis of ordinal classification techniques is included, demonstrating that models designed for ordered data significantly reduce error distance in graded contamination tasks. For batch traceability, a novel self-supervised contrastive learning method, SpectroSim, is introduced, which identifies sample origins with $70.8\%$ accuracy without using class labels. Explainable AI techniques are employed to ensure model predictions are interpretable, enhancing trust and providing chemically relevant insights. This research is motivated by the need to overcome the limitations of current state-of-the-art analytical methods. The standard approach for REIMS analysis relies on traditional chemometric techniques like Orthogonal Partial Least Squares Discriminant Analysis (OPLS-DA), which are often limited in their ability to model the complex, non-linear, and high-dimensional nature of mass spectra. For the foundational task of species and body part identification (Chapter 4), the challenge was to classify samples from complex and high-dimensional REIMS data. This was addressed by developing novel Transformer architectures that achieved near-perfect accuracy (up to $100\%$), establishing a new performance benchmark. For the novel tasks of oil contamination and cross-species adulteration (Chapter 5), the challenge was to detect subtle chemical signatures often masked by the sample’s primary chemical profile. This was solved by adapting advanced models to these difficult classification problems and systematically applying transfer learning, which was found to consistently improve the detection of oil contamination. Finally, for batch traceability (Chapter 6), the challenge was to create a method that avoids costly physical tags and the need for impractical supervised labels. This was achieved by developing ``SpectroSim'', a self-supervised framework that accurately determines sample origin without labels, offering a practical and cost-effective solution for the industry.

2025
Intelligent Marine Technology and Systems

Hook, Line and Spectra: Machine Learning for Fish Species and Part Classification using Rapid Evaporative Ionization Mass Spectrometry

Marine biomass composition analysis traditionally requires time-consuming processes and domain expertise. This study demonstrates the effectiveness of rapid evaporative ionization mass spectrometry (REIMS) combined with advanced machine learning (ML) techniques for accurate marine biomass composition determination. Using fish species and body parts as model systems representing diverse biochemical profiles, we investigate various ML methods, including unsupervised pretraining strategies for transformers. The deep learning approaches consistently outperformed traditional machine learning across all tasks. For fish species classification, the pretrained transformer achieved 99.62% accuracy, and for fish body parts classification, the transformer achieved 84.06% accuracy. We further explored the explainability of the best-performing and predominantly black box models using local interpretable model-agnostic explanations and gradient-weighted class activation mapping to identify the important features driving the decisions behind each of the best performing classifiers. REIMS analysis with ML can be an accurate and potentially explainable technique for automated marine biomass composition analysis. Thus, REIMS analysis with ML has potential applications in quality control, product optimization, and food safety monitoring in marine-based industries.

2022
Australasian Joint Conference on Artificial Intelligence

Automated Fish Classification Using Unprocessed Fatty Acid Chromatographic Data: A Machine Learning Approach

Fish is approximately 40% edible fillet. The remaining 60% can be processed into low-value fertilizer or high-value pharmaceutical-grade omega-3 concentrates. High-value manufacturing options depend on the composition of the biomass, which varies with fish species, fish tissue and seasonally throughout the year. Fatty acid composition, measured by Gas Chromatography, is an important measure of marine biomass quality. This technique is accurate and precise, but processing and interpreting the results is time-consuming and requires domain-specific expertise. The paper investigates different classification and feature selection algorithms for their ability to automate the processing of Gas Chromatography data. Experiments found that SVM could classify compositionally diverse marine biomass based on raw chromatographic fatty acid data. The SVM model is interpretable through visualization which can highlight important features for classification. Experiments demonstrated that applying feature selection significantly reduced dimensionality and improved classification performance on high-dimensional low sample-size datasets. According to the reduction rate, feature selection could accelerate the classification system up to four times.

2022
PhD Proposal, Victoria University of Wellington

Rapid determination of bulk composition and quality of marine biomass in Mass Spectrometry

Navigating the analysis of mass spectrometry data for marine biomass and fish demands a technologically adept approach to derive accurate and actionable insights. This research will introduce a novel AI methodology to interpret a substantial repository of mass spectrometry datasets, utilizing pre-training strategies like Next Spectra Prediction and Masked Spectra Modeling, targeting enhanced interpretability and correlation of spectral patterns with chemical attributes. Three core research objectives are explored: 1) precise fish species and body part identification via binary and multi-class classification, respectively; 2) quantitative contaminant analysis employing multi-label classification and multi-output regression; and 3) traceability through pair-wise comparison and instance recognition. By validating against traditional baselines and various downstream tasks, this work aims to enhance chemical analytical processes and offer fresh insights into the chemical and traceability aspects of marine biology and fisheries through advanced AI applications.

Technical Projects

Bartender

Godot
GD Script

Step behind the bar in this high-fidelity mixology simulator. Unlike standard click-and-serve games, this experience leverages a physics-based interaction system where every pour, shake, and clink of ice matters. As the shifts progress, the pressure mounts, and only the most precise bartenders will survive the rush.

Cloudy with a Chance of Git Pulls

JavaScript
GitHub Actions

This GitHub Action used the Open Weather API to display the weather forecast for a given area. It is updated once every 30 minutes. The weather forecast is displayed within predefined tags (hidden inside HTML comments), such that it does not overwrite any other existing content in a README.

Fishy Business

Python
Pytorch
Deap
Numpy

Machine Learning for Rapid Evaporative Ionization Mass Spectrometry for Marine Biomass Analysis --- A Doctoral Thesis by Jesse Wood

Ionic Scholar

Ionic
Semantic Scholar
Gemini

This individually developed app keeps track of academic references. The app remembers the users progress, keywords, quotes. Also it can generate citations. We design the app to reduce the stress of academic writing. Frequently it can be problematic to maintain track of several numerous scholarly articles when trying to prepare a paper.

qsh | Qwen Shell

Rust
Python
SQL

A local-first, privacy-focused CLI that brings vision and semantic understanding to the Unix pipe. Powered by Qwen 3.5-0.8B and Rust. 🦀

Skyrim Wellbeing Manager

React
Firebase
Leaflet
Threejs

Moody bitch is a high-fidelity mental health odyssey that transforms real-world self-care into an epic, Skyrim-inspired RPG experience. By completing daily "quests," users earn experience points to level up legendary skill constellations, unlock psychological perks, and loot powerful artifacts that grant permanent growth bonuses.

Thesis

Python
JavaScript
LaTeX

A wiki for my thesis, inspired by a Karpathy tweet https://x.com/karpathy/status/2039805659525644595

Wordle Solving Transformer

Python
Onnyx
Pytorch

Wordle solver with a transformer deep learning neural network.

Collaborate

Seeking a scientific partner or technical engineer? Let's establish a connection in the deep.

Fun Fact

"I once built a Skyrim Wellbeing Manager to gamify tracking mental health. Currently, I'm navigating the depths of Baldur's Gate 3."

Machine Learning Engineer

Jesse Wood

Jesse Wood

Lead Researcher

Specializing in the intersection of deep learning and marine biochemistry.