Hands-On: Climb the Algorithm Ladder

ISOEN 2026 — Tutorial 6 — Pillar 1: Selectivity in electrochemical sensor arrays

In the next 5–10 minutes you will classify four overlapping electroactive analytes using three approaches in sequence: raw-signal PCA, engineered features, and a small neural network. You will see directly how algorithm choice interacts with dataset size — the central message of this tutorial.

Everything runs in your browser. No login, no install, no data leaves your device. Work at your own pace.

Step 1 — Look

The dataset

Eighty simulated cyclic voltammograms — 20 each of dopamine, serotonin, ascorbic acid, and uric acid, on a microelectrode at physiological pH. Realistic peak overlap, electrode-to-electrode variability, and noise.

Dopamine (DA) Serotonin (5-HT) Ascorbic acid (AA) Uric acid (UA)

The forward-scan oxidation peaks overlap. Selectivity has to come from the algorithm, not from peak height alone.

Step 2 — Try raw PCA

PCA on the raw 200-point waveform

First attempt: throw the entire voltammogram (200 sampled current points per scan) at PCA and ask whether the classes separate in the first two components.

Step 3 — Engineer features

Pick features, watch the projection

Now reduce each voltammogram to a handful of interpretable, chemistry-aware features. Toggle which features to include. The projection and the classifier accuracy update live.

Peak position: potential at maximum forward current. Peak height: baseline-corrected current at peak. Peak width: full width at half-maximum. Reversibility ratio: reverse-peak depth divided by forward-peak height (0 = irreversible, 1 = fully reversible). Forward slope: di/dE on the rising edge of the peak.

Nearest-centroid accuracy
Features used

A few well-chosen features beat 200-dimensional PCA. The chemistry lives in the shape of the wave, not in every individual current sample.

Step 4 — Try a small neural network

Deep learning on raw waveforms

Last attempt: feed the raw 200-point waveform into a small multilayer perceptron (200 → 16 → 4, about 3,300 parameters). No feature engineering — let the network learn from the raw signal. Same train/test split as before.

Step 5 — Debrief

Takeaways

1. Match algorithm capacity to dataset size. Climb the ladder only when the data justifies it.
2. Engineered voltammogram features (position, height, ratio, slope) beat raw-signal deep learning at the dataset sizes most labs work with.
3. If unsupervised PCA fails to separate your classes, try a supervised method (PLS-DA, LDA) before concluding it is a sensor problem.

Tell the room

Which single feature gave you the biggest accuracy jump when you turned it on? Vote at:

Mentimeter QR

menti.com

code: 6320 5978