ISOEN 2026 — Tutorial 6 — Pillar 1: Selectivity in electrochemical sensor arrays
In the next 5–10 minutes you will classify four overlapping electroactive analytes using three approaches in sequence: raw-signal PCA, engineered features, and a small neural network. You will see directly how algorithm choice interacts with dataset size — the central message of this tutorial.
Everything runs in your browser. No login, no install, no data leaves your device. Work at your own pace.
Eighty simulated cyclic voltammograms — 20 each of dopamine, serotonin, ascorbic acid, and uric acid, on a microelectrode at physiological pH. Realistic peak overlap, electrode-to-electrode variability, and noise.
The forward-scan oxidation peaks overlap. Selectivity has to come from the algorithm, not from peak height alone.
First attempt: throw the entire voltammogram (200 sampled current points per scan) at PCA and ask whether the classes separate in the first two components.
Some structure, but at 200 dimensions PCA captures variance directions that are not the same as class-separation directions, and noise eats some of the budget. You can usually see partial overlap between at least two of the four classes.
Now reduce each voltammogram to a handful of interpretable, chemistry-aware features. Toggle which features to include. The projection and the classifier accuracy update live.
Peak position: potential at maximum forward current. Peak height: baseline-corrected current at peak. Peak width: full width at half-maximum. Reversibility ratio: reverse-peak depth divided by forward-peak height (0 = irreversible, 1 = fully reversible). Forward slope: di/dE on the rising edge of the peak.
A few well-chosen features beat 200-dimensional PCA. The chemistry lives in the shape of the wave, not in every individual current sample.
Last attempt: feed the raw 200-point waveform into a small multilayer perceptron (200 → 16 → 4, about 3,300 parameters). No feature engineering — let the network learn from the raw signal. Same train/test split as before.
Same data, same train/test split. The MLP usually nails the training set and stumbles on the test set — classic overfitting at small N. The engineered features win. This is exactly what Pillar 1 of the tutorial predicted, and it is the regime almost every ECHEM-array lab works in.
Click Re-train MLP a few times: the engineered classifier holds its accuracy, while the MLP wobbles by 10–20 percentage points depending on the random initialization. That instability is itself a warning sign.
Which single feature gave you the biggest accuracy jump when you turned it on? Vote at:
menti.com
code: 6320 5978