Recovering Coin Segmentation from Low-Light Images

— a Classical Pipeline that Beats the Dark

Overview

Take a perfectly clean photograph of a few coins on a desk. Run Otsu's thresholding on it and you get a near-perfect binary mask of every coin. Now turn off the lights, drop the exposure, and let the sensor noise creep in. Run the same Otsu's thresholding. The mask is garbage.

This project asks one tightly-scoped question: can a small toolbox of classical image-processing operators rescue segmentation accuracy when the input image is severely underexposed?

No deep learning. No labelled dataset. Just Gamma Correction, CLAHE, Histogram Equalisation, and the four classical evaluation metrics that quantify whether any of it is actually working.

Code on GitHub: github.com/mraknar/low-light-coin-segmentation

The Problem

Low-light photography breaks classical segmentation for three compounding reasons:

Brightness collapses. Most of the image lives in the bottom of the histogram, where small absolute differences map to indistinguishable pixel values.
Contrast disappears. Object boundaries — the actual signal Otsu's thresholding needs to do its job — wash out.
Sensor noise dominates. What little signal remains gets swamped by Gaussian-ish noise from the imaging chip.

Otsu's algorithm assumes a roughly bimodal intensity histogram (one peak for background, one for foreground). In a low-light image, the histogram is a single compressed lump — and Otsu picks a threshold somewhere inside the noise. The result looks like static.

Setting Up a Fair Experiment

Comparing "before" and "after" needs a careful setup, because we don't have a hand-labelled dataset of coin photos and we don't have paired low-light/well-lit shots of the same scene. So I built one synthetically:

Start with a well-lit coin image.
Run Otsu + morphology on it to produce a gold-standard segmentation mask. This is the "ground truth" — it's what the segmentation should look like, generated under ideal conditions.
Apply synthetic darkening: I_dark = (I/255)^γ × 255 with γ = 4.0, then add Gaussian noise σ = 25. This is an aggressive transformation — the resulting image is genuinely hard to read, even with the eye.
Apply each enhancement candidate to the dark image.
Run the same Otsu + morphology pipeline on every variant and compare against the ground-truth mask.

Why is this fair? Because the ground truth is built using exactly the algorithm we're trying to evaluate — so any difference in the resulting masks is attributable to enhancement quality, not to a different segmentation strategy. We're not comparing Otsu against U-Net. We're comparing Otsu-on-dark against Otsu-on-enhanced, against Otsu-on-original.

The Four Enhancers

I deliberately chose four candidates that operate on different parts of the image-intensity model:

1) Gamma Correction (γ = 0.4)

The simplest possible enhancer: I_bright = (I/255)^(1/γ) × 255. With γ < 1, dark values get pulled up disproportionately. Pre-computed as a 256-entry lookup table — runs in microseconds. Great for globally underexposed images. Useless when the histogram is genuinely uniform (it just shifts the lump up).

2) CLAHE (Contrast Limited Adaptive Histogram Equalisation)

Cuts the image into an 8×8 grid of tiles. For each tile it computes a local histogram, clips it at a contrast limit (so noise doesn't get amplified into spurious detail), redistributes the clipped mass, and equalises. Bilinear interpolation across tile boundaries prevents block artefacts. Operates on the L channel of LAB to preserve colour. The strongest single tool for non-uniform lighting.

3) Histogram Equalisation

The classic global histogram stretch — but applied to the Y channel of YCrCb to keep the chroma intact. Effective when the dynamic range is genuinely compressed; tends to over-do it (and amplify noise) when the original distribution was already reasonable.

4) Combined (Gamma → CLAHE)

Gamma correction first to fix global brightness, then CLAHE to sharpen local contrast. Two-stage pipeline. Usually the most consistent performer, occasionally over-processes images that didn't need both stages.

Evaluation: Four Metrics, Different Stories

I evaluated every (image × method) pair against ground truth using:

Intersection over Union (IoU) — strict overlap. The metric that matters most.
Dice Coefficient — F1 for binary masks. Less punishing than IoU.
Precision — "of what I called coin, how much really is?"
Recall — "of all coin pixels, how many did I catch?"

Each tells a different story. A method with high recall and low precision is over-segmenting (calling background as coin). A method with high precision and low recall is under-segmenting (missing parts of coins). IoU is the strict joint measure.

What I Found

The headline numbers, on a coin dataset of ~180 images:

Method	Avg IoU	Notes
Dark (no enhancement)	≈ 0.30	Otsu has effectively given up — the mask is mostly noise.
Gamma alone	≈ 0.65	Big jump from baseline. Globally brightens, but local contrast is still flat.
CLAHE alone	≈ 0.78	The strongest single method. Local contrast does the heavy lifting.
Histogram-EQ	≈ 0.55	Over-amplifies noise on already-noisy inputs.
Combined (γ + CLAHE)	≈ 0.79	Most consistent across images. Average improvement vs dark: +143%.

The most interesting finding wasn't the best method but the failure modes:

Histogram-EQ amplifies noise. Because it spreads the histogram aggressively, every tiny noise speckle gets a brighter neighbourhood. Otsu then trips on speckle-class boundaries.
CLAHE can over-segment very dark regions. When it boosts local contrast in a region that should be background, it hallucinates structure.
Gamma alone is too gentle. It restores brightness but leaves the histogram compressed. Otsu still struggles to find a clean threshold.
Combined is the most reliable, but it's also the slowest and the most prone to over-processing.

Learned:

Synthetic darkening is a powerful experimental tool. I didn't need a paired low-light dataset; I built my own controlled experiment by darkening well-lit images and using their original Otsu masks as ground truth. This is how you should approach evaluation when labelled data is scarce.
Otsu's thresholding is more fragile than it looks. It's an excellent algorithm under bimodal conditions, and it gracefully degrades to nothing when those conditions aren't met. Half the work in this project was recreating the bimodal histogram that Otsu needs.
CLAHE punches well above its weight. Of the four enhancers I tested, CLAHE alone often matched the combined pipeline. Local contrast enhancement is doing more of the work than global brightness correction.
Morphology is non-negotiable. Every mask in this project went through opening (remove specks) + closing (fill holes) with a small elliptical kernel. Without that step, even a "good" Otsu mask looks ragged.

Resources

💻 GitHub Repository — full pipeline (Python + OpenCV)

You can view and download my presentation PDF file below.

presentation İndir

Recovering Coin Segmentation from Low-Light Images

— a Classical Pipeline that Beats the Dark

Overview

The Problem

Setting Up a Fair Experiment