Eigenfaces — User Guide & Technical Documentation

01

Quick Start

Three commands in Git Bash are all you need. Make sure all project files are in the same folder before starting.

1

Install Python dependencies

pip install scikit-learn numpy pillow

Downloads scikit-learn (includes sklearn PCA, the Olivetti dataset loader), numpy for array math, and Pillow for image resizing. Only needed once.

2

Navigate to project folder and run the script

cd /c/Users/YourName/Downloads/eigenfaces
python generate_eigenfaces.py

The script fetches the real Olivetti faces, runs PCA(n_components=50).fit(X), encodes results, and generates the HTML file. Takes 15–30 seconds.

3

Open in browser

start eigenfaces_olivetti.html

Opens the self-contained visualization in your default browser. No server, no internet required after generation.

💡

Preview without setup

Open eigenfaces_sklearn.html right now — it contains the full interface loaded with simulated data so you can explore before running anything.

W1

Windows Setup

Detailed step-by-step guide for running the Eigenfaces pipeline on Windows using Git Bash. The script downloads the AT&T Olivetti dataset, runs real sklearn PCA, and generates the self-contained HTML visualization — all from five simple commands.

Pipeline Overview

sklearn runs on your machine, fetches the real AT&T Olivetti dataset, computes PCA, then bakes everything into a self-contained HTML file you open in any browser.

You run

generate_eigenfaces.py

Python script

Downloads

Olivetti 400 faces

figshare.com · ~5 MB

Runs real

sklearn PCA

PCA(n_components=50)

Injects into

template.html

base64 JSON payload

You open

eigenfaces_olivetti.html

no server needed

Setup Steps — Git Bash on Windows

1

Check Python is installed

Open Git Bash and verify Python works. You need Python 3.7 or higher.

Git Bashpython --version
# Python 3.11.x

# If that fails, try:
python3 --version

⚠

Python not found?

Download from python.org/downloads — check "Add Python to PATH" during install, then restart Git Bash.

2

Navigate to the folder with your files

Git Bash uses Unix-style paths — drive letter becomes /c/ instead of C:\

Git Bash# Change to match where you saved the files:
cd /c/Users/YourName/Downloads/eigenfaces

# Verify all files are there:
ls
generate_eigenfaces.py  eigenfaces_template.html  eigenfaces_sklearn.html

3

Install Python packages

Only needs to be done once. Pillow is optional — the script has a numpy fallback if it's missing.

Git Bashpip install scikit-learn numpy pillow

If pip is not foundpython -m pip install scikit-learn numpy pillow

4

Run the Python script

Downloads the real Olivetti dataset (~5 MB from figshare.com), runs sklearn PCA, and generates the HTML. Takes about 15–30 seconds total.

Git Bashpython generate_eigenfaces.py

Expected output[1/5] Fetching Olivetti faces dataset via sklearn...
      X.shape = (400, 4096)
[2/5] Running PCA(n_components=50).fit(X)...
[3/5] Downsampling 64×64 → 32×32...
[4/5] Encoding to base64...
[5/5] Building eigenfaces_olivetti.html...

✓  eigenfaces_olivetti.html  (2100 KB)

✓

Success!

eigenfaces_olivetti.html will appear in the same folder with all 400 real Olivetti faces embedded — no internet required to open it.

5

Open the result in your browser

The generated HTML is fully self-contained — no server, no dependencies needed.

Git Bashstart eigenfaces_olivetti.html

Alternative# Double-click the file in Windows Explorer
# Or drag it into Chrome / Firefox

Troubleshooting

⚠

'python' is not recognized

Try python3 instead of python, or reinstall Python from python.org and check "Add Python to PATH".

⚠

'pip' is not recognized

Run python -m pip install scikit-learn numpy pillow instead.

🌐

Download error / URLError / figshare blocked

Your network is blocking figshare.com. Try a different network, mobile hotspot, or VPN. The script only needs internet for step 4 (first run only — dataset is cached after that).

⚠

eigenfaces_template.html not found

The script must be in the same folder as the template. Run ls to confirm both files are present in the current directory.

💡

Want to preview before running anything?

Open eigenfaces_sklearn.html right now — it works with simulated data and shows the full interface with no setup required.

02

Project Files

Four files make up the project. Keep them all in the same folder — the Python script looks for the template by relative path.

🐍

generate_eigenfaces.py

The Python script you run once. Downloads Olivetti data, runs real sklearn PCA, exports results as base64 JSON, and injects it into the template.

📄

eigenfaces_template.html

HTML shell with a /* __SKLEARN_DATA__ */ placeholder comment. The script replaces this with the real PCA JSON payload.

👁

eigenfaces_sklearn.html

Instant preview version with simulated data already embedded. Open anytime without running Python.

✅

eigenfaces_olivetti.html

Generated output. Created by the script. Contains all 400 real Olivetti faces + real sklearn PCA results embedded. This is the final deliverable.

File	Size	Created by	Role
`generate_eigenfaces.py`	~3.5 KB	Provided	Run once to generate the HTML
`eigenfaces_template.html`	~41 KB	Provided	HTML shell, keep alongside the script
`eigenfaces_sklearn.html`	~2.1 MB	Provided	Preview — open immediately
`eigenfaces_olivetti.html`	~2+ MB	Script output	Final file with real Olivetti data

03

Workflow

The core challenge is that browsers cannot run Python. The solution is a build-time bridge: Python runs once on your machine, does all the heavy computation, then serializes the results into a self-contained HTML file the browser can display with pure JavaScript.

Step 1

Fetch Data

fetch_olivetti
_faces()

Step 2

Run PCA

PCA(k=50)
.fit(X)

Step 3

Export

JSON +
base64

Step 4

Inject

template
.replace()

Step 5

Visualize

Browser
renders JS

Why not just run sklearn in the browser?

Browsers execute only JavaScript. Python libraries like sklearn, numpy, and scipy cannot run client-side. Pyodide (Python-in-WebAssembly) could in theory work but would require downloading a 30+ MB runtime and would be far slower.

The chosen approach — pre-compute in Python, embed in HTML — gives the best of both worlds: real sklearn results with zero browser dependencies. The generated HTML is completely self-contained and can be shared, emailed, or opened offline.

Data Flow Detail

Stage	Where	Input	Output
`fetch_olivetti_faces()`	Python	Internet (figshare)	`X (400,4096)`, `y (400,)`
`PCA.fit(X)`	Python/sklearn	`X`	components_, mean_, EVR, projections
Downsample + encode	Python/Pillow	64×64 images	32×32 uint8 → base64 strings
JSON injection	Python	All PCA results	`eigenfaces_olivetti.html`
Decode + render	Browser JS	base64 JSON	Canvas drawings, charts
Recognition	Browser JS	User interaction	Nearest-neighbor in eigenspace

04

Methodology

Eigenfaces is a classical computer vision technique introduced by Turk and Pentland (1991). It applies Principal Component Analysis (PCA) to a dataset of face images, discovering the directions of greatest variance — the "eigenfaces" — and uses projection into this low-dimensional space for recognition.

The AT&T Olivetti Faces Dataset

Collected at AT&T Laboratories Cambridge between 1992–1994. Contains 400 grayscale photographs of 40 distinct subjects, with 10 images per person. Variations include different lighting conditions, facial expressions (open/closed eyes, smiling/neutral), and accessories (glasses or no glasses). Each image is 64×64 pixels = 4,096 dimensions.

Property	Value
Total images	400
Identities	40 subjects
Images per person	10
Image size	64×64 pixels (original) · 32×32 (display)
Feature dimension D	4,096 (64×64 flattened)
Color	Grayscale, uint8, range [0, 255]
sklearn access	`fetch_olivetti_faces()`

Core Methodology Steps

M1

Build the Data Matrix

Flatten each 64×64 face image into a 4,096-dimensional row vector. Stack all 400 images to form the matrix X ∈ ℝ⁴⁰⁰ˣ⁴⁰⁹⁶. Each row is one face; each column is one pixel across all faces.

M2

Compute and Subtract the Mean Face

Calculate μ = (1/N) Σ xᵢ — the average face across all 400 images (pca.mean_). Subtract from every face: X_c = X − μ. This centers the data before SVD.

M3

Singular Value Decomposition (SVD)

Decompose the centered matrix: X_c = U Σ Vᵀ. sklearn internally uses LAPACK's dgesdd (divide-and-conquer SVD). The right singular vectors Vᵀ are the eigenfaces — directions of maximal variance in face-space.

M4

Select Top-k Eigenfaces

Retain only the top k singular vectors from Vᵀ. These become pca.components_ with shape (k, 4096). Each row reshaped to 64×64 is one eigenface — a "face-like" basis image capturing one major mode of variation.

M5

Project All Faces into Eigenspace

For each face, compute its weights: wᵢ = (xᵢ − μ) · Vᵏ. This is pca.transform(X), producing a (400, k) matrix of projections. Recognition operates entirely in this low-dimensional eigenspace.

M6

Nearest-Neighbor Recognition

Project query face into eigenspace: w_q = pca.transform(query). Find the training face with minimum Euclidean distance: match = argmin ||W − w_q||₂. The identity of the matched face is the prediction.

M7

Reconstruction

Reconstruct any face from its eigenspace weights: x̂ = w · Vᵏ + μ — equivalent to pca.inverse_transform(w). Using all k=50 components gives near-perfect reconstruction; fewer components give a blurrier approximation.

05

Mathematics

SVD Decomposition

sklearn PCA centers the data matrix and applies truncated SVD:

X_c = U · Σ · Vᵀ

X_c ∈ ℝ⁴⁰⁰ˣ⁴⁰⁹⁶ | U ∈ ℝ⁴⁰⁰ˣᵏ | Σ ∈ ℝᵏˣᵏ | Vᵀ ∈ ℝᵏˣ⁴⁰⁹⁶

Explained Variance Ratio

Each singular value σᵢ corresponds to one eigenface. The fraction of total variance it explains:

EVRi = σᵢ² / (N−1) / Σⱼ σⱼ² / (N−1) = σᵢ² / Σⱼ σⱼ²

pca.explained_variance_ratio_ — same as σᵢ²/Σσⱼ²

Projection (Encoding)

w = pca.transform(x) = (x − μ ) · Vᵏᵀ

w ∈ ℝᵏ — the face's coordinates in eigenspace

Reconstruction (Decoding)

x̂ = pca.inverse_transform(w) = w · Vᵏ + μ = Σᵢ wᵢ · efi + μ

x̂ ∈ ℝ⁴⁰⁹⁶ — reconstructed face. Perfect when k = rank(X)

Recognition Distance

match = argmini || W[i] − wquery ||2

Euclidean distance in eigenspace — used for nearest-neighbor matching

📐

Why eigenspace distance works for recognition

Projecting into eigenspace removes high-frequency noise and retains only the principal modes of variation — lighting, expression, pose. Two images of the same person cluster closer in this space than images of different people, making simple L2 distance effective for recognition.

06

Python Script Explained

generate_eigenfaces.py runs in five stages. Each stage is clearly printed to the terminal so you can follow progress.

Stage 1 — Fetch the Dataset

sklearn downloads the AT&T Olivetti dataset from figshare.com (~5 MB) and caches it locally in ~/scikit_learn_data/. Future runs use the cache.

      Python · generate_eigenfaces.py · Stage 1
      from sklearn.datasets import fetch_olivetti_faces

dataset = fetch_olivetti_faces(shuffle=False)

X      = dataset.data    # (400, 4096) float32 — flattened images, range [0,1]
y      = dataset.target  # (400,)      int     — identity labels 0..39
images = dataset.images  # (400,64,64) float32 — 2D pixel arrays

Stage 2 — Run Real sklearn PCA

PCA.fit(X) centers X by mean, then calls LAPACK's truncated SVD. All results are standard sklearn attributes.

      Python · generate_eigenfaces.py · Stage 2
      from sklearn.decomposition import PCA

K   = 50
pca = PCA(n_components=K, whiten=False)
pca.fit(X)

# Standard sklearn PCA attributes:
eigenfaces = pca.components_              # (50, 4096) — right singular vectors
mean_face  = pca.mean_                    # (4096,)    — mean over training set
evr        = pca.explained_variance_ratio_ # (50,)     — variance per component
singular_v = sqrt(pca.explained_variance_ * (N-1))  # approx σᵢ

# Project all 400 faces into eigenspace:
projections = pca.transform(X)           # (400, 50) — real sklearn output

Stage 3 — Downsample for Display

Original images are 64×64 but displayed at 32×32 to keep the HTML file size manageable (~2 MB vs ~8 MB). Pillow's LANCZOS filter gives high quality downsampling.

      Python · generate_eigenfaces.py · Stage 3
      from PIL import Image as PILImage

def resize32(arr_64):
    u8  = (arr_64.clip(0,1) * 255).astype('uint8')
    pil = PILImage.fromarray(u8, mode='L').resize((32,32), PILImage.LANCZOS)
    return array(pil, dtype='float32') / 255.0

faces_32 = stack([resize32(images[i]) for i in range(400)])  # (400,32,32)
mean_32  = resize32(mean_face.reshape(64,64))               # (32,32)
ef_32    = stack([resize32(normalize(eigenfaces[i])) for i in range(K)])  # (50,32,32)

Stage 4 — Encode to base64

Float arrays are quantized to uint8 and base64-encoded so they can be embedded as strings inside the HTML/JavaScript. The browser decodes them back with atob().

      Python · generate_eigenfaces.py · Stage 4
      import base64, json

def b64(arr):
    u8 = (arr.clip(0,1) * 255).astype('uint8')
    return base64.b64encode(u8.tobytes()).decode('ascii')

payload = {
    "faces_b64":   b64(faces_32.reshape(400,-1)),     # 400 × 1024 pixels
    "mean_b64":    b64(mean_32.flatten()),            # 1024 pixels
    "ef_b64":      b64(ef_32.reshape(K,-1)),          # 50 × 1024 pixels
    "projections": projections.tolist(),              # (400, 50) full precision
    "evr":         evr.tolist(),                      # (50,)
    "labels":      y.tolist(),                        # (400,) identity IDs
}

Stage 5 — Inject into HTML Template

The template contains a single placeholder comment. Python replaces it with the full JSON payload — resulting in a self-contained HTML file.

      Python · generate_eigenfaces.py · Stage 5
      json_str = json.dumps(payload, separators=(',',':'))  # compact JSON

html = open('eigenfaces_template.html').read()

# Replace placeholder with real sklearn data:
html = html.replace(
    '/* __SKLEARN_DATA__ */',
    f'const DATA = {json_str};'
)

open('eigenfaces_olivetti.html', 'w').write(html)

07

HTML Visualization Explained

The HTML file is pure vanilla JavaScript — no frameworks, no external dependencies. All face rendering uses the HTML5 <canvas> API.

Decoding Embedded Data

When the page loads, the injected DATA constant is decoded back from base64 to pixel arrays.

      JavaScript · eigenfaces_template.html · Data Decoding
      // DATA is injected by Python — the const is already in the HTML
const FW = DATA.img_w, FH = DATA.img_h, FD = FW * FH;  // 32, 32, 1024

function decodeU8B64(b64, rows, cols) {
    const raw = atob(b64);                     // base64 → raw bytes
    const u8  = new Uint8Array(raw.length);
    for (let i = 0; i < raw.length; i++)
        u8[i] = raw.charCodeAt(i);

    const arr = [];
    for (let i = 0; i < rows; i++) {
        const row = new Float32Array(cols);
        for (let j = 0; j < cols; j++)
            row[j] = u8[i*cols + j] / 255;    // uint8 → [0,1]
        arr.push(row);
    }
    return arr;
}

const ALL_FACES   = decodeU8B64(DATA.faces_b64,  400, FD);
const MEAN_FACE   = decodeU8B64(DATA.mean_b64,   1,   FD)[0];
const EIGENFACES  = decodeU8B64(DATA.ef_b64,     50,  FD);
const PROJECTIONS = DATA.projections;  // (400, 50) full float64 from sklearn

Drawing a Face on Canvas

Every face image — original, eigenface, reconstructed — uses the same canvas drawing function.

      JavaScript · Face Rendering
      function drawFace(canvas, face, w, h, displayW, displayH) {
    canvas.width = displayW; canvas.height = displayH;
    const ctx = canvas.getContext('2d');

    // Normalize to [0,1] for rendering (handles negative eigenface values)
    let mn = Infinity, mx = -Infinity;
    for (const v of face) { if (v < mn) mn=v; if (v > mx) mx=v; }

    // Write pixel-by-pixel into ImageData at native resolution
    const img = ctx.createImageData(w, h);
    for (let i = 0; i < w*h; i++) {
        const v = Math.round((face[i]-mn)/(mx-mn+1e-8) * 255);
        img.data[i*4]   = v;    // R
        img.data[i*4+1] = v;    // G
        img.data[i*4+2] = v;    // B
        img.data[i*4+3] = 255; // A (fully opaque)
    }
    // Scale up from 32×32 → display size with pixelated rendering
    const tmp = document.createElement('canvas');
    tmp.width=w; tmp.height=h;
    tmp.getContext('2d').putImageData(img, 0, 0);
    ctx.drawImage(tmp, 0, 0, displayW, displayH);
}

Reconstruction in JavaScript

The browser re-implements pca.inverse_transform() using the pre-computed projections and eigenfaces from sklearn.

      JavaScript · Face Reconstruction
      function reconstructFace(faceIdx, k) {
    // Start from mean face (equivalent to pca.mean_)
    const recon = new Float32Array(FD);
    for (let j=0; j<FD; j++) recon[j] = MEAN_FACE[j];

    // Add weighted eigenfaces: Σᵢ wᵢ · efᵢ
    for (let ki=0; ki<k; ki++) {
        const w  = PROJECTIONS[faceIdx][ki]; // real sklearn projection weight
        const ef = EIGENFACES[ki];            // real sklearn eigenface
        for (let j=0; j<FD; j++) recon[j] += w * ef[j];
    }
    return recon;
}

Nearest-Neighbor Recognition

The recognition step projects the (noisy) query face into eigenspace using the stored eigenfaces, then finds the nearest training projection by Euclidean distance.

      JavaScript · Recognition
      function recognize(queryFace, k) {
    // 1. Center the query face
    const centered = queryFace.map((v, j) => v - MEAN_FACE[j]);

    // 2. Project: w_q = (query - mean) · Vᵏᵀ
    const qProj = EIGENFACES.slice(0, k).map(ef => {
        let d = 0;
        for (let j=0; j<FD; j++) d += centered[j] * ef[j];
        return d;
    });

    // 3. Find nearest training projection by L2 distance
    let minDist = Infinity, bestIdx = 0;
    PROJECTIONS.forEach((p, i) => {
        let d = 0;
        for (let ki=0; ki<k; ki++) d += (p[ki] - qProj[ki]) ** 2;
        if (Math.sqrt(d) < minDist) { minDist = Math.sqrt(d); bestIdx = i; }
    });
    return { match: bestIdx, dist: minDist, identity: LABELS[bestIdx] };
}

08

The sklearn Bridge Pattern

This project demonstrates a general pattern for bringing any Python ML computation into a static HTML page. The pattern has three parts:

B1

Python computes, JavaScript renders

All heavy math (SVD, PCA, projection) happens in Python at build time. The browser only handles display and user interaction — tasks JavaScript does well with no performance issues.

B2

Placeholder injection

The HTML template has a single placeholder /* __SKLEARN_DATA__ */ inside a <script> tag. Python replaces it with const DATA = {...}; — making the data a native JavaScript constant, not a runtime fetch.

B3

Zero runtime dependencies

The generated HTML imports no external scripts, makes no HTTP requests, and needs no server. It works offline, can be emailed, shared via USB, or hosted on any static file server.

🔁

Adapting this pattern for other ML models

Replace fetch_olivetti_faces() with your own dataset. Replace PCA with any sklearn estimator. Serialize what your JavaScript needs (weights, predictions, embeddings). Inject via the same replace(placeholder, json_str) pattern.

09

Interface Guide

The visualization has a fixed two-column layout: a sidebar on the left for controls and stats, and a main panel on the right with five tabs corresponding to the five stages of the Eigenfaces pipeline.

Layout Regions

Region	Contents
Header	Title, dataset info, badges
Pipeline tabs	Five clickable steps — click any to jump to that stage
Sidebar	k slider, noise slider, action buttons, live stats, sklearn code reference
Main panel	Content for the currently selected pipeline step

10

Controls Reference

Control	Range	Effect
k slider	1 – 50	Number of eigenfaces used. Higher k = more variance captured = better reconstruction and recognition. Lower k = faster, more compressed, less accurate.
Noise slider	0 – 60%	Gaussian noise added to the query face before recognition. Tests robustness. At 0% the system uses a clean image; at 60% the face is heavily degraded.
Apply k & Refresh	Button	Re-renders all panels with the current k value. Updates eigenface grid, reconstruction pairs, variance chart, and stats.
Recognize Query	Button	Projects the current query face (with noise) into eigenspace using the current k, runs nearest-neighbor, and switches to Step 5 to show results.
New Query	Button	Selects a random face from the 400 Olivetti images as the new query. Also updates the person grid selection highlight.
Person grid	Click	Click any of the 40 identity cards in Step 1 to set a face from that person as the query. Highlights selected identity.

11

The Five Pipeline Steps

01

Step 1 — Dataset · fetch_olivetti_faces()

Displays all 40 identity cards (one representative image each) and the full 400-face gallery. Click any identity card to select it as the recognition query. Hover any face in the gallery to see its person ID. This step corresponds to loading X and y from sklearn.

02

Step 2 — PCA / SVD · PCA.fit(X)

Shows the mean face (pca.mean_) and a bar chart of singular values σᵢ. Bars highlighted in green are the k selected components; grey bars are excluded. The SVD equation is displayed with the actual matrix shapes. Adjusting k updates the green/grey split.

03

Step 3 — Eigenfaces · pca.components_

Shows the cumulative explained variance curve with a 90% reference line and a green dot marking the current k. Below the chart, all k eigenfaces are displayed as thumbnails — these are the rows of pca.components_, reshaped to 32×32 and normalized for display. Earlier eigenfaces capture broad features; later ones capture fine detail.

04

Step 4 — Reconstruction · pca.inverse_transform()

Displays 20 side-by-side pairs: original face (left) and reconstruction (right). Reconstruction uses the stored sklearn projections plus the first k eigenfaces. Lower k gives blurrier, approximate faces; at k=50 the reconstruction is nearly perfect. The percentage label shows how much variance is retained.

05

Step 5 — Recognition · argmin(||W − w_q||₂)

Shows the noisy query face, the recognition result (HIT or MISS), and a horizontal bar chart of the top-8 nearest neighbors with their L2 distances in eigenspace. The green bar is the best match; blue bars are runners-up. Adjusting noise or k and re-running shows how both parameters affect accuracy.

12

Troubleshooting

Error	Cause	Fix
`python: command not found`	Python not in PATH	Try `python3` instead, or reinstall Python from `python.org` and check "Add to PATH"
`pip: command not found`	pip not in PATH	Use `python -m pip install ...`
`URLError: Tunnel connection failed`	figshare.com blocked by network	Use mobile hotspot, VPN, or different Wi-Fi network. Only needed for first download; then cached locally.
`eigenfaces_template.html not found`	Files not in same folder	Run `ls` to confirm both `generate_eigenfaces.py` and `eigenfaces_template.html` are in the current directory
`ModuleNotFoundError: PIL`	Pillow not installed	Script has a numpy fallback — just re-run. Or install: `pip install pillow`
Blank page in browser	File too large / wrong browser	Use Chrome or Firefox. Edge and Safari may have issues with large inline scripts. Try `python -m http.server 8080` and open `localhost:8080`
Faces look pixelated	Expected — 32×32 display	This is intentional for file size. The recognition math still uses full-precision sklearn data.

⚠

Network required only once

sklearn caches the Olivetti dataset in ~/scikit_learn_data/ after the first download. All subsequent runs of generate_eigenfaces.py use the cache and need no internet connection.

13

Glossary

Term	Definition
PCA	Principal Component Analysis — dimensionality reduction technique that finds directions of maximum variance in data.
SVD	Singular Value Decomposition — matrix factorization A = UΣVᵀ. sklearn PCA uses truncated SVD internally.
Eigenface	A principal component of the face dataset — one row of `pca.components_`, reshaped to an image. Looks like a ghostly face pattern.
Eigenspace	The low-dimensional coordinate system spanned by the top-k eigenfaces. Each face maps to a point in this space.
Projection	The coordinates of a face in eigenspace — computed by `pca.transform()`. Shape: (k,) per face.
Reconstruction	Rebuilding a face from its eigenspace coordinates — `pca.inverse_transform()`. Quality improves with larger k.
EVR	Explained Variance Ratio — fraction of total variance captured by each component. Sum to 1 across all components.
Nearest Neighbor	Recognition method — finds the training face whose eigenspace projection is closest (min L2 distance) to the query.
mean_ (μ)	`pca.mean_` — the average pixel values across all training faces. Subtracted before projection, added during reconstruction.
components_	`pca.components_` — shape (k, D). The eigenfaces. Row i is the i-th principal direction in pixel space.
k	Number of eigenfaces (PCA components) to use. Trades off accuracy vs. dimensionality.
base64	Binary-to-text encoding that lets pixel data be stored as ASCII strings inside HTML/JavaScript.
AT&T Olivetti	Standard face recognition benchmark dataset. 400 images, 40 subjects × 10 photos, 64×64 grayscale.

Eigenfaces · sklearn PCA · AT&T Olivetti Faces · MSDE Cohort 4 · User Guide v1.0

Eigenfaces — User Guide &Technical Documentation

Quick Start

Windows Setup

Pipeline Overview

Setup Steps — Git Bash on Windows

Troubleshooting

Project Files

Workflow

Why not just run sklearn in the browser?

Data Flow Detail

Methodology

The AT&T Olivetti Faces Dataset

Core Methodology Steps

Build the Data Matrix

Compute and Subtract the Mean Face

Singular Value Decomposition (SVD)

Select Top-k Eigenfaces

Project All Faces into Eigenspace

Nearest-Neighbor Recognition

Reconstruction

Mathematics

SVD Decomposition

Explained Variance Ratio

Projection (Encoding)

Reconstruction (Decoding)

Recognition Distance

Python Script Explained

Stage 1 — Fetch the Dataset

Stage 2 — Run Real sklearn PCA

Stage 3 — Downsample for Display

Stage 4 — Encode to base64

Stage 5 — Inject into HTML Template

HTML Visualization Explained

Decoding Embedded Data

Drawing a Face on Canvas

Reconstruction in JavaScript

Nearest-Neighbor Recognition

The sklearn Bridge Pattern

Python computes, JavaScript renders

Placeholder injection

Zero runtime dependencies

Interface Guide

Layout Regions

Controls Reference

The Five Pipeline Steps

Step 1 — Dataset · fetch_olivetti_faces()

Step 2 — PCA / SVD · PCA.fit(X)

Step 3 — Eigenfaces · pca.components_

Step 4 — Reconstruction · pca.inverse_transform()

Step 5 — Recognition · argmin(||W − w_q||₂)

Troubleshooting

Glossary

Eigenfaces — User Guide &
Technical Documentation