Skip to content

Latest commit

 

History

History
199 lines (121 loc) · 7.2 KB

File metadata and controls

199 lines (121 loc) · 7.2 KB

Methodology: Measuring the Eigen-Drift Law


Overview

The experiment has three components:

  1. ChaoticAnchor — a controllable adversarial signal with tunable structural coherence time (τ)
  2. DeepMLP — a learner with measurable gradient alignment behavior
  3. EigenDriftTracker — a measurement apparatus for τ_structure and τ_alignment

The core hypothesis being tested:

τ_structure < τ_alignment  →  non-convergence

Measurement Apparatus: EigenDriftTracker

Signal Space (τ_structure)

We track the anchor signal in a sliding window and compute a time-delay embedding followed by SVD:

X_emb = [X[i:i+embed_dim] for i in range(len(X) - embed_dim)]
U, S, Vt = np.linalg.svd(X_emb, full_matrices=False)
v = Vt[0]  # principal direction

The cosine similarity between successive principal directions (with exponential smoothing) gives a proxy for structural stability:

smoothed_sim = alpha * raw_sim + (1 - alpha) * smoothed_sim

τ_structure is defined as the number of consecutive steps where this similarity stays above a threshold — i.e., how long the dominant direction persists.

def _compute_tau_from_history(history, threshold):
    tau = 0
    for s in reversed(history):
        if s >= threshold:
            tau += 1
        else:
            break
    return tau if tau > 0 else None

Note: this measures persistence, not "time since last drop." The distinction matters.

Model Space (τ_alignment)

τ_alignment approximates the minimum temporal horizon required for the learner's gradient updates to establish a consistent descent direction. It is measured via the directional stability of weight updates (∆W), not absolute weights:

delta = current_w - prev_w
sim = dot(delta, prev_delta) / (norm(delta) * norm(prev_delta))

Using weight updates rather than weights captures whether the learner is moving in a consistent direction — the actual signal of alignment. Absolute weight similarity is misleading in high dimensions because it is dominated by magnitude and can appear stable even when the gradient direction is chaotic.

τ_alignment is computed the same way as τ_structure: consecutive steps where ∆W direction similarity stays above threshold.

Spectral Gap

As a second independent indicator of structural integrity:

gap = S[0] / (S[1] + 1e-12)
  • High gap → dominant structure exists (one direction accounts for most variance)
  • Gap → 1 → structure dissolving (singular values becoming uniform)

Empirically, the gap staying > 3 confirms we are in the transient-invariant regime (structure exists but is short-lived), not the annihilation regime.


Threshold Considerations

We use multiple thresholds simultaneously to avoid threshold-artifact criticism:

thresholds = [0.9, 0.8, 0.7, 0.6, 0.5, 0.4]

The operationally meaningful threshold in these experiments is 0.5. At 0.7, the similarity rarely holds long enough for a non-null measurement; at 0.5, we capture real persistence.

When reporting τ: always specify which threshold was used.


Window Size

Window = 100 samples (not 500).

This is critical. If the window is much larger than τ_structure, SVD never stabilizes and similarity never rises above threshold — giving the misleading appearance that structure doesn't exist. The window should be on the same order of magnitude as τ_structure.

At 16 kHz, 100 samples ≈ 6.25 ms.


Anchor Configuration

The ChaoticAnchor provides four layers of interference:

Layer Effect Parameter
L1: Chaos Sinusoidal + logistic map phase drift tau_samples
L2: Adaptive chaos Drift rate itself varies alpha
L3: Private key Hidden basis switches key_transition_interval
L4: Orthogonal jumps Instantaneous frame rotation orthogonal_switch_prob

Anchor speed presets:

Speed key_transition_interval orthogonal_switch_prob tau_samples
slow 1000 0.01 100
normal 500 0.02 50
fast 200 0.05 25
extreme 80 0.10 10

Experiment Design

Experiment 1: Adversary Scaling

Fix anchor at "normal". Vary MLP architecture (mem_len, hidden units, depth).

Purpose: Confirm that capacity doesn't fix the failure.

Experiment 2: Discontinuity Scaling

Fix MLP (m64/h128/d2). Vary anchor speed: slow → normal → fast → extreme.

Purpose: Show τ_structure dropping monotonically and SINR tracking it.

Experiment 3: Race Matrix

Cross all anchor speeds with small and large MLPs.

Purpose: Show the boundary is determined by anchor speed, not model size.

Experiment 4: Learner Speed Sweep

Fix anchor at "normal". Vary learning rate: 0.0001, 0.0005, 0.001.

Purpose: Test whether τ_alignment can be reduced by tuning LR. (Result: minimal effect — τ_alignment is largely an architectural property.)


Key Metric: Late SINR

SINR is estimated on the last 25% of each run:

ls = int(0.75 * N)
late = estimate_sinr_gain_db(clean[ls:], pre=d0[ls:], post=e_out[ls:])

Using the late window captures steady-state behavior rather than transient startup. Early SINR can look good even in failing systems as the learner partially tracks before the anchor resets.


Causal Ordering to Look For

The valid claim requires causal ordering, not just correlation:

  1. τ_structure drops
  2. ∆W direction similarity destabilizes
  3. SINR flatlines

This ordering, if consistently present across runs, demonstrates mechanism rather than outcome correlation. Use vertical break markers (at the first τ drop below threshold) across all four panels to check this.


What Null τ_learning Means

When τ_learning is reported as None or null at threshold 0.7, this does not mean the learner isn't updating. It means the learner never achieves directional stability for even one consecutive step at that threshold.

This is informative: the learner is updating actively (loss changes) but has no stable gradient direction. That is the signature of gradient spin — the learner reorients faster than any alignment accumulates.


Known Limitations

  1. SVD sensitivity: With small windows and high noise, singular vectors can jitter. Exponential smoothing (α = 0.2) reduces this but doesn't eliminate it.

  2. Single signal: The tracker uses the pure anchor signal (a_val). Mixing in the background (x_val + 0.6 * a_val) contaminates the structural measurement with learnable components.

  3. τ_alignment estimates: LR alone is a weak knob for τ_alignment. Architecture (memory length, depth) is likely the dominant factor, not yet systematically swept.

  4. Threshold dependence: While multi-threshold tracking helps, the mapping between threshold value and "true" coherence time is not yet analytically grounded.

  5. Generalization scope: All experiments used one signal family (chaotic anchor) and one learner class (MLP with online backprop). The τ_structure ≥ τ_alignment principle is hypothesized to generalize to other non-stationary systems and learner architectures, but is currently validated only on this system. Extensions to recurrent learners, ensemble methods, and different signal families remain open.