Idiya Market Phasor (IMP)

1 · The one idea

Every chart you've seen shows one thing: where price has been. This framework measures two things — the visible move, and the hidden pressure that changes before price does.

Kettle analogy: the visible move is steam you can see; the hidden pressure is the water temperature inside. By the time you see steam, the temperature has been climbing for minutes. We measure the temperature, not just the steam.

2 · The four situations

Each of the two numbers can be positive or negative. Four combinations, four clear actions.

Visible UP · Hidden UP Hold or add The trend has fuel. Both axes agree.	Visible UP · Hidden DOWN ⚠️ Take profits Price looks fine but pressure has flipped. Smart money leaving. Distribution.
Visible DOWN · Hidden UP Watch for the bottom Falling still, but buyers stepping in. Reversal setup.	Visible DOWN · Hidden DOWN Wait No floor. Don't catch the falling knife.

3 · The six stages of a cycle

Price moves in cycles, like seasons. Six stages, each with a name borrowed from Wyckoff market structure.

Accumulation — smart money quietly buying. Chart looks flat. Hidden pressure rising. The move begins here.
Markup — the visible uptrend.
Distribution — top forms. Price still rising, hidden pressure flipping. Warning.
Markdown — visible downtrend.
Capitulation — panic bottom.
Re-accumulation — base rebuilds. Cycle restarts.

4 · Euler's formula — "the most beautiful equation"

In 1748 Leonhard Euler published an identity that connects five of the most fundamental numbers in mathematics:

e^iπ + 1 = 0

Richard Feynman called it "the most remarkable formula in mathematics." It's a special case of a more general identity — and the general form is the one we use here:

e^iθ = cos θ + i·sin θ

This says: raising e to an imaginary power traces a circle in the complex plane. As θ sweeps from 0 to 2π, the point e^iθ walks once around the unit circle. The real part is cos θ, the imaginary part is sin θ. These two components are always 90° out of phase — one leads the other by a quarter cycle.

Multiply by a magnitude r and you get a phasor — a single complex number that encodes both how big something is and where it is in its cycle:

z(t) = r(t)·e^iθ(t) = r·cos θ + i·r·sin θ

Engineers have used this since the 1890s to describe AC electricity. Quantum mechanics uses it for wave functions. Signal processing uses it for audio, radio, radar, and sonar. We use it for price.

Market link: Price is a wave. If we can recover its magnitude and phase at every point in time, we get a complete description of where the stock is in its cycle and how much energy is behind the move. r becomes move strength; θ becomes the stage.

5 · What is a phasor?

A phasor (phase vector) is a complex-number shortcut for describing an oscillating signal. Anything that swings back and forth — a voltage, a sound pressure, a price — can be written as:

v(t) = V · cos(ωt + θ)

Three pieces: V is the amplitude (how big the swing is), ω is the angular frequency (how fast it cycles), and θ is the phase angle (where in the cycle it starts). Carrying all three through calculus is painful. So engineers collapse them into a single complex number:

V∠θ = V · e^iθ

That's the phasor. It's a vector in the complex plane of length V pointing at angle θ. The time-varying ωt part falls out, because in a steady-state system every element oscillates at the same frequency — it's the relative amplitudes and phases between signals that matter, not the absolute clock.

Why phasors exist at all

Charles Proteus Steinmetz, a General Electric engineer, popularized phasors in 1893 to make AC circuit analysis tractable. Before him, you'd solve an AC circuit by writing differential equations for the sinusoidal voltages and currents and integrating by hand. With phasors, the same problem becomes algebra — complex-number multiplication and division. Differentiation of a sine turns into multiplying the phasor by iω. Integration turns into dividing by iω. A circuit with capacitors, inductors, and resistors becomes a set of linear equations you can solve with pencil and paper.

The phasor diagram

When you draw phasors as arrows on the complex plane, you can see the relationships. Voltage and current in a capacitor are 90° apart. Current lags voltage in an inductor by 90°. A power factor of 0.8 means the load's current phasor is rotated 37° from the voltage phasor. One diagram captures the whole AC behavior of a circuit. It's the reason every electrical engineering textbook is full of arrows on a complex plane.

Where phasors show up

Power systems — grid synchronization, fault analysis, reactive power.
Communications — IQ modulation, FM demodulation, software-defined radio.
Control theory — frequency response, Bode plots, Nyquist stability.
Signal processing — Fourier analysis, filter design, vibration analysis.
Quantum mechanics — wave functions are literally complex phasors evolving in time.

How it applies to the stock market

A stock's price over time is also an oscillating signal. Wyckoff was describing this intuitively a century ago when he named the cycle stages — Accumulation, Markup, Distribution, Markdown, Capitulation, Re-accumulation. That's not four separate phenomena. It's one cycle at different angular positions, and each stage corresponds to a specific slice of the 360° circle.

If we can extract the magnitude and phase of that cycle from the observed price, we get a phasor per stock per bar:

z(t) = r(t) · e^iθ(t)

where:

r(t) = the amplitude — how big the current move is. This is move strength. High r means real conviction behind the move; low r means the stock is just grinding.
θ(t) = the phase angle — exactly where in the Wyckoff cycle we are. θ = 90° is pure Markup. θ = 180° is Distribution peak. θ = 270° is Capitulation bottom. θ = 0° is Accumulation.
Re(z) = r·cos(θ) = the real part = visible price momentum. What you see on the chart.
Im(z) = r·sin(θ) = the imaginary part = hidden flow pressure. What the chart hasn't shown yet. Leads price by a quarter cycle.

The Hilbert transform is the machine that turns a raw price series into a phasor. It's parameter-free — no model to train, no thresholds to tune. You feed in 500 bars of closing prices, you get back 500 phasors: one complex number per day, each containing the full state of that stock at that moment.

Why this is useful: A single phasor captures what a trader normally tries to piece together from 5-6 separate indicators — trend direction, strength, momentum, divergence, and cycle position. Instead of saying "RSI is 45 and MACD is crossing down but Bollinger bands are narrowing and volume is spiking," you say "θ is 138° and r is 0.07" — one number that already contains all those relationships, because they're all projections of the same underlying oscillation. And because the math is the same math the power grid runs on, it's deterministic, auditable, and reproducible. The phasor is the minimal complete description of a stock's current state.

The analogy in one picture

Picture an arrow rotating counterclockwise around the origin of a graph, once per market cycle. The arrow's length is the amplitude — how much energy is in the move. The arrow's angle is the phase — which stage we're in. The shadow the arrow casts on the horizontal axis is the visible price momentum. The shadow it casts on the vertical axis is the hidden flow pressure. Every day, you take a snapshot of the arrow. That snapshot is the phasor.

In a perfect world, price would be a clean sine wave and the arrow would rotate at constant speed. In reality, price is noisy and the arrow wobbles — it speeds up, slows down, sometimes reverses. The rate at which the angle changes (dθ/dt) is how fast the cycle is advancing, and a sudden spike in that rate is the signature of a regime transition.

6 · Complex numbers as market state

A complex number z = a + i·b has two parts that can't be collapsed into one. This is exactly what a stock needs: it's not enough to know the price went up — you also need to know why, and whether the reason is strengthening or weakening.

In our framework:

Re(z) = the visible price move. Positive = rising, negative = falling.
Im(z) = the hidden flow. Positive = buying pressure, negative = selling pressure.
|z| = r = the amplitude — how big the current move is.
∠z = θ = the phase angle — which stage of the cycle we're in.

Re and Im are orthogonal. No matter what Re does, Im can tell a different story. That's the whole point. Pearson correlation collapses a stock to a single time series, discarding everything except the Re axis. The phasor sees both.

7 · The Hilbert transform

Here's the problem: a chart only gives us Re. We need a principled way to construct the Im component from the observed price alone. The Hilbert transform does exactly this.

Given a real signal x(t), the Hilbert transform H{x}(t) is the unique linear operator that shifts every frequency component by 90°. The analytic signal is then formed as:

z(t) = x(t) + i · H{x}(t)

The resulting z(t) is a complex time series whose magnitude is the signal's envelope and whose phase is its instantaneous angle. This is how we recover both axes from a single observed price stream.

Mathematically, in the frequency domain, the Hilbert transform multiplies positive frequencies by −i and negative frequencies by +i. Time-domain equivalent: a convolution against 1/(π·t).

The transform was introduced by David Hilbert around 1905 and became fundamental to communications theory through Dennis Gabor's 1946 paper on signal expansion. It's the math behind FM demodulation, speech processing, structural vibration analysis, radar echo processing, and MRI k-space reconstruction.

Market link: Price is treated as a 1-D signal, just like voltage on an antenna. The Hilbert transform is parameter-free — there's nothing to tune, no model to train. The imaginary component is a deterministic function of the past, not a guess. This is why the framework is fully reproducible.

8 · The Butterworth filter

Raw log returns are too noisy to phase-decompose directly. High-frequency jitter dominates, masking the cycle we care about. A lowpass filter removes the noise before the Hilbert step.

The Butterworth filter, invented by Stephen Butterworth in 1930, has the flattest possible passband response of any filter. For a lowpass Butterworth of order n with cutoff ω_c:

|H(ω)|² = 1 / (1 + (ω/ω_c)²ⁿ)

It's the only parameter choice in the pipeline. Default cutoff is 0.25 of the Nyquist frequency, order 4. This preserves cycles of roughly 8 bars and longer, removes day-to-day noise.

Market link: Without this filter, the phasor would react to every bid-ask wiggle and the regime would flip violently. With it, we get clean multi-day cycles that match what traders actually see. The cutoff is the one dial on the whole machine.

Laplace transform — the s-plane lens

The phasor tells you where a stock is on its cycle. The Laplace transform tells you what kind of system the cycle belongs to — exploding, decaying, oscillating, or mean-reverting. One pole on a 2-D map captures it.

From e^iωt to e^st

The phasor identity e^iωt traces a circle at constant amplitude — a perfect, undamped oscillation. Real markets aren't perfect; moves grow, decay, and reverse. To describe those, we replace iω with a complex number:

s = σ + iω

Now the same expression e^st = e^(σ+iω)t = e^σt·e^iωt says two things at once. e^σt is the envelope (growing, shrinking, or steady); e^iωt is the rotation inside that envelope. σ is the real part — damping rate. ω is the imaginary part — oscillation frequency. Together they form a single point in the complex plane called the s-plane.

Pierre-Simon Laplace published this in 1782, generalising Euler's circle into a tool for solving differential equations. Oliver Heaviside in the 1880s used it to tame electrical transients. Every control engineer since 1930 has thought in s-plane: the location of a system's poles tells you, at a glance, whether it's stable, oscillatory, damped, or about to blow up.

The four quadrants — what kind of system?

Plotting (σ, ω) places every market state in one of four geometric regions, each with a clear behavioural signature:

σ > 0, ω ≠ 0 — explosive oscillation Riding a momentum move Amplitude growing and rotating. Markup phase with conviction. Trends until σ rolls negative.	σ > 0, ω ≈ 0 — pure exponential growth Late-stage parabolic No cycle, just compounding. Crypto bubbles, manias. Beautiful until σ flips.
σ < 0, ω ≠ 0 — damped oscillation Mean-reverting cycle Range-bound, oscillating but losing energy. Pair-trade territory.	σ < 0, ω ≈ 0 — pure decay Markdown / capitulation Falling without rotation. The classic dead-cat tape.

The same stock visits all four quadrants over a full Wyckoff cycle. Capitulation lives in the bottom-right; the floor is when σ crosses zero and ω begins to spiral the dot back up into accumulation; markup walks the top-right; distribution is the slow rotation through the top-left as σ flips negative again. The s-plane trail is a year of behaviour drawn as one geometric shape.

How we estimate σ and ω from price

The phasor pipeline already produces the amplitude r(t) and phase θ(t) per bar. From those:

σ ≈ d(ln r)/dt — the logarithmic growth rate of amplitude. Positive σ means the move is gaining energy; negative σ means it's bleeding out. We compute σ as the slope of a 5-bar linear regression on ln r, which smooths edge noise without introducing meaningful lag.
ω ≈ dθ/dt — the angular velocity, the same number used in the instantaneous-frequency view. Already a column in the parquet.

One pair (σ, ω) per bar; one trail through the s-plane per stock.

Worked example — TLKM through one full year

Run TLKM through the Laplace page and the trail traces a recognisable shape. Early in the window the dot sits in the bottom-left (σ ≈ −0.02, ω ≈ +5°/bar) — damped oscillation, mean-reverting around 3,400 IDR. A three-week markup pulls the dot into the top-right (σ ≈ +0.04, ω ≈ +6°/bar) — explosive oscillation. Then σ rolls over and the dot drifts left through the top-left quadrant — distribution. The trail finishes the year back near origin, having made one complete loop.

The pole's distance from the origin |s| measures how energetic the system is overall. A dot tightly bunched near origin is a quiet, range-bound stock; a dot orbiting at radius 0.05 is a stock with real conviction in either direction. Reading the trail end-to-end is reading a year of behaviour as one geometric shape.

Why this is genuinely complementary to the phasor: the phasor says "today you are at θ=130°, in distribution." The Laplace pole says "today you are in a system that is decaying with σ=−0.03 and rotating at ω=4°/bar — i.e. a damped oscillator, not an explosion." Two stocks can be at identical θ (both in distribution) but with very different σ — one is collapsing, the other is gently rolling over. The phasor cannot distinguish them; the s-plane can.

The interactive layer

The Laplace page mirrors the Phasor page's UX so the two feel like the same instrument viewed from different angles:

Price chart at the top with a bar-by-bar cursor — hover to highlight, click to lock the timeline to that bar. Identical UX to the Phasor tab so the muscle memory transfers.
Animated scrub slider with a play/pause button. Press play and the dot walks the s-plane, the price cursor advances in lockstep, and the small "behaviour classifier" caption updates in real time ("damped oscillation → explosive oscillation → markdown decay").
Quadrant-residence histogram — a four-bar chart showing what fraction of the year was spent in each quadrant. A markup-heavy year shows tall right-side bars; a distribution-heavy year shows tall left-side bars; a healthy mean-reverting stock shows roughly equal bars across all four.
Sparklines for σ(t) and ω(t) so you can scan when each component crossed zero — those crossings are the regime transitions.
Faded polyline trail behind the current pole, with a pulsing ring on the live dot, so even on a static screenshot you can see the direction the system has been moving.

Reading the trail in three example shapes

Tight knot near origin — a quiet stock. Low amplitude, slow rotation. Not interesting until something breaks the knot.
Counter-clockwise loop centred on origin — a textbook Wyckoff cycle. The healthiest possible shape.
Long line drifting outward — a runaway. Either the stock is in genuine sustained markup (rare and beautiful) or it's a data artefact (delisting, suspension). Worth investigating either way.

Fourier transform — the cycle spectrum

The phasor measures the dominant cycle. The Laplace pole describes the system's character. The Fourier transform maps every cycle in the window — earnings rhythms, options expiries, weekly behaviour, quarterly rebalancing — at once.

Joseph Fourier and the heat equation

In 1822 Joseph Fourier published Théorie analytique de la chaleur, in which he claimed that any function — no matter how irregular — could be written as an infinite sum of sines and cosines. Mathematicians of the time refused to believe him. He was right. The transform that bears his name decomposes a signal into its constituent frequencies:

X(f) = ∫ x(t)·e^−i2πft dt

The integrand is the same Euler kernel e^iθ we've been using for everything — only this time we sweep f across all frequencies and ask, at each one, "how much of the signal is rotating at this rate?" The answer is a complex number: magnitude = how much energy at that frequency, phase = where in the cycle we currently are.

Why this is genuinely a third lens

Lens	Question it answers
Phasor (Hilbert)	At this moment, the dominant cycle has phase θ and amplitude r. Quasi-stationary, mono-frequency.
Laplace (s-plane)	What kind of system is generating the dynamics — explosive, damped, oscillating, decaying?
Fourier (spectrum)	Across this window, the full distribution of energy across all frequencies. All cycles at once, not just the dominant.

The Fourier view answers a question the others can't: which discrete cycles are alive in this signal, and how strong are they relative to noise? Earnings rhythms (~63 trading days), monthly options expiries (~21 days), weekly drift (~5 days), Indonesian dividend mid-year cluster (~125 days) — all of these can coexist in one stock and the periodogram surfaces them simultaneously.

The pipeline — discrete, browser-side, parameter-light

Log returns. Raw price has a trend; the Fourier transform requires zero mean. We compute ln(p[t]/p[t−1]), then subtract the mean.
Hann window. The FFT implicitly assumes the signal is periodic — that bar N−1 is followed by bar 0. For a price series that's egregious. Multiplying by a Hann window (raised cosine) tapers the edges to zero, suppressing the spectral leakage that would otherwise contaminate every bin.
Zero-pad to next power of 2. Cooley-Tukey radix-2 FFT requires N a power of 2; padding to 512 from a 365-bar year costs nothing and improves frequency resolution.
FFT. Hand-rolled 50-line in-place radix-2 Cooley-Tukey in the browser. At N≤512 it runs in microseconds.
Magnitude squared = power. Bin k corresponds to a cycle of period N/k trading days. We display only periods of 4 days to N/2, since edge bins are dominated by windowing artefacts.

James Cooley and John Tukey published the radix-2 algorithm in 1965, reducing the FFT cost from O(N²) to O(N log N) and making modern signal processing possible. We use the textbook in-place version; the whole transform fits on one screen of JavaScript.

The five visualisations

1 · Periodogram

Power spectrum on a log-x axis, x = period in trading days, y = amplitude. Tall spikes = dominant cycles. The top five peaks are auto-labelled in plain English ("21d ≈ monthly", "63d ≈ quarterly earnings", "252d ≈ annual"). A median-density line on the chart marks the noise floor; peaks above 3× median are flagged "strong", above 1.5× are "medium", below are "noise".

2 · Reconstruction overlaid on price

Take only the top five peaks, sum their cosines and sines (with the FFT-derived amplitudes and phases), exp-cumsum back to price space, overlay on the actual price. The headline number: R² — what fraction of the price variance is captured by these five cycles? Three example readings:

Ticker	R² (top 5)	Strongest cycle	Reading
TLKM	~22%	64d (quarterly earnings)	Cyclically faithful — repeating 64-day rhythm explains a fifth of total variance.
BBCA	~8%	2.7d (microstructure)	Mostly trending — what's not in the trend is short-cycle noise, not interpretable cycles.
BTCUSDT	~6%	2.7d	Almost entirely trending; cycle hunting on crypto is mostly futile and the low R² says so honestly.

Low R² is not a failure of the framework — it's an honest answer that the signal doesn't have meaningful cycles. We deliberately don't smooth, fit, or detrend further to manufacture prettier numbers.

3 · Spectrogram (STFT)

Heatmap, x = time (rolling-window centre), y = period, colour = amplitude. Computed as a Short-Time Fourier Transform with a 60-bar window stepped by 10 bars, so a 365-bar series produces ~30 columns × ~30 frequency rows. Reading the heatmap: vertical bands of colour at a constant period = a stable cycle alive across the whole window; bands that appear and fade = cycles that came and went (rare; usually triggered by a corporate-action change in cash-flow rhythm). Click any column on the spectrogram and the periodogram below it scrubs to that exact 60-bar window — visually demonstrating how the dominant cycle drifted over the year.

4 · Cycle catalogue with calendar labels

A small table beside the periodogram, listing the top five cycles with calendar interpretation:

Period (bars)	Calendar reading	Amplitude	Significance
5	≈ weekly	0.018	strong
21	≈ monthly (options)	0.012	strong
63	≈ quarterly (earnings)	0.009	medium
252	≈ annual	0.004	noise floor

Calendar mapping is auto-applied: 5d → weekly, 21d → monthly, 63d → quarterly, 125d → semi-annual, 252d → annual. Periods that don't fall near a calendar anchor are labelled simply by their length ("~12 days").

5 · Phase coherence dial

For the top three cycles, compute their current phases. Display a small dial showing where each one sits on its own circle. The trader's question: are they reinforcing or interfering?

If two or more dominant cycles are simultaneously near a peak — they reinforce. The dial shows a "reinforcement zone" tag.
If one is at peak and another is at trough — they interfere. The market sees offsetting pressures and tends to range-bound.

This is descriptive, not predictive. The dial reads the structure that is currently in place; it doesn't forecast a turn.

Long-history mode — Yahoo Finance fallback

The default Fourier window is the 365-bar parquet that the nightly batch fetches via the local MCP gateway. That's enough resolution to see weekly, monthly, quarterly, and annual cycles — but not enough for macro-scale rhythms. The famous Kitchin (3–5 years), Juglar (7–11), Kuznets (15–25), and Kondratieff (45–60) cycles need decades of data to detect.

The Fourier page exposes a Long history toggle. When enabled, the server fetches up to 10 years of daily bars from Yahoo Finance — IDX tickers via the .JK suffix (e.g. TLKM.JK), crypto via the −USD suffix (e.g. BTC-USD) — and runs the full Fourier pipeline over the extended window. TLKM.JK returns ~2,460 bars; BTC-USD returns ~3,650. With a 2,460-bar window, FFT bins resolve down to a 9-day period at the high end and up to ~1,230 days (≈ 5 years) at the low end — enough to surface a Kitchin candidate if one exists.

Honesty caveat: 10 years is short for a 60-year Kondratieff. The long-history mode will not "find" cycles that need 200 years of data. What it will do is let you check whether the textbook 4-year Kitchin and 9-year Juglar candidates are visible on Indonesian and crypto data. Often they aren't — and that's an honest answer too.

RFFT — scipy real FFT and Welch averaging

The Fourier page runs a single-window FFT in JavaScript. The RFFT page runs scipy's rfft on the server and adds a Welch averaged periodogram — the same idea, but with much less variance in the spectrum estimate.

Market link: A single-window FFT will tell you a stock has a 21-day cycle, then a different "21-day cycle" next time you reload — half the peaks are statistical noise. Welch tells you which cycles are stable across the whole window. Trading off a high-variance peak burns money; trading off a Welch-stable peak is a bet on a real rhythm.

Why "real" FFT

Price returns are real-valued, but a generic FFT treats the input as complex and returns a symmetric spectrum that mirrors itself across the Nyquist frequency. scipy.fft.rfft exploits the symmetry to return only the unique half (length ⌊N/2⌋+1) and runs ~2× faster on long histories. For our purposes the numbers come out the same as the JS Cooley-Tukey FFT on the Fourier page — the value isn't precision, it's that we're now on a server-side numerical stack that lets us layer Welch on top.

Welch — averaging away spectrum variance

A single-window FFT of a noisy signal is itself noisy: each bin's amplitude has an enormous standard error. Two different 365-bar windows of the same stock can produce completely different "top peaks". Welch's method splits the signal into overlapping windows (we use 256-bar segments with 50% overlap), runs an FFT on each, and averages the resulting periodograms. The averaging cancels the random-component noise while preserving structurally stable cycles. Cycles that survive Welch are real; cycles that only show in the single-window FFT are noise.

Market link: The strongest peak on the single-window FFT often shifts by a week between two adjacent runs of the same ticker. The strongest peak on Welch barely moves. If you are about to size a trade around "this stock has an X-day cycle", confirm that cycle on Welch first — peaks that wobble are statistical mirages.

Cycle alignment across assets

If you pass a comma-separated list of peer tickers, the page also computes Jaccard alignment of their dominant cycles against the primary. Two stocks with alignment ≈ 1.0 share the same set of structural rhythms — they're driven by the same macro inputs — even if their Pearson correlation is mediocre because one trends harder than the other. This is the input the Coherence page should eventually consume in place of (or alongside) Pearson.

Market link: Two stocks with high cycle alignment but mediocre Pearson correlation are textbook pair-trade candidates — they share the same cyclic engine but their relative trend is unstable. Pearson would have called them unrelated; cycle alignment correctly says they're structurally coupled.

What this fixes that the Fourier page can't

The Fourier page's "top peaks" can change between reloads because the single-window estimate is high variance. If you look at a top-peaks list and feel uncertain whether they're stable cycles or coincidence, run the same ticker through RFFT and see which peaks survive Welch averaging — the ones that do are the ones worth basing a thesis on.

Market link: Use this when you've already identified a candidate cycle on the Fourier page and want to gut-check it before sizing. RFFT is the second opinion that turns a hunch into a thesis.

How the page is laid out

Spectrum panel — a single chart with two lines overlaid on log-period x-axis. Blue = single-window rfft amplitude. Orange = Welch averaged amplitude. Green vertical pins mark the top-5 cycles. The orange line is dramatically smoother than the blue — that's the whole point.
Top-5 cycles table — period in bars, calendar label (~weekly, ~monthly, ~quarterly), % of total variance, and the cycle's current phase in degrees (0° = peak, 180° = trough).
Spectral entropy gauge — a single number from 0 (perfectly cyclic) up to log(N/2) (pure noise). Coloured green / orange / red depending on where it sits in that range.
Cycle alignment matrix — only appears when peers are passed. For each peer, shows their dominant period and the Jaccard alignment of their top-5 cycle set against the primary's. Bars on the right give the visual.

The peers input

The second input field on the page accepts a comma-separated list of peer tickers (e.g. BBCA,UNVR,ASII). The server fetches each peer's bars in parallel through the same data path as the primary, runs SpectralMap on each, and computes Jaccard similarity between their dominant-period sets. A peer with alignment ≈ 1.0 shares the same structural rhythms as the primary — they're driven by the same macro inputs even if their Pearson correlation is mediocre.

This is the input the Coherence page should eventually consume in place of (or alongside) Pearson. Pearson conflates trend slope and cyclic alignment; Jaccard on the rounded period set isolates the cyclic alignment alone.

Pipeline — server-side, scipy, sub-second

Bars in. Same /api/stock/:ticker[/long] data path as Fourier and Wavelets.
Worker spawn. Node pipes { mode: "spectral", ticker, dates, prices, peers } to server/phasor_worker.py.
Compute. Log-returns + demean + Hann window, scipy.fft.rfft for the spectrum, scipy.signal.welch with 256-bar segments and 50% overlap for the averaged periodogram. Then SpectralMap from src/spectral.py for top cycles, entropy, dominant period, phase-at-frequency, and (for each peer) cycle_alignment.
JSON out. The browser receives { spectrum, welch, top_cycles, entropy, dominant, alignment } and renders the panels via Canvas 2D.

Wavelets — when each cycle existed, not just whether

Fourier's fundamental limitation: it tells you which cycles are present in the window, but not when. A 22-day cycle that died 90 bars ago and a 22-day cycle that's alive right now look identical in the periodogram. Wavelets fix this — they trade infinite frequency precision for finite time localization.

Market link: A flash crash on day 200 of a 365-day window reads, on the Fourier page, as ringing across every cycle band — useless for risk management. On Wavelets it's a single fine-scale spike at day 200 — exactly when, exactly how strong. The same goes for regime changes: wavelets find the date the macro pull weakened, instead of averaging it across the year.

The little wave

A Fourier basis function is e^i2πft: a sine that runs forever. A wavelet is a little wave — it starts at zero, oscillates briefly, and returns to zero. The Morlet wavelet (a Gaussian-windowed complex sinusoid) is the standard choice for finance: it gives the best simultaneous time and frequency localization of any function obeying the uncertainty principle.

By scaling the wavelet (stretching or compressing it) we tune the cycle period it's sensitive to. By translating it across the time axis we ask "is there a cycle of this period at this time?". Doing both for every (scale, time) pair produces the scalogram — a 2D heatmap that the Fourier transform's 1D periodogram cannot draw.

Market link: "Scale" is just "what cycle period are we asking about". "Translation" is "what time are we asking about". The whole page is a 2D answer to a 2D question — what cycle was driving the price, when. Fourier could only answer the first half.

The CWT scalogram — computed, not shown

The continuous wavelet transform on the Wavelets page runs 48 scales (geometrically spaced from 2 bars to ~N/4) using a complex Morlet wavelet. The output is a (48 × time) matrix of magnitudes — the full scalogram, time on x, cycle period on y. We compute it, but we don't render the heatmap. Two-dimensional rainbow heatmaps were confusing more than they explained: the eye couldn't tell whether a bright spot meant a cycle was alive or just that the colormap was non-linear. We surface the same information through two cursor-aware panels instead — a vertical-slice bar chart and a band-energy table — that read the scalogram column at a single time.

"Cycles alive at [date]" — the vertical slice

At the cursor position, we read out the scalogram column: 48 magnitudes, one per scale. Each is row-normalized (each scale divided by its own row max) so coarse cycles aren't drowned out by fine ones, then plotted as a horizontal bar chart with period on the x-axis and magnitude on the y-axis. Move the cursor and the bars rearrange — high-energy periods light up, dead cycles fade. This is the answer Fourier can't give: at this specific moment, which cycles are driving the price?

Market link: Read the dominant bar before anything else on the page. If the ~22-day bar dominates, you're in a swing-trade regime — fade the extremes. If the ~252-day bar dominates, you're being driven by macro forces — go with the trend. If multiple bars are nearly equal, the regime is in transition, which is exactly when sizing should come down.

"Energy share at [date]" — the live DWT-style breakdown

For a single, plain-English number per band, we group the 48 CWT scales into the same period bands as a 6-level Daubechies-4 DWT — D₁ (~2–4 bars), D₂ (~4–8), D₃ (~8–16), D₄ (~16–32), D₅ (~32–64), D₆ (~64–128), and an approximation band (~128+). At every cursor position we sum |CWT|² within each band and re-normalize so the column sums to 100%. The result is a 6×T matrix of percentages — every column tells you what fraction of cyclic energy is in that band at that moment. The table updates live as you scrub.

The whole-history DWT decomposition is computed separately — that's the standard pywt.wavedec output, used for the three reconstructions and their R²:

Trend = approximation + D₅ + D₆. Period roughly 64+ bars (multi-month).
Swing = D₃ + D₄. Period roughly 8–32 bars (weekly–monthly).
Random = D₁ + D₂. Period roughly 2–8 bars (sub-weekly noise).

Each reconstruction inverts pywt.waverec with only its own bands non-zero. The R² of each vs the original log returns answers a different question than the live cursor: it tells you which behavior dominates the stock across its whole history, not just right now. Both views matter — the live view is the snapshot, the R² is the long-run baseline.

Market link: Trend / Swing / Random are the three behaviors a stock can express. Knowing which one dominates right now tells you whether to ride momentum, fade extremes, or sit out. Knowing how that compares to the long-run R² tells you whether the snapshot is rare or normal — a stock that is normally Random going briefly Trend is a setup; a stock that is always Random is just noise that briefly looked like a trend.

Jump detection — what Fourier can't do at all

Fourier needs an infinite number of sines to construct a sharp edge, which is why a flash crash creates Gibbs ringing across the entire spectrum. Wavelets at fine scales (D₁, D₂) are locally sensitive to discontinuities. The page flags time bins where fine-scale coefficients exceed 5×MAD — an established threshold for outlier detection — and overlays them as red verticals on the price chart. On crypto histories you'll see the FTX collapse, the ETF approval pump, and major exchange listings as crisp vertical lines, dated to within ±2 bars. The 50 strongest are listed in a table at the bottom of the page.

Market link: Localized jumps are where stop-losses fire and where new regimes begin. Having them dated and ranked lets you check, after a loss, whether you were trading a regime that already changed. Most of the surprising losses in a portfolio post-mortem cluster on dates the wavelet flagged days earlier.

The plain-English read

The header card on the page is a one-glance summary that reads the live numbers and writes a five-line story in the same labeled style as the Phasor page's formula box. The classification thresholds are deliberately blunt:

Live numbers	Regime	Action
Random ≥ 55%	Random	sit on hands; no rhythm to predict
Trend ≥ 55%	Trend	go with the direction; don't bet against the bigger picture
Swing ≥ 35%	Swing	buy near a recent low, sell near a recent high; size around one cycle period
otherwise	Changing	wait; trade smaller until one bucket clearly leads

The "Versus history" line then compares the live mix against the same buckets averaged over the whole history, and flags shifts of ≥20 percentage points — that's how the page tells you "this regime is unusual for this stock right now". The "Credibility" line uses the whole-history reconstruction R² as a sanity check: if a stock's long run is 80%+ random, even a swing-looking moment is a snapshot of which way the random is leaning, not a deep pattern. The story is generated entirely from observable numbers — no LLM, no opinion, no surprise outputs.

Market link: Transitions come in two flavors — organic (slow, well-ordered, the regime drifts gradually) and chaotic (violent, the regime flips in days). The organic ones are where pair trades and rotation bets work. The chaotic ones are where you get stopped out. The "Versus history" line tells you which kind you're in: a small drift from baseline is organic; a 30+ point swing is chaotic.

How the page is laid out

Toolbar — slider scrubber + ▶ Animate button. Same widget the Phasor page uses (.toolbar-row + .inline-scrub + .anim-btn) so muscle memory carries over.
Price + jump markers — the canvas you actually trade off. Hover scrubs the cursor; click locks it and stops autoplay; ←/→ steps ±1 bar, Shift+←/→ steps ±10. A purple cursor on the chart and a synchronized cursor in every other panel keep the time aligned.
Cycles alive at [date] — the bar chart that replaces the scalogram heatmap.
Plain-English read — the five-line story (Regime / Reading / Action / Versus history / Credibility).
Energy share at [date] + Reconstructions (R²) side-by-side — the live DWT-style table and the whole-history R² breakdown.
Detected jumps — top 10 by magnitude, dates listed.

Every card heading carries the ticker (TLKM — price + jump markers, etc.) so a screenshot of any single panel is unambiguous.

How this connects to the rest of the system

The phasor framework collapses each ticker to one dominant cycle (r·e^iθ) via the Hilbert transform, which assumes the signal is approximately mono-frequency. Wavelets reveal where that assumption breaks: if the live "Cycles alive" bar chart shows energy spread across multiple periods at a given time, the phasor's "dominant cycle" is a misleading single number. The Wavelets page is therefore not a replacement for the phasor — it's the diagnostic that tells you when to trust it.

Market link: The Phasor page tells you what stage of the cycle the stock is in. Wavelets answer the question that comes before that one: "is there actually a cycle to be in?" If the energy is split across three different periods at a given moment, the phasor's stage label is averaging away real information. Use Wavelets to know when to trust the phasor and when to discount it.

Pipeline — server-side, Python, sub-second

Bars in. Same data path as the Fourier page: /api/stock/:ticker for the 365-bar default, /api/stock/:ticker/long for the 10-year Yahoo history.
Worker spawn. The Node server pipes { mode: "wavelet", ticker, dates, prices, family } to server/phasor_worker.py via stdin — same IPC pattern the phasor pipeline uses.
Compute. Log-returns + demean, then pywt.cwt for the 48-scale scalogram and pywt.wavedec for the 6-level DWT. Bands grouped, energies summed, columns normalized, R² computed against the original log returns, jumps detected at 5×MAD.
Downsample. Long history can be 4500 bars × 48 scales = 216k cells. We stride the time axis to ≤600 columns to keep the JSON payload under ~2 MB.
JSON out. The browser receives { scalogram, dwt: { bands, band_energy_t }, reconstructions, jumps, prices, dates } and renders everything client-side via Canvas 2D. No charting library, no build step.

Look-ahead bias — quantified

A practical question led to an uncomfortable discovery: every observation in the phasor framework is silently influenced by future data. Here's how big the effect actually is, and why it matters.

The question that started it

A natural assumption when looking at any charting tool is: "when the model tells me what a stock was doing on March 15, it's using information that was available on March 15." That assumption turns out to be wrong for this pipeline — and for almost every signal-processing pipeline applied to financial data.

The culprits are two functions in [src/market_phasor.py:191-197](src/market_phasor.py#L191-L197):

scipy.signal.filtfilt — the Butterworth lowpass is applied forward then backward to cancel phase distortion. The backward pass at bar t uses bars t+1, t+2, …, N as inputs.
scipy.signal.hilbert — the Hilbert transform uses a full-series FFT. Every output sample is a linear combination of every input sample, past and future.

Both are standard best practice for offline signal analysis. Neither is appropriate for a strategy that would run in real time. The output at bar t silently depends on bars that, in reality, hadn't happened yet.

The invariant that makes the experiment possible

At the very last bar of any series, there is no future for the filter to peek at — the series ends there. So at t = N, the reality pipeline and the causal pipeline must produce exactly the same answer. This is a mathematical identity, not an approximation.

reality(prices)[N] ≡ causal(prices[:N+1])[−1]

It's the sanity check that validates every causal experiment: if these ever disagree, something is broken. They don't. They agree to full floating-point precision. Verified live on TLKM, DFAM, BTCUSDT, and every other ticker we've checked.

How the causal trail is built

For every bar t from the warmup threshold (~80) up to N:

causal[t] = MarketPhasor( prices[:t+1] ).to_dataframe().iloc[-1]

That is: rerun the same code, with the same Butterworth cutoff, on prices up to and including bar t, and keep the last row of the output. That row is the phasor state a real-time observer would have computed on day t. Stack all of these up and you get the causal trail — the honest, look-ahead-free time series.

Cost: O(N²) per stock. In practice, on a modern laptop, ~200–400 ms for 365 bars. Not the bottleneck. The bottleneck was thinking of doing this in the first place.

The disagreement percentage

For each bar where both trails are defined, compare the regime labels. The disagreement percentage is the fraction of bars where the reality regime and the causal regime are different:

disagreement = count( regime_reality ≠ regime_causal ) / total comparable bars

This is a direct measure of how much the non-causal filter is "fixing" the historical story. A stock with disagreement near 0% has a historical regime timeline that's stable under the arrival of new data. A stock with disagreement near 100% has a history that gets almost completely rewritten every time the filter sees another bar.

What we actually found

The nightly batch computes disagreement for every ticker in the IDX universe. Results from the most recent run:

Ticker	Sector	Disagreement	Note
SCNP	Consumer Non-Cyclicals	97%	29 of the last 30 bars relabeled
FIMP	Infrastructure	97%	29 of 30
MENN	Technology	97%	29 of 30
DFAM	Consumer Non-Cyclicals	70%	21 of 30 — the user's original mismatch
TLKM	Infrastructure	63%	19 of 30
BTCUSDT	Crypto (Binance)	66%	20 of 30

More than half of the universe has meaningful look-ahead bias. The top decile is effectively entirely re-labeled by hindsight. The phenomenon is not exotic and it is not small.

Market implication: any strategy that takes the reality regime labels as input — a momentum filter that enters on Markup and exits on Distribution, for example — is being tested on data that would not have been available in real time. The reported backtest numbers are systematically optimistic. The degree depends on the stock's volatility: trending large caps have low disagreement (the filter broadly agrees with itself across windows); volatile small caps and recent IPOs have very high disagreement (the filter relabels history constantly).

The four views we now expose

With this finding documented, the framework carries four separate views of every stock:

Reality (snapshot) — the classic non-causal output at the moment the batch ran. What the Screener shows today.
Reality (live) — the non-causal output re-computed on today's price series. What the Phasor tab shows when you type a ticker. Can differ from the snapshot even for the same historical bar, because the filter saw more data.
Causal (frozen) — the real-time value for each bar, pre-computed and stored in the per-ticker parquet. Does not drift when new bars arrive. Exposed via the r_causal, regime_causal, etc. columns in the snapshot and in the Phasor-tab overlay.
Disagreement score — the aggregate measure of how much reality and causal diverge over the last 30 bars per ticker. Surfaced on the Screener as the coloured bias flag and as a filter chip (Look-ahead bias ≥ 30% / 50% / 80%).

How to read the Screener's bias flag

No flag — under 20% of recent bars disagree. The reality labels are roughly trustworthy for historical analysis.
Amber ⚠ 20–49% — noticeable relabelling. Treat regime-based backtests on this stock as overstated.
Orange ⚠ 50–79% — majority of recent bars get rewritten by hindsight. Historical labels should not be trusted for strategy development.
Red (pulsing) ⚠ 80%+ — the filter is essentially rewriting history every day. Use only the live latest-bar reading; anything else is noise.

The invariant in action — the "why doesn't this agree?" moment

A real case: user runs the Screener, sees DFAM as "L1 Accumulation, strength 31.4%". Clicks through to the Phasor tab and sees "Markdown, strength 14.9%" for the same stock. The numbers are very different.

What happened: the Screener was showing DFAM's state as of the previous day's batch, when the series ended at 2026-04-13 with price 130. At that moment, filtfilt's backward pass had no future to pull against — the edge artifact inflated r to 0.31 and put the phase near the Accumulation zone. Overnight, 2026-04-14 arrived with a price drop to 112. The Phasor tab now re-runs the pipeline on the extended series; the backward pass finally has context on the right edge; the 2026-04-13 label flips from L1 Accumulation to L4 Markdown. For the same date.

Both answers are "correct" outputs of the same math. They disagree because the math is non-causal. The causal column, by contrast, is frozen: r_causal for 2026-04-13 is 0.31 and will stay 0.31 no matter how many future bars arrive. That's the honest real-time reading, and it's now stored in the parquet permanently.

Why `filtfilt` specifically is the culprit

A Butterworth lowpass is an IIR filter — infinite impulse response. Applied once (forward), it introduces phase distortion: different frequency components get delayed by different amounts, smearing sharp transitions. For offline work, the standard cure is to apply the same filter twice — once forward, once backward — so the phase distortions cancel exactly. The result has perfectly zero phase lag. That's filtfilt.

The cost of zero phase lag is that every output sample depends on every input sample on both sides. Concretely, for a 4th-order Butterworth at cutoff 0.25, the impulse response decays over roughly 10–20 samples. The backward pass at bar t therefore mixes in non-negligible contributions from bars t+1 through t+20. That's not a theoretical quibble — it's a direct linear combination with measurable weights.

In a real-time system you can't run the backward pass because the future samples don't exist yet. Your choices are: (a) accept the forward-only phase lag, which means every signal appears delayed by several bars relative to the true underlying cycle, or (b) use a different filter architecture. The existing pipeline chose (c) — pretend you're offline and use filtfilt — which is the right call for historical analysis and the wrong call for live trading. Both can be true simultaneously. The causal column exists to make the distinction auditable.

The edge effect, visualised

The Hilbert transform in scipy is computed as ifft(fft(x) · H) where H zeros out negative frequencies. The FFT implicitly assumes the signal is periodic — that bar N−1 is followed by bar 0. For a price series this assumption is egregiously wrong, so the first and last ~20 bars of the Hilbert output are contaminated by wrap-around from the other end of the series.

The same price series, extended by one bar, produces different Hilbert outputs everywhere — but the difference is concentrated at the edges. The middle of the series moves by microns; the last 10 bars can move by tens of percent. This is why the snapshot and the live re-run disagree most on recent history, not on ancient history. And it's why the disagreement percentage reported on the Screener is computed over the last 30 bars, not over the whole series — older bars are effectively stable, recent bars are where hindsight rewrites the story.

The warmup threshold of ~80 bars on the causal trail exists for the same reason, at the opposite edge: before bar 80, filtfilt's forward-then-backward pass on a too-short window produces garbage (the filter hasn't converged). The causal trail returns NaN for those bars rather than lying about them.

Paths to a genuinely real-time pipeline

The causal column solves the audit problem — you can now see the bias and filter stocks that suffer from it. It does not solve the production problem: computing an O(N²) trail nightly for every ticker in the universe is fine, but running it tick-by-tick at market speed is not. A true streaming pipeline needs O(1) update per new bar. Three well-understood options exist:

Forward-only IIR — use lfilter instead of filtfilt. O(1) per bar, causal by construction, but introduces phase lag of roughly n/ω_c bars. For a 4th-order Butterworth at cutoff 0.25, that's ~16 bars of lag. You'd be looking at a cycle position that was true two weeks ago. Acceptable for very slow strategies; unacceptable for anything responsive.
Causal FIR Hilbert — replace the FFT-based Hilbert with a finite impulse response approximation (e.g. Parks-McClellan design with a specified passband). FIR Hilbert transforms are strictly causal if you allow a fixed group delay, typically 30–50 samples. You pay with latency, but the latency is constant and known — you can simply offset your decisions by the group delay. This is how real-time radios demodulate SSB.
Kalman analytic signal — model (r, θ, dθ/dt) as a state-space system and run a Kalman filter. Each new bar updates the state in O(1) with no lookahead. Edge performance is dramatically better than FFT-Hilbert because the filter doesn't assume periodicity. The tradeoff is a tuning step (process and measurement noise) that the current pipeline avoids — we'd lose the "parameter-free" property that makes the framework reproducible.

None of these are in the current codebase. The causal column is the cheapest possible first step: honest about the bias, reuses the existing math without modification, and lets the rest of the framework surface the problem without committing to a rewrite of the signal chain.

Production architecture — how the nightly machine works

The theory above is the math. This section is the plumbing: how ~1,000 tickers get fetched, phasor'd, causal-trailed, interpreted by Claude, and written to disk every night before market open. Five moving parts, each of which is its own small project.

1 · Causal pre-computation

For every ticker in the universe, the nightly batch runs the O(N²) causal loop described above — MarketPhasor(prices[:t+1]) for each t from the warmup threshold up to N — and stores the result as extra columns (r_causal, theta_causal, regime_causal, real_causal, imag_causal) inside the per-ticker parquet file. These columns are frozen: once written for a bar, they never change, even when new bars arrive tomorrow. The reality columns (r, theta, …) continue to drift as filtfilt sees more future; the causal columns don't.

Cost budget: ~400 ms per ticker × 1,032 tickers ≈ 7 minutes on a single worker. The pipeline parallelizes 4-wide across Python subprocesses, so wall time is ~2 minutes. This is the cheapest possible honesty mechanism — no new math, no new libraries, just "run the existing pipeline a lot of times and keep the edge."

2 · AI interpretation

Once the phasors are computed, each ticker gets handed to Claude via the Agent SDK (@anthropic-ai/claude-agent-sdk) to produce a single plain-English sentence describing what the stock is actually doing. These sentences land in summaries_latest.parquet and become the searchable corpus behind the Ask tab.

The batch chunks the universe into groups of 25 tickers per Claude call, pools 4 concurrent calls, and asks for a strict JSON array back — one summary per ticker. Each call sees only the numeric state (r, theta, regime, coherence, sector, subsector, last price, last return), never raw prices, so summaries are grounded in the framework's vocabulary and are internally consistent across the universe. Single-stock deep interpretation on the Phasor tab uses Sonnet 4.5 for quality; the batch uses the same model for narrative uniformity.

Natural-language query on the Ask tab reverses the flow: the entire summary corpus (~1,031 one-line descriptions) is packed into a prompt along with the user's question, Claude picks the top matches, and the server streams the ranked result back to the browser as JSON. No vector database, no embeddings — just a single shot against a pre-built corpus that's small enough to fit in one context window.

3 · Multi-source MCP gateway

Market data comes from a local MCP (Model Context Protocol) gateway that the Node server talks to over HTTP. Two protocols coexist: classic REST for batch pulls (daily OHLC, sector listings, fundamentals) and server-sent events for streaming updates. A thin routing layer in server/server.js picks the right transport per endpoint and normalises the response shape so the rest of the app never has to care which protocol served the data.

The gateway is also the single choke point for caching. Stock lists, sector memberships, and fundamentals are cached for 24 hours; price bars are cached only until the next 16:00 close; FX rates and macro indicators are cached for 1 hour. Everything flows through one function so the cache policy is one file to audit, not fifteen.

4 · The daily parquet pipeline

Parquet is the only storage format. No Postgres, no Redis, no SQLite. DuckDB reads parquet files directly from disk at SQL speed, and the Node server uses @duckdb/node-api to query them without ever loading a row into JavaScript memory.

Write path (nightly, triggered by node-cron at 06:00 local): fetch prices via MCP → compute phasors + causal trails in Python → write per-ticker parquets → rebuild snapshot_latest.parquet (the single-row-per-ticker latest state file) via a DuckDB COPY (SELECT … FROM read_parquet('tickers/*.parquet')) TO 'snapshot_latest.parquet' statement → run the Claude interpret batch → write summaries_latest.parquet. Five steps, one cron job, fully idempotent — you can rerun any stage and the downstream stages will pick up the freshest inputs automatically.

Compression is zstd level 3 throughout. The entire universe (1,032 tickers × ~500 bars × ~25 columns × both reality and causal trails) fits in under 40 MB on disk. The snapshot file is under 500 KB. Loading the snapshot for a Screener page costs one read_parquet call and ~15 ms.

5 · Determinism as a first-class property

Every stage above is reproducible bit-for-bit given the same input. The phasor math is parameter-free (Butterworth cutoff is the only dial, and it's fixed in code). The causal trail is a deterministic function of the input series. The MCP gateway caches by content hash. The parquet writes use stable column ordering and no per-run metadata. Only the Claude interpretation step is non-deterministic — and that's why its outputs live in their own parquet file and are never used as inputs to any downstream numerical calculation. Numbers stay reproducible; narratives are allowed to drift.

The point: the "AI" part of this system is strictly a presentation layer over a fully deterministic numerical pipeline. If Claude disappears tomorrow, the Screener, Phasor tab, causal analysis, and every downstream strategy still work identically. If the phasor math disappears tomorrow, there's nothing left to describe.

Natural-language search — the Ask layer

A 1,032-row catalogue is too big to scroll and too small to hide a vector database under. The Ask page gives the user a single-line text input, calls Claude Haiku with the entire catalogue as a cached prompt block, and streams ranked tickers back as JSON. No embeddings, no fine-tuning, no model state.

What the catalogue contains

Every night, after the phasor batch and the Claude interpret pass have finished, every ticker has one row of structured state and one row of plain-English summary. The Ask endpoint joins them and emits a flat text catalogue — one stock per line — that fits in a single Anthropic context window:

TICKER [sector/subsector] regime visX hidY r=R% clar=C — summary

Concrete example, taken verbatim from a recent run:

BNII [Keuangan/Bank] accumulation vis↑ hid↑ r=1.0% clar=0.38 — quietly accumulating with both visible rising and hidden building, early stage but clarity weak

Each line is roughly 200 characters; the full catalogue is around 200 KB. Well under the 200-KB cache-block ceiling and far under the 200K-token context limit.

The single-shot retrieval architecture

A typical "search across thousands of items" pipeline reaches for embeddings + vector DB + reranker. We don't, because:

The catalogue is small enough to fit in one prompt — 1,032 rows × 200 chars ≈ 60K tokens, well under model limits.
It changes once per day, not per query — perfectly suited to prompt caching. The Ask endpoint marks the catalogue block with cache_control: { type: "ephemeral" }, so the second and subsequent queries of the day reuse it for free.
Embeddings collapse semantics into a fixed-dim vector and lose precision on multi-attribute queries ("banks AND silently accumulating AND r < 4%"). A frontier LLM reading the raw text gets every attribute verbatim.

The pipeline is:

Server loads the catalogue from summaries_latest.parquet (joined with snapshot_latest.parquet for bar-count and active-pct gates), filters out low-quality rows (bar_count ≥ 252 AND active_pct ≥ 0.30), formats one line per ticker.
Builds a system prompt with strict definitions and explicit sector mappings (full text below).
Streams a single Haiku 4.5 call with the catalogue cached and the user's question as the only uncached payload.
Streams the JSON response back to the browser as it generates, so the user sees ranked tickers appear progressively.

Latency on a cached repeat call is roughly 1.5–3 s end-to-end. First call (cache miss) is ~6–8 s. Both numbers are roughly an order of magnitude faster than the previous Agent-SDK harness, which added ~30 s of multi-turn orchestration overhead the catalogue scan didn't need.

Why Haiku, not Sonnet

The single-stock deep interpretation on the Phasor tab uses Sonnet 4.5 — narrative quality matters, and the prompt is small. Ask is a different task: a fixed-format catalogue scan with a one-shot return. Haiku 4.5 handles it at one-quarter the cost and one-third the latency, and the strict JSON schema keeps the output rigid enough that quality differences between the two models are invisible.

Definition pinning — why "silently accumulating" maps to hard rules

Free-form natural language is ambiguous. "Silently accumulating" could mean a stock that's quietly building a base (the intended reading), or one whose summary happens to mention "building hidden demand" — even if its actual regime label is distribution. An early version of the page conflated those two and returned LIFE (an insurer in distribution) as a "silently accumulating bank" because its summary had the right vocabulary.

The fix was to write the system prompt as a tight rule sheet:

The regime label is the one-word value before vis. A summary that mentions "building demand" on a stock whose label is distribution is not accumulation.
silently accumulating = regime IN (accumulation, re_accumulation) AND hid↑ AND r ≤ 4%. Higher r means the stock is no longer silent — it's already breaking out.
"Banks" matches subsector = 'Bank' — not the parent sector "Keuangan", which also covers insurance (Asuransi), multifinance (Pembiayaan), and securities. "Financials" is the term that means the broader Keuangan parent.

The catalogue line was changed to expose the subsector explicitly ([Keuangan/Bank] instead of just [Keuangan]) so the rule actually has the substring it needs to match. Both fixes together turn an ambiguous LLM call into a near-deterministic catalogue scan.

What the user types vs. what runs

User question	Effective filter
"banks silently accumulating"	`subsector='Bank' AND regime∈(accum, re_accum) AND imag>0 AND r<0.04`
"insurers in distribution"	`subsector='Asuransi' AND regime='distribution'`
"strong markup with high clarity"	`regime='markup' AND r>0.05 AND coherence>0.7`
"crypto turning bottom"	`source='crypto' AND regime∈(capitulation, accumulation) AND imag>0`

The translation is performed inside Haiku, not in code — but the rule sheet in the system prompt makes the translation reliable enough to behave like a structured query in practice.

The takeaway: a 200-KB cached catalogue plus a tight rule sheet is faster, cheaper, and more auditable than an embedding pipeline for any corpus that fits in one context window. The cost is forcing yourself to write the rule sheet — but writing it explicitly is also what makes the Ask page debuggable. When a wrong ticker comes back, the rule that was misapplied is a single line of prose to fix.

Universe filters — keeping dead tickers out of the corpus

A market data feed is a museum of dead, halted, suspended, and never-traded instruments. Letting any of them into the analytical corpus poisons every downstream metric. Three small filters do most of the heavy lifting.

The three filters

Bar-count floor — drop tickers with fewer than 252 trading bars. A stock with only 90 bars of history can't have its annual cycle measured, can't have a coherence trail, can't be compared to peers. The floor is a hard requirement, not a soft warning. Default 252 (one trading year). Chips on the Universe page let the user relax this to 180, or tighten it to 500 (≥2y) for cycle research.
Active-pct floor — drop tickers whose price has changed on fewer than 30% of their bars. A stock at the same price for 70% of the window isn't quietly accumulating — it's not trading. The active-pct gate filters out halted, suspended, locked-up, and recently-IPO'd low-data names whose phasor would be dominated by zero-return bars. Default 30%; chips offer 0% (any) and 60% (liquid only).
Constant-price drop — applied at the equity-screener level, removes any ticker whose last close has equalled its mean close for >50% of the window. These are the truly dead names where the feed is still publishing yesterday's last print every day.

Why these filters are visible to the user

Each filter is exposed as a chip on the Universe and Screener pages, with the active selection highlighted. The user can always relax them — for example to research a recently-listed stock with only 90 bars — but the default catalogue, the Ask layer, and the regime-cluster visualisations all run with the strict defaults applied. The honest, rigorous corpus is the one the user sees first; the relaxed view is opt-in.

The point: a "1,500-ticker universe" is misleading if 500 of those tickers haven't traded in three months. The filtered universe of ~1,032 tickers is what every nightly batch — phasor compute, Claude interpret, regime cluster, Ask — actually operates on. The headline ticker count on the Universe tab is the filtered count, deliberately, so the rest of the platform's numbers are interpretable without a footnote.

9 · Instantaneous frequency & phase velocity

Once we have z(t) = r(t)·e^iθ(t), we can differentiate the phase:

ω(t) = dθ/dt

This is the instantaneous frequency. In music it's how fast a note is changing pitch. In markets it's how fast the cycle is advancing. A steady cycle has a slow, near-constant dθ/dt. A regime transition comes with a big spike in |dθ/dt| — the phase is rotating quickly through a boundary.

We flag a bar as being "in transition" when |dθ/dt| exceeds a threshold (default 25°/bar). These are the moments to pay attention.

Market link: Transitions come in two flavors — organic (slow, well-ordered, high coherence) and chaotic (violent, low coherence). The organic ones are where pair trades and rotation bets work. The chaotic ones are where you get stopped out.

10 · Net phasor & coherence

For a universe of stocks we compute the cap-weighted vector sum of every phasor:

z_net = (1/W) · Σ w_i · r_i · e^iθ_i

The magnitude |z_net| divided by the mean amplitude gives net coherence — a number between 0 and 1 that tells you how phase-aligned the whole sector is.

This is mathematically identical to the order parameter in statistical physics. It measures directional alignment of many oscillators.

> 0.70 — all stocks pointing the same way. Clear trend. Ride it.
0.40 – 0.70 — stocks split across stages. Rotation. Pair trades work.
< 0.40 — scattered. No theme. Stock-pick individually.

Market link: Coherence switches regime before returns data confirms the shift, because it measures the alignment of latent flows, not realized prices. It's the single most important environment indicator the framework produces.

11 · Kuramoto synchronization

In 1975 Yoshiki Kuramoto introduced a model for how large populations of coupled oscillators synchronize. Fireflies flashing in unison, heart cells firing together, electric grid generators locking into phase — they all follow the same equation:

dθ_i/dt = ω_i + (K/N) · Σ sin(θ_j − θ_i)

The order parameter r · e^iψ = (1/N) · Σ e^iθ_j is exactly our net coherence. Below a critical coupling K_c the population is incoherent; above it, a macroscopic fraction locks into phase.

Market link: A stock market under stress undergoes a phase transition in the exact Kuramoto sense — when a macro event hits (rate shock, earnings surprise, geopolitical crisis), the coupling between stocks rises, coherence jumps, and the whole market locks into one direction. That's the "risk-off everything-correlated-to-1" day. The same math that describes synchronizing fireflies describes it.

12 · Anti-phase pairs & structural arbitrage

Two stocks at phase angles θ_A and θ_B have a cross-phase distance:

Δ(A, B) = |θ_A − θ_B| wrapped to [0°, 180°]

When Δ ≈ 180°, they're in anti-phase — one is in Markup while the other is in Capitulation. Long A, short B, and you have a pair trade that doesn't care about the broader market direction, because it's structurally orthogonal to it.

Correlation-based pair selection can't find these. Pearson correlation collapses a stock to Re alone and penalizes pairs with opposite price moves. The phasor preserves the full complex structure, so it detects pairs that are structurally mirror images even if their return correlation looks normal.

13 · Connection to QKV attention

In 2003, US patent 8,572,041 proposed a key-value store indexed by historical state. In 2017, transformer models reformulated this as scaled dot-product attention: Q·K^T·V / √d_k. In both cases you have keys (memory), values (payloads), and queries (what to retrieve).

Applied to markets:

K (keys) = historical price levels → what the market remembers
V (values) = capital flow magnitudes → how much moved
Q (query) = the Hilbert phasor → what is being asked right now

The Hilbert transform constructs Q from the observed K and V stream. The imaginary axis is a learnable-free query that asks "given everything that came before, what's the latent pressure now?"

Market link: The QKV architecture is not a metaphor. It is the same mathematical structure, applied to capital instead of tokens. The framework is, in a literal sense, an attention mechanism on price — and like an attention head, its output is a single complex-valued context vector per time step.

14 · Why deterministic matters

Everything on this page is parameter-free once the Butterworth cutoff is fixed. No training data. No hyperparameter search. No drift. The same code on the same OHLC produces the same numbers today, next year, and five years from now.

For a fund this means: every trade is explainable, every signal is reproducible, and every backtest is identical to live. Regulators (OJK, MAS, SEC) can audit the pipeline end to end. A quant in Jakarta and a quant in Singapore running the same code will see the same regime labels. There is no "model version" to argue about.

15 · Phase extrapolation — trading the projected regime

The phasor tells you where a stock is. The derivative tells you how fast it's moving. Together they tell you where it's going.

Taylor series — the idea

Brook Taylor published this in 1715. The idea: if you know a function's value and all its derivatives at a single point, you can reconstruct the function's value at any nearby point. For a smooth function f(t), the value at t + k is:

f(t+k) = f(t) + f'(t)·k + ½f''(t)·k² + ⅙f'''(t)·k³ + …

Each term adds a layer of accuracy. The first term is where you are. The second is velocity (how fast you're moving). The third is acceleration (how fast the velocity is changing). And so on. The more terms, the further ahead the approximation holds — but for a smooth signal, even two or three terms are powerful over short horizons.

Why this matters: Taylor series is the same tool Newton used to compute planetary orbits, Euler used for differential equations, and every GPS receiver uses between satellite fixes. It's not a statistical model — it's calculus. It works whenever the underlying function is smooth, and the Butterworth filter guarantees that θ(t) is smooth.

Applying Taylor to phase

At every bar the phasor pipeline gives us three numbers:

θ(t) — the phase angle. Where the stock is in the Wyckoff cycle right now.
ω = dθ/dt — the angular velocity. How many degrees per bar the phase is advancing. Already computed by MarketPhasor as the d_theta column.
α = d²θ/dt² — the angular acceleration. How fast ω is changing. Computed as the mean of diff(d_theta) over the last 20 bars.

Plugging these into the Taylor series, truncated at second order:

θ̂(t+k) = θ(t) + ω·k + ½α·k²

The first term says "you're at 90°." The second says "you're moving at 5°/bar, so in 5 bars you'll be at 115°." The third says "but you're decelerating at 0.3°/bar², so actually you'll be at 111°." Each term corrects the previous one.

Why we stop at second order

The third derivative (jerk) and beyond are noise for daily equity data, even after Butterworth filtering. Two terms capture the meaningful dynamics: is the stock speeding up or slowing down through the cycle? Beyond that, the signal-to-noise ratio flips. Backtest validation confirms this — adding a cubic term doesn't improve H5 or H10 accuracy.

The same expansion for amplitude

Phase tells you which regime. Amplitude tells you how much energy is behind it. We apply the same Taylor expansion to r(t):

r̂(t+k) = max(0, r(t) + dr·k + ½d²r·k²)

Clamped at zero because amplitude can't go negative. A decaying r means the stock is coasting — the move is losing energy even if the phase is still advancing. A growing r means the move has fuel. Both matter for conviction.

Why phase extrapolates better than price

Price is noisy, non-stationary, and mean-reverting at different timescales simultaneously. Phase, after filtering, is a smooth monotonic-ish function that advances through the Wyckoff cycle at a locally stable rate. The angular velocity ω changes slowly — a stock in markup doesn't suddenly teleport to capitulation. This is why a simple quadratic fit on θ produces useful projections at horizons of 1–10 bars, whereas the same fit on price would be meaningless.

Backtest result: across the IDX universe, H1 (1-bar) projections achieve 93% conviction hit rate with 11° mean residual. H5 (5-bar) achieves 65% with 60° residual. H10 degrades to 59%. The degradation is predictable and well-behaved — no cliff edges.

From projected phase to projected conviction

Once we have θ̂(t+k), we reconstruct the projected phasor in Cartesian:

r̂(t+k) = max(0, r(t) + dr·k + ½d²r·k²)
Re_hat = r̂·cos(θ̂) Im_hat = r̂·sin(θ̂)

From (Re_hat, Im_hat, projected regime) we run the same conviction classifier that the portfolio system uses at t. The output is a projected conviction — hl, dw, pr, or he — at horizon k.

This is what the portfolio system trades on. Not "where is the stock today?" but "where will it be when my position is mature?"

Five horizons

Projections are computed at five horizons, each mapping to a holding-period intent:

Horizon	Bars	Intent	Typical use
H1	1	Very short	Intraday confirmation — is the next bar likely to stay in regime?
H3	3	Short	Swing entry — will conviction hold through the entry settling period?
H5	5	Medium	Default agent horizon — the trade thesis maturity window.
H7	7	Long	Position sizing — is the move projected to have legs?
H10	10	Strategic	Conviction filter — reject entries that degrade within 10 bars.

Confidence: the uncertainty envelope

The standard deviation of dθ/dt over the lookback window measures how stable the angular velocity has been. A stock rotating at a steady 5°/bar has a tight σ; one wobbling between −30° and +20° has a wide σ. Confidence maps this to [0, 1]:

confidence = max(0, 1 − σ(ω) / 90°)

The portfolio system ignores projections with confidence below 0.40 (configurable). This prevents noisy extrapolations from triggering false signals.

The residual — hypothesis tracking

Every projection is a hypothesis. When the next bar arrives, we compute the residual:

residual = θ_actual − θ̂_projected (wrapped to −180°, 180°]

A small residual means the trajectory is holding. A growing residual means something changed. The system tracks this for every open position:

|residual| < 45° → hypothesis confirmed, hold position.
|residual| > 45° → hypothesis diverged, flag for exit review.

This is the same observe→act loop used in Kalman filtering and model-predictive control: project, observe, compute residual, decide. The threshold (45°, configurable) is one quarter of the cycle — if you're wrong by more than a full regime boundary, the trade thesis is broken.

The workflow in one line: extrapolate → filter by projected conviction with margin → commit → observe incoming bars → residual confirms or disproves → stay or leave.

16 · What can be further added

The current framework is the minimal deterministic version. Several well-understood extensions plug in naturally:

Multi-scale phasors — run the pipeline on multiple Butterworth cutoffs simultaneously (e.g. 0.1, 0.25, 0.4). You get a daily, weekly, and monthly cycle on the same chart. A stock can be in Markup on the weekly and Distribution on the daily — exactly the setup where you take partial profits.
Empirical Mode Decomposition (EMD) — Huang 1998. A data-adaptive alternative to Fourier that extracts intrinsic mode functions. Feeding each IMF through the Hilbert transform gives the Hilbert-Huang spectrum. Better for non-stationary signals, which equities are.
Wavelet phase — Morlet or Paul wavelets give time-frequency phase maps, letting you see how the stage breakdown evolves with both time and timescale.
Kalman-smoothed phase — add a state-space model over (r, θ, dθ/dt) to improve edge-of-series estimates (the Hilbert transform degrades at the boundaries).
Cross-asset phasors — apply the same pipeline to bonds, FX, commodities, crypto. The z_net across asset classes gives a macro regime indicator that no single market provides.
Phase-locked option pricing — use θ to tilt implied volatility skew. Distribution stages should command higher put skew than Accumulation stages for identical realized vol.
Event injection — overlay earnings dates, dividend dates, and FOMC meetings on the phasor. Quantify whether transitions cluster around known events (they often do) and use the residual as clean signal.
Graph coupling — build a graph where edge weight is 1 − cos(Δ_ij). The leading eigenvector gives the market's "principal phase mode" — like PCA, but on phase instead of returns.
Order-flow phasors — replace price with signed trade volume as the input signal. The Im axis then measures aggressive buying vs passive absorption, directly quantifying what market microstructure calls "informed flow."
Regime-conditional factor models — re-estimate value, momentum, and quality factors separately within each of the six phase stages. Factor premia are almost certainly stage-dependent.

The point: None of these are speculative. Each extension is a well-studied technique from DSP, applied mathematics, or econometrics, plugged into a framework that already works. Adding them doesn't require new theory — just engineering time.

17 · Glossary

Math term	Trader term	Meaning
z(t)	state	complex-valued capital state
Re	visible momentum	observable price move
Im	hidden pressure	latent flow, leads price
r(t)	move strength	amplitude, how big the move is
θ(t)	stage	phase angle, 0°–360°
dθ/dt	cycle velocity	how fast the stage is advancing
coherence	signal clarity	phase alignment over window
net coherence	togetherness	Kuramoto order parameter across stocks
Δ(A,B)	pair distance	angular distance, 0°–180°
e^iθ	unit cycle	walking the unit circle
ω = dθ/dt	rotation speed	angular velocity, degrees per bar
α = d²θ/dt²	rotation acceleration	rate of change of angular velocity
θ̂(t+k)	projected stage	extrapolated phase at horizon k bars ahead
residual	tracking error	actual θ minus projected θ, wrapped to ±180°
H5	5-bar horizon	default projection horizon for trade decisions

Ask the market.

1 · The one idea

2 · The four situations

3 · The six stages of a cycle

4 · Euler's formula — "the most beautiful equation"

5 · What is a phasor?

Why phasors exist at all

The phasor diagram

Where phasors show up

How it applies to the stock market

The analogy in one picture

6 · Complex numbers as market state

7 · The Hilbert transform

8 · The Butterworth filter

Laplace transform — the s-plane lens

From eiωt to est

The four quadrants — what kind of system?

How we estimate σ and ω from price

Worked example — TLKM through one full year

The interactive layer

Reading the trail in three example shapes

Fourier transform — the cycle spectrum

Joseph Fourier and the heat equation

Why this is genuinely a third lens

The pipeline — discrete, browser-side, parameter-light

The five visualisations

1 · Periodogram

2 · Reconstruction overlaid on price

3 · Spectrogram (STFT)

4 · Cycle catalogue with calendar labels

5 · Phase coherence dial

Long-history mode — Yahoo Finance fallback

RFFT — scipy real FFT and Welch averaging

Why "real" FFT

Welch — averaging away spectrum variance

Cycle alignment across assets

What this fixes that the Fourier page can't

How the page is laid out

The peers input

Pipeline — server-side, scipy, sub-second

Wavelets — when each cycle existed, not just whether

The little wave

The CWT scalogram — computed, not shown

"Cycles alive at [date]" — the vertical slice

"Energy share at [date]" — the live DWT-style breakdown

Jump detection — what Fourier can't do at all

The plain-English read

How the page is laid out

How this connects to the rest of the system

Pipeline — server-side, Python, sub-second

Look-ahead bias — quantified

The question that started it

The invariant that makes the experiment possible

How the causal trail is built

The disagreement percentage

What we actually found

The four views we now expose

How to read the Screener's bias flag

The invariant in action — the "why doesn't this agree?" moment

Why filtfilt specifically is the culprit

The edge effect, visualised

Paths to a genuinely real-time pipeline

Production architecture — how the nightly machine works

1 · Causal pre-computation

2 · AI interpretation

3 · Multi-source MCP gateway

4 · The daily parquet pipeline

5 · Determinism as a first-class property

Natural-language search — the Ask layer

What the catalogue contains

The single-shot retrieval architecture

Why Haiku, not Sonnet

Definition pinning — why "silently accumulating" maps to hard rules

What the user types vs. what runs

Universe filters — keeping dead tickers out of the corpus

The three filters

Why these filters are visible to the user

9 · Instantaneous frequency & phase velocity

10 · Net phasor & coherence

11 · Kuramoto synchronization

From e^iωt to e^st

Why `filtfilt` specifically is the culprit