1 · The one idea
Every chart you've seen shows one thing: where price has been. This framework measures two things — the visible move, and the hidden pressure that changes before price does.
2 · The four situations
Each of the two numbers can be positive or negative. Four combinations, four clear actions.
Visible UP · Hidden UP Hold or add The trend has fuel. Both axes agree. |
Visible UP · Hidden DOWN ⚠️ Take profits Price looks fine but pressure has flipped. Smart money leaving. Distribution. |
Visible DOWN · Hidden UP Watch for the bottom Falling still, but buyers stepping in. Reversal setup. |
Visible DOWN · Hidden DOWN Wait No floor. Don't catch the falling knife. |
3 · The six stages of a cycle
Price moves in cycles, like seasons. Six stages, each with a name borrowed from Wyckoff market structure.
- Accumulation — smart money quietly buying. Chart looks flat. Hidden pressure rising. The move begins here.
- Markup — the visible uptrend.
- Distribution — top forms. Price still rising, hidden pressure flipping. Warning.
- Markdown — visible downtrend.
- Capitulation — panic bottom.
- Re-accumulation — base rebuilds. Cycle restarts.
4 · Euler's formula — "the most beautiful equation"
In 1748 Leonhard Euler published an identity that connects five of the most fundamental numbers in mathematics:
Richard Feynman called it "the most remarkable formula in mathematics." It's a special case of a more general identity — and the general form is the one we use here:
This says: raising e to an imaginary power traces a circle in the complex plane. As θ sweeps from 0 to 2π, the point eiθ walks once around the unit circle. The real part is cos θ, the imaginary part is sin θ. These two components are always 90° out of phase — one leads the other by a quarter cycle.
Multiply by a magnitude r and you get a phasor — a single complex number that encodes both how big something is and where it is in its cycle:
Engineers have used this since the 1890s to describe AC electricity. Quantum mechanics uses it for wave functions. Signal processing uses it for audio, radio, radar, and sonar. We use it for price.
r becomes move strength; θ becomes the stage.
5 · What is a phasor?
A phasor (phase vector) is a complex-number shortcut for describing an oscillating signal. Anything that swings back and forth — a voltage, a sound pressure, a price — can be written as:
Three pieces: V is the amplitude (how big the swing is), ω is the angular frequency (how fast it cycles), and θ is the phase angle (where in the cycle it starts). Carrying all three through calculus is painful. So engineers collapse them into a single complex number:
That's the phasor. It's a vector in the complex plane of length V pointing at angle θ. The time-varying ωt part falls out, because in a steady-state system every element oscillates at the same frequency — it's the relative amplitudes and phases between signals that matter, not the absolute clock.
Why phasors exist at all
Charles Proteus Steinmetz, a General Electric engineer, popularized phasors in 1893 to make AC circuit analysis tractable. Before him, you'd solve an AC circuit by writing differential equations for the sinusoidal voltages and currents and integrating by hand. With phasors, the same problem becomes algebra — complex-number multiplication and division. Differentiation of a sine turns into multiplying the phasor by iω. Integration turns into dividing by iω. A circuit with capacitors, inductors, and resistors becomes a set of linear equations you can solve with pencil and paper.
The phasor diagram
When you draw phasors as arrows on the complex plane, you can see the relationships. Voltage and current in a capacitor are 90° apart. Current lags voltage in an inductor by 90°. A power factor of 0.8 means the load's current phasor is rotated 37° from the voltage phasor. One diagram captures the whole AC behavior of a circuit. It's the reason every electrical engineering textbook is full of arrows on a complex plane.
Where phasors show up
- Power systems — grid synchronization, fault analysis, reactive power.
- Communications — IQ modulation, FM demodulation, software-defined radio.
- Control theory — frequency response, Bode plots, Nyquist stability.
- Signal processing — Fourier analysis, filter design, vibration analysis.
- Quantum mechanics — wave functions are literally complex phasors evolving in time.
How it applies to the stock market
A stock's price over time is also an oscillating signal. Wyckoff was describing this intuitively a century ago when he named the cycle stages — Accumulation, Markup, Distribution, Markdown, Capitulation, Re-accumulation. That's not four separate phenomena. It's one cycle at different angular positions, and each stage corresponds to a specific slice of the 360° circle.
If we can extract the magnitude and phase of that cycle from the observed price, we get a phasor per stock per bar:
where:
r(t)= the amplitude — how big the current move is. This is move strength. High r means real conviction behind the move; low r means the stock is just grinding.θ(t)= the phase angle — exactly where in the Wyckoff cycle we are. θ = 90° is pure Markup. θ = 180° is Distribution peak. θ = 270° is Capitulation bottom. θ = 0° is Accumulation.Re(z) = r·cos(θ)= the real part = visible price momentum. What you see on the chart.Im(z) = r·sin(θ)= the imaginary part = hidden flow pressure. What the chart hasn't shown yet. Leads price by a quarter cycle.
The Hilbert transform is the machine that turns a raw price series into a phasor. It's parameter-free — no model to train, no thresholds to tune. You feed in 500 bars of closing prices, you get back 500 phasors: one complex number per day, each containing the full state of that stock at that moment.
The analogy in one picture
Picture an arrow rotating counterclockwise around the origin of a graph, once per market cycle. The arrow's length is the amplitude — how much energy is in the move. The arrow's angle is the phase — which stage we're in. The shadow the arrow casts on the horizontal axis is the visible price momentum. The shadow it casts on the vertical axis is the hidden flow pressure. Every day, you take a snapshot of the arrow. That snapshot is the phasor.
In a perfect world, price would be a clean sine wave and the arrow would rotate at constant speed. In reality, price is noisy and the arrow wobbles — it speeds up, slows down, sometimes reverses. The rate at which the angle changes (dθ/dt) is how fast the cycle is advancing, and a sudden spike in that rate is the signature of a regime transition.
6 · Complex numbers as market state
A complex number z = a + i·b has two parts that can't be collapsed into one. This is exactly what a stock needs: it's not enough to know the price went up — you also need to know why, and whether the reason is strengthening or weakening.
In our framework:
- Re(z) = the visible price move. Positive = rising, negative = falling.
- Im(z) = the hidden flow. Positive = buying pressure, negative = selling pressure.
- |z| = r = the amplitude — how big the current move is.
- ∠z = θ = the phase angle — which stage of the cycle we're in.
Re and Im are orthogonal. No matter what Re does, Im can tell a different story. That's the whole point. Pearson correlation collapses a stock to a single time series, discarding everything except the Re axis. The phasor sees both.
7 · The Hilbert transform
Here's the problem: a chart only gives us Re. We need a principled way to construct the Im component from the observed price alone. The Hilbert transform does exactly this.
Given a real signal x(t), the Hilbert transform H{x}(t) is the unique linear operator that shifts every frequency component by 90°. The analytic signal is then formed as:
The resulting z(t) is a complex time series whose magnitude is the signal's envelope and whose phase is its instantaneous angle. This is how we recover both axes from a single observed price stream.
Mathematically, in the frequency domain, the Hilbert transform multiplies positive frequencies by −i and negative frequencies by +i. Time-domain equivalent: a convolution against 1/(π·t).
The transform was introduced by David Hilbert around 1905 and became fundamental to communications theory through Dennis Gabor's 1946 paper on signal expansion. It's the math behind FM demodulation, speech processing, structural vibration analysis, radar echo processing, and MRI k-space reconstruction.
8 · The Butterworth filter
Raw log returns are too noisy to phase-decompose directly. High-frequency jitter dominates, masking the cycle we care about. A lowpass filter removes the noise before the Hilbert step.
The Butterworth filter, invented by Stephen Butterworth in 1930, has the flattest possible passband response of any filter. For a lowpass Butterworth of order n with cutoff ωc:
It's the only parameter choice in the pipeline. Default cutoff is 0.25 of the Nyquist frequency, order 4. This preserves cycles of roughly 8 bars and longer, removes day-to-day noise.
Laplace transform — the s-plane lens
The phasor tells you where a stock is on its cycle. The Laplace transform tells you what kind of system the cycle belongs to — exploding, decaying, oscillating, or mean-reverting. One pole on a 2-D map captures it.
From eiωt to est
The phasor identity eiωt traces a circle at constant amplitude — a perfect, undamped oscillation. Real markets aren't perfect; moves grow, decay, and reverse. To describe those, we replace iω with a complex number:
Now the same expression est = e(σ+iω)t = eσt·eiωt says two things at once. eσt is the envelope (growing, shrinking, or steady); eiωt is the rotation inside that envelope. σ is the real part — damping rate. ω is the imaginary part — oscillation frequency. Together they form a single point in the complex plane called the s-plane.
Pierre-Simon Laplace published this in 1782, generalising Euler's circle into a tool for solving differential equations. Oliver Heaviside in the 1880s used it to tame electrical transients. Every control engineer since 1930 has thought in s-plane: the location of a system's poles tells you, at a glance, whether it's stable, oscillatory, damped, or about to blow up.
The four quadrants — what kind of system?
Plotting (σ, ω) places every market state in one of four geometric regions, each with a clear behavioural signature:
|
σ > 0, ω ≠ 0 — explosive oscillation
Riding a momentum move
Amplitude growing and rotating. Markup phase with conviction. Trends until σ rolls negative.
|
σ > 0, ω ≈ 0 — pure exponential growth
Late-stage parabolic
No cycle, just compounding. Crypto bubbles, manias. Beautiful until σ flips.
|
|
σ < 0, ω ≠ 0 — damped oscillation
Mean-reverting cycle
Range-bound, oscillating but losing energy. Pair-trade territory.
|
σ < 0, ω ≈ 0 — pure decay
Markdown / capitulation
Falling without rotation. The classic dead-cat tape.
|
The same stock visits all four quadrants over a full Wyckoff cycle. Capitulation lives in the bottom-right; the floor is when σ crosses zero and ω begins to spiral the dot back up into accumulation; markup walks the top-right; distribution is the slow rotation through the top-left as σ flips negative again. The s-plane trail is a year of behaviour drawn as one geometric shape.
How we estimate σ and ω from price
The phasor pipeline already produces the amplitude r(t) and phase θ(t) per bar. From those:
σ ≈ d(ln r)/dt— the logarithmic growth rate of amplitude. Positive σ means the move is gaining energy; negative σ means it's bleeding out. We compute σ as the slope of a 5-bar linear regression onln r, which smooths edge noise without introducing meaningful lag.ω ≈ dθ/dt— the angular velocity, the same number used in the instantaneous-frequency view. Already a column in the parquet.
One pair (σ, ω) per bar; one trail through the s-plane per stock.
Worked example — TLKM through one full year
Run TLKM through the Laplace page and the trail traces a recognisable shape. Early in the window the dot sits in the bottom-left (σ ≈ −0.02, ω ≈ +5°/bar) — damped oscillation, mean-reverting around 3,400 IDR. A three-week markup pulls the dot into the top-right (σ ≈ +0.04, ω ≈ +6°/bar) — explosive oscillation. Then σ rolls over and the dot drifts left through the top-left quadrant — distribution. The trail finishes the year back near origin, having made one complete loop.
The pole's distance from the origin |s| measures how energetic the system is overall. A dot tightly bunched near origin is a quiet, range-bound stock; a dot orbiting at radius 0.05 is a stock with real conviction in either direction. Reading the trail end-to-end is reading a year of behaviour as one geometric shape.
The interactive layer
The Laplace page mirrors the Phasor page's UX so the two feel like the same instrument viewed from different angles:
- Price chart at the top with a bar-by-bar cursor — hover to highlight, click to lock the timeline to that bar. Identical UX to the Phasor tab so the muscle memory transfers.
- Animated scrub slider with a play/pause button. Press play and the dot walks the s-plane, the price cursor advances in lockstep, and the small "behaviour classifier" caption updates in real time ("damped oscillation → explosive oscillation → markdown decay").
- Quadrant-residence histogram — a four-bar chart showing what fraction of the year was spent in each quadrant. A markup-heavy year shows tall right-side bars; a distribution-heavy year shows tall left-side bars; a healthy mean-reverting stock shows roughly equal bars across all four.
- Sparklines for σ(t) and ω(t) so you can scan when each component crossed zero — those crossings are the regime transitions.
- Faded polyline trail behind the current pole, with a pulsing ring on the live dot, so even on a static screenshot you can see the direction the system has been moving.
Reading the trail in three example shapes
- Tight knot near origin — a quiet stock. Low amplitude, slow rotation. Not interesting until something breaks the knot.
- Counter-clockwise loop centred on origin — a textbook Wyckoff cycle. The healthiest possible shape.
- Long line drifting outward — a runaway. Either the stock is in genuine sustained markup (rare and beautiful) or it's a data artefact (delisting, suspension). Worth investigating either way.
Fourier transform — the cycle spectrum
The phasor measures the dominant cycle. The Laplace pole describes the system's character. The Fourier transform maps every cycle in the window — earnings rhythms, options expiries, weekly behaviour, quarterly rebalancing — at once.
Joseph Fourier and the heat equation
In 1822 Joseph Fourier published Théorie analytique de la chaleur, in which he claimed that any function — no matter how irregular — could be written as an infinite sum of sines and cosines. Mathematicians of the time refused to believe him. He was right. The transform that bears his name decomposes a signal into its constituent frequencies:
The integrand is the same Euler kernel eiθ we've been using for everything — only this time we sweep f across all frequencies and ask, at each one, "how much of the signal is rotating at this rate?" The answer is a complex number: magnitude = how much energy at that frequency, phase = where in the cycle we currently are.
Why this is genuinely a third lens
| Lens | Question it answers |
|---|---|
| Phasor (Hilbert) | At this moment, the dominant cycle has phase θ and amplitude r. Quasi-stationary, mono-frequency. |
| Laplace (s-plane) | What kind of system is generating the dynamics — explosive, damped, oscillating, decaying? |
| Fourier (spectrum) | Across this window, the full distribution of energy across all frequencies. All cycles at once, not just the dominant. |
The Fourier view answers a question the others can't: which discrete cycles are alive in this signal, and how strong are they relative to noise? Earnings rhythms (~63 trading days), monthly options expiries (~21 days), weekly drift (~5 days), Indonesian dividend mid-year cluster (~125 days) — all of these can coexist in one stock and the periodogram surfaces them simultaneously.
The pipeline — discrete, browser-side, parameter-light
- Log returns. Raw price has a trend; the Fourier transform requires zero mean. We compute
ln(p[t]/p[t−1]), then subtract the mean. - Hann window. The FFT implicitly assumes the signal is periodic — that bar N−1 is followed by bar 0. For a price series that's egregious. Multiplying by a Hann window (raised cosine) tapers the edges to zero, suppressing the spectral leakage that would otherwise contaminate every bin.
- Zero-pad to next power of 2. Cooley-Tukey radix-2 FFT requires
Na power of 2; padding to 512 from a 365-bar year costs nothing and improves frequency resolution. - FFT. Hand-rolled 50-line in-place radix-2 Cooley-Tukey in the browser. At N≤512 it runs in microseconds.
- Magnitude squared = power. Bin
kcorresponds to a cycle of periodN/ktrading days. We display only periods of 4 days to N/2, since edge bins are dominated by windowing artefacts.
James Cooley and John Tukey published the radix-2 algorithm in 1965, reducing the FFT cost from O(N²) to O(N log N) and making modern signal processing possible. We use the textbook in-place version; the whole transform fits on one screen of JavaScript.
The five visualisations
1 · Periodogram
Power spectrum on a log-x axis, x = period in trading days, y = amplitude. Tall spikes = dominant cycles. The top five peaks are auto-labelled in plain English ("21d ≈ monthly", "63d ≈ quarterly earnings", "252d ≈ annual"). A median-density line on the chart marks the noise floor; peaks above 3× median are flagged "strong", above 1.5× are "medium", below are "noise".
2 · Reconstruction overlaid on price
Take only the top five peaks, sum their cosines and sines (with the FFT-derived amplitudes and phases), exp-cumsum back to price space, overlay on the actual price. The headline number: R² — what fraction of the price variance is captured by these five cycles? Three example readings:
| Ticker | R² (top 5) | Strongest cycle | Reading |
|---|---|---|---|
| TLKM | ~22% | 64d (quarterly earnings) | Cyclically faithful — repeating 64-day rhythm explains a fifth of total variance. |
| BBCA | ~8% | 2.7d (microstructure) | Mostly trending — what's not in the trend is short-cycle noise, not interpretable cycles. |
| BTCUSDT | ~6% | 2.7d | Almost entirely trending; cycle hunting on crypto is mostly futile and the low R² says so honestly. |
Low R² is not a failure of the framework — it's an honest answer that the signal doesn't have meaningful cycles. We deliberately don't smooth, fit, or detrend further to manufacture prettier numbers.
3 · Spectrogram (STFT)
Heatmap, x = time (rolling-window centre), y = period, colour = amplitude. Computed as a Short-Time Fourier Transform with a 60-bar window stepped by 10 bars, so a 365-bar series produces ~30 columns × ~30 frequency rows. Reading the heatmap: vertical bands of colour at a constant period = a stable cycle alive across the whole window; bands that appear and fade = cycles that came and went (rare; usually triggered by a corporate-action change in cash-flow rhythm). Click any column on the spectrogram and the periodogram below it scrubs to that exact 60-bar window — visually demonstrating how the dominant cycle drifted over the year.
4 · Cycle catalogue with calendar labels
A small table beside the periodogram, listing the top five cycles with calendar interpretation:
| Period (bars) | Calendar reading | Amplitude | Significance |
|---|---|---|---|
| 5 | ≈ weekly | 0.018 | strong |
| 21 | ≈ monthly (options) | 0.012 | strong |
| 63 | ≈ quarterly (earnings) | 0.009 | medium |
| 252 | ≈ annual | 0.004 | noise floor |
Calendar mapping is auto-applied: 5d → weekly, 21d → monthly, 63d → quarterly, 125d → semi-annual, 252d → annual. Periods that don't fall near a calendar anchor are labelled simply by their length ("~12 days").
5 · Phase coherence dial
For the top three cycles, compute their current phases. Display a small dial showing where each one sits on its own circle. The trader's question: are they reinforcing or interfering?
- If two or more dominant cycles are simultaneously near a peak — they reinforce. The dial shows a "reinforcement zone" tag.
- If one is at peak and another is at trough — they interfere. The market sees offsetting pressures and tends to range-bound.
This is descriptive, not predictive. The dial reads the structure that is currently in place; it doesn't forecast a turn.
Long-history mode — Yahoo Finance fallback
The default Fourier window is the 365-bar parquet that the nightly batch fetches via the local MCP gateway. That's enough resolution to see weekly, monthly, quarterly, and annual cycles — but not enough for macro-scale rhythms. The famous Kitchin (3–5 years), Juglar (7–11), Kuznets (15–25), and Kondratieff (45–60) cycles need decades of data to detect.
The Fourier page exposes a Long history toggle. When enabled, the server fetches up to 10 years of daily bars from Yahoo Finance — IDX tickers via the .JK suffix (e.g. TLKM.JK), crypto via the −USD suffix (e.g. BTC-USD) — and runs the full Fourier pipeline over the extended window. TLKM.JK returns ~2,460 bars; BTC-USD returns ~3,650. With a 2,460-bar window, FFT bins resolve down to a 9-day period at the high end and up to ~1,230 days (≈ 5 years) at the low end — enough to surface a Kitchin candidate if one exists.
RFFT — scipy real FFT and Welch averaging
The Fourier page runs a single-window FFT in JavaScript. The RFFT page runs scipy's rfft on the server and adds a Welch averaged periodogram — the same idea, but with much less variance in the spectrum estimate.
Why "real" FFT
Price returns are real-valued, but a generic FFT treats the input as complex and returns a symmetric spectrum that mirrors itself across the Nyquist frequency. scipy.fft.rfft exploits the symmetry to return only the unique half (length ⌊N/2⌋+1) and runs ~2× faster on long histories. For our purposes the numbers come out the same as the JS Cooley-Tukey FFT on the Fourier page — the value isn't precision, it's that we're now on a server-side numerical stack that lets us layer Welch on top.
Welch — averaging away spectrum variance
A single-window FFT of a noisy signal is itself noisy: each bin's amplitude has an enormous standard error. Two different 365-bar windows of the same stock can produce completely different "top peaks". Welch's method splits the signal into overlapping windows (we use 256-bar segments with 50% overlap), runs an FFT on each, and averages the resulting periodograms. The averaging cancels the random-component noise while preserving structurally stable cycles. Cycles that survive Welch are real; cycles that only show in the single-window FFT are noise.
Cycle alignment across assets
If you pass a comma-separated list of peer tickers, the page also computes Jaccard alignment of their dominant cycles against the primary. Two stocks with alignment ≈ 1.0 share the same set of structural rhythms — they're driven by the same macro inputs — even if their Pearson correlation is mediocre because one trends harder than the other. This is the input the Coherence page should eventually consume in place of (or alongside) Pearson.
What this fixes that the Fourier page can't
The Fourier page's "top peaks" can change between reloads because the single-window estimate is high variance. If you look at a top-peaks list and feel uncertain whether they're stable cycles or coincidence, run the same ticker through RFFT and see which peaks survive Welch averaging — the ones that do are the ones worth basing a thesis on.
How the page is laid out
- Spectrum panel — a single chart with two lines overlaid on log-period x-axis. Blue = single-window rfft amplitude. Orange = Welch averaged amplitude. Green vertical pins mark the top-5 cycles. The orange line is dramatically smoother than the blue — that's the whole point.
- Top-5 cycles table — period in bars, calendar label (~weekly, ~monthly, ~quarterly), % of total variance, and the cycle's current phase in degrees (0° = peak, 180° = trough).
- Spectral entropy gauge — a single number from 0 (perfectly cyclic) up to
log(N/2)(pure noise). Coloured green / orange / red depending on where it sits in that range. - Cycle alignment matrix — only appears when peers are passed. For each peer, shows their dominant period and the Jaccard alignment of their top-5 cycle set against the primary's. Bars on the right give the visual.
The peers input
The second input field on the page accepts a comma-separated list of peer tickers (e.g. BBCA,UNVR,ASII). The server fetches each peer's bars in parallel through the same data path as the primary, runs SpectralMap on each, and computes Jaccard similarity between their dominant-period sets. A peer with alignment ≈ 1.0 shares the same structural rhythms as the primary — they're driven by the same macro inputs even if their Pearson correlation is mediocre.
This is the input the Coherence page should eventually consume in place of (or alongside) Pearson. Pearson conflates trend slope and cyclic alignment; Jaccard on the rounded period set isolates the cyclic alignment alone.
Pipeline — server-side, scipy, sub-second
- Bars in. Same
/api/stock/:ticker[/long]data path as Fourier and Wavelets. - Worker spawn. Node pipes
{ mode: "spectral", ticker, dates, prices, peers }toserver/phasor_worker.py. - Compute. Log-returns + demean + Hann window,
scipy.fft.rfftfor the spectrum,scipy.signal.welchwith 256-bar segments and 50% overlap for the averaged periodogram. ThenSpectralMapfromsrc/spectral.pyfor top cycles, entropy, dominant period, phase-at-frequency, and (for each peer)cycle_alignment. - JSON out. The browser receives
{ spectrum, welch, top_cycles, entropy, dominant, alignment }and renders the panels via Canvas 2D.
Wavelets — when each cycle existed, not just whether
Fourier's fundamental limitation: it tells you which cycles are present in the window, but not when. A 22-day cycle that died 90 bars ago and a 22-day cycle that's alive right now look identical in the periodogram. Wavelets fix this — they trade infinite frequency precision for finite time localization.
The little wave
A Fourier basis function is ei2πft: a sine that runs forever. A wavelet is a little wave — it starts at zero, oscillates briefly, and returns to zero. The Morlet wavelet (a Gaussian-windowed complex sinusoid) is the standard choice for finance: it gives the best simultaneous time and frequency localization of any function obeying the uncertainty principle.
By scaling the wavelet (stretching or compressing it) we tune the cycle period it's sensitive to. By translating it across the time axis we ask "is there a cycle of this period at this time?". Doing both for every (scale, time) pair produces the scalogram — a 2D heatmap that the Fourier transform's 1D periodogram cannot draw.
The CWT scalogram — computed, not shown
The continuous wavelet transform on the Wavelets page runs 48 scales (geometrically spaced from 2 bars to ~N/4) using a complex Morlet wavelet. The output is a (48 × time) matrix of magnitudes — the full scalogram, time on x, cycle period on y. We compute it, but we don't render the heatmap. Two-dimensional rainbow heatmaps were confusing more than they explained: the eye couldn't tell whether a bright spot meant a cycle was alive or just that the colormap was non-linear. We surface the same information through two cursor-aware panels instead — a vertical-slice bar chart and a band-energy table — that read the scalogram column at a single time.
"Cycles alive at [date]" — the vertical slice
At the cursor position, we read out the scalogram column: 48 magnitudes, one per scale. Each is row-normalized (each scale divided by its own row max) so coarse cycles aren't drowned out by fine ones, then plotted as a horizontal bar chart with period on the x-axis and magnitude on the y-axis. Move the cursor and the bars rearrange — high-energy periods light up, dead cycles fade. This is the answer Fourier can't give: at this specific moment, which cycles are driving the price?
"Energy share at [date]" — the live DWT-style breakdown
For a single, plain-English number per band, we group the 48 CWT scales into the same period bands as a 6-level Daubechies-4 DWT — D₁ (~2–4 bars), D₂ (~4–8), D₃ (~8–16), D₄ (~16–32), D₅ (~32–64), D₆ (~64–128), and an approximation band (~128+). At every cursor position we sum |CWT|² within each band and re-normalize so the column sums to 100%. The result is a 6×T matrix of percentages — every column tells you what fraction of cyclic energy is in that band at that moment. The table updates live as you scrub.
The whole-history DWT decomposition is computed separately — that's the standard pywt.wavedec output, used for the three reconstructions and their R²:
- Trend = approximation + D₅ + D₆. Period roughly 64+ bars (multi-month).
- Swing = D₃ + D₄. Period roughly 8–32 bars (weekly–monthly).
- Random = D₁ + D₂. Period roughly 2–8 bars (sub-weekly noise).
Each reconstruction inverts pywt.waverec with only its own bands non-zero. The R² of each vs the original log returns answers a different question than the live cursor: it tells you which behavior dominates the stock across its whole history, not just right now. Both views matter — the live view is the snapshot, the R² is the long-run baseline.
Jump detection — what Fourier can't do at all
Fourier needs an infinite number of sines to construct a sharp edge, which is why a flash crash creates Gibbs ringing across the entire spectrum. Wavelets at fine scales (D₁, D₂) are locally sensitive to discontinuities. The page flags time bins where fine-scale coefficients exceed 5×MAD — an established threshold for outlier detection — and overlays them as red verticals on the price chart. On crypto histories you'll see the FTX collapse, the ETF approval pump, and major exchange listings as crisp vertical lines, dated to within ±2 bars. The 50 strongest are listed in a table at the bottom of the page.
The plain-English read
The header card on the page is a one-glance summary that reads the live numbers and writes a five-line story in the same labeled style as the Phasor page's formula box. The classification thresholds are deliberately blunt:
| Live numbers | Regime | Action |
|---|---|---|
| Random ≥ 55% | Random | sit on hands; no rhythm to predict |
| Trend ≥ 55% | Trend | go with the direction; don't bet against the bigger picture |
| Swing ≥ 35% | Swing | buy near a recent low, sell near a recent high; size around one cycle period |
| otherwise | Changing | wait; trade smaller until one bucket clearly leads |
The "Versus history" line then compares the live mix against the same buckets averaged over the whole history, and flags shifts of ≥20 percentage points — that's how the page tells you "this regime is unusual for this stock right now". The "Credibility" line uses the whole-history reconstruction R² as a sanity check: if a stock's long run is 80%+ random, even a swing-looking moment is a snapshot of which way the random is leaning, not a deep pattern. The story is generated entirely from observable numbers — no LLM, no opinion, no surprise outputs.
How the page is laid out
- Toolbar — slider scrubber + ▶ Animate button. Same widget the Phasor page uses (
.toolbar-row+.inline-scrub+.anim-btn) so muscle memory carries over. - Price + jump markers — the canvas you actually trade off. Hover scrubs the cursor; click locks it and stops autoplay; ←/→ steps ±1 bar, Shift+←/→ steps ±10. A purple cursor on the chart and a synchronized cursor in every other panel keep the time aligned.
- Cycles alive at [date] — the bar chart that replaces the scalogram heatmap.
- Plain-English read — the five-line story (Regime / Reading / Action / Versus history / Credibility).
- Energy share at [date] + Reconstructions (R²) side-by-side — the live DWT-style table and the whole-history R² breakdown.
- Detected jumps — top 10 by magnitude, dates listed.
Every card heading carries the ticker (TLKM — price + jump markers, etc.) so a screenshot of any single panel is unambiguous.
How this connects to the rest of the system
The phasor framework collapses each ticker to one dominant cycle (r·eiθ) via the Hilbert transform, which assumes the signal is approximately mono-frequency. Wavelets reveal where that assumption breaks: if the live "Cycles alive" bar chart shows energy spread across multiple periods at a given time, the phasor's "dominant cycle" is a misleading single number. The Wavelets page is therefore not a replacement for the phasor — it's the diagnostic that tells you when to trust it.
Pipeline — server-side, Python, sub-second
- Bars in. Same data path as the Fourier page:
/api/stock/:tickerfor the 365-bar default,/api/stock/:ticker/longfor the 10-year Yahoo history. - Worker spawn. The Node server pipes
{ mode: "wavelet", ticker, dates, prices, family }toserver/phasor_worker.pyvia stdin — same IPC pattern the phasor pipeline uses. - Compute. Log-returns + demean, then
pywt.cwtfor the 48-scale scalogram andpywt.wavedecfor the 6-level DWT. Bands grouped, energies summed, columns normalized, R² computed against the original log returns, jumps detected at 5×MAD. - Downsample. Long history can be 4500 bars × 48 scales = 216k cells. We stride the time axis to ≤600 columns to keep the JSON payload under ~2 MB.
- JSON out. The browser receives
{ scalogram, dwt: { bands, band_energy_t }, reconstructions, jumps, prices, dates }and renders everything client-side via Canvas 2D. No charting library, no build step.
Look-ahead bias — quantified
A practical question led to an uncomfortable discovery: every observation in the phasor framework is silently influenced by future data. Here's how big the effect actually is, and why it matters.
The question that started it
A natural assumption when looking at any charting tool is: "when the model tells me what a stock was doing on March 15, it's using information that was available on March 15." That assumption turns out to be wrong for this pipeline — and for almost every signal-processing pipeline applied to financial data.
The culprits are two functions in [src/market_phasor.py:191-197](src/market_phasor.py#L191-L197):
scipy.signal.filtfilt— the Butterworth lowpass is applied forward then backward to cancel phase distortion. The backward pass at bartuses barst+1, t+2, …, Nas inputs.scipy.signal.hilbert— the Hilbert transform uses a full-series FFT. Every output sample is a linear combination of every input sample, past and future.
Both are standard best practice for offline signal analysis. Neither is appropriate for a strategy that would run in real time. The output at bar t silently depends on bars that, in reality, hadn't happened yet.
The invariant that makes the experiment possible
At the very last bar of any series, there is no future for the filter to peek at — the series ends there. So at t = N, the reality pipeline and the causal pipeline must produce exactly the same answer. This is a mathematical identity, not an approximation.
It's the sanity check that validates every causal experiment: if these ever disagree, something is broken. They don't. They agree to full floating-point precision. Verified live on TLKM, DFAM, BTCUSDT, and every other ticker we've checked.
How the causal trail is built
For every bar t from the warmup threshold (~80) up to N:
That is: rerun the same code, with the same Butterworth cutoff, on prices up to and including bar t, and keep the last row of the output. That row is the phasor state a real-time observer would have computed on day t. Stack all of these up and you get the causal trail — the honest, look-ahead-free time series.
Cost: O(N²) per stock. In practice, on a modern laptop, ~200–400 ms for 365 bars. Not the bottleneck. The bottleneck was thinking of doing this in the first place.
The disagreement percentage
For each bar where both trails are defined, compare the regime labels. The disagreement percentage is the fraction of bars where the reality regime and the causal regime are different:
This is a direct measure of how much the non-causal filter is "fixing" the historical story. A stock with disagreement near 0% has a historical regime timeline that's stable under the arrival of new data. A stock with disagreement near 100% has a history that gets almost completely rewritten every time the filter sees another bar.
What we actually found
The nightly batch computes disagreement for every ticker in the IDX universe. Results from the most recent run:
| Ticker | Sector | Disagreement | Note |
|---|---|---|---|
| SCNP | Consumer Non-Cyclicals | 97% | 29 of the last 30 bars relabeled |
| FIMP | Infrastructure | 97% | 29 of 30 |
| MENN | Technology | 97% | 29 of 30 |
| DFAM | Consumer Non-Cyclicals | 70% | 21 of 30 — the user's original mismatch |
| TLKM | Infrastructure | 63% | 19 of 30 |
| BTCUSDT | Crypto (Binance) | 66% | 20 of 30 |
More than half of the universe has meaningful look-ahead bias. The top decile is effectively entirely re-labeled by hindsight. The phenomenon is not exotic and it is not small.
The four views we now expose
With this finding documented, the framework carries four separate views of every stock:
- Reality (snapshot) — the classic non-causal output at the moment the batch ran. What the Screener shows today.
- Reality (live) — the non-causal output re-computed on today's price series. What the Phasor tab shows when you type a ticker. Can differ from the snapshot even for the same historical bar, because the filter saw more data.
- Causal (frozen) — the real-time value for each bar, pre-computed and stored in the per-ticker parquet. Does not drift when new bars arrive. Exposed via the
r_causal,regime_causal, etc. columns in the snapshot and in the Phasor-tab overlay. - Disagreement score — the aggregate measure of how much reality and causal diverge over the last 30 bars per ticker. Surfaced on the Screener as the coloured bias flag and as a filter chip (
Look-ahead bias ≥ 30% / 50% / 80%).
How to read the Screener's bias flag
- No flag — under 20% of recent bars disagree. The reality labels are roughly trustworthy for historical analysis.
- Amber ⚠ 20–49% — noticeable relabelling. Treat regime-based backtests on this stock as overstated.
- Orange ⚠ 50–79% — majority of recent bars get rewritten by hindsight. Historical labels should not be trusted for strategy development.
- Red (pulsing) ⚠ 80%+ — the filter is essentially rewriting history every day. Use only the live latest-bar reading; anything else is noise.
The invariant in action — the "why doesn't this agree?" moment
A real case: user runs the Screener, sees DFAM as "L1 Accumulation, strength 31.4%". Clicks through to the Phasor tab and sees "Markdown, strength 14.9%" for the same stock. The numbers are very different.
What happened: the Screener was showing DFAM's state as of the previous day's batch, when the series ended at 2026-04-13 with price 130. At that moment, filtfilt's backward pass had no future to pull against — the edge artifact inflated r to 0.31 and put the phase near the Accumulation zone. Overnight, 2026-04-14 arrived with a price drop to 112. The Phasor tab now re-runs the pipeline on the extended series; the backward pass finally has context on the right edge; the 2026-04-13 label flips from L1 Accumulation to L4 Markdown. For the same date.
Both answers are "correct" outputs of the same math. They disagree because the math is non-causal. The causal column, by contrast, is frozen: r_causal for 2026-04-13 is 0.31 and will stay 0.31 no matter how many future bars arrive. That's the honest real-time reading, and it's now stored in the parquet permanently.
Why filtfilt specifically is the culprit
A Butterworth lowpass is an IIR filter — infinite impulse response. Applied once (forward), it introduces phase distortion: different frequency components get delayed by different amounts, smearing sharp transitions. For offline work, the standard cure is to apply the same filter twice — once forward, once backward — so the phase distortions cancel exactly. The result has perfectly zero phase lag. That's filtfilt.
The cost of zero phase lag is that every output sample depends on every input sample on both sides. Concretely, for a 4th-order Butterworth at cutoff 0.25, the impulse response decays over roughly 10–20 samples. The backward pass at bar t therefore mixes in non-negligible contributions from bars t+1 through t+20. That's not a theoretical quibble — it's a direct linear combination with measurable weights.
In a real-time system you can't run the backward pass because the future samples don't exist yet. Your choices are: (a) accept the forward-only phase lag, which means every signal appears delayed by several bars relative to the true underlying cycle, or (b) use a different filter architecture. The existing pipeline chose (c) — pretend you're offline and use filtfilt — which is the right call for historical analysis and the wrong call for live trading. Both can be true simultaneously. The causal column exists to make the distinction auditable.
The edge effect, visualised
The Hilbert transform in scipy is computed as ifft(fft(x) · H) where H zeros out negative frequencies. The FFT implicitly assumes the signal is periodic — that bar N−1 is followed by bar 0. For a price series this assumption is egregiously wrong, so the first and last ~20 bars of the Hilbert output are contaminated by wrap-around from the other end of the series.
The same price series, extended by one bar, produces different Hilbert outputs everywhere — but the difference is concentrated at the edges. The middle of the series moves by microns; the last 10 bars can move by tens of percent. This is why the snapshot and the live re-run disagree most on recent history, not on ancient history. And it's why the disagreement percentage reported on the Screener is computed over the last 30 bars, not over the whole series — older bars are effectively stable, recent bars are where hindsight rewrites the story.
The warmup threshold of ~80 bars on the causal trail exists for the same reason, at the opposite edge: before bar 80, filtfilt's forward-then-backward pass on a too-short window produces garbage (the filter hasn't converged). The causal trail returns NaN for those bars rather than lying about them.
Paths to a genuinely real-time pipeline
The causal column solves the audit problem — you can now see the bias and filter stocks that suffer from it. It does not solve the production problem: computing an O(N²) trail nightly for every ticker in the universe is fine, but running it tick-by-tick at market speed is not. A true streaming pipeline needs O(1) update per new bar. Three well-understood options exist:
- Forward-only IIR — use
lfilterinstead offiltfilt. O(1) per bar, causal by construction, but introduces phase lag of roughlyn/ωcbars. For a 4th-order Butterworth at cutoff 0.25, that's ~16 bars of lag. You'd be looking at a cycle position that was true two weeks ago. Acceptable for very slow strategies; unacceptable for anything responsive. - Causal FIR Hilbert — replace the FFT-based Hilbert with a finite impulse response approximation (e.g. Parks-McClellan design with a specified passband). FIR Hilbert transforms are strictly causal if you allow a fixed group delay, typically 30–50 samples. You pay with latency, but the latency is constant and known — you can simply offset your decisions by the group delay. This is how real-time radios demodulate SSB.
- Kalman analytic signal — model
(r, θ, dθ/dt)as a state-space system and run a Kalman filter. Each new bar updates the state in O(1) with no lookahead. Edge performance is dramatically better than FFT-Hilbert because the filter doesn't assume periodicity. The tradeoff is a tuning step (process and measurement noise) that the current pipeline avoids — we'd lose the "parameter-free" property that makes the framework reproducible.
None of these are in the current codebase. The causal column is the cheapest possible first step: honest about the bias, reuses the existing math without modification, and lets the rest of the framework surface the problem without committing to a rewrite of the signal chain.
Production architecture — how the nightly machine works
The theory above is the math. This section is the plumbing: how ~1,000 tickers get fetched, phasor'd, causal-trailed, interpreted by Claude, and written to disk every night before market open. Five moving parts, each of which is its own small project.
1 · Causal pre-computation
For every ticker in the universe, the nightly batch runs the O(N²) causal loop described above — MarketPhasor(prices[:t+1]) for each t from the warmup threshold up to N — and stores the result as extra columns (r_causal, theta_causal, regime_causal, real_causal, imag_causal) inside the per-ticker parquet file. These columns are frozen: once written for a bar, they never change, even when new bars arrive tomorrow. The reality columns (r, theta, …) continue to drift as filtfilt sees more future; the causal columns don't.
Cost budget: ~400 ms per ticker × 1,032 tickers ≈ 7 minutes on a single worker. The pipeline parallelizes 4-wide across Python subprocesses, so wall time is ~2 minutes. This is the cheapest possible honesty mechanism — no new math, no new libraries, just "run the existing pipeline a lot of times and keep the edge."
2 · AI interpretation
Once the phasors are computed, each ticker gets handed to Claude via the Agent SDK (@anthropic-ai/claude-agent-sdk) to produce a single plain-English sentence describing what the stock is actually doing. These sentences land in summaries_latest.parquet and become the searchable corpus behind the Ask tab.
The batch chunks the universe into groups of 25 tickers per Claude call, pools 4 concurrent calls, and asks for a strict JSON array back — one summary per ticker. Each call sees only the numeric state (r, theta, regime, coherence, sector, subsector, last price, last return), never raw prices, so summaries are grounded in the framework's vocabulary and are internally consistent across the universe. Single-stock deep interpretation on the Phasor tab uses Sonnet 4.5 for quality; the batch uses the same model for narrative uniformity.
Natural-language query on the Ask tab reverses the flow: the entire summary corpus (~1,031 one-line descriptions) is packed into a prompt along with the user's question, Claude picks the top matches, and the server streams the ranked result back to the browser as JSON. No vector database, no embeddings — just a single shot against a pre-built corpus that's small enough to fit in one context window.
3 · Multi-source MCP gateway
Market data comes from a local MCP (Model Context Protocol) gateway that the Node server talks to over HTTP. Two protocols coexist: classic REST for batch pulls (daily OHLC, sector listings, fundamentals) and server-sent events for streaming updates. A thin routing layer in server/server.js picks the right transport per endpoint and normalises the response shape so the rest of the app never has to care which protocol served the data.
The gateway is also the single choke point for caching. Stock lists, sector memberships, and fundamentals are cached for 24 hours; price bars are cached only until the next 16:00 close; FX rates and macro indicators are cached for 1 hour. Everything flows through one function so the cache policy is one file to audit, not fifteen.
4 · The daily parquet pipeline
Parquet is the only storage format. No Postgres, no Redis, no SQLite. DuckDB reads parquet files directly from disk at SQL speed, and the Node server uses @duckdb/node-api to query them without ever loading a row into JavaScript memory.
Write path (nightly, triggered by node-cron at 06:00 local): fetch prices via MCP → compute phasors + causal trails in Python → write per-ticker parquets → rebuild snapshot_latest.parquet (the single-row-per-ticker latest state file) via a DuckDB COPY (SELECT … FROM read_parquet('tickers/*.parquet')) TO 'snapshot_latest.parquet' statement → run the Claude interpret batch → write summaries_latest.parquet. Five steps, one cron job, fully idempotent — you can rerun any stage and the downstream stages will pick up the freshest inputs automatically.
Compression is zstd level 3 throughout. The entire universe (1,032 tickers × ~500 bars × ~25 columns × both reality and causal trails) fits in under 40 MB on disk. The snapshot file is under 500 KB. Loading the snapshot for a Screener page costs one read_parquet call and ~15 ms.
5 · Determinism as a first-class property
Every stage above is reproducible bit-for-bit given the same input. The phasor math is parameter-free (Butterworth cutoff is the only dial, and it's fixed in code). The causal trail is a deterministic function of the input series. The MCP gateway caches by content hash. The parquet writes use stable column ordering and no per-run metadata. Only the Claude interpretation step is non-deterministic — and that's why its outputs live in their own parquet file and are never used as inputs to any downstream numerical calculation. Numbers stay reproducible; narratives are allowed to drift.
Natural-language search — the Ask layer
A 1,032-row catalogue is too big to scroll and too small to hide a vector database under. The Ask page gives the user a single-line text input, calls Claude Haiku with the entire catalogue as a cached prompt block, and streams ranked tickers back as JSON. No embeddings, no fine-tuning, no model state.
What the catalogue contains
Every night, after the phasor batch and the Claude interpret pass have finished, every ticker has one row of structured state and one row of plain-English summary. The Ask endpoint joins them and emits a flat text catalogue — one stock per line — that fits in a single Anthropic context window:
Concrete example, taken verbatim from a recent run:
Each line is roughly 200 characters; the full catalogue is around 200 KB. Well under the 200-KB cache-block ceiling and far under the 200K-token context limit.
The single-shot retrieval architecture
A typical "search across thousands of items" pipeline reaches for embeddings + vector DB + reranker. We don't, because:
- The catalogue is small enough to fit in one prompt — 1,032 rows × 200 chars ≈ 60K tokens, well under model limits.
- It changes once per day, not per query — perfectly suited to prompt caching. The Ask endpoint marks the catalogue block with
cache_control: { type: "ephemeral" }, so the second and subsequent queries of the day reuse it for free. - Embeddings collapse semantics into a fixed-dim vector and lose precision on multi-attribute queries ("banks AND silently accumulating AND r < 4%"). A frontier LLM reading the raw text gets every attribute verbatim.
The pipeline is:
- Server loads the catalogue from
summaries_latest.parquet(joined withsnapshot_latest.parquetfor bar-count and active-pct gates), filters out low-quality rows (bar_count ≥ 252 AND active_pct ≥ 0.30), formats one line per ticker. - Builds a system prompt with strict definitions and explicit sector mappings (full text below).
- Streams a single Haiku 4.5 call with the catalogue cached and the user's question as the only uncached payload.
- Streams the JSON response back to the browser as it generates, so the user sees ranked tickers appear progressively.
Latency on a cached repeat call is roughly 1.5–3 s end-to-end. First call (cache miss) is ~6–8 s. Both numbers are roughly an order of magnitude faster than the previous Agent-SDK harness, which added ~30 s of multi-turn orchestration overhead the catalogue scan didn't need.
Why Haiku, not Sonnet
The single-stock deep interpretation on the Phasor tab uses Sonnet 4.5 — narrative quality matters, and the prompt is small. Ask is a different task: a fixed-format catalogue scan with a one-shot return. Haiku 4.5 handles it at one-quarter the cost and one-third the latency, and the strict JSON schema keeps the output rigid enough that quality differences between the two models are invisible.
Definition pinning — why "silently accumulating" maps to hard rules
Free-form natural language is ambiguous. "Silently accumulating" could mean a stock that's quietly building a base (the intended reading), or one whose summary happens to mention "building hidden demand" — even if its actual regime label is distribution. An early version of the page conflated those two and returned LIFE (an insurer in distribution) as a "silently accumulating bank" because its summary had the right vocabulary.
The fix was to write the system prompt as a tight rule sheet:
- The regime label is the one-word value before
vis. A summary that mentions "building demand" on a stock whose label isdistributionis not accumulation. silently accumulating=regime IN (accumulation, re_accumulation) AND hid↑ AND r ≤ 4%. Higher r means the stock is no longer silent — it's already breaking out.- "Banks" matches
subsector = 'Bank'— not the parent sector "Keuangan", which also covers insurance (Asuransi), multifinance (Pembiayaan), and securities. "Financials" is the term that means the broader Keuangan parent.
The catalogue line was changed to expose the subsector explicitly ([Keuangan/Bank] instead of just [Keuangan]) so the rule actually has the substring it needs to match. Both fixes together turn an ambiguous LLM call into a near-deterministic catalogue scan.
What the user types vs. what runs
| User question | Effective filter |
|---|---|
| "banks silently accumulating" | subsector='Bank' AND regime∈(accum, re_accum) AND imag>0 AND r<0.04 |
| "insurers in distribution" | subsector='Asuransi' AND regime='distribution' |
| "strong markup with high clarity" | regime='markup' AND r>0.05 AND coherence>0.7 |
| "crypto turning bottom" | source='crypto' AND regime∈(capitulation, accumulation) AND imag>0 |
The translation is performed inside Haiku, not in code — but the rule sheet in the system prompt makes the translation reliable enough to behave like a structured query in practice.
Universe filters — keeping dead tickers out of the corpus
A market data feed is a museum of dead, halted, suspended, and never-traded instruments. Letting any of them into the analytical corpus poisons every downstream metric. Three small filters do most of the heavy lifting.
The three filters
- Bar-count floor — drop tickers with fewer than 252 trading bars. A stock with only 90 bars of history can't have its annual cycle measured, can't have a coherence trail, can't be compared to peers. The floor is a hard requirement, not a soft warning. Default 252 (one trading year). Chips on the Universe page let the user relax this to 180, or tighten it to 500 (≥2y) for cycle research.
- Active-pct floor — drop tickers whose price has changed on fewer than 30% of their bars. A stock at the same price for 70% of the window isn't quietly accumulating — it's not trading. The active-pct gate filters out halted, suspended, locked-up, and recently-IPO'd low-data names whose phasor would be dominated by zero-return bars. Default 30%; chips offer 0% (any) and 60% (liquid only).
- Constant-price drop — applied at the equity-screener level, removes any ticker whose last close has equalled its mean close for >50% of the window. These are the truly dead names where the feed is still publishing yesterday's last print every day.
Why these filters are visible to the user
Each filter is exposed as a chip on the Universe and Screener pages, with the active selection highlighted. The user can always relax them — for example to research a recently-listed stock with only 90 bars — but the default catalogue, the Ask layer, and the regime-cluster visualisations all run with the strict defaults applied. The honest, rigorous corpus is the one the user sees first; the relaxed view is opt-in.
9 · Instantaneous frequency & phase velocity
Once we have z(t) = r(t)·eiθ(t), we can differentiate the phase:
This is the instantaneous frequency. In music it's how fast a note is changing pitch. In markets it's how fast the cycle is advancing. A steady cycle has a slow, near-constant dθ/dt. A regime transition comes with a big spike in |dθ/dt| — the phase is rotating quickly through a boundary.
We flag a bar as being "in transition" when |dθ/dt| exceeds a threshold (default 25°/bar). These are the moments to pay attention.
10 · Net phasor & coherence
For a universe of stocks we compute the cap-weighted vector sum of every phasor:
The magnitude |znet| divided by the mean amplitude gives net coherence — a number between 0 and 1 that tells you how phase-aligned the whole sector is.
This is mathematically identical to the order parameter in statistical physics. It measures directional alignment of many oscillators.
- > 0.70 — all stocks pointing the same way. Clear trend. Ride it.
- 0.40 – 0.70 — stocks split across stages. Rotation. Pair trades work.
- < 0.40 — scattered. No theme. Stock-pick individually.
11 · Kuramoto synchronization
In 1975 Yoshiki Kuramoto introduced a model for how large populations of coupled oscillators synchronize. Fireflies flashing in unison, heart cells firing together, electric grid generators locking into phase — they all follow the same equation:
The order parameter r · eiψ = (1/N) · Σ eiθj is exactly our net coherence. Below a critical coupling Kc the population is incoherent; above it, a macroscopic fraction locks into phase.
12 · Anti-phase pairs & structural arbitrage
Two stocks at phase angles θA and θB have a cross-phase distance:
When Δ ≈ 180°, they're in anti-phase — one is in Markup while the other is in Capitulation. Long A, short B, and you have a pair trade that doesn't care about the broader market direction, because it's structurally orthogonal to it.
Correlation-based pair selection can't find these. Pearson correlation collapses a stock to Re alone and penalizes pairs with opposite price moves. The phasor preserves the full complex structure, so it detects pairs that are structurally mirror images even if their return correlation looks normal.
13 · Connection to QKV attention
In 2003, US patent 8,572,041 proposed a key-value store indexed by historical state. In 2017, transformer models reformulated this as scaled dot-product attention: Q·KT·V / √dk. In both cases you have keys (memory), values (payloads), and queries (what to retrieve).
Applied to markets:
V (values) = capital flow magnitudes → how much moved
Q (query) = the Hilbert phasor → what is being asked right now
The Hilbert transform constructs Q from the observed K and V stream. The imaginary axis is a learnable-free query that asks "given everything that came before, what's the latent pressure now?"
14 · Why deterministic matters
Everything on this page is parameter-free once the Butterworth cutoff is fixed. No training data. No hyperparameter search. No drift. The same code on the same OHLC produces the same numbers today, next year, and five years from now.
For a fund this means: every trade is explainable, every signal is reproducible, and every backtest is identical to live. Regulators (OJK, MAS, SEC) can audit the pipeline end to end. A quant in Jakarta and a quant in Singapore running the same code will see the same regime labels. There is no "model version" to argue about.
15 · Phase extrapolation — trading the projected regime
The phasor tells you where a stock is. The derivative tells you how fast it's moving. Together they tell you where it's going.
Taylor series — the idea
Brook Taylor published this in 1715. The idea: if you know a function's value and all its derivatives at a single point, you can reconstruct the function's value at any nearby point. For a smooth function f(t), the value at t + k is:
Each term adds a layer of accuracy. The first term is where you are. The second is velocity (how fast you're moving). The third is acceleration (how fast the velocity is changing). And so on. The more terms, the further ahead the approximation holds — but for a smooth signal, even two or three terms are powerful over short horizons.
Applying Taylor to phase
At every bar the phasor pipeline gives us three numbers:
θ(t)— the phase angle. Where the stock is in the Wyckoff cycle right now.ω = dθ/dt— the angular velocity. How many degrees per bar the phase is advancing. Already computed by MarketPhasor as thed_thetacolumn.α = d²θ/dt²— the angular acceleration. How fast ω is changing. Computed as the mean ofdiff(d_theta)over the last 20 bars.
Plugging these into the Taylor series, truncated at second order:
The first term says "you're at 90°." The second says "you're moving at 5°/bar, so in 5 bars you'll be at 115°." The third says "but you're decelerating at 0.3°/bar², so actually you'll be at 111°." Each term corrects the previous one.
Why we stop at second order
The third derivative (jerk) and beyond are noise for daily equity data, even after Butterworth filtering. Two terms capture the meaningful dynamics: is the stock speeding up or slowing down through the cycle? Beyond that, the signal-to-noise ratio flips. Backtest validation confirms this — adding a cubic term doesn't improve H5 or H10 accuracy.
The same expansion for amplitude
Phase tells you which regime. Amplitude tells you how much energy is behind it. We apply the same Taylor expansion to r(t):
Clamped at zero because amplitude can't go negative. A decaying r means the stock is coasting — the move is losing energy even if the phase is still advancing. A growing r means the move has fuel. Both matter for conviction.
Why phase extrapolates better than price
Price is noisy, non-stationary, and mean-reverting at different timescales simultaneously. Phase, after filtering, is a smooth monotonic-ish function that advances through the Wyckoff cycle at a locally stable rate. The angular velocity ω changes slowly — a stock in markup doesn't suddenly teleport to capitulation. This is why a simple quadratic fit on θ produces useful projections at horizons of 1–10 bars, whereas the same fit on price would be meaningless.
From projected phase to projected conviction
Once we have θ̂(t+k), we reconstruct the projected phasor in Cartesian:
Re_hat = r̂·cos(θ̂) Im_hat = r̂·sin(θ̂)
From (Re_hat, Im_hat, projected regime) we run the same conviction classifier that the portfolio system uses at t. The output is a projected conviction — hl, dw, pr, or he — at horizon k.
This is what the portfolio system trades on. Not "where is the stock today?" but "where will it be when my position is mature?"
Five horizons
Projections are computed at five horizons, each mapping to a holding-period intent:
| Horizon | Bars | Intent | Typical use |
|---|---|---|---|
| H1 | 1 | Very short | Intraday confirmation — is the next bar likely to stay in regime? |
| H3 | 3 | Short | Swing entry — will conviction hold through the entry settling period? |
| H5 | 5 | Medium | Default agent horizon — the trade thesis maturity window. |
| H7 | 7 | Long | Position sizing — is the move projected to have legs? |
| H10 | 10 | Strategic | Conviction filter — reject entries that degrade within 10 bars. |
Confidence: the uncertainty envelope
The standard deviation of dθ/dt over the lookback window measures how stable the angular velocity has been. A stock rotating at a steady 5°/bar has a tight σ; one wobbling between −30° and +20° has a wide σ. Confidence maps this to [0, 1]:
The portfolio system ignores projections with confidence below 0.40 (configurable). This prevents noisy extrapolations from triggering false signals.
The residual — hypothesis tracking
Every projection is a hypothesis. When the next bar arrives, we compute the residual:
A small residual means the trajectory is holding. A growing residual means something changed. The system tracks this for every open position:
- |residual| < 45° → hypothesis confirmed, hold position.
- |residual| > 45° → hypothesis diverged, flag for exit review.
This is the same observe→act loop used in Kalman filtering and model-predictive control: project, observe, compute residual, decide. The threshold (45°, configurable) is one quarter of the cycle — if you're wrong by more than a full regime boundary, the trade thesis is broken.
16 · What can be further added
The current framework is the minimal deterministic version. Several well-understood extensions plug in naturally:
- Multi-scale phasors — run the pipeline on multiple Butterworth cutoffs simultaneously (e.g. 0.1, 0.25, 0.4). You get a daily, weekly, and monthly cycle on the same chart. A stock can be in Markup on the weekly and Distribution on the daily — exactly the setup where you take partial profits.
- Empirical Mode Decomposition (EMD) — Huang 1998. A data-adaptive alternative to Fourier that extracts intrinsic mode functions. Feeding each IMF through the Hilbert transform gives the Hilbert-Huang spectrum. Better for non-stationary signals, which equities are.
- Wavelet phase — Morlet or Paul wavelets give time-frequency phase maps, letting you see how the stage breakdown evolves with both time and timescale.
- Kalman-smoothed phase — add a state-space model over
(r, θ, dθ/dt)to improve edge-of-series estimates (the Hilbert transform degrades at the boundaries). - Cross-asset phasors — apply the same pipeline to bonds, FX, commodities, crypto. The
znetacross asset classes gives a macro regime indicator that no single market provides. - Phase-locked option pricing — use
θto tilt implied volatility skew. Distribution stages should command higher put skew than Accumulation stages for identical realized vol. - Event injection — overlay earnings dates, dividend dates, and FOMC meetings on the phasor. Quantify whether transitions cluster around known events (they often do) and use the residual as clean signal.
- Graph coupling — build a graph where edge weight is
1 − cos(Δij). The leading eigenvector gives the market's "principal phase mode" — like PCA, but on phase instead of returns. - Order-flow phasors — replace price with signed trade volume as the input signal. The Im axis then measures aggressive buying vs passive absorption, directly quantifying what market microstructure calls "informed flow."
- Regime-conditional factor models — re-estimate value, momentum, and quality factors separately within each of the six phase stages. Factor premia are almost certainly stage-dependent.
17 · Glossary
| Math term | Trader term | Meaning |
|---|---|---|
| z(t) | state | complex-valued capital state |
| Re | visible momentum | observable price move |
| Im | hidden pressure | latent flow, leads price |
| r(t) | move strength | amplitude, how big the move is |
| θ(t) | stage | phase angle, 0°–360° |
| dθ/dt | cycle velocity | how fast the stage is advancing |
| coherence | signal clarity | phase alignment over window |
| net coherence | togetherness | Kuramoto order parameter across stocks |
| Δ(A,B) | pair distance | angular distance, 0°–180° |
| eiθ | unit cycle | walking the unit circle |
| ω = dθ/dt | rotation speed | angular velocity, degrees per bar |
| α = d²θ/dt² | rotation acceleration | rate of change of angular velocity |
| θ̂(t+k) | projected stage | extrapolated phase at horizon k bars ahead |
| residual | tracking error | actual θ minus projected θ, wrapped to ±180° |
| H5 | 5-bar horizon | default projection horizon for trade decisions |