OHLC-Vol

A Python toolkit for computing OHLC-based volatility estimators, studying their statistical properties, and evaluating their ability to forecast future realized volatility using non-negative least squares (NNLS) regression.

Overview

Daily OHLC (Open, High, Low, Close) prices contain more information about intraday volatility than close-to-close returns alone. This project implements a suite of classical OHLC estimators, compares them via in-sample regressions and walk-forward out-of-sample evaluation, and experiments with several functional forms for the vol-on-vol forecasting relationship.

Volatility estimators

Column	Estimator
`vol_adj_cc_ann`	Adjusted close-to-close
`vol_cc_ann`	Unadjusted close-to-close
`vol_parkinson_ann`	Parkinson (1980)
`vol_gk_ann`	Garman–Klass (1980)
`vol_rs_ann`	Rogers–Satchell (1991)
`vol_yz_ann`	Yang–Zhang (2000)
`vol_meilijson_ann`	Meilijson (2011)
`vol_ht_ann`	Hodges–Tompkins (2002) (bias-corrected CC)
`vol_tr_ann`	True Range
`vol_on_rs_ann`	Overnight + Rogers–Satchell
`neg_ret_adj_cc`	Negative adjusted return (leverage proxy)
`neg_ret_adj_cc_pos`	`max(neg_ret_adj_cc, 0)` (down-day leverage)

All estimators are annualized (multiplied by √252) and expressed as percentage volatility.

Regression models

Four functional forms are supported for predicting h-day-ahead realized volatility from linearly-weighted moving averages (LWMA) of past estimators:

Model	Specification	Notes
`level`	vol = c₀ + Σ cⱼ·xⱼ	Linear NNLS; fastest
`sqrt-var`	vol = √(c₀ + Σ cⱼ·xⱼ²)	Non-linear NLS; slowest
`log-vol`	log(vol) = c₀ + Σ cⱼ·log(xⱼ)	Log-log NNLS + Duan smearing
`var-space`	vol² = c₀ + Σ cⱼ·xⱼ²	Linear NNLS on squared quantities + ratio smearing

R², residual statistics, and BIC are reported on the vol scale for all four models, enabling direct comparison. Non-negative constraints ensure forecasts are always non-negative. Predictors are selected via a t-statistic pruning loop and an optional one-per-group filter that retains only the best LWMA window for each estimator.

Files

Library modules

File	Purpose
`ohlc_vol.py`	Compute all OHLC vol estimators from raw OHLC data
`ohlc_io.py`	Read and write the multi-symbol OHLC CSV format
`vol_analysis.py`	Summary stats, ACF tables, correlation matrices, predictor builders (lag, MA, LWMA), forward vol target (RMS)
`nnls_reg.py`	NNLS regression engine: fitting, pruning, SE, BIC, and the four model types

Analysis scripts

Script	Purpose
`xohlc_vol.py`	Main analysis: in-sample NNLS regressions for each estimator and forecast horizon; regression model comparison summary
`xohlc_vol_oos.py`	Walk-forward out-of-sample evaluation; OOS R², RMSE, MAE, bias per model and symbol
`xohlc_vol_measures.py`	Lightweight script: compute and display vol measures with summary stats and ACF; optionally write to CSV
`xohlc_vol_var.py`	Extends `xohlc_vol.py` with VAR models (OLS and NNLS, full and restricted)
`xreturn_stats.py`	Return and vol summary statistics and autocorrelations; useful for comparing original vs. resampled data
`xresample_ohlc.py`	Bootstrap resampler: draws daily OHLC rows with replacement to create a null dataset that destroys temporal autocorrelation while preserving marginal distributions
`vector_autoreg.py`	VAR fitting library used by `xohlc_vol_var.py`: OLS and NNLS VAR estimation, companion matrix, RMSE and IC tables

Data

File	Description
`prices_ohlc.csv`	Multi-symbol daily OHLC prices (SPY, QQQ, GLD, USO, HYG; 2010–2026)

Input data format

The CSV uses a two-row header: the first row repeats the ticker symbol for each field; the second row contains field names (Open, High, Low, Close, Adj Close). The first column is the date index. This is the format produced by yfinance when downloading multiple tickers.

         SPY                          QQQ          ...
         Open   High    Low  Close  Adj Close  Open ...
2010-01-04  ...

Usage

Compute vol measures and print summary statistics

python xohlc_vol_measures.py
python xohlc_vol_measures.py --symbol GLD --output gld_vol.csv

In-sample regression analysis

# All symbols, default horizons (1, 5, 21 days)
python xohlc_vol.py

# Single symbol, 21-day horizon, SPY as external predictor
python xohlc_vol.py --symbol QQQ --horizons 21 --external-symbols SPY

# Predict each measure only from its own past values (baseline)
python xohlc_vol.py --same-measure --horizons 21

# Write vol measures to CSV
python xohlc_vol.py --output vol_measures.csv

Walk-forward out-of-sample evaluation

# Default: level model, all symbols, horizons 1 5 21, annual refit
python xohlc_vol_oos.py

# Compare level and var-space models, monthly refit
python xohlc_vol_oos.py --models level var-space --step 21

# Same-measure baseline OOS
python xohlc_vol_oos.py --same-measure --horizons 21

# With external SPY predictors
python xohlc_vol_oos.py --external-symbols SPY --horizons 21

Resampling null check

# Create resampled prices and compare return statistics
python xresample_ohlc.py --input prices_ohlc.csv --output prices_ohlc_resampled.csv
python xreturn_stats.py --file prices_ohlc_resampled.csv

Key options

Most options are available as both a constant at the top of each script (for persistent defaults) and as a command-line argument (for one-off overrides).

`xohlc_vol.py`

Option	Default	Description
`--symbol SYM`	all	Analyse one symbol
`--horizons H [H ...]`	1 5 21	Forecast horizons in trading days
`--external-symbols SYM [...]`	none	Add another symbol's vol measures as predictors
`--same-measure`	off	Restrict predictors to the same estimator as the target
`--output FILE`	none	Write vol measures to CSV

`xohlc_vol_oos.py`

Option	Default	Description
`--symbol SYM`	all	Analyse one symbol
`--horizons H [H ...]`	1 5 21	Forecast horizons
`--models MODEL [...]`	level	Models: `level` `sqrt-var` `log-vol` `var-space`
`--min-train N`	504	Minimum training days (expanding window)
`--step N`	252	Refit interval in trading days
`--external-symbols SYM [...]`	none	External predictor symbols
`--same-measure`	off	Restrict predictors to the same estimator

Dependencies

numpy
pandas
scipy

Selected findings (SPY, QQQ, GLD, USO, HYG; 2010–2026)

Level model wins in-sample and OOS for equity-like assets (SPY, QQQ, HYG, USO). The variance-based models (sqrt-var, var-space) do not consistently improve on the simpler linear specification.
GLD is an exception: variance-based models show a meaningful in-sample improvement (R² +0.03), consistent with gold volatility having more symmetric dynamics than equities.
OOS R² is 3–10 points below in-sample R², a typical degradation from predictor selection.
5-day horizon forecasts best on average (OOS R² ≈ 0.45); 1-day forecasts are hardest (R² ≈ 0.25).
External SPY predictors help USO OOS but hurt GLD and HYG, suggesting the cross-asset vol relationship is non-stationary.
Same-measure baseline (predicting vol from its own past only) is substantially weaker than pooling all OHLC estimators, confirming that the other estimators add genuine predictive value.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OHLC-Vol

Overview

Volatility estimators

Regression models

Files

Library modules

Analysis scripts

Data

Input data format

Usage

Compute vol measures and print summary statistics

In-sample regression analysis

Walk-forward out-of-sample evaluation

Resampling null check

Key options

`xohlc_vol.py`

`xohlc_vol_oos.py`

Dependencies

Selected findings (SPY, QQQ, GLD, USO, HYG; 2010–2026)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
nnls_reg.py		nnls_reg.py
ohlc_io.py		ohlc_io.py
ohlc_vol.py		ohlc_vol.py
prices_ohlc.csv		prices_ohlc.csv
vector_autoreg.py		vector_autoreg.py
vol_analysis.py		vol_analysis.py
xohlc_vol.py		xohlc_vol.py
xohlc_vol_measures.py		xohlc_vol_measures.py
xohlc_vol_oos.py		xohlc_vol_oos.py
xohlc_vol_var.py		xohlc_vol_var.py
xresample_ohlc.py		xresample_ohlc.py
xreturn_stats.py		xreturn_stats.py

Folders and files

Latest commit

History

Repository files navigation

OHLC-Vol

Overview

Volatility estimators

Regression models

Files

Library modules

Analysis scripts

Data

Input data format

Usage

Compute vol measures and print summary statistics

In-sample regression analysis

Walk-forward out-of-sample evaluation

Resampling null check

Key options

xohlc_vol.py

xohlc_vol_oos.py

Dependencies

Selected findings (SPY, QQQ, GLD, USO, HYG; 2010–2026)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`xohlc_vol.py`

`xohlc_vol_oos.py`

Packages