Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
7490ee4
Add shared forecast pipeline utilities and tests
SamuelBrand1 Dec 11, 2025
1641e59
change to relative imports
SamuelBrand1 Dec 11, 2025
2e434ea
add Parquet dep
SamuelBrand1 Dec 11, 2025
fca72b0
reduce docstring bloat
SamuelBrand1 Dec 11, 2025
f89673e
Add PipelineOutput support for pipeline forecasts
SamuelBrand1 Dec 11, 2025
0cbc8fd
Add DEFAULT_TARGET_LETTER and update output filenames
SamuelBrand1 Dec 11, 2025
5156147
move utils and rename paths dataclass
SamuelBrand1 Dec 12, 2025
4401bef
Add use_percentage flag to EpiAutoGPInput and output logic
SamuelBrand1 Dec 12, 2025
44fa0af
Refactor EpiAutoGP pipeline and add end-to-end tests
SamuelBrand1 Dec 12, 2025
966f4d6
Update .gitignore
SamuelBrand1 Dec 12, 2025
02591bb
Refactor EpiAutoGP post-processing into utility function
SamuelBrand1 Dec 12, 2025
9b46362
Refactor forecast utils to use context methods
SamuelBrand1 Dec 12, 2025
ada1b74
Update README.md
SamuelBrand1 Dec 12, 2025
6cef44d
Add frequency to input and generalize forecast horizon
SamuelBrand1 Dec 12, 2025
74cbe8d
Add ed_visit_type to input and output handling
SamuelBrand1 Dec 12, 2025
b693b8f
Add ed_visit_type param for NSSP/ED visit modeling
SamuelBrand1 Dec 12, 2025
2d258cb
Add daily NSSP forecast tests and support for ED visit type
SamuelBrand1 Dec 12, 2025
6a4426e
Refactor forecast utils tests and remove prep_epiautogp tests
SamuelBrand1 Dec 12, 2025
ea73cd8
update epiautogp docstrings
SamuelBrand1 Dec 15, 2025
511cf26
Update prep_epiautogp_data.py
SamuelBrand1 Dec 15, 2025
0dd0f9f
Update output.jl
SamuelBrand1 Dec 15, 2025
ae9313e
add nhsn test coverage
SamuelBrand1 Dec 15, 2025
e16f115
reorg unit tests
SamuelBrand1 Dec 15, 2025
65e5d2d
Update pipelines/epiautogp/process_epiautogp_forecast.py
SamuelBrand1 Dec 15, 2025
551543a
caught anti-pattern
SamuelBrand1 Dec 15, 2025
6bb924c
Merge branch '780-add-forecast_epiautogp-function' of https://github.…
SamuelBrand1 Dec 15, 2025
bc5dbce
explain use of percentage
SamuelBrand1 Dec 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
reduce docstring bloat
  • Loading branch information
SamuelBrand1 committed Dec 12, 2025
commit fca72b011c55939b754669f29b7ff3fce42c9881
75 changes: 0 additions & 75 deletions EpiAutoGP/src/input.jl
Original file line number Diff line number Diff line change
Expand Up @@ -16,22 +16,6 @@ with nowcasting requirements and forecast parameters.
- `nowcast_dates::Vector{Date}`: Dates requiring nowcasting (typically recent dates with incomplete data)
- `nowcast_reports::Vector{Vector{Real}}`: Uncertainty bounds or samples for nowcast dates

# Examples
```julia
# Create a simple input dataset
data = EpiAutoGPInput(
[Date("2024-01-01"), Date("2024-01-02"), Date("2024-01-03")],
[45.0, 52.0, 38.0],
"COVID-19",
"CA",
Date("2024-01-03"),
[Date("2024-01-02"), Date("2024-01-03")],
[[50.0, 52.0, 54.0], [36.0, 38.0, 40.0]]
)

# Validate the input
validate_input(data) # returns true if valid
```
"""
struct EpiAutoGPInput
dates::Vector{Date}
Expand Down Expand Up @@ -65,30 +49,6 @@ Performs comprehensive validation including:

# Returns
- `Bool`: Returns `true` if validation passes

# Throws
- `ArgumentError`: If any validation check fails, with descriptive error message

# Examples
```julia
# Valid data passes validation
valid_data = EpiAutoGPInput(
[Date("2024-01-01"), Date("2024-01-02")],
[45.0, 52.0],
"COVID-19", "CA", Date("2024-01-02"),
Date[], Vector{Real}[]
)
validate_input(valid_data) # returns true

# Invalid data throws ArgumentError
invalid_data = EpiAutoGPInput(
[Date("2024-01-01")],
[-5.0], # negative values not allowed
"COVID-19", "CA", Date("2024-01-01"),
Date[], Vector{Real}[]
)
validate_input(invalid_data) # throws ArgumentError
```
"""
function validate_input(data::EpiAutoGPInput; valid_targets = ["nhsn", "nssp"])
@assert data.target in valid_targets "Target must be one of $(valid_targets), got '$(data.target)'"
Expand Down Expand Up @@ -221,41 +181,6 @@ end
read_and_validate_data(path_to_json::String) -> EpiAutoGPInput

Read epidemiological data from JSON file with automatic validation.

This is the recommended function for loading input data in production workflows.
It combines [`read_data`](@ref) and [`validate_input`](@ref) to ensure that
loaded data is both structurally correct and passes all validation checks.

# Arguments
- `path_to_json::String`: Path to the JSON file containing input data

# Returns
- `EpiAutoGPInput`: Validated data structure ready for modeling

# Throws
- `SystemError`: If the file cannot be read
- `JSON3.StructuralError`: If JSON structure is invalid
- `ArgumentError`: If data fails validation checks

# Examples
```julia
# Load and validate data in one step
data = read_and_validate_data("epidata.json")

# This is equivalent to:
data = read_data("epidata.json")
validate_input(data)

# Use in a try-catch block for error handling
try
data = read_and_validate_data("uncertain_data.json")
println("Data loaded successfully")
catch e
@error "Failed to load data" exception=e
end
```

See also: [`read_data`](@ref), [`validate_input`](@ref), [`EpiAutoGPInput`](@ref)
"""
function read_and_validate_data(path_to_json::String)
data = read_data(path_to_json)
Expand Down
37 changes: 0 additions & 37 deletions EpiAutoGP/src/modelling.jl
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,6 @@ A NamedTuple containing:
- `n_forecasts_per_nowcast::Int`: Number of forecast samples per nowcast scenario
- `transformation::Function`: Forward transformation function
- `inv_transformation::Function`: Inverse transformation function

# Examples
```julia
input = EpiAutoGPInput(...)
model_setup = prepare_for_modelling(input, "boxcox", 4, 1000)
```
"""
function prepare_for_modelling(input::EpiAutoGPInput, transformation_name::String,
n_forecast_weeks::Int, n_forecasts::Int)
Expand Down Expand Up @@ -84,14 +78,6 @@ combination with nowcast scenarios.

# Returns
- Fitted AutoGP model ready for forecasting

# Examples
```julia
dates = [Date(2024,1,1), Date(2024,1,8), Date(2024,1,15)]
values = [100.0, 120.0, 95.0]
transform_func, _ = get_transformations("boxcox", values)
model = fit_base_model(dates, values; transformation=transform_func)
```
"""
function fit_base_model(dates::Vector{Date}, values::Vector{<:Real};
transformation::Function,
Expand Down Expand Up @@ -168,20 +154,6 @@ A NamedTuple containing:
- `forecast_date::Date`: The reference date for forecasting (from input.forecast_date)
- `location::String`: The location identifier (from input.location)
- `disease::String`: The disease name (from input.disease)

# Examples
```julia
# Basic forecasting
input = EpiAutoGPInput(...)
results = forecast_with_epiautogp(input)
forecast_dates, forecasts = results.forecast_dates, results.forecasts

# Custom parameters
results = forecast_with_epiautogp(input;
n_forecast_weeks=4,
n_forecasts=1000,
transformation_name="positive")
```
"""
function forecast_with_epiautogp(input::EpiAutoGPInput;
n_forecast_weeks::Int = 8,
Expand Down Expand Up @@ -241,15 +213,6 @@ with parsed command-line arguments to execute the full nowcasting and forecastin
- `"smc-data-proportion"`: SMC data proportion
- `"n-mcmc"`: Number of MCMC samples
- `"n-hmc"`: Number of HMC samples

# Examples
```julia
# Typical usage pattern
args = parse_arguments()
input_data = read_and_validate_data(args["json-input"])
results = forecast_with_epiautogp(input_data, args)
forecast_dates, forecasts = results.forecast_dates, results.forecasts
```
"""
function forecast_with_epiautogp(input::EpiAutoGPInput, args::Dict{String, Any})
return forecast_with_epiautogp(input;
Expand Down
76 changes: 9 additions & 67 deletions EpiAutoGP/src/output.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,47 +2,28 @@
AbstractForecastOutput

Abstract base type for all forecast output formats in EpiAutoGP.

This type serves as the root of the forecast output type hierarchy, allowing for
extensible output formatting while maintaining type safety and dispatch.
"""
abstract type AbstractForecastOutput end

"""
AbstractHubverseOutput <: AbstractForecastOutput

Abstract type for hubverse-compatible forecast outputs.

The hubverse is a standardized format for epidemiological forecasting used by
the CDC and other public health organizations. All concrete subtypes must
produce outputs compatible with hubverse table specifications, e.g. quantile-based
forecasts, sample-based forecasts, etc.
Abstract type for hubverse-compatible forecast outputs in CSV format.
"""
abstract type AbstractHubverseOutput <: AbstractForecastOutput end

"""
QuantileOutput <: AbstractHubverseOutput

Configuration for quantile-based forecast outputs compatible with hubverse specifications.
PipelineOutput <: AbstractForecastOutput

This struct defines the quantile levels to be computed and included in the
hubverse-compatible output table. The default quantile levels follow CDC
forecast hub standards.

# Fields
- `quantile_levels::Vector{Float64}`: Vector of quantile levels between 0 and 1

# Examples
```julia
# Use default quantiles (23 levels from 0.01 to 0.99)
output = QuantileOutput()
Abstract type for directly outputting forecasts as typical pipeline outputs for
`pyrenew-hew`.
"""
abstract type PipelineOutput <: AbstractForecastOutput end

# Custom quantiles for specific use case
output = QuantileOutput(quantile_levels = [0.25, 0.5, 0.75])
"""
QuantileOutput <: AbstractHubverseOutput

# Single quantile (median only)
output = QuantileOutput(quantile_levels = [0.5])
```
Configuration for quantile-based forecast outputs compatible with hubverse specifications.
"""
@kwdef struct QuantileOutput <: AbstractHubverseOutput
quantile_levels::Vector{Float64} = [
Expand All @@ -65,14 +46,6 @@ date (when the forecast was made) and each target date.

# Returns
- `Vector{Int}`: Vector of horizons in weeks (integer division by 7 days)

# Examples
```julia
ref_date = Date("2024-01-01")
targets = [Date("2024-01-08"), Date("2024-01-15"), Date("2024-01-22")]
horizons = _make_horizon_col(targets, ref_date)
# Returns: [1, 2, 3]
```
"""
function _make_horizon_col(target_end_dates::Vector{Date}, reference_date::Date)
return [Dates.value(d - reference_date) ÷ 7 for d in target_end_dates]
Expand All @@ -99,15 +72,6 @@ core forecast data needed for hubverse tables.
- `value`: Computed quantile value
- `target_end_date`: Date for which the forecast applies
- `output_type`: Always "quantile" for this method

# Examples
```julia
results = (forecast_dates = [Date("2024-01-08"), Date("2024-01-15")],
forecasts = rand(2, 100)) # 2 dates × 100 samples
output_config = QuantileOutput(quantile_levels = [0.25, 0.5, 0.75])
df = create_forecast_df(results, output_config)
# Returns DataFrame with 6 rows (2 dates × 3 quantiles)
```
"""
function create_forecast_df(results::NamedTuple, output_type::QuantileOutput)
# Extract relevant data
Expand Down Expand Up @@ -163,28 +127,6 @@ hubverse table, optionally saving it to disk.
- `horizon`: Forecast horizon in weeks
- `target_end_date`: Date for which forecast applies
- `location`: Geographic location identifier

# Examples
```julia
# Create and save hubverse table
output_type = QuantileOutput()
df = create_forecast_output(
input_data, results, "./output", output_type;
save_output = true,
group_name = "CDC",
model_name = "EpiAutoGP-v1"
)

# Create table without saving
df = create_forecast_output(
input_data, results, "./output", output_type;
save_output = false
)
```

# File Output
When `save_output = true`, creates a CSV file with filename format:
`{reference_date}-{group_name}-{model_name}-{location}-{disease_abbr}-{target}.csv`
"""
function create_forecast_output(
input::EpiAutoGPInput,
Expand Down