ARPS decline curve analysis and production forecasting for oil and gas wells with comprehensive data validation.
- ARPS Decline Curves: Automatic fitting of exponential, hyperbolic, and harmonic decline curves
- Smart Data Processing: Peak detection, outlier removal, and data quality filters
- High-Quality Forecasts: Average R² of 0.69 with 70% of wells achieving good fits (R² ≥ 0.6)
- Data Validation Framework: Bronze-to-silver data quality validation with change tracking
- Interactive Visualization: Streamlit app for exploring forecasts
- Audit & Compliance: JSON audit reports for data lineage and quality monitoring
uv syncuv run python run_validation.pyfrom petrinex.forecast import prepare_well_data, forecast_multiple_wells
# Load and prepare data
df = pd.read_parquet('fixtures/ngl_vol_bronze_cvx.parquet')
processed = prepare_well_data(df, min_months=6)
# Generate forecasts
forecasts = forecast_multiple_wells(processed, forecast_months=24, production_column='GasProduction')We have created a Streamlit app to allow users to explore the forecasts.
uv run streamlit run streamlit_app.pyFor deploying to Databricks:
databricks bundle validate
databricks bundle deploy --target dev
databricks bundle runThe project includes a comprehensive data quality framework that:
- Bronze-to-Silver Validation: Automatically validates incoming data against silver tables
- Change Detection: Efficiently tracks data changes using row hashes
- Data Quality Checks: Validates data types, ranges, and handles NA values
- Audit Reports: Generates JSON audit trails for compliance and monitoring
- Bronze Tables: Raw Petrinex production data (
ngl_vol_bronze_cvx.parquet,conv_vol_bronze_cvx.parquet) - Validation: Data type checking, range validation, NA handling, schema compliance
- Silver Tables: Validated bronze data with metadata (hash, timestamps) for change tracking
- Audit Reports: JSON files tracking data quality metrics and changes over time
All files are stored in fixtures/ directory for easy access and future Spark integration.
Run the complete validation pipeline using the Jupyter notebook:
uv run jupyter lab src/03_validate.ipynbOr use the standalone script:
uv run python run_validation.pyRun tests with:
uv run pytest- imperial unit conversions properly handled in the presentation layer (Streamlit app) rather than cluttering the core business logic.
- Logging should only happen within the codebase (src/petrinex) and not the notebooks.
- Works with volumes only, not using BDFS