Skip to content

Conversation

@tasansal
Copy link
Contributor

@tasansal tasansal commented Sep 7, 2025

No description provided.

tasansal and others added 30 commits May 27, 2025 09:04
* Update from list to discrete values for coordinate metadata

* Add docs to help users understand difference

* Update docs and fix case sensitivity.

* Linting

* Add CoordianteMetadata to docs
# Conflicts:
#	uv.lock
* Update dependencies to latest versions

* Update linter type-checking code to 'TC' in pyproject.toml

https://astral.sh/blog/ruff-v0.8.0#new-error-codes-for-flake8-type-checking-rules

* Refactor: Move Zarr codec imports to top-level

* disable safety in CI (temporary)

* Refactor: Replace Zarr codec imports with numcodecs equivalents

* Refactor: Remove unused numcodecs imports and related methods

* pin zarr due to zarr 3.0.9 bug
dmitriyrepin and others added 15 commits August 12, 2025 12:49
* Fix integration import tests

* Fix integration import tests

* mask_and_scale=False

* PR Review

* pre-commit

* PR Review issues

* add todo for headers

* update line length limit to 120 in pyproject.toml

* compact nested code for improved readability in validation tests

* compact coordinate and dimension name definitions in 2D/3D prestack shot templates

* refactor names in header validation in SEG-Y export tests

* remove v1 suffix

* compact code by merging multi-line blocks into single lines where possible

* bump prettier to v3.1.0 and remove prettier-plugin-toml

* update lock file

---------

Co-authored-by: Altay Sansal <[email protected]>
* Fix integration import tests

* mask_and_scale=False

* pre-commit

* PR Review issues

* serialize-text-and-binary-headers

* remove dev test data

* add back whitespace

* revert import changes

* fix attribute initialization in `_add_text_binary_headers`

* Add tests

* refactor: improve type annotations and docstrings in test utilities

* fix formatting

* remove redundant `str()` casting in `xr.open_dataset` calls

---------

Co-authored-by: Altay Sansal <[email protected]>
* update helper to support structured types in variable validation

* add Seismic3DPreStackCocaTemplate and corresponding unit tests

* register Seismic3DPreStackCocaTemplate in template registry

* reorganize template registrations in template_registry and remove depth ones from shots.

* use registered templates instead of listing them all by hand.

* simplify template instantiation in unit tests

* fix default templates and add missing ones

* refactor default template assertions using shared constant
* Implement fixes to ensure lazy allocation of data arrays on serialization

* Avoid unnecessary copies of data in memory

* Linting

* Eliminate immediate overwrite of `data` bug

* Remove unused import

* Set appropriate fill value for lazy arrays

* Clean up header value handler

* Resolve data serialization issues

* Ensure all encodings are captured

* Simplify dataset coordinate population logic by removing unused imports and redundant variable handling

* Refactor `_workers.py` to streamline variable handling, replace manual Variable creation with direct assignment, and resolve redundant imports.

* make better use of grid

* fix type hint

* make better use of grid

* fix(regression): make dataset serialization less eager

* update zarr

* remove comment

---------

Co-authored-by: Altay Sansal <[email protected]>
Fix memory and core utilization regressions
* Export part 1

* Enable header value validation

* Revert the test names back

* Remove Endianness, new_chunks API args and traceDomain,

* PR review

* lint

* create/use new api location and lint

* allow configuring opener chunks

* clarify xarray open parameters

* fix regression of not-opening with native dask re-chunking

* fix regression of not-opening with native dask re-chunking

* make export rechunker work with named dimension sizes and chunks

* make StorageLocation available at library level and update mdio to segy example

* pre-open with zarr backend and simplify dataset slicing after lazy loading

* better opener docs

* more explicit xarray selection

* rename trace variable name to default variable name

* remove the guard for setting storage options to empty dictionary. new zarr is ok with None.

* update lockfile

* fix broken tests and inconsistent type hints

* clean up comments

* clarify binary header scaling

* make test names clearer

* fix broken unit tests due to storage_options handling

---------

Co-authored-by: Altay Sansal <[email protected]>
* AutoChannelWrap over updated-v1

* Fix test

* rename function for new behaviour and improve type hint for grid_overrides

* simplify metadata handling

* lint

* gridOverride is not required

* remove unnecessary byte order change, handled upstream.

* remove rtol adds, tests pass.

* remove expected behaviour comment

* clean up tests

* use grouped assignments to fix PLR915

* add comments to clarify

---------

Co-authored-by: Altay Sansal <[email protected]>
* remove all zarr v2 refs and fix fill_value attributes

* fix codec initialization for zarr3

* use correct kwargs for compressor definition

* fix fill value for structs

* fix numpy imports

* fix creation logic

* make numpy import namespace

* ensure fill value is correct for structured arrays

* fill value all fields

* remove legacy test for bug in v2

* fix codec related issues and warning spamming

* use UPath instead of StorageLocation and remove all v0 stuff

* undo warning suppression for now

* remove v0 dataset schema

* make immutable metadata tuples, performance optimizations. consistent code styling as well

- remove old zarr APIs
- Ensure grid attrs (map and live) get compressed properly.
- move grid_map slicing to worker from main process
-
* fix output uri handling for remote stores

* switch from `as_uri` to `as_posix` for compatibility with xarray
* reorg and simplify

* fix comparison of stats

* fix regression in dataset attribute serialization

* ensure histogram alias is compared correctly

* update docs references

* fix broken refs

* remove top level metadata ref

* remove blosc config refs (we now get from zarr)

* delete removed stats metadata wrapper

* update deps and remove safety

- reason for removal: pyupio/safety#673

* fix numpy rng lint errors

* exclude lower level members

* remove singleton from template registry title

* make template registry api ref with autodoc
* rename things to be more sensible and add angle gathers configuration to PreStackCdp templates.

- add missing 2d test

* align shot data template with prod

* fix tests for 3d pre-stack shot

* remove deleted attribute (processingStage)

* rename gatherType for coca

* lint and fix 1 bug

* rename gather -> ensemble or raw field data

* add missing 2d shot

* fix docstrings

* fix wrong validation namings
* fix correct ingestion for coordinates that don't share all dimensions

* add todo for verification of reduced dimensions
* add todo markers for disabled tests.

* set coverage minimum to 85% due to disabled tests
@tasansal tasansal marked this pull request as ready for review September 8, 2025 01:23
@tasansal tasansal merged commit 65d8263 into main Sep 8, 2025
16 checks passed
@tasansal tasansal deleted the v1 branch September 8, 2025 01:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants