Skip to content
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
88a3e0f
Update documentation for the relationship between CIF grammar and the…
janbridley Dec 8, 2025
39a9396
Fix string format
janbridley Dec 8, 2025
e283d39
Unix-style wildcards doc
janbridley Dec 8, 2025
aaf5ad5
Remove exclamation points
janbridley Dec 8, 2025
c498b9d
Clean up development guide
janbridley Dec 8, 2025
6db5a2a
Swap to RST link style for doc comment
janbridley Dec 8, 2025
f77c6f8
Update type hints
janbridley Dec 8, 2025
5e82c5c
Add type hints where they cannot be inferred
janbridley Dec 8, 2025
4d0ba4d
Standardize error handling for gemmi tests
janbridley Dec 8, 2025
d032730
Clean up unused TODOs
janbridley Dec 8, 2025
ea0e515
One straggler TODO
janbridley Dec 8, 2025
92ef1f3
Note TODO
janbridley Dec 8, 2025
3550eae
Swap note -> attention admonition
janbridley Dec 8, 2025
8b0e7ff
Swap to caution
janbridley Dec 8, 2025
81ef61a
One more caution
janbridley Dec 8, 2025
f9522b7
Add GSD requirement for testing
janbridley Dec 8, 2025
c66d871
Fix typos
janbridley Dec 8, 2025
2fef24d
Add HOOMD-Blue example and examples toc section
janbridley Dec 8, 2025
e083212
Add LAMMPS example
janbridley Dec 8, 2025
fe34e8d
Fix LAMMPS example
janbridley Dec 8, 2025
13b2823
Remove unused line
janbridley Dec 8, 2025
9398394
Add noisy data and fix headers
janbridley Dec 8, 2025
6357459
Final title
janbridley Dec 8, 2025
a8d80f4
Doctest LAMMPS output
janbridley Dec 8, 2025
239dc7e
Add example on numerical precision
janbridley Dec 9, 2025
81b0771
Update requirements file for py3.14
janbridley Dec 9, 2025
6ec6f29
Update changelog.rst
janbridley Dec 9, 2025
1b5be04
Fix label in CHANGELOG
janbridley Dec 9, 2025
ab33ab2
Pre-compile patterns for unit cell evaluation
janbridley Dec 10, 2025
4129943
Add example for setting Wyckoff sites
janbridley Dec 10, 2025
fbf40fb
Fix doctest-requires
janbridley Dec 10, 2025
21a5c3f
Add warning for setting structure
janbridley Dec 10, 2025
2287526
Fix type annotation
janbridley Dec 10, 2025
a624d3a
Include only necessary data
janbridley Dec 10, 2025
d3c189c
Merge branch 'doc/grammar' into feat/set_basis
janbridley Dec 10, 2025
84c08e3
sphinx-inline-tabs
janbridley Dec 12, 2025
6ca3a23
Description of Wyckoff postions
janbridley Dec 12, 2025
02c7740
Move up table and fix formatting
janbridley Dec 12, 2025
0860b82
Merge branch 'main' into feat/set_basis
janbridley Dec 12, 2025
ff427e7
Restore heading
janbridley Dec 12, 2025
4825342
Merge branch 'main' into feat/set_basis
janbridley Jan 2, 2026
e5b02b8
Fix format
janbridley Jan 2, 2026
a8c0031
Clean up text
janbridley Jan 2, 2026
3887499
Clarify number of atoms
janbridley Jan 2, 2026
9b66854
Clarify optimal structure
janbridley Jan 2, 2026
bcfa11e
en dash -> em dash
janbridley Jan 2, 2026
d48bcea
Tighten up text
janbridley Jan 2, 2026
66c9119
Fix indentation typo
janbridley Jan 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add example for setting Wyckoff sites
  • Loading branch information
janbridley committed Dec 10, 2025
commit 4129943074d9a9db23bf2e68eb8d9ced389c4870
302 changes: 302 additions & 0 deletions doc/source/_static/perfect_imperfect_bmn.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
52 changes: 52 additions & 0 deletions doc/source/betamn.cif
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
data_beta_manganese

_chemical_name_mineral "beta"
_chemical_formula_sum "Mn"

_symmetry_Int_Tables_number 213

_cell_length_a 6.315000
_cell_length_b 6.315000
_cell_length_c 6.315000
_cell_angle_alpha 90.00000
_cell_angle_beta 90.00000
_cell_angle_gamma 90.00000

loop_
_space_group_symop_id
_space_group_symop_operation_xyz
1 x,y,z
2 x+1/2,-y+1/2,-z
3 -x,y+1/2,-z+1/2
4 -x+1/2,-y,z+1/2
5 y,z,x
6 y+1/2,-z+1/2,-x
7 -y,z+1/2,-x+1/2
8 -y+1/2,-z,x+1/2
9 z,x,y
10 z+1/2,-x+1/2,-y
11 -z,x+1/2,-y+1/2
12 -z+1/2,-x,y+1/2
13 -y+3/4,-x+3/4,-z+3/4
14 -y+1/4,x+3/4,z+1/4
15 y+1/4,-x+1/4,z+3/4
16 y+3/4,x+1/4,-z+1/4
17 -x+3/4,-z+3/4,-y+3/4
18 -x+1/4,z+3/4,y+1/4
19 x+1/4,-z+1/4,y+3/4
20 x+3/4,z+1/4,-y+1/4
21 -z+3/4,-y+3/4,-x+3/4
22 -z+1/4,y+3/4,x+1/4
23 z+1/4,-y+1/4,x+3/4
24 z+3/4,y+1/4,-x+1/4

loop_
_atom_site_label
_atom_site_type_symbol
_atom_site_symmetry_multiplicity
_atom_site_Wyckoff_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
Mn1 Mn 8 c 0.06361 0.06361 0.06361
Mn2 Mn 12 d 0.12500 0.20224 0.45224
129 changes: 129 additions & 0 deletions doc/source/example_new_structures.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
Refining and Experimenting with Structures
==========================================

**parsnip** allows users to set the Wyckoff positions of a crystal, enabling the
construction of modified -- or entirely new -- structures. In this example, we show
how an experimental beta-Manganese (cP20-Mn) structure can be refined into the
more uniform variant described by `O'Keefe and Andersson`_.

.. _`O'Keefe and Andersson`: https://doi.org/10.1107/S0567739477002228

These are the Wyckoff positions for elemental Beta-Manganese:

.. literalinclude:: betamn.cif
:lines: 50-52


.. testsetup::

>>> import os
>>> import numpy as np
>>> if "doc/source" not in os.getcwd(): os.chdir("doc/source")

Loading the file shows the twenty atoms we expect for β-Mn:

.. doctest::

>>> from parsnip import CifFile
>>> filename = "betamn.cif"
>>> cif = CifFile(filename)
>>> uc = cif.build_unit_cell()
>>> assert uc.shape == (20, 3)

Introducing Beta-Manganese
^^^^^^^^^^^^^^^^^^^^^^^^^^

Beta-Manganese is a `tetrahedrally close-packed`_ (TCP) structure, a class of complex
phases whose geometry minimizes the distance between atoms in a manner that prevents the
formation of octahedral interstitial sites. Intuitively, one can image the bond network
of TCP structures forming a space-filling collection of irregular tetrahedra, with some
required amount of distortion imposed by the requirement that the structure tiles space.

It turns out that natural beta-Manganese actually has *more* variation in bond lengths
than is strictly required for this topology of structure. `O'Keefe and Andersson`_
noticed that moving the ``Mn1`` and ``Mn2`` Wyckoff positions by just ``0.0011`` and
``0.0042`` fractional units results in a TCP structure composed of bonds whose maximum
relative distance is lower than experiments predicted.

.. _`tetrahedrally close-packed`: https://www.chemie-biologie.uni-siegen.de/ac/hjd/lehre/ss08/vortraege/mehboob_tetrahedrally_close_packing_corr_.pdf

Using **parsnip**, we can explore the differences between experimental and ideal
beta-Manganese, quantifying the distribution of bond lengths in the crystal:

.. doctest::

>>> from parsnip import CifFile
>>> from math import sqrt
>>> filename = "betamn.cif"
>>> cif = CifFile(filename)
>>> atomic_uc = cif.build_unit_cell()
>>> assert atomic_uc.shape == (20, 3)
>>> # Values are drawn from O'Keefe and Andersson, linked above.
>>> x = 1 / (9 + sqrt(33))
>>> mn1 = [x, x, x] # doctest: +FLOAT_CMP
>>> mn1
[0.0678216, 0.0678216, 0.0678216]
>>> y = (9 - sqrt(33)) / 16
>>> z = (13 - sqrt(33)) / 16
>>> mn2 = [1 / 8, y, z]
>>> mn2 # doctest: +FLOAT_CMP
[0.1250000, 0.2034648, 0.4534648]

>>> cif.set_wyckoff_positions([mn1, mn2])
CifFile(file=betamn.cif) : 9 data entries, 2 data loops
>>> # We should still have the same number of atoms
>>> ideal_uc = cif.build_unit_cell(n_decimal_places=4)
>>> assert ideal_uc.shape == atomic_uc.shape


Analyzing our New Structure
^^^^^^^^^^^^^^^^^^^^^^^^^^^

The following plot shows a histogram of neighbor distances for experimental
beta-Manganese (top) and the ideal structure (bottom). Each bar corresponds with a
single neighbor bond length, with each particle's neighbors existing at one of the
specified distances. Interestingly, althought the ideal structure has a more uniform
topology with fewer total distinct edges, the observed atomic structure more uniformly
distributes bonds to each particle.


.. image:: _static/perfect_imperfect_bmn.svg
:width: 100%


A Note on Symmetry
^^^^^^^^^^^^^^^^^^

Modifying the Wyckoff positions of a crystal (without changing the symmetry operations)
cannot reduce the symmetry of the structure -- however, some choices of sites can
result in *additional* symmetry operations that are not present in the input space
group. While the example provided above preserved the space group of our crystal,
choosing a fractional coordinate that lies on a high symmetry point (like the origin,
or the center of the cell) can result in differences.


.. doctest-requires:: spglib

>>> import spglib
>>> box = cif.lattice_vectors
>>> # Verify that our initial and "ideal" beta-Manganese cells share a space group
>>> spglib.get_spacegroup((box, atomic_uc, [0] * 20))
'P4_132 (213)'
>>> spglib.get_spacegroup((box, ideal_uc, [0] * 20))
'P4_132 (213)'
>>> cif["_symmetry_Int_Tables_number"] # Data from the initial file.
'213'

>>> cif = CifFile("betamn.cif").set_wyckoff_positions([[0.0, 0.0, 0.0]])
>>> different_uc = cif.build_unit_cell()
>>> spglib.get_spacegroup((box, different_uc, [0] * len(different_uc)))
'Fd-3m (227)'

Takeaways
^^^^^^^^^

**parsnip** allows us to use existing structural data to generate new crystals,
including those that have not been observed in experiment. While the example shown here
is relatively simple, assigning alternative Wyckoff positions enables high-throughput
materials discovery research and offers a simple framework by which structural features
can be explored.
18 changes: 18 additions & 0 deletions doc/source/examples.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
.. _examples:

========
Examples
========

This tutorial provides a complete introduction to **parsnip**, including its place in
the broader simulation and data science ecosystems. We begin by illustrating how
**parsnip** aids in simulation initialization in two common libraries, HOOMD-Blue and
LAMMPS. We then highlight **parsnip**'s class-leading performance reconstructing noisy
experimental data. We conclude with a tutorial on using **parsnip** to generate new
structures from existing data.


.. toctree::
:maxdepth: 2

example_new_structures
1 change: 1 addition & 0 deletions doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@

installation
quickstart
examples


.. toctree::
Expand Down
85 changes: 84 additions & 1 deletion parsnip/parsnip.py
Original file line number Diff line number Diff line change
Expand Up @@ -852,7 +852,10 @@ def _read_wyckoff_positions(self):
for (k, v) in zip(self.__class__._WYCKOFF_KEYS, wyckoff_position_data)
if v is not None
]
return np.hstack([x for x in wyckoff_position_data if x is not None] or [[]])
data_to_stack = [x for x in wyckoff_position_data if x is not None]
if not data_to_stack:
return np.array([])
return np.column_stack(data_to_stack)

@property
def wyckoff_positions(self):
Expand All @@ -867,6 +870,86 @@ def wyckoff_positions(self):
"""
return cast_array_to_float(self._read_wyckoff_positions(), dtype=float)

def set_wyckoff_positions(self, wyckoff_sites: np.ndarray[(None, 3), np.float64]):
r"""Set the Wyckoff sites in the CIF file data.

This method updates the values of the Wyckoff position coordinates
(e.g., ``_atom_site_fract_x``, ``_atom_site_fract_y``, ``_atom_site_fract_z``)
in the corresponding loop structure. The input is a NumPy array of floating
point values, which will be converted to strings for storage.

If the provided array has a different number of rows than the existing
data, the loop will be resized. When adding new sites, placeholder
data ("?") will be used for non-coordinate columns. When removing sites,
rows are removed from the end of the loop.

Parameters
----------
wyckoff_sites : np.ndarray[(None, 3), np.float64]
A NumPy array of shape (N, 3) containing the new Wyckoff sites.

Raises
------
ValueError
If the Wyckoff position keys cannot be found in any loop, or if the
input array does not have 3 columns.
"""
wyckoff_sites = np.asarray(wyckoff_sites)
if len(self._raw_wyckoff_keys) == 0:
self._read_wyckoff_positions()

keys_to_set = self._wyckoff_site_keys

# If we have both fractional and cartesian, only use the first three (fract)
if len(keys_to_set) > 3:
keys_to_set = keys_to_set[:3]

if len(keys_to_set) != 3:
raise ValueError(f"Found {len(keys_to_set)} Wyckoff keys, expected 3.")

target_loop_idx = -1
for i, loop in enumerate(self._loops):
if all(key in loop.dtype.names for key in keys_to_set):
target_loop_idx = i
break

if target_loop_idx == -1:
raise ValueError(
f"Could not find a loop containing all Wyckoff keys: {keys_to_set}"
)

if wyckoff_sites.ndim != 2 or wyckoff_sites.shape[1] != 3:
raise ValueError(
"Input `wyckoff_sites` must have shape (N, 3), but has shape"
f"{wyckoff_sites.shape}."
)

target_loop = self._loops[target_loop_idx]
n_current = len(target_loop)
n_new = len(wyckoff_sites)

new_loop = np.empty(n_new, dtype=target_loop.dtype)

# Copy over existing data for columns that are not being set
other_keys = [
name for name in target_loop.dtype.names if name not in keys_to_set
]
n_to_copy = min(n_current, n_new)
for key in other_keys:
new_loop[key][:n_to_copy] = target_loop[key][:n_to_copy].squeeze()

# Set new coordinates
for i, key in enumerate(keys_to_set):
new_loop[key] = [f"{val:.8f}" for val in wyckoff_sites[:, i]]

# Fill in default values for added rows
if n_new > n_current:
for key in other_keys:
new_loop[key][n_current:] = "?"

self._loops[target_loop_idx] = new_loop
return self # Allow for chaining.

@property
def cast_values(self):
"""Bool : Whether to cast "number-like" values to ints & floats.
Expand Down
Loading