Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
db390c8
Rebase 1
lwasser Feb 2, 2023
47ca25a
Fix: edits to package guide from review
lwasser Feb 2, 2023
e114a1b
Fix: cleanup packaging page and add decision tree diagram
lwasser Feb 16, 2023
4db3074
Add: more edits from massive pr
lwasser Feb 17, 2023
9cae7fa
Fix: final edits to packaging section broken links etc
lwasser Feb 22, 2023
3c6f7ff
Update index.md
lwasser Mar 13, 2023
9be2c29
Apply suggestions from code review
lwasser Mar 13, 2023
f657772
Apply suggestions from code review
lwasser Mar 13, 2023
cf36261
Apply suggestions from code review
lwasser Mar 13, 2023
755cdd5
Apply suggestions from code review
lwasser Mar 13, 2023
3e78288
Fix: Apply suggestions from code review
lwasser Mar 13, 2023
0c4b77c
Fix: confpy file
lwasser Mar 13, 2023
0a9cf7a
Fix: clean up some of the discussion around poetry
lwasser Mar 13, 2023
659ffeb
Updated decision tree diagram
lwasser Mar 13, 2023
61b86eb
Update package-structure-code/python-package-structure.md
lwasser Mar 13, 2023
4b80f31
Update package-structure-code/complex-python-package-builds.md
lwasser Mar 15, 2023
7ba8292
Update package-structure-code/intro.md
lwasser Mar 15, 2023
b64b59a
Fix: many more comments from review 2
lwasser Mar 15, 2023
40eb8c6
A few more fixes to the build tools page
lwasser Mar 15, 2023
1cbae63
Update package-structure-code/complex-python-package-builds.md
lwasser Mar 16, 2023
fbf3ded
Update package-structure-code/python-package-distribution-files-sdist…
lwasser Mar 16, 2023
b082fb1
Fix: csv table delim and numerous other review fixes
lwasser Mar 16, 2023
f2c1966
Fix: remove notes from docs
lwasser Mar 16, 2023
4a75fd9
Update package-structure-code/python-package-build-tools.md
lwasser Mar 16, 2023
e46dab2
Update package-structure-code/python-package-distribution-files-sdist…
lwasser Mar 16, 2023
68f5723
Update package-structure-code/python-package-distribution-files-sdist…
lwasser Mar 16, 2023
bef6801
Fix: many more great comments to address
lwasser Mar 16, 2023
a28b971
Fix: more edits from the current review
lwasser Mar 20, 2023
45b7ff8
Add: new cleaned up diagram
lwasser Mar 21, 2023
d427e08
Fix: clarify the section on adding tests to src layout
lwasser Mar 21, 2023
5c74e82
Cleanup of package structure page
lwasser Mar 22, 2023
6d1cd13
Final edits?! yaas
lwasser Mar 22, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix: remove notes from docs
  • Loading branch information
lwasser committed Mar 16, 2023
commit f2c1966b39361f3a4008545a5089d04d504bda74
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ tmp/
.DS_Store
.nox
__pycache__
*notes-from-review.md
131 changes: 1 addition & 130 deletions package-structure-code/complex-python-package-builds.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,134 +14,5 @@ back-end tools.
1. **Pure-python packages:** these are packages that only rely on Python to function. Building a pure Python package is simpler. As such, you can chose a tool below that
has the features that you want and be done with your decision!
2. **Python packages with non-Python extensions:** These packages have additional components called extensions written in other languages (such as C or C++). If you have a package with non-python extensions, then you need to select a build back-end tool that allows you to add additional build steps needed to compile your extension code. Further, if you wish to use a front-end tool to support your workflow, you will need to select a tool that
supports additional build setps. In this case, you could use setuptools. However, we suggest that you chose build tool that supports custom build steps such as Hatch with Hatchling or PDM. PDM is an excellent choice as it allows you to also select your build back-end of choice. We will discuss this at a high level on the complex builds page.
3.**Python packages that have extensions written in different languages (e.g. Fortran and C++) or that have non Python dependencies that are difficult to install (e.g. GDAL)** These packages often have complex build steps (more complex than a package with just a few C extensions for instance). As such, these packages require tools such as [scikit-build](https://scikit-build.readthedocs.io/en/latest/)
supports additional build setps. In this case, you could use setuptools. However, we suggest that you chose build tool that supports custom build steps such as Hatch with Hatchling or PDM. PDM is an excellent choice as it allows you to also select your build back-end of choice. We will discuss this at a high level on the complex builds page. 3.**Python packages that have extensions written in different languages (e.g. Fortran and C++) or that have non Python dependencies that are difficult to install (e.g. GDAL)** These packages often have complex build steps (more complex than a package with just a few C extensions for instance). As such, these packages require tools such as [scikit-build](https://scikit-build.readthedocs.io/en/latest/)
or [meson-python](https://mesonbuild.com/Python-module.html) to build. NOTE: you can use meson-python with PDM.

<!--
On this page, we will focus on using front-end tools to package pure python
packages. We will note if a package does have the flexibility to support other
back-ends and in turn more complex builds (*mentioned in #2 and #3 above*). -->
<!--
## COmbine the two sets of statement below...
ELI:
PDM supports C/Cython extensions too: https://pdm.fming.dev/latest/pyproject/build/#build-platform-specific-wheels

It does this by allowing you to write a python script that gets injected into a setuptools build process :) so that's not necessarily the greatest choice. It's a bit like using setuptools directly. ;)

Ralf:
Hatch only supports pure Python packages as of now. setuptools is still a very reasonable choice, and okay if all you have is a few C/Cython extensions. But I'd say you should probably recommend meson-python and scikit-build-core as the two best tools for building packages containing compiled extensions.


* link to ralf's blog and book on complex builds
* keep this page high level so we don't get weight downsides
* can use the examplePy repo stefan and I are working on that will test various build combinations

*****

ELI: It would be more accurate to say that PDM supports using PDM and setuptools at the same time, so you run setuptools to produce the C extensions and then PDM receives the compiled extension files (.so, .pyd) and packages it up alongside the pure python files.

Hatch - https://hatch.pypa.io/latest/config/build/#build-hooks uild hooks

Ralf -
Hatch has the worst take on building compiled code by some distance. Unless its author starts developing an understanding of build systems / needs, and implements support for PEP 517 build back-end hooks in pyproject.toml, it's pretty much a dead end.
****


HEnry: Poetry will move to PEP 621 configuration in version 2.

* pdm, hatch and poetry all have "ways" of supporting c extensions via pdm-backend, hatchling and poetry's build back-end.
* poetry's support for C extensions is not fully developed and documented (yet). * Poetry doesn't offer a way to facilitate "communication" between poetry front end and another back-end like meson to build via a build hook. so while some have used it with other back-end builds it's not ideal for this application
* pdm and poetry both rely on setuptools for C extensions. pdm's support claims to be fully developed and documented. poetry claims nothing, and doesn't document it.
* hatch both offers a plugin type approach to support custom build steps
PDM (right now) is the only tool that supports other back-ends (hatch is working on this - 2 minor releases away)
At some point a build becomes so complex that you need to use a tool like scikit or meson to support that complexity.



**Setuptools** is the oldest tool in the above list. While it doesn't have a
friendly user front end, because "OG" tool that has been used for Python packaging for over a decade, we discuss it here.

**Hatch** and PDM are newer, more modern tool that support customization of any
part of your packaging steps. These tools also support some C and C++
extensions.


OFEK - Why use hatchlin vs pdm back-end -
File inclusion is more configurable and easier by default
There is already a rich ecosystem of plugins and a well-thought-out interface
Consistency since the official Python packaging tutorial uses Hatchling by default


Henry -
The scikit-hep cookie provides 11 back-ends including flit-core and hatchling, and I've moved packaging to flit-core, and lots of other things to hatchling, and I can say that hatching's defaults are much nicer than flit-core's. Hatching uses .gitignore to decide what to put in the sdist. Flit-core basically tries to keep its hands off of adding defaults, so you have to configure everything manually. To make it even more confusing, if you use flit instead of a standard tool like build, it will switch to using VCS and those ignored files won't be added - meaning it is really easy to have a project that doesn't support build, including various GitHub Actions. Hatchling wins this by a ton.

<!-- TODO: add - compatible with other build back-ends eg pdm can work with hatchling

Eli:
poetry: supports it, but is undocumented and uses setuptools under the hood, they plan to change how this works and then document it
pdm-back-end: supports it, and documents it -- and also uses setuptools under the hood
hatchling: permits you to define hooks for you to write your own custom build steps, including to build C++ extensions

-->

<!-- from eli about pdm
It would be more accurate to say that PDM supports using PDM and setuptools at the same time, so you run setuptools to produce the C extensions and then PDM receives the compiled extension files (.so, .pyd) and packages it up alongside the pure Python files.

Comment about hatch.
https://github.com/pyOpenSci/python-package-guide/pull/23#discussion_r1081108118

From ralf: There are no silver bullets here yet, no workflow tool is complete. Both Hatch and PDM are single-author tools, which is another concern. @eli-schwartz's assessment is unfortunately correct here I believe (at a high level at least, not sure about details). Hatch has the worst take on building compiled code by some distance. Unless its author starts developing an understanding of build systems / needs, and implements support for PEP 517 build back-end hooks in pyproject.toml, it's pretty much a dead end.

-->

<!--TODO Add examples of builds using each of the tools below?

pdm, hatch and poetry all have "ways" of supporting c extensions via pdm-build, hatchling and poetry's build back-end.
poetry's support for C extensions is not fully developed and documented (yet). Poetry doesn't offer a way to facilitate "communication" between poetry front end and another back-end like meson to build via a build hook.
PDM and hatch both offer a plugin type approach to support custom build steps
PDM (right now) is the only tool that supports other back-ends (hatch is working on this - 2 minor releases away)
At some point a build becomes so complex that you need to use a tool like scikit or meson to support that complexity.

CORRECTIONS:
pdm doesn't use plugins. Hatch does.
pdm and poetry both rely on setuptools for C extensions. pdm's support claims to be fully developed and documented. poetry claims nothing, and doesn't document it.


??
Poetry supports extensions written in other languages but this functionality is
currently undocumented.

Tools such as Setuptools, PDM, Hatch and Poetry all have some level of support
for C and C++ extensions.
Some Python packaging tools,
such as **Flit** and the **flit-core** build back-end only support pure-Python
package builds.
Some front-end packaging tools, such as PDM, allow you to use other
build back-ends such as **meson** and **scikit-build**.


me:
pdm, hatch and poetry all have "ways" of supporting c extensions via pdm-build, hatchling and poetry's build back-end.
poetry's support for C extensions is not fully developed and documented (yet). Poetry doesn't offer a way to facilitate "communication" between poetry front end and another back-end like meson to build via a build hook.
PDM and hatch both offer a plugin type approach to support custom build steps
PDM (right now) is the only tool that supports other back-ends (hatch is working on this - 2 minor releases away)
At some point a build becomes so complex that you need to use a tool like scikit or meson to support that complexity.
@eli-schwartz eli-schwartz 3 weeks ago
PDM and hatch both offer a plugin type approach to support custom build steps

ELI:
pdm doesn't use plugins. Hatch does.
pdm and poetry both rely on setuptools for C extensions. pdm's support claims to be fully developed and documented. poetry claims nothing, and doesn't document it.


https://pdm.fming.dev/latest/pyproject/build/#build-platform-specific-wheels
-->

<!-- https://github.com/pyOpenSci/python-package-guide/pull/23#discussion_r1071541329
ELI: A complex build could mean running a python script that processes some data file and produces a pure python module file.

Probably not common in the scientific community specifically, but I've seen quite a few setup.py files that contain custom build stages which e.g. build gettext locale catalogs.

The main point is that it is more "complex" than simply copying files or directories as-is into the built wheel.
-->
132 changes: 1 addition & 131 deletions package-structure-code/python-package-build-tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,6 @@ We focus on pure-python packages in this guide. However, we also
highlight tools that currently support packages with C/C++ and other language
extensions.

<!-- TODO: create build tool selection diagram - https://www.canva.com/design/DAFawXrierc/O7DTnqW5fmMEZ31ao-TK9w/edit -->

:::{figure-md} fig-target

<img src="../images/python-package-tools-decision-tree.png" alt="Figure showing... will finish this once we are all happy with the figure and it's not going to change more..." width="700px">
Expand Down Expand Up @@ -53,44 +51,6 @@ and **scikit-build** support complex builds with custom steps. If your
build is particularly complex (i.e. you have more than a few `C`/`C++`
extensions), then we suggest you use **meson.build** or **scikit-build**.

<!--
### Build front-ends

Build front-ends have a user-friendly interface that allow you to perform
common Python packaging tasks such as building your package, creating an
environment to run package tests and build documentation, and pushing to PyPI.

For instance, you can use **Flit**, **Hatch**, **Poetry** and **PDM** to both build your
package and to publish your package to PyPI (or test PyPI). However, if you
want a tool that also support environment management and versioning your package,
then you might prefer to use **Hatch**, **Poetry** or **PDM**.

Using a tool like **Flit**, **Hatch**, **Poetry** or **PDM** will simplify your workflow.

Example to build your package with **Flit**:

`flit build`

Example to publish to PyPI:
`flit publish --repository testpypi`

In the Python package build space **setuptools** is
the "OG" -the original tool that everyone used to use.
With a tool like `setuptools` you have the flexibility
to publish python pure python packages and packages with custom build steps. However, you will also need to use other tools. For example, you will use `twine` to publish to PyPI.

## An ecosystem of Python build tools

Below we introduce several of the most commonly used
Python packaging build tools. Each tool has various
features that might make you chose to use it
or not use it. There is no right or wrong tool to use
as far as pyOpenSci is concerned. We are just trying to
help you find the tool that works best for
your workflow.
Example build steps using setuptools:
======= -->

### Python package build front-ends
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: i just realized there is another big difference between flit, setuptools and the other three - some have an init workflow that creates your package structure for you. Whereas flit init for instance only creates the .toml file and adds a license. it relies on the user to create the structure.

also the way it creates the toml needs to be. noted as by default it assumes dynamic version and description. so if you try to do flit init and then build it will fail by default unless you already have everything setup.

i'm not sure if the readme needs to be added as file = somewhere or what dynamic description means so need to look into this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is particularly important for a new user. it would be nice to be able to create a structure, add code and install without running into errors that you don't understand initially.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PDM does the same but it includes a READMe.MD and a pyproject.toml only - no license file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hatch should automatically create and include the license for generated projects

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok many thanks. i need to test hatch next. maybe that is what i'll do today!!

i'm working on examples of each tool here, @ofek https://github.com/pyOpenSci/examplePy hatch isn't there yet but it will be it's just taking me time . each readme in that repo will then link to a online blog post about the tool.


A packaging front-end tool refers to a tool that makes it easier for you to
Expand Down Expand Up @@ -214,17 +174,6 @@ questions:
NOTE: You can also use Hatch for non pure python builds but you will need to
write your own plugin for this support.

<!-- ### Build tools for Python packages with complex build steps
If your package is not pure Python, or it has complex build steps (or build
steps that you need to customize), then you should consider using:

* Setuptools
* Hatch
* PDM

These tools allow you to customize your workflow with build steps needed
to compile code. -->

## Python packaging tools summary

<!-- NOTE - add language around the front end means that you have less individual tools in your build - such as nox / make with hatch -->
Expand Down Expand Up @@ -379,9 +328,6 @@ Build your sdist and wheel distributions|✅| Hatch will build the sdist and whe

_\*\* There is some argument about this approach placing a burden on maintainers to create a custom build system. But others appreciate the flexibility_

<!-- QUESTION: Does hatch allow you to use other envt managers like conda?? i don't see that it does
so it might be similar to Poetry in that regard -->

### Why you might not want to use Hatch

There are a few features that hatch is missing that may be important for some.
Expand Down Expand Up @@ -427,7 +373,7 @@ Install your package in editable mode|✅|Poetry supports installing your packag
Build your sdist and wheel distributions|✅|Poetry will build your sdist and wheel distributions using `poetry build`
```

<!-- TODO: update this given responses here: https://github.com/python-poetry/poetry/discussions/7525 -->
<!-- TODO: responses here on poetry's future dev work: https://github.com/python-poetry/poetry/discussions/7525 -->

### Challenges with Poetry

Expand Down Expand Up @@ -516,79 +462,3 @@ when using setuptools. For instance:
\*Setuptools also will include all of the files in your package
repository if you do not explicitly tell it to exclude files using a
**MANIFEST.in** file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of this file, I think we should note how difficult it is to choose precisely what goes inside source distributions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. this was actually a pain point for me with our stravalib package. and it defaults to adding everything. do you have a suggestion for language to add here by chance? i'm not exactly sure what you mean by precisely as you can use that manifest file but i also haven't played with it extensively. i just saw what it did by default without a manifest file and was surprised

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean 2 things I suppose: one is that the pattern notation and directives are cumbersome to use in comparison to Git patterns and the other is that options in setup.py also affect the behavior of what gets included in the source distribution which makes it all the more confusing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so i haven't actually worked on our manifest file yet.

one is that the pattern notation and directives are cumbersome to use in comparison to Git patterns

are you saying that there is some sort of regex needed to specify patters in the. manifest.in file for setuptools that are cumbersome to use? vs what you'd use in a .gitignore file?

options in setup.py also affect the behavior of what gets included in the source distribution which makes it all the more confusing

this i'm less familiar with. can you please clarify so i can add a bit of language?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


<!-- From stefan: build, run tests on built version, load the built version into
Python (?how is this different from install??), make editable install, build
wheel, build sdist -->

<!--
The example below is taken from [this thread in GitHub](https://github.com/py-pkgs/py-pkgs/issues/95#issuecomment-1035584750).

```toml
[tool.poetry.dependencies]
python = ">=3.6" # This is muddled in as a dependency, while it's not like the others
numpy = ">=1.13.3"
typing_extensions = { version = ">=3.7", python = "<3.8" }

sphinx = {version = "^4.0", optional = true}
sphinx_book_theme = { version = ">=0.0.40", optional = true }
sphinx_copybutton = { version = ">=0.3.1", optional = true }
pytest = { version = ">=6", optional = true }
importlib_metadata = { version = ">=1.0", optional = true, python = "<3.8" } # TOML error to add an ending comma or new line, even if this gets long
boost-histogram = { version = ">=1.0", optional = true }

[tool.poetry.dev-dependencies]
pytest = ">=5.2" # All the optional stuff above doesn't help here!
importlib_metadata = {version = ">=1.0", python = "<3.8" }
boost_histogram = ">=1.0"

[tool.poetry.extras]
docs = ["sphinx", "sphinx_book_theme", "sphinx_copybutton"]
test = ["pytest", "importlib_metadata", "boost_histogram" ]
```

vs PDM

```toml
[project]
requires-python = ">=3.6"
dependencies = [
"numpy>=1.13.3",
"typing-extensions>=3.7; python_version<'3.8'",
]

# Only needed for extras
[project.optional-dependencies]
docs = [
"sphinx>=4.0",
"sphinx-book-theme>=0.0.40",
"sphinx-copybutton>=0.3.1",
]
test = [
"pytest>=6",
"importlib-metadata>=1.0; python_version<'3.8'",
"boost-histogram>=1.0",
]

# Only needed for "dev" installs
[tool.pdm.dev-dependencies]
dev = [
"pytest>=6",
"importlib-metadata>=1.0; python_version<'3.8'",
"boost-histogram>=1.0",
]
```

From Eli:

poetry: supports it (c extensions), but is undocumented and uses setuptools under the hood, they plan to change how this works and then document it
pdm-back-end: supports it, and documents it -- and also uses setuptools under the hood
hatchling: permits you to define hooks for you to write your own custom build steps, including to build C++ extensions


**PDM** does have some support for `C`/[`Cython`](https://cython.org/) extensions. [Click here to
learn more.](https://pdm.fming.dev/latest/pyproject/build/#build-platform-specific-wheels). This functionality uses setuptools "under the
hood".


-->