forked from pytorch/pytorch
-
Notifications
You must be signed in to change notification settings - Fork 0
Proposal for a PyTorch conda "distribution" #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,391 @@ | ||
# A PyTorch conda "distribution" | ||
|
||
This proposal addresses the need for a PyTorch conda distribution, meaning a | ||
collection of integration-tested packages that can be installed from a single | ||
channel, to enable package authors to release packages that depend on PyTorch | ||
and let users install them in a reliable way. | ||
|
||
|
||
## Motivation and Scope | ||
|
||
For developers of libraries that depend on PyTorch, it is currently (Nov'20) | ||
quite difficult to express that dependency in a way that makes their package | ||
easily installable with `conda` (or `pip`) by end users. With the PyTorch | ||
ecosystem growing and the dependency graphs of sets of packages users use in | ||
a single environment becoming more complex, streamlining the package | ||
distribution and installation experience is important. | ||
|
||
Examples of packages for which there's interest in making them more easily | ||
available to end users: | ||
|
||
- [fastai](https://docs.fast.ai/): Jeremy Howard expressed interest, and | ||
plans to copy `pytorch` and other dependencies of fastai over to the `fastai` | ||
channel in case this proposal doesn't work out. | ||
- [fairseq](https://github.com/pytorch/fairseq): a fairseq developer inquired | ||
about being added to the `pytorch` channel | ||
[here](https://github.com/pytorch/builder/issues/563), and a conda-forge | ||
contributor wanted to package both PyTorch and fairseq in conda-forge, see | ||
[here](https://github.com/conda-forge/pytorch-cpu-feedstock/issues/7#issuecomment-688467743). | ||
- [TorchANI](https://github.com/aiqm/torchani): see a TorchANI user's recent | ||
attempt to add a conda-forge package | ||
[here](https://github.com/conda-forge/torchani-feedstock/pull/1). | ||
|
||
In scope for this proposal are: | ||
|
||
- Processes related to adding new packages to the `pytorch` conda channel. | ||
- CI infrastructure needed for integration testing and moving already built | ||
packages to the `pytorch` channel. | ||
|
||
_Note: using the `pytorch` channel seems like the most obvious choice for a | ||
single integration channel; using a new channel is also possible, it won't | ||
change the rest of this proposal materially._ | ||
|
||
Out of scope are: | ||
|
||
- Changes related to how libraries are built or packages for conda are created. | ||
- Updating PyTorch packaging in `defaults` or `conda-forge`. | ||
- Improvements to installing with pip or wheel builds. | ||
|
||
|
||
### The current state of affairs | ||
|
||
PyTorch is packaged in the `pytorch` channel; users must either add that | ||
channel to the channels list globally or in an environment (using, e.g., | ||
`conda config --env --add channels pytorch`), or add `-c pytorch` to every | ||
`conda` command they run. Note that the channels method is preferred over `-c | ||
pytorch` but installation instructions invariably use the latter, which can | ||
lead to problems when it's forgotten by the user at some point. | ||
|
||
PyTorch is also packaged in `defaults`, but it's really outdated (1.4.0 for | ||
CUDA-enabled packages, 1.5.0 for CPU-only). The `conda-forge` channel doesn't | ||
have PyTorch packages - there's a desire to add them, however it's unclear if | ||
and how that will happen. | ||
|
||
Authors of _pure Python packages_ tend to use their own conda channel to | ||
distribute their own package. Installation instructions will then have both | ||
the `pytorch` and their own channel in them. For example for fastai and | ||
BoTorch: | ||
|
||
``` | ||
conda install -c fastai -c pytorch fastai | ||
``` | ||
|
||
``` | ||
conda install botorch -c pytorch -c gpytorch | ||
``` | ||
|
||
When a user needs multiple packages, that becomes unwieldy quickly with each | ||
package adding its own channel. Note: alternatively, pure Python packages can | ||
choose to distribute on PyPI only (see the _PyPI, pip and wheels_ section | ||
further down) - Kornia is an example of a package that does this. | ||
|
||
Authors of _packages containing C++ or CUDA code_ which use the PyTorch C++ | ||
API have an additional issue: they need to release new package versions in | ||
sync with PyTorch itself, because there's no stable ABI that would allow | ||
depending on multiple PyTorch versions. For example, the torchvision | ||
`install_requires` dependency is determined like: | ||
|
||
```python | ||
pytorch_dep = 'torch' | ||
if os.getenv('PYTORCH_VERSION'): | ||
pytorch_dep += "==" + os.getenv('PYTORCH_VERSION') | ||
|
||
requirements = [ | ||
'numpy', | ||
pytorch_dep, | ||
] | ||
``` | ||
and its build script ensure a one-to-one correspondence of `pytorch` and | ||
`torchvision` versions of packages. | ||
|
||
The `pytorch` channel currently already contains other packages that depend | ||
on PyTorch. Those fall into two categories: needed dependencies (e.g., | ||
`magma-cuda`, `ffmpeg`) , and PyTorch-branded and Facebook-owned projects | ||
like `torchvision`, `torchtext`, `torchaudio`, `captum`, `faiss`, `ignite`, etc. | ||
See https://anaconda.org/pytorch/repo for a complete list. | ||
|
||
Those packages maintain their own build and packaging scripts (see | ||
[this comment](https://github.com/pytorch/builder/issues/563#issuecomment-722667815)), | ||
and the integration testing and uploading to the `pytorch` conda channel is done | ||
via scripts in the [pytorch/builder](https://github.com/pytorch/builder) repo. | ||
|
||
There's more integration testing happening already: | ||
- The `test_community_repos/` directory in the `builder` repo contains a | ||
significantly larger set of packages that's tested in addition to the packages | ||
that are distributed on the `pytorch` conda channel. | ||
- The [pytorch-integration-testing](https://github.com/pytorch/pytorch-integration-testing) | ||
repo contains tooling to test PyTorch release candidates. | ||
- An overview of integration test results from the `builder` repo (last updated Oct'19, | ||
so perhaps no longer maintained) can be found | ||
[here](http://ossci-integration-test-results.s3-website-us-east-1.amazonaws.com/test-results.html). | ||
|
||
|
||
## Usage and Impact | ||
|
||
### End users | ||
|
||
The intended outcome for end users is that they will be able to install many | ||
of the most commonly packages easily with `conda` from a single channel, | ||
e.g.: | ||
|
||
``` | ||
conda install pytorch torchvision kornia fastai mmf -c pytorch | ||
``` | ||
|
||
or, a little more complete: | ||
|
||
``` | ||
# Use a new environment for a new project | ||
conda create -n myenv | ||
conda activate myenv | ||
# Add channel to env, so all conda commands will now pick up packages | ||
# in the pytorch channel: | ||
conda config --env --add channels pytorch | ||
conda install pytorch torchvision kornia fastai mmf | ||
``` | ||
|
||
### Maintainers of packages depending on PyTorch | ||
|
||
The intended outcome for maintainers is that: | ||
|
||
1. They have clear documentation on how to add their package to the `pytorch` channel, | ||
including the criteria their packages should meet, how to run integration tests, | ||
and how to release new versions. | ||
2. They can declare their dependencies correctly | ||
3. They will still need their own channel or some staging channel to host packages | ||
before they get `anaconda copy`'d to the `pytorch` channel. | ||
4. They can provide a single install command to their users, `conda install mypkg -c pytorch`, | ||
that will work reliably. | ||
|
||
|
||
## Processes | ||
|
||
### Proposing a new package for inclusion | ||
|
||
Prerequisites for a package being considered for inclusion in the `pytorch` channel are: | ||
|
||
1. The package naturally belongs in the PyTorch ecosystem. I.e., PyTorch is a | ||
key dependency, and the package is focused on an area like deep learning, | ||
machine learning or scientific computing. | ||
2. All runtime dependencies of the package are available in the `defaults` or | ||
`pytorch` channel, or adding them to the `pytorch` is possible with a | ||
reasonable amount of effort. | ||
3. A working recipe for creating a conda package is available. | ||
|
||
A GitHub repository (working name `conda-distro`) will be used for managing | ||
proposals for new packages as well as integration configuration and tooling. | ||
To propose a new package, open an issue and fill out the instructions in the | ||
GitHub issue template. When a maintainer approves the request, the proposer | ||
can open a PR to that same repo to add the package to the integration | ||
testing. | ||
|
||
|
||
### Integration testing infrastructure | ||
|
||
The CI connected to the `conda-distro` repo has to do the following: | ||
|
||
1. Trigger on PRs that add or update an individual package, running the tests | ||
for that package _and_ downstream dependencies of that package. | ||
2. If tests for (1) are successful, sync the conda packages in question to | ||
the `pytorch` channel with `anaconda copy`. | ||
3. Provide a way to run the tests of all packages together. | ||
4. Send notifications if a package releases requires an update (e.g. a | ||
version bump) to a downstream package. | ||
|
||
The individual packages have to do the following: | ||
|
||
1. Ensure there are _upper bounds on dependency versions_, so new releases of | ||
PyTorch or another dependency cannot break already released versions of | ||
the individual package in question. Note that that does mean that a new | ||
PyTorch releases requires version bumps on existing packages - more detail | ||
in strategy will be needed here. | ||
2. Tests for a package should be _runnable in a standardized way_, via | ||
`conda-build --test`. This is easy to achieve via either a `test:` section | ||
in the recipe (`meta.yaml`) or a `run_test.py` file. See [this section of | ||
the conda-build docs](https://docs.conda.io/projects/conda-build/en/latest/resources/define-metadata.html#test-section) | ||
for details. An advantage of this method is that `conda-build` is already | ||
aware of channels and dependencies, so it should work with very little | ||
extra effort. | ||
|
||
|
||
### What happens when a new PyTorch release is made? | ||
|
||
For minor or major versions of PyTorch, new releases of downstream packages | ||
will also be necessary. A number of packages, such as `torchvision`, | ||
`torchaudio` and `torchtext`, are anyway released in sync. Other packages in | ||
the `pytorch` channel may need to be manually released via a PR to the | ||
`conda-distro` repo). | ||
|
||
Version constraints should be set such that a bugfix release of PyTorch does | ||
not require any new downstream package releases. | ||
|
||
|
||
### Dealing with packages that aren't maintained | ||
|
||
Proposing a package for inclusion in the `pytorch` channel implies a | ||
commitment to keep maintaining the package. There wil be a place to list one | ||
or more maintainers for each package so they can be pinged if needed. In case | ||
a package is not up-to-date or broken and it does not get fixed, after a | ||
certain duration (length TBD) it may be removed from the channel. | ||
|
||
|
||
## Alternatives | ||
|
||
### Conda-forge | ||
|
||
The main alternative to making the `pytorch` channel an integration channel | ||
that distributes many packages that depend on PyTorch is to have a | ||
(GPU-enabled) PyTorch package in conda-forge, and tell users and package | ||
authors that that is the place to go. It will require working with | ||
conda-forge in order to ensure that the `pytorch` package is of high quality, | ||
either by copying over the binaries from the `pytorch` channel or by | ||
migrating recipes and keeping them in sync. See | ||
[this very long discussion](https://github.com/conda-forge/pytorch-cpu-feedstock/issues/7) | ||
for details (and issues). | ||
|
||
Advantages of this alternative are: | ||
|
||
- Conda-forge has a lot of packages, so it will be easier to install PyTorch | ||
in combination with other non-deep learning packages (e.g. the geo-science | ||
stack). | ||
- Conda-forge already has established tools and processes for adding and | ||
updating them. Which means it's less likely for there to be issues with | ||
dependencies (e.g. packages with many or unusual dependencies may not be | ||
accepted into the `pytorch` channel, while `conda-forge` will be fine with | ||
them). | ||
- Users are likely already familiar with using the `conda-forge` channel. | ||
|
||
Disadvantages of this alternative are: | ||
|
||
- As of today, conda-forge doesn't have GPU hardware. Building is stil | ||
possible using CUDA stubs, however testing cannot really be done inside CI, | ||
only manually (which is a pain, especially when having to test multiple | ||
hardware and OS platforms). | ||
_Note that there are packages that follow this approach (mostly without | ||
problems so far), for example `arrow-cpp` and `cupy`. To obtain a full list of packages, clone https://github.com/conda-forge/feedstocks and run | ||
`grep 'compiler(' feedstocks/*/meta.yaml | grep cuda`._ | ||
- `conda-forge` and `defaults` aren't guaranteed to be compatible, so | ||
standardizing on `conda-forge` may cause problems for people who prefer | ||
`defaults`. | ||
- Exotic hardware support may be difficult. PyTorch has support for TPUs (via | ||
XLA), AMD ROCm, Linux on ARM64, Vulkan, Metal, Android NNAPI - this list | ||
will continue to grow. Most of this is experimental and hence not present | ||
in official binaries (and/or in the C++/Java packages which aren't | ||
distributed with conda), but this is likely to change and present issues | ||
with compilers or dependencies not present in conda-forge. | ||
For more details, see [this comment by Soumith](https://github.com/conda-forge/pytorch-cpu-feedstock/issues/7#issuecomment-538253388). | ||
- Release coordination is more difficult. For a PyTorch release, packages for | ||
`pytorch`, `torchvision`, `torchtext`, `torchaudio` will all be built | ||
together and then released. There may be manual quality assurance steps | ||
before uploading the packages. | ||
Building a set of packages like that depend on each other and releasing | ||
them in a coordinated fashion is hard to do on conda-forge, given that if | ||
everything is in feedstocks, the new pytorch package must already be | ||
available before the next build can start. It may be possible to do this | ||
with channel labels (build sequentially, then move all packages to the | ||
`main` label at once), but either way all the released artifacts will be | ||
publicly visible before the official release. | ||
|
||
Other points: | ||
|
||
- If the PyTorch team does not package for conda-forge, someone else will do | ||
that at some point. | ||
- Conda-forge no longer uses a single compiler toolchain for all packages it | ||
builds for a given platform - it is now possible to use a newer compiler, | ||
which itself is built with an older glibc/binutils (that does need to be | ||
common). See | ||
[this example](https://github.com/conda-forge/omniscidb-feedstock/blob/master/recipe/conda_build_config.yaml) | ||
for how to specify using GCC 8. So not having a recent enough compiler | ||
available is unlikely to be a relevant concern. | ||
- Mirroring packages in the `pytorch` channel to the `conda-forge` channel | ||
would alleviate worries about the disadvantages here, however there's no | ||
conda-forge tooling currently to verify ABI compatibility of the packages, | ||
which is the main worry of the conda-forge team with this approach. | ||
|
||
|
||
### DIY for every package | ||
|
||
Letting authors of every package depending on PyTorch find their own solution | ||
is basically the status quo of today. The most likely outcome longer-term is | ||
that PyTorch plus those packages depending on it will be packaged in | ||
conda-forge independently. At that point there are two competing `pytorch` | ||
packages, one in the `pytorch` and one in the `conda-forge` channel. And | ||
users who need a prebuilt version of other packages not available in the | ||
`pytorch` channel will likely migrate to `conda-forge`. | ||
|
||
The advantage is: no need to do any work to implement this proposal. The | ||
disadvantage is: depending on PyTorch will remain difficult for downstream | ||
packages. | ||
|
||
|
||
## Related work and issues | ||
|
||
### Conda channels | ||
|
||
Mixing multiple conda channels is rarely a good idea. It isn't even | ||
completely clear what a channel is for, opinions of conda and conda-forge | ||
maintainers differ - see | ||
https://github.com/conda-forge/conda-forge.github.io/issues/883. | ||
|
||
|
||
### RAPIDS | ||
|
||
RAPIDS has a really complex setup for distributing conda packages. Its install instructions currently look like: | ||
``` | ||
conda create -n rapids-0.16 -c rapidsai -c nvidia -c conda-forge \ | ||
-c defaults rapids=0.16 python=3.7 cudatoolkit=10.1 | ||
``` | ||
|
||
Depending on a user's config (e.g. having `channel_priority: strict` in | ||
`.condarc`), this may not work even in a clean environment. If one would add | ||
the `pytorch` channel as well, for users that need both PyTorch and RAPIDS, | ||
it's even less likely to work - the conda solver cannot handle that many | ||
channels and will fail to find a solution. | ||
|
||
|
||
### Cudatoolkit | ||
|
||
CUDA libraries are distributed for conda users via the `cudatoolkit` package. | ||
That package is only available in the `nvidia`, `defaults` and `conda-forge` | ||
channels. The license of the package prohibits redistribution, and an | ||
exception is difficult to obtain. Therefore it should not be added to the | ||
`pytorch` channel (also not necessary, obtaining it from `defaults` is fine). | ||
|
||
|
||
### PyPI, pip and wheels | ||
|
||
The experience installing PyTorch with `pip` is suboptimal, mainly because | ||
there's no way to control CUDA versions via `pip`, so the user gets whatever | ||
the default CUDA version is (10.2 at the time of writing) when running `pip | ||
install torch`. In case the user needs a different CUDA version or the | ||
CPU-only package, the install instruction looks like: | ||
``` | ||
pip install torch==1.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html | ||
``` | ||
There's the [pytorch-pip-shim](https://github.com/pmeier/pytorch-pip-shim) | ||
tool to handle auto-detecting CUDA versions and retrieving the right wheel. | ||
It relies on monkeypatching pip though, so it may break when new versions of | ||
pip are released. | ||
|
||
For package authors wanting to add a dependency on PyTorch, the above | ||
usability issue is a serious problem. If they add a runtime dependency on | ||
PyTorch (via `install_requires` in `setup.py` or via `pyproject.toml`), the | ||
only thing they can add is `torch` and there's no good way of signalling to | ||
the user that there's a CUDA version issue or how to deal with it. | ||
|
||
Finally note that `pip` and `conda` work together reasonably well, so for | ||
package authors that want to release packages that _do not contain C++ or | ||
CUDA code_, releasing on PyPI only and telling their users to install PyTorch | ||
with `conda` and their package with `pip` will work best. As soon as C++/CUDA | ||
code gets added, that's no longer reliable though. | ||
|
||
|
||
## Effort estimate | ||
|
||
TODO | ||
|
||
### Initial setup | ||
|
||
|
||
### Ongoing effort | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.