include pip installs in `conda env export --from-history`
Hi -
I like the idea of the --from-history tag for conda env export for cross-platform compatibility, and simpler looking .yml files. However, it does not export the pip installed packages, as it normally would without the --from-history tag. Since pip is often still used to install packages not available on conda channels, the utility of the --from-history tag is greatly minimized.
Wondering if it would be possible to track history of pip installs. If it is difficult to do this in the same way as for conda packages, i wonder if there could be an additional argument, e.g. '--include-pip-in-history', that would just append all the pip installed packages to the output of conda env export when --from-history is used.
Thanks a lot Nick
This is definitely a needed feature.
But more generally it'd be nice to have more control over what is and isn't emitted by conda env export. E.g. the name and prefix fields also often need to be deleted in my experience (unless I've been using conda create --file incorrectly).
Any updates on this feature?
Here's a script I've been using for this purpose: https://gist.github.com/gwerbin/dab3cf5f8db07611c6e0aeec177916d8
I'll try to block out some time to draft a PR that incorporates this behavior into conda env export.
This would be nice to have since --from-history is used to create a cross-platform environment file and usually in that case you also want to replicate whatever pip-only packages you may have.
Very keen to have this feature! 🙏
+1 on this feature!
Aye aye, I could really use this right now.
This would be great!
+1 :smile:
Would really benefit from this as well.
This would be MUCH appreciated. Current conda export behavior is one of my main obstacles to to cross-platform compatibility.
+1.
+1
Would also love to see this! If the implementation by @gwerbin is acceptable, shall I submit a PR to merge this in?
+1
+1
I submitted a draft PR to address this issue, as well as @gwerbin's comment. This PR has functionality to include pip dependencies in conda env export --from-history output and adds a flag to remove the prefix from the result by doing conda env export --override-prefix. Removing the name is more involved and not currently in the works (also lower priority in my eyes).
However! This PR is not yet complete. Please head to #11532 if you have spare capacity to address the outstanding issues (see tick boxes and description). I unfortunately will not be able to invest much more time into it, and it is not yet ready to be merged.
Any updates on this from the team?
I did not hear from the team, but addressed the failing unit test just now. While I did not have time to address documentation, the PR should be in a merge-able state (at least on my system tests pass now). Hoping we can get this merged in soon!
Looking forward to when this feature is merged.
Thank you for your effort @jeskowagner. Is your approach able to filter also the pip installed packages by the actively installed ones?
So without the dependencies a (pip) package might have, like --from-history does it with conda export
It follows the same logic regarding pip as conda env export does without --from-history. For example:
conda create -y -n test pip
conda activate test
pip install pandas
conda env export --from-history
Results in the following dependencies:
dependencies:
- pip=22.2.2=pyhd8ed1ab_0
- pip:
- numpy==1.23.2
- pandas==1.4.3
- python-dateutil==2.8.2
- pytz==2022.2.1
- six==1.16.0
In other words, unfortunately all PyPi packages are automatically included. I see that pip has a pip list --not-required flag which would be what we're looking for, but not sure how easy it would be to gain that functionality within conda. Will think about that.
Part of the problem with Pip is that it still after all these years has no stable Python API, and users are specifically discouraged from trying to use any of its Python interfaces. I don't know if there's a way around that other than invoking Pip in a subprocess, and I don't know if that's considered acceptable within the Conda codebase.
@gwerbin That's a really good point you're making about using pip through a subprocess here. Regarding whether it's acceptable: it turns out that, to install dependencies from a requirements file that contains pip dependencies, subprocesses are the preffered way. In fact, there is a function here that we could use to call this subprocess.
That said, it looks like most functions in that same script are no longer used and in fact dysfunctional. It would be good to have a more senior developer chime in on whether it will be deprecated soon.
See what this function can do below. I first set up the environment:
conda create -y -n test2 pip
conda activate test2
pip install pandas
Then call the function:
import os
from conda_env.pip_util import pip_subprocess
out, err = pip_subprocess(["list", "--not-required"],"<...>envs/test2",os.getcwd()) # removed path
print(out)
# Package Version
# ---------- ----------------------
# conda 4.13.0.post7+964ac75d8
# pandas 1.4.3
# pip 22.2.2
# setuptools 65.2.0
# wheel 0.37.1
So it's not perfect - it gives us an ugly string, and it does contain more that just pandas.
I played around with pip_subprocess a bit, which seems to be stable enough in my hands. With it, I now submitted a new commit to the PR that will respect --from-history even for PyPi dependencies. Example:
conda create -y -n test pip pandas
conda activate test
pip install scikit-learn
conda env export --from-history --override-prefix
Yields
name: test
channels:
- conda-forge
dependencies:
- pandas=1.4.3=py310h3099161_0
- pip=22.2.2=pyhd8ed1ab_0
- pip:
- scikit-learn==1.1.2
If you leave out the --from-history flag, you get the following:
Output
name: test
channels:
- conda-forge
dependencies:
- bzip2=1.0.8=h0d85af4_4
- ca-certificates=2022.6.15=h033912b_0
- libblas=3.9.0=16_osx64_openblas
- libcblas=3.9.0=16_osx64_openblas
- libcxx=14.0.6=hce7ea42_0
- libffi=3.4.2=h0d85af4_5
- libgfortran=5.0.0=10_4_0_h97931a8_25
- libgfortran5=11.3.0=h082f757_25
- liblapack=3.9.0=16_osx64_openblas
- libopenblas=0.3.21=openmp_h947e540_2
- libsqlite=3.39.2=h5a3d3bf_1
- libzlib=1.2.12=hfe4f2af_2
- llvm-openmp=14.0.4=ha654fa7_0
- ncurses=6.3=h96cf925_1
- numpy=1.23.2=py310hf910466_0
- openssl=3.0.5=hb81d4ab_1
- pandas=1.4.3=py310h3099161_0
- pip=22.2.2=pyhd8ed1ab_0
- python=3.10.6=hc14f532_0_cpython
- python-dateutil=2.8.2=pyhd8ed1ab_0
- python_abi=3.10=2_cp310
- pytz=2022.2.1=pyhd8ed1ab_0
- readline=8.1.2=h3899abd_0
- setuptools=65.2.0=py310h2ec42d9_0
- six=1.16.0=pyh6c4a22f_0
- tk=8.6.12=h5dbffcc_0
- tzdata=2022c=h191b570_0
- wheel=0.37.1=pyhd8ed1ab_0
- xz=5.2.6=h775f41a_0
- pip:
- joblib==1.1.0
- scikit-learn==1.1.2
- scipy==1.9.0
- threadpoolctl==3.1.0
I remain hopeful that using the subprocess is okay here, given that it is used in the installation routine already.
What is the status of this? It looks like awesome work from @jeskowagner . But has it been actually merged into conda?
Unfortunately PR #11532 is still pending feedback on implementation details by those with rights to merge the changes. At the moment these changes have therefore not been merged into conda.
Any updates about this?
Hi @gemyhamed, any movement on this would happen over at PR #11532, but at the moment my previous comment stands unchanged.
I'm using this script to get the job done while the PR isn't merged. It does the trick:
# Extract installed pip packages
pip_packages=$(conda env export | grep -A9999 ".*- pip:" | grep -v "^prefix: ")
# Export conda environment without builds, and append pip packages
conda env export --from-history | grep -v "^prefix: " > environment.yml
echo "$pip_packages" >> environment.yml
+1! this would be ideal!
I often have to use the '--from-history' flag because (although I use Linux) lots of people I work with use Windows or Mac and the packages/version don't always align ac cross the three OS, when exported without the flag.
The '--from-history' flag allows for a somewhat better OS-agnostic environments file... but the fact that it doesn't include pip 'from-history' installs is really limiting.