conda icon indicating copy to clipboard operation
conda copied to clipboard

include pip installs in `conda env export --from-history`

Open nblauch opened this issue 6 years ago • 33 comments

Hi -

I like the idea of the --from-history tag for conda env export for cross-platform compatibility, and simpler looking .yml files. However, it does not export the pip installed packages, as it normally would without the --from-history tag. Since pip is often still used to install packages not available on conda channels, the utility of the --from-history tag is greatly minimized.

Wondering if it would be possible to track history of pip installs. If it is difficult to do this in the same way as for conda packages, i wonder if there could be an additional argument, e.g. '--include-pip-in-history', that would just append all the pip installed packages to the output of conda env export when --from-history is used.

Thanks a lot Nick

nblauch avatar Jan 23 '20 17:01 nblauch

This is definitely a needed feature.

But more generally it'd be nice to have more control over what is and isn't emitted by conda env export. E.g. the name and prefix fields also often need to be deleted in my experience (unless I've been using conda create --file incorrectly).

gwerbin avatar Mar 30 '20 19:03 gwerbin

Any updates on this feature?

dshahrokhian avatar Jan 08 '21 14:01 dshahrokhian

Here's a script I've been using for this purpose: https://gist.github.com/gwerbin/dab3cf5f8db07611c6e0aeec177916d8

I'll try to block out some time to draft a PR that incorporates this behavior into conda env export.

gwerbin avatar Jan 08 '21 16:01 gwerbin

This would be nice to have since --from-history is used to create a cross-platform environment file and usually in that case you also want to replicate whatever pip-only packages you may have.

stephenonethree avatar Feb 05 '21 21:02 stephenonethree

Very keen to have this feature! 🙏

leej11 avatar Aug 17 '21 13:08 leej11

+1 on this feature!

dataframing avatar Jan 13 '22 21:01 dataframing

Aye aye, I could really use this right now.

sefabey avatar Feb 10 '22 19:02 sefabey

This would be great!

rajkundu avatar Mar 01 '22 07:03 rajkundu

+1 :smile:

askuric avatar Mar 25 '22 15:03 askuric

Would really benefit from this as well.

mvonpohle avatar Mar 30 '22 20:03 mvonpohle

This would be MUCH appreciated. Current conda export behavior is one of my main obstacles to to cross-platform compatibility.

davidkartchner avatar Mar 31 '22 15:03 davidkartchner

+1.

bnaman50 avatar May 03 '22 02:05 bnaman50

+1

ThobiasAggerholm avatar May 05 '22 05:05 ThobiasAggerholm

Would also love to see this! If the implementation by @gwerbin is acceptable, shall I submit a PR to merge this in?

jeskowagner avatar May 05 '22 10:05 jeskowagner

+1

KhanMechAI avatar Jun 04 '22 00:06 KhanMechAI

+1

vaxenburg avatar Jun 08 '22 17:06 vaxenburg

I submitted a draft PR to address this issue, as well as @gwerbin's comment. This PR has functionality to include pip dependencies in conda env export --from-history output and adds a flag to remove the prefix from the result by doing conda env export --override-prefix. Removing the name is more involved and not currently in the works (also lower priority in my eyes). However! This PR is not yet complete. Please head to #11532 if you have spare capacity to address the outstanding issues (see tick boxes and description). I unfortunately will not be able to invest much more time into it, and it is not yet ready to be merged.

jeskowagner avatar Jun 09 '22 11:06 jeskowagner

Any updates on this from the team?

modem7 avatar Aug 02 '22 21:08 modem7

I did not hear from the team, but addressed the failing unit test just now. While I did not have time to address documentation, the PR should be in a merge-able state (at least on my system tests pass now). Hoping we can get this merged in soon!

jeskowagner avatar Aug 05 '22 10:08 jeskowagner

Looking forward to when this feature is merged. Thank you for your effort @jeskowagner. Is your approach able to filter also the pip installed packages by the actively installed ones? So without the dependencies a (pip) package might have, like --from-history does it with conda export

fmfreeze avatar Aug 23 '22 11:08 fmfreeze

It follows the same logic regarding pip as conda env export does without --from-history. For example:

conda create -y -n test pip
conda activate test
pip install pandas
conda env export --from-history

Results in the following dependencies:

dependencies:
  - pip=22.2.2=pyhd8ed1ab_0
  - pip:
    - numpy==1.23.2
    - pandas==1.4.3
    - python-dateutil==2.8.2
    - pytz==2022.2.1
    - six==1.16.0

In other words, unfortunately all PyPi packages are automatically included. I see that pip has a pip list --not-required flag which would be what we're looking for, but not sure how easy it would be to gain that functionality within conda. Will think about that.

jeskowagner avatar Aug 23 '22 12:08 jeskowagner

Part of the problem with Pip is that it still after all these years has no stable Python API, and users are specifically discouraged from trying to use any of its Python interfaces. I don't know if there's a way around that other than invoking Pip in a subprocess, and I don't know if that's considered acceptable within the Conda codebase.

gwerbin avatar Aug 23 '22 13:08 gwerbin

@gwerbin That's a really good point you're making about using pip through a subprocess here. Regarding whether it's acceptable: it turns out that, to install dependencies from a requirements file that contains pip dependencies, subprocesses are the preffered way. In fact, there is a function here that we could use to call this subprocess.

That said, it looks like most functions in that same script are no longer used and in fact dysfunctional. It would be good to have a more senior developer chime in on whether it will be deprecated soon.

See what this function can do below. I first set up the environment:

conda create -y -n test2 pip
conda activate test2
pip install pandas

Then call the function:

import os
from conda_env.pip_util import pip_subprocess
out, err = pip_subprocess(["list", "--not-required"],"<...>envs/test2",os.getcwd()) # removed path
print(out)
# Package    Version
# ---------- ----------------------
# conda      4.13.0.post7+964ac75d8
# pandas     1.4.3
# pip        22.2.2
# setuptools 65.2.0
# wheel      0.37.1

So it's not perfect - it gives us an ugly string, and it does contain more that just pandas.

jeskowagner avatar Aug 23 '22 14:08 jeskowagner

I played around with pip_subprocess a bit, which seems to be stable enough in my hands. With it, I now submitted a new commit to the PR that will respect --from-history even for PyPi dependencies. Example:

conda create -y -n test pip pandas
conda activate test
pip install scikit-learn
conda env export --from-history --override-prefix

Yields

name: test
channels:
  - conda-forge
dependencies:
  - pandas=1.4.3=py310h3099161_0
  - pip=22.2.2=pyhd8ed1ab_0
  - pip:
    - scikit-learn==1.1.2

If you leave out the --from-history flag, you get the following:

Output
name: test
channels:
  - conda-forge
dependencies:
  - bzip2=1.0.8=h0d85af4_4
  - ca-certificates=2022.6.15=h033912b_0
  - libblas=3.9.0=16_osx64_openblas
  - libcblas=3.9.0=16_osx64_openblas
  - libcxx=14.0.6=hce7ea42_0
  - libffi=3.4.2=h0d85af4_5
  - libgfortran=5.0.0=10_4_0_h97931a8_25
  - libgfortran5=11.3.0=h082f757_25
  - liblapack=3.9.0=16_osx64_openblas
  - libopenblas=0.3.21=openmp_h947e540_2
  - libsqlite=3.39.2=h5a3d3bf_1
  - libzlib=1.2.12=hfe4f2af_2
  - llvm-openmp=14.0.4=ha654fa7_0
  - ncurses=6.3=h96cf925_1
  - numpy=1.23.2=py310hf910466_0
  - openssl=3.0.5=hb81d4ab_1
  - pandas=1.4.3=py310h3099161_0
  - pip=22.2.2=pyhd8ed1ab_0
  - python=3.10.6=hc14f532_0_cpython
  - python-dateutil=2.8.2=pyhd8ed1ab_0
  - python_abi=3.10=2_cp310
  - pytz=2022.2.1=pyhd8ed1ab_0
  - readline=8.1.2=h3899abd_0
  - setuptools=65.2.0=py310h2ec42d9_0
  - six=1.16.0=pyh6c4a22f_0
  - tk=8.6.12=h5dbffcc_0
  - tzdata=2022c=h191b570_0
  - wheel=0.37.1=pyhd8ed1ab_0
  - xz=5.2.6=h775f41a_0
  - pip:
    - joblib==1.1.0
    - scikit-learn==1.1.2
    - scipy==1.9.0
    - threadpoolctl==3.1.0

I remain hopeful that using the subprocess is okay here, given that it is used in the installation routine already.

jeskowagner avatar Aug 23 '22 18:08 jeskowagner

What is the status of this? It looks like awesome work from @jeskowagner . But has it been actually merged into conda?

vsisl avatar May 31 '23 13:05 vsisl

Unfortunately PR #11532 is still pending feedback on implementation details by those with rights to merge the changes. At the moment these changes have therefore not been merged into conda.

jeskowagner avatar Jun 01 '23 11:06 jeskowagner

Any updates about this?

gemyhamed avatar Jun 21 '23 12:06 gemyhamed

Hi @gemyhamed, any movement on this would happen over at PR #11532, but at the moment my previous comment stands unchanged.

jeskowagner avatar Jun 21 '23 12:06 jeskowagner

I'm using this script to get the job done while the PR isn't merged. It does the trick:

# Extract installed pip packages
pip_packages=$(conda env export | grep -A9999 ".*- pip:" | grep -v "^prefix: ")

# Export conda environment without builds, and append pip packages
conda env export --from-history | grep -v "^prefix: " > environment.yml
echo "$pip_packages" >> environment.yml

cmd8 avatar Jun 27 '23 07:06 cmd8

+1! this would be ideal!

I often have to use the '--from-history' flag because (although I use Linux) lots of people I work with use Windows or Mac and the packages/version don't always align ac cross the three OS, when exported without the flag.

The '--from-history' flag allows for a somewhat better OS-agnostic environments file... but the fact that it doesn't include pip 'from-history' installs is really limiting.

jonathancosme avatar Jul 18 '23 05:07 jonathancosme