Skip to content

Support Yarn v2#1

Closed
ocavue wants to merge 1 commit intouhop:masterfrom
ocavue-forks:yarn_cache
Closed

Support Yarn v2#1
ocavue wants to merge 1 commit intouhop:masterfrom
ocavue-forks:yarn_cache

Conversation

@ocavue
Copy link
Copy Markdown

@ocavue ocavue commented Dec 21, 2020

I'm using Yarn and I found that Yarn has to rebuild node-re2 every time in the CI pipeline. The reason is that Yarn doesn't provide the environment variable npm_package_json. In this PR, I use the environment variable PWD to support Yarn (and also other package managers).

@uhop uhop self-assigned this Dec 21, 2020
@uhop uhop added the enhancement New feature or request label Dec 21, 2020
@uhop
Copy link
Copy Markdown
Owner

uhop commented Dec 21, 2020

I am not sure I follow the fix. Line 141 is used only on NPM >= 7. At this version, NPM has changed how it passes over package.json values — I populate them myself as environment variables. In older versions and in yarn those environment variables are already set. I am not sure your fix is correct there. At best it does nothing.

Could you do more research on that?

@ocavue
Copy link
Copy Markdown
Author

ocavue commented Dec 21, 2020

Thanks for your quick response.

I populate them myself as environment variables

According to the document from here, it seems that npm run should set the environment variable npm_package_json. I didn't find any code related to npm_package_json in the codebase of node-re2.

I will do some more research on this. I will update my result soon in this issue.

@ocavue ocavue changed the title Support Yarn Support Yarn v2 Dec 21, 2020
@ocavue
Copy link
Copy Markdown
Author

ocavue commented Dec 21, 2020

After some experiments, this issue only occurs on Yarn v2. I built a repository to reproduce this issue, you can run the following command to check more details.

$ docker ps # make sure that the docker is running 
$ git clone https://github.com/ocavue/node-re2-cache-issue 
$ cd node-re2-cache-issue  
$ make run > result.log
$ cat result.log

I test some package managers and the result is as below:

package manager time to install node-re2 has environment variable npm_package_github has environment variable npm_package_json
[email protected] 0m8.086s yes no
[email protected] 0m5.628s yes no
[email protected] 0m38.269s no no
[email protected] 0m4.892s yes no

BTW, I also collect all environment variables for these four tools when they run bin/install-from-cache.js and put them at here.

To summarize, Yarn v2 needs to recompile node-re2 because it doesn't provide npm_package_github nor npm_package_json in environement variable.


This PR should be able to fix this issue since all four package manager provided the correct package.json path by reading the environment variable PWD:

$ cat result.log | grep 'PWD + package.json'

# npm
[DEBUG] [install-artifact-from-github]  PWD + package.json: /data/node-re2-cache-issue/npm-example/node_modules/re2/package.json exists: true

# yarn v1
[DEBUG] [install-artifact-from-github]  PWD + package.json: /data/node-re2-cache-issue/yarn-example/node_modules/re2/package.json exists: true

# yarn v2
➤ YN0000: │ re2@https://github.com/ocavue/node-re2.git#commit=ae4ea3e9c933d4f99d00dc89dd957a76194b210a STDOUT [DEBUG] [install-artifact-from-github]  PWD + package.json: /data/node-re2-cache-issue/yarnv2-example/node_modules/re2/package.json exists: true

# pnpm 
.../node_modules/re2 install: [DEBUG] [install-artifact-from-github]  PWD + package.json: /data/node-re2-cache-issue/pnpm-example/node_modules/.pnpm/github.com/ocavue/node-re2@ae4ea3e9c933d4f99d00dc89dd957a76194b210a/node_modules/re2/package.json exists: true
codespace ➜ ~/workspace/node-re2-cache-issue (master) 

@uhop

@ocavue
Copy link
Copy Markdown
Author

ocavue commented Dec 29, 2020

@uhop Hi. Is there anything I can do to push forward this PR? Thanks in advanced.

@uhop
Copy link
Copy Markdown
Owner

uhop commented Dec 30, 2020

yarn v2 is not released yet. My understanding is that it breaks the npm convention of using npm_* environment variables to access package.json and its content.

Your fix relies on PWD, which is incorrect. For example, I created a script called show.js in a directory called zzz, which prints PWD. Now I go to its parent directory and print a value of PWD by calling the script:

yarn --cwd zzz node show.js

And it prints the name of the parent directory, which is normal behavior for a shell.

My point is that PWD is not a reliable way to find package.json. It may work for you, but it is easy to create an environment where it will fail. If you find a more reliable way to do it, I'll develop the fix myself, but as is it is a no go.

npm solves this problem by setting environment variables. To read up on npm_* variables:

PS: yarn v1 emulates npm <= 6. Interesting to see what yarn v2 does.

@uhop
Copy link
Copy Markdown
Owner

uhop commented Dec 30, 2020

That's what I see in my console with npm v7:

$ npm run show

> show
> node show.js

npm_command                   run-script
npm_config_cache              /home/eugene/.npm
npm_config_globalconfig       /home/eugene/.nvm/versions/node/v15.5.0/etc/npmrc
npm_config_init_module        /home/eugene/.npm-init.js
npm_config_metrics_registry   https://registry.npmjs.org/
npm_config_node_gyp /home/eugene/.nvm/versions/node/v15.5.0/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js
npm_config_prefix             /home/eugene/.nvm/versions/node/v15.5.0
npm_config_user_agent         npm/7.3.0 node/v15.5.0 linux x64
npm_config_userconfig         /home/eugene/.npmrc
npm_execpath                  /home/eugene/.nvm/versions/node/v15.5.0/lib/node_modules/npm/bin/npm-cli.js
npm_lifecycle_event           show
npm_lifecycle_script          node show.js
npm_node_execpath             /home/eugene/.nvm/versions/node/v15.5.0/bin/node
npm_package_json              /home/eugene/Work/temp/zzz/package.json
npm_package_name              zzz

yarn v2:

$ yarn run show
npm_config_user_agent  yarn/2.4.0 npm/? node/15.5.0 linux x64
npm_execpath           /tmp/xfs-22a53eee/yarn
npm_lifecycle_event    show
npm_node_execpath      /tmp/xfs-22a53eee/node

I would suggest to go to yarn v2 project and file a request to add npm_package_json as per nvm v7. It is a trivial code to add for them, and it will help a lot of projects including re2.

@ocavue
Copy link
Copy Markdown
Author

ocavue commented Dec 30, 2020

yarn v2 is not released yet.

yarn v2 is released. However, since it breaks so many things from v1 (this issue itself is an example 😂 ), the yarn teams encourage users to install yarn v1 globally and enable yarn v2 for each project.

Yarn v2 document: You've probably remarked the global Yarn is from the "Classic" line (1.x). This is expected! One extra perk of this system is that projects configured for Yarn 1 will keep using it instead of suddenly having to migrate to the 2.x configuration format.


My understanding is that it breaks the npm convention of using npm_* environment variables to access package.json and its content.

I agree.

I would suggest to go to yarn v2 project and file a request to add npm_package_json as per nvm v7.

There is already an issue in the Yarn v2 repository about this: yarnpkg/berry#1659. It seems that the yarn team doesn't want to add these environment variables to yarn v2.


My point is that PWD is not a reliable way to find package.json. It may work for you, but it is easy to create an environment where it will fail.

I agree with this view that PWD is not very reliable. Especially considering that I have not tested the compatibility of this method on Windows.

I was wondering if there was some way to have re2 explicitly pass the current information to install-artifact-from-github instead of having install-artifact-from-github try to find re2's package.json itself.

@ocavue
Copy link
Copy Markdown
Author

ocavue commented Dec 30, 2020

I have two new proposals (or vague ideas) that don't depend on a special package manager or a special operating system.

proposal 1

Use Node'js __dirname instead of environment variables from the package manager to get the path of package.json.

// re2/package.json
{
  "name": "re2",
  "scripts": {
    "install": "node -c 'require(`install-artifact-from-github`)({dirname: __dirname, artifact: `build/Release/re2.node`, hostVar: `RE2_DOWNLOAD_MIRROR`})' || npm run rebuild",
  }
  ...
}

It seems that this method is not affected by --cwd, but I'm not sure it's correct in all cases.

➜  cat package.json
{
  "name": "npm_package_test",
  "version": "0.0.0",
  "scripts": {
    "test": "node -e 'console.log(require(`path`).resolve(__dirname))'"
  }
}
➜  yarn run test
yarn run v1.22.10
$ node -e 'console.log(require(`path`).resolve(__dirname))'
run /Users/ocavue/code/play/yarn_v2_env

➜  yarn --cwd zzz test
yarn run v1.22.10
$ node -e 'console.log(require(`path`).resolve(__dirname))'
/Users/ocavue/code/play/yarn_v2_env

proposal 2

Explicitly pass the version and github repository position to install-from-cache. The version and script.install in re2's package.json need to be updated synchronously, which means an extra step to the publishing process.

// re2/package.json
{
  "name": "re2",
  "version": "1.15.9",
  "scripts": {
    "install": "install-from-cache --version 1.15.9 --github uhop/node-re2 --artifact build/Release/re2.node --host-var RE2_DOWNLOAD_MIRROR || npm run rebuild",
  }
  ...
}

@uhop Do you have any ideas?

@uhop
Copy link
Copy Markdown
Owner

uhop commented Dec 31, 2020

It seems that the yarn team doesn't want to add these environment variables to yarn v2.

Oops. 8-0

@uhop
Copy link
Copy Markdown
Owner

uhop commented Dec 31, 2020

I am leaning towards the "proposal 2" direction + your PWD fix. The latter can generate false negatives, but it is okay because in the worst case we will rebuild. Let me meditate on that and do some experiments.

@viceice
Copy link
Copy Markdown

viceice commented May 3, 2022

this is the env for yarn v3.1

YARN_VERSION=3.1.0
HOME=/home/user
OLDPWD=/test
PATH=/tmp/xfs-c76d2d0c:/home/user/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/test/a/node_modules/re2
COREPACK_ROOT=/home/user/corepack/0.10.0/lib/node_modules/corepack
BERRY_BIN_FOLDER=/tmp/xfs-c76d2d0c
INIT_CWD=/test/a
PROJECT_CWD=/test/a
npm_execpath=/tmp/xfs-c76d2d0c/yarn
npm_node_execpath=/tmp/xfs-c76d2d0c/node
npm_package_name=re2
npm_package_version=1.0.0
npm_config_user_agent=yarn/3.1.0 npm/? node/16.15.0 linux x64
npm_lifecycle_event=install

trimed some unrelated env vars

@viceice
Copy link
Copy Markdown

viceice commented May 3, 2022

can we have this fix until yarn has a proper fix :-/ as this is blocking all our projects using re2, because we can't have (and don't want) python to rebuild

@uhop
Copy link
Copy Markdown
Owner

uhop commented Dec 19, 2022

It looks like Yarn has the fix.

@uhop uhop closed this Dec 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants