Skip to content

Prevent dynamic loading of an already loaded engine#17055

Closed
eitkin-nvidia wants to merge 5 commits intoopenssl:masterfrom
eitkin-nvidia:master
Closed

Prevent dynamic loading of an already loaded engine#17055
eitkin-nvidia wants to merge 5 commits intoopenssl:masterfrom
eitkin-nvidia:master

Conversation

@eitkin-nvidia
Copy link

Use the engine's engine_id to check if the engine already exists before we attempt to dynamically load it (again). Loading an engine more than once will always fail, and often lead to crashes (#17023).

Hence, it is preferred that we avoid this double load in the first place.

Fixes #17023.

Signed-off-by: Eyal Itkin [email protected]

Use the engine's engine_id to check if the engine already exists
before we attempt to dynamically load it (again). Loading an engine
more than once will always fail, and often lead to crashes (#17023).

Hence, it is preferred that we avoid this double load in the first
place.

Signed-off-by: Eyal Itkin <[email protected]>
@openssl-machine openssl-machine added the hold: cla required The contributor needs to submit a license agreement label Nov 17, 2021
@eitkin-nvidia
Copy link
Author

Sorry for the mess with git, this is the revision of #17025.

I am still in the process of submitting a CLA, and will update as soon as I have one.
Thanks for your understanding.

@petrovr
Copy link

petrovr commented Nov 17, 2021 via email

@eitkin-nvidia
Copy link
Author

I am not trying to solve a case in which a single configuration file specifies multiple load techniques, but a case in which the same configuration file is loaded by mistake more than once. This scenario is apparently way more common than expected, and I though to help fix it in a robust manner instead of sending programs to crash whenever an API change moved the invocation of CONF_modules_load_file().

As I mentioned in #17023, there are countless examples, including popular programs such wget and curl, of programs that by mistake cause the configuration to be loaded twice. These cases lead to segmentation faults in said programs (when engines are used), and #9481 mentions that they should be able to load the config more than once without crashing.

In a recent check Canonical found, on top of wget and curl, that there are at least 4 additional projects that they keep track of that probably suffer from the same bug.

Use the engine's engine_id to check if the engine already exists
before we attempt to dynamically load it (again). Loading an engine
more than once will always fail, and often lead to crashes (#17023).

Hence, it is preferred that we avoid this double load in the first
place.

Signed-off-by: Eyal Itkin <[email protected]>
@eitkin-nvidia
Copy link
Author

As I said earlier, I apologize for the git mess in the pull request, once it will undergo all revisions, I will prepare a tidy version. Also still working on the CLA on my side.

@petrovr
Copy link

petrovr commented Nov 17, 2021 via email

@eitkin-nvidia
Copy link
Author

My sample, like the rest of the examples I've linked to in #17025, are real-life examples from the GitHub of real engines:

We can argue that all these engines use "broken configuration!" or understand that this is how clients of OpenSSL understood that engines should be used, and that braking support for this syntax will probably break 90% of the engines deployed worldwide.

@petrovr I really am trying to help fix a communication issue between engine developers today, and OpenSSL, and I don't fully understand why this is the tone that I get back.

I worked late hours trying to debug a curl and wget issue which caused them both to crash when engines are used. The bug was a result of an API change in OpenSSL that was apparently miscommunicated to many 3rd party developers (wget and curl included) and that Canonical weren't aware of. When checking this issue I found that loading the configuration twice is a repeated issue encountered by engine developers, and the engine devs are blamed that the engine crashes, even though the fault is on the program that used OpenSSL and by mistake loaded the config twice.

I even linked in #17023 to a case from 11 years ago (!) of a crash due to this engine API, and I am now trying to help you fix it so the API will be robust and this error will stop popping to the surface every year. I'm trying to help and resolve this issue, and saying that all engine configs today are "broken!" isn't helping us move forward.

@petrovr
Copy link

petrovr commented Nov 17, 2021 via email

Will squash together all commits when the final revision will be
approved.

Signed-off-by: Eyal Itkin <[email protected]>
goto err;
if (!ENGINE_ctrl_cmd_string(e, "SO_PATH", dynamic_path, 0))
goto err;
if (!ENGINE_ctrl_cmd_string(e, "ID", engine_id, 0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible that engine_id is NULL here. It is probably better not to call in this case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it can happen, see line 78 - name = skip_dot(name);, and there is even a debug print of the name in it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, but then I dont see why you do this at all.
even this example does not need this complication:

[engine_section]
ibmca = ibmca_section

[ibmca_section]
# The openssl engine path for ibmca.so.
# Set the dynamic_path to where the ibmca.so engine
# resides on the system.
dynamic_path = /usr/local/lib/ibmca.so
engine_id = ibmca
init = 1

the name is ibmca from the beginning and indeed the
engine_id = ibmca is superfluous but benign.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, our configure file could do something like that:

openssl_conf = ossl_cnf

[ ossl_cnf ]
engines = eng_cnf

[ eng_cnf ]

test = xxx

[ xxx ]
a.dynamic_path = /usr/local/lib/lib1.so
b.dynamic_path = /usr/local/lib/lib2.so
c.dynamic_path = /usr/local/lib/lib3.so

works now, but breaks with this patch.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for feedback, I went over your examples, alongside additional configuration files, and now I better understand the configuration types available in the .conf file.

If split to cases, we have the following cases and current (before my fix) behavior:

  1. A single engine, using only "engine_id"
  2. A single engine, using only "dynamic_path"
  3. A single engine, using "engine_id" and then "dynamic_path" - engine_id won't be used at all
  4. A single engine, using "dynamic_path" and then "engine_id" - engine_id won't be used at all
  5. Multiple engines in same section, using only "dynamic_path"
  6. Multiple engines in same section - "engine_id" would only be used if first, and init/ctrl happens before the rest of the engines

The init sequence and refcount tracking of the engines makes me believe that either there are bugs when multiple engines (5 and 6) are used in the same section, or this case wasn't meant to be supported at all. The init/ctrl will only work if they occur directly after every engine, and the chances for undefined init behavior are very high. In addition, the reference counting for such engines would be flawed (too high) as only a single engine will get decremented at the end of the function.

All in all, one can load multiple engines, but it is hard to say it will work well in regards to refcounts and init of the engines, and with very high probability "engine_id" (if passed) won't ever be used. I agree that in such cases, "engine_id" isn't well defined anyway, and shouldn't be used.

In the first case, we know the "engine_id", and "ENGINE_by_id()" will use the "ID" command for the dynamic load. Hence, the dynamic load could easily check the engine by name and avoid duplications.

In the second case, we can't really "know" in advance what will be the engine ID, and I agree the dynamic load shouldn't use the "ID" cmd.

In all other cases, "engine_id" won't ever be used, unless the flow is "engine_id", "init", "dynamic_path", and I don't think it was ever meant to be supported. Hence, I argue that in cases 3 and 4, "engine_id" is actually used by users as a "hint" and we could use it when loading the engine by supplying the "ID" command.

If going according to this logic, the only case in which we can insert errors, is in cases 3 and 4: if for some reason there is a mismatch between the "engine_id" and the one of the real loaded engine, and I don't find it very likely. In the multi-load scenario (in a single section), or in case 2, the only thing we lose is the protection against double-load. As the entire fix is a best-effort mechanism, and considering the circumstances, I think it is something we can live with.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the documentation (https://www.openssl.org/docs/man1.1.1/man5/config.html) each section is an "engine section", and only a single engine is supported per such section. This means the example you used above isn't supported, and thus cases 5 and 6 weren't supported even before my fix.

Again, this makes sense in regards to the init logic, and to the reference count issues I pointed to above.

According to the same documentation, the engine name is indeed set by the configuration section (again, singularity), and can be overriden by "engine_id". Hence, in case 2 we still know what the engine ID should be, and the dynamic engine could use it for the check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I start to agree with @petrovr
A perfectly valid configuration will just use "dynamic_path", and not necessarily have an "engine_id" at all.
The section name may or may not be identical with the engine id, this name is currently just ignored.
Now this configuration will start to fail, while other really invalid configurations will still fail, but more gracefully.

Signed-off-by: Eyal Itkin <[email protected]>
@bernd-edlinger
Copy link
Member

@eitkin-nvidia see #17073 for how I would fix the duplicate engine loading.

@eitkin-nvidia
Copy link
Author

I saw your suggestion and I think it is indeed better than mine. Aborting this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hold: cla required The contributor needs to submit a license agreement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Engine crashes when loading the configuration more than once

5 participants