Skip to content

Update scoring api.#6575

Merged
dbczumar merged 98 commits intomlflow:branch-2.0from
tomasatdatabricks:tomas-update-scoring-api
Sep 27, 2022
Merged

Update scoring api.#6575
dbczumar merged 98 commits intomlflow:branch-2.0from
tomasatdatabricks:tomas-update-scoring-api

Conversation

@tomasatdatabricks
Copy link
Copy Markdown
Contributor

This PR makes the following changes to the scoring API:

headers: No longer recognize format flags used in mlflow 1.0 to specify pandas encoding (split or records or numpy split).
Input data:
Input is a dictionary with exactly one of dataframe_split, dataframe_records, instances, inputs fields.
Output data:
Output is a dictionary with predictions field that contains jsonized model output. The output itself remains unchanged.

Implementation wise, we can no longer use pandas.read_json function since the json is nested. Instead we parse the json using the built-in json library and turn the input into pandas DataFrame. This means that the type hints are not applied until the data is parsed and we have to explicitly cast the types.

Signed-off-by: tomasatdatabricks [email protected]

from mlflow.exceptions import MlflowException


def infer_and_parse_json_output(decoded_input):
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: I am not sure what the client is for and what is the intended API. I tried to preserve the existing behavior as closely as possible, but maybe we should change it?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when mlflow.pyfunc.spark_udf() is called with env_manager="virtualenv" or env_manager="conda", MLflow starts a model server and uses this client to send requests to it in order to perform inference; this is the long-awaited "restore model environments in spark UDF" feature that we talked about ~ 3 years ago :). It isn't a public API, so changes are okay as long as Spark UDF continues to work.

cc @WeichenXu123

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The scoring_server.client module is used in spark_udf with env_manager="virtualenv" or env_manager="conda".

if len(schema.inputs) == 1:
dtypes = schema.numpy_types()[0]
if pandas_orient == "records":
if not isinstance(decoded_input, list):
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: pandas is able to interpret almost anything as records format so it does not have to be a list for pandas's sake. However, from our perspective, it is better to be more restrictive so there is less variance in input encoding and it is easier to parse the input without pandas. E.g. when parsing logs etc.

@dbczumar dbczumar self-requested a review August 29, 2022 06:23
local_model_path_on_executor,
workers=1,
install_mlflow=False,
install_mlflow=(mlflow_home is None),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we would never install MLflow. Now, it seems like we'll almost always install MLflow, since users don't typically set mlflow_home. Is this necessary? cc @WeichenXu123

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice catch, it's inverted. I want to install mlflow only when MLFLOW_HOME is set. MLFLOW_HOME (not very descriptive name in retrospect) is meant for development - so that you can test your local changes when running stuff in a conda environment. For example, in this case, the tests were failing because the conda environment was installing mllfow from pip and hence was expecting the old input.

Copy link
Copy Markdown
Collaborator

@WeichenXu123 WeichenXu123 Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I think only in running test scenario we need set install_mlflow to be true

Comment on lines +1033 to +1034
# Used in test to force install lcoal version of mlflow when starting a model server
mlflow_home = os.environ.get("MLFLOW_HOME")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The impact of this appears to extend beyond tests: https://github.com/mlflow/mlflow/pull/6575/files#r959175274.

Suggested change
# Used in test to force install lcoal version of mlflow when starting a model server
mlflow_home = os.environ.get("MLFLOW_HOME")
# Used in test to force install local version of mlflow when starting a model server
mlflow_home = os.environ.get("MLFLOW_HOME")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented above. hopefully it explains it. we can also rename MLFLOW_HOME to something like MLFLOW_DEV_PROJECT_HOME in a followup

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!


response_body = response["Body"].read().decode("utf-8")
return infer_and_parse_json_input(response_body)
return ScoringServerResponse.from_raw_json(response_body)
Copy link
Copy Markdown
Collaborator

@dbczumar dbczumar Aug 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks the API contract of BaseDeploymenntClient.predict(): https://mlflow.org/docs/latest/python_api/mlflow.deployments.html#mlflow.deployments.BaseDeploymentClient.predict, which states that the API returns a pandas DataFrame, series, or numpy array (i.e pyfunc output, though list should also be added to the docs)

The goal of the original implementation was to convert the response JSON into the corresponding pyfunc output object. Can we preserve that, e.g. by adding / using a utility for parsing the response as a pyfunc output object?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is a hard one. We could certainly preserve the behavior but I am not sure if we should because:

  1. we wrap the response in a dictionary so we can return additional stuff. I think we should enable access to any additional output via the client interface.
  2. It's actually impossible to know whether the response should be pandas dataframe or series or numpy array. I saw most clients will always return pandas dataframe. infer_and_parse_json returns a dataframe unless the input is explicitly a tensor input - i.e. a dict with instances or inputs field. Unfortunately, our current server does not make that distinction and just jsonizes whatever the model returned. It is a bit late to change that now unfortunately.

Therefore, I would propose we update the interface to return pyfunc reponse object that can access the prediction as dataframe or numpy array per user's request and will have access to all response fields. Wdyt? Is that too big a change too late? We could also preserve the original behavior by default and add a flag to use the new behavior.

Comment on lines +12 to +32
class ScoringServerResponse(dict):
"""Predictions returned from the MLflow scoring server. The predictions are presented as a
dictionary with convenience methods to access the predictions parsed as higher level type."""

@classmethod
def from_raw_json(cls, json_str):
try:
parsed_response = json.loads(json_str)
except Exception as ex:
raise MlflowException(f"Model response is not a valid json. Error {ex}")
if not isinstance(parsed_response, dict) or "predictions" not in parsed_response:
raise MlflowException(
"Invalid predictions data. Prediction object must be a dictionary "
"with 'predictions' field."
)
return ScoringServerResponse(parsed_response)

def get_predictions_dataframe(self):
"""Get the predictions returned from the server as pandas.DataFrame. this method will fail
if the returned predictions is not a valid"""
return pd.DataFrame(data=self["predictions"])
Copy link
Copy Markdown
Collaborator

@dbczumar dbczumar Aug 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following from https://github.com/mlflow/mlflow/pull/6575/files#r959181856, is this class necessary? It doesn't support the full set of pyfunc outputs. Is there a utility method that we can add or extend for parsing the response as a pyfunc output object?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responded above.

Copy link
Copy Markdown
Collaborator

@dbczumar dbczumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomasatdatabricks Awesome stuff! Left some small comments and questions. Do we have time allocated to update our docs for this, such as https://mlflow.org/docs/latest/models.html#deploy-mlflow-models? Looks like a lot of this content would be out of date after merging this PR.

Can we add docs updates to the current PR or file a separate PR that we can merge at the same time as this one?

@tomasatdatabricks
Copy link
Copy Markdown
Contributor Author

Thanks for the review @dbczumar ! I am planning to update the docs, but I am still fixing tests :/ The PR grew unexpectedly in scope as I am working on it cause I see it impacts more things. So I was hoping to get all tests green before I bring stuff up, get agreement and update the docs in the end. I will be working on this full time next sprint - it's my last sprint before I go on parental leave so I gotta finish it.

tomasatdatabricks and others added 13 commits September 2, 2022 06:47
…field and nest dataframe inputs explicit fields

Signed-off-by: tomasatdatabricks <[email protected]>
Signed-off-by: tomasatdatabricks <[email protected]>
Signed-off-by: tomasatdatabricks <[email protected]>
Signed-off-by: tomasatdatabricks <[email protected]>
Signed-off-by: tomasatdatabricks <[email protected]>
Signed-off-by: tomasatdatabricks <[email protected]>
Signed-off-by: tomasatdatabricks <[email protected]>
Signed-off-by: tomasatdatabricks <[email protected]>
Signed-off-by: tomasatdatabricks <[email protected]>
Signed-off-by: tomasatdatabricks <[email protected]>
Signed-off-by: tomasatdatabricks <[email protected]>
Signed-off-by: tomasatdatabricks <[email protected]>
Co-authored-by: Corey Zumar <[email protected]>
Signed-off-by: tomasatdatabricks <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
@dbczumar dbczumar force-pushed the tomas-update-scoring-api branch from 9e70640 to c348b17 Compare September 23, 2022 20:53
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
Signed-off-by: dbczumar <[email protected]>
@dbczumar
Copy link
Copy Markdown
Collaborator

@mlflow-automation autoformat

Copy link
Copy Markdown
Collaborator

@dbczumar dbczumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @tomasatdatabricks!

@dbczumar
Copy link
Copy Markdown
Collaborator

Test failures are caused by sqlparse upgrade and unrelated to this PR. Merging!

@dbczumar dbczumar merged commit 1d78f30 into mlflow:branch-2.0 Sep 27, 2022
@harupy harupy mentioned this pull request Oct 27, 2022
31 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/scoring MLflow Model server, model deployment tools, Spark UDFs rn/breaking-change Mention under Breaking Changes in Changelogs.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants