Skip to content

[ML] Stopping a model deployment does not check if it is used by an inference endpoint #128549

@davidkyle

Description

@davidkyle

Elasticsearch Version

9.0.0

Installed Plugins

No response

Java Version

bundled

OS Version

any

Problem Description

Inference Endpoints created with the elasticsearch service are backed by models deployed on ml nodes. The Trained Model APIs can also be used to interact with that same backing model deployment. It is possible to stop the deployment with the Trained Model APIs but doing so breaks the inference endpoint as its model deployment has been removed.

If the deployment is used by an ingest processor the stop action gives a warning and requires the force parameter. At the very least the code should check for usage by an inference endpoint and require the force parameter. Or it should not be possible to stop a deployment managed by the inference API with the trained models APIs.

Steps to Reproduce

First create an inference endpoint using the elasticsearch service to deploy the model on an ml node.

PUT _inference/sparse_embedding/my-elser-inference-endpoint
{
  "service": "elasticsearch",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 2,
    "model_id": ".elser_model_2_linux-x86_64"
  },
  "chunking_settings": {
    "strategy": "sentence",
    "max_chunk_size": 250,
    "sentence_overlap": 1
  }
}

Call stop with the trained model API. The deployment stops and now the inference endpoint cannot be used as the backing model deployment has gone.

POST _ml/trained_models/my-elser-inference-endpoint/deployment/_stop

Logs (if relevant)

No response

Metadata

Metadata

Labels

:mlMachine learning>bugTeam:MLMeta label for the ML team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions