Context
there is no advantage to increase n_batch above n_ubatch with embeddings models with pooling, because the entire batch must fit in a physical batch (ie. n_ubatch). n_batch is always >= n_ubatch.
Proposition
Exit failure if --embedding is set and --ubatch-size != --batch-size in the server example. Probably also in the retrieval example in #6193.
Aldo probably KV bert.context_size must be taken into account.