You can get really, really far with this approach. Even 'naive' approaches like classifying what you're embedding and directing it to different models, or using multiple and blending scores can get you to a point where your results are better than anything you could pay (a lot!) for.
What is especially beneficial about that approach is that you can hang each of the embeddings off of the same bits in the db and tune how their scores are blended at query time.
If you haven't tried it yet: because what you're searching is presumably standardized enough to the point that there will be sprawling glossaries of acronyms, taking those and processing them into custom word lists will boost scores. If you go a little further and build lil graphs/maps of them all, doubly so, and it will give you 'free' autocomplete and the ability to specify which specific acronym(s) you meant or don't want on the query side.
Have recently been playing around with these for some code+prose+extracted prose+records semantic searching stuff, its a fun rabbit hole