Skip to content

Releases: blockentropy/ml-client

v0.0.1

27 Dec 17:09

Choose a tag to compare

Initial release of distributed ML endpoints.

OpenAI Compatible Endpoints

llm_client.py: LLM chat compatible endpoint. Supported models include Yi, Starling, Mixtral, Mistral, Phind, Llama, and more.
image_client.py: Image compatible endpoint. Supported models include SD1.5 diffusion models, and SDXL diffusion models. Also added support for IP adapters for images/edits endpoint.
embedding_client.py: Vector embedding compatible endpoint. Supported models include the BGE embedding models.

Custom Research Endpoints

rerank_client.py: This rerank endpoint takes in an input and a list of several output strings, then returns a rank of the best outputs. Based on research by LLM-Blender. The API endpoint is v1/rank.
compress_client.py: This compression endpoint takes in an input and compresses it, maintaining the structure and meaning of the original input. Based on research by LLMLingua. The API endpoints are v1/compress and v1/compresslong.

Conda Environments

llm_environment.yml: Conda environment needed for the LLMs.
guardrails_environment.yml: Conda environment needed to set up guardrails for those interested in trustworthy, safe, and controllable LLM conversations. Based on research by NVIDIA NeMo-Guardrails.

Config

config.ini.sample: Configuration file that points to model directories, port, upload destination (for image generation). Rename to config.ini.