Catalog Details
CATEGORY
deploymentCREATED BY
UPDATED AT
April 15, 2024VERSION
1.0
What this pattern does:
Serve a large language model (LLM) with GPUs in Google Kubernetes Engine (GKE) mode. Create a GKE Standard cluster that uses multiple L4 GPUs and prepares the GKE infrastructure to serve any of the following models: 1. Falcon 40b. 2. Llama 2 70b
Caveats and Consideration:
Depending on the data format of the model, the number of GPUs varies. In this design, each model uses two L4 GPUs.
Compatibility:
Recent Discussions with "meshery" Tag
- Apr 14 | Unable to deploy meshery to minikube
- Apr 12 | What exactly is this sistent design system project
- Nov 11 | Unable setup local Meshery development server
- Apr 10 | How a beginner can start exploring project of meshery?
- Apr 10 | Meshery Development Meeting | April 10th 2024
- Apr 07 | Regarding [Bug]: Connection page shows error in "Local Provider" #10595
- Apr 03 | Meshery Development Meeting | 3rd April 2024
- Apr 02 | Open Request for Comments: Depiction of the Model Relationship Evaluation Cycle
- Mar 28 | Meshery Build and Release | March 28th 2024
- Mar 27 | Meshery Development Meeting | 27th March 2024