This is a follow up blog to the earlier blog “From RAGs to the Riches” from my colleague, Amar Kapadia.
Setting up the GenAI for an Enterprise involves multiple steps, and this can be categorized as:
- Infrastructure Orchestration, which includes the servers/GPUs with cloud software, virtualization tools and the networking infrastructure. There may be additional requirements depending on the Enterprise needs, such as:
- SD-WAN setup between their locations
- Access to the Enterprise data from their SaaS infrastructure (Confluence/Jira/Salesforce etc.)
- Connectivity to public clouds, if needed
- Connectivity to the repos where the GenAI models are present (Huggingface etc.)
- If this is set up on Cloud Edge DCs (such as Equinix), there may be a need to configure the fabric to connect to other Edge locations or the public clouds, using network edge devices (routers/firewalls that run as xNFs)
- GenAI Orchestration, which includes bringing up the GenAI tools, either for training or for inferencing.
- RAG Orchestration, which includes building the necessary Vector DB from various Enterprise sources, and using that as part of the Inferencing pipeline.
All of the above requires a sophisticated Orchestrator that can work in a generic manner, and provide a single-click (or a command) functionality.
The flow will be as follows:
- The Admin creates a high-level Intent that describes the necessary infrastructure, connectivity requirements, site details and the tools
- The Orchestrator takes the Intent as input, and sets up the necessary infrastructure and applications
- The Orchestrator also monitors the infra/applications for any failures/performance issues, and makes the necessary adjustments (it could work with one of the existing tools such as TMS for this function).
I hope this sheds some light on the topic and gives some clarity on how to go about setting up the underlying infrastructure for RAGOps.
AMCOP can orchestrate AI (and more specifically, GenAI) workloads on various platforms. At Aarna.ml, we offer open source, zero-touch, orchestrator, AMCOP (also offered as a SaaS AES) for lifecycle management, real-time policy, and closed loop automation for edge and 5G services. If you’d like to discuss your orchestration needs, please contact us for a free consultation.
Next Steps
Contact us for help on getting started with RAGOps. The Aarna.ml Multi Cluster Orchestration Platform orchestrates and manages edge environments including support for RAGOps. We have specifically created an offering that is suitable for NSPs by focusing not just on the FM and related ML components, but also on the infrastructure e.g. using Equinix Metal to speed up deployment and Equinix Fabric for seamless data connectivity. As an NVidia partner, we have deep expertise with server platforms like the NVidia GraceHopper and platform components such as NVidia Triton and NeMo.