Skip to content

PacktPublishing/Kubernetes-for-Generative-AI-Solutions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Packt Sale

Kubernetes for Generative AI Solutions

Book Name

This is the code repository for Kubernetes for Generative AI Solutions, published by Packt.

A complete guide to designing, optimizing, and deploying Generative AI workloads on Kubernetes

What is this book about?

Learn step by step how to design, optimize, and deploy Generative AI projects on Kubernetes. Covering networking, observability, security, scaling, and cost optimization strategies, this guide takes you from first deployment to production excellence.

This book covers the following exciting features:

  • Explore GenAI deployment stack, agents, RAG, and model fine-tuning
  • Implement HPA, VPA, and Karpenter for efficient autoscaling
  • Optimize GPU usage with fractional allocation, MIG, and MPS setups
  • Reduce cloud costs and monitor spending with Kubecost tools
  • Secure GenAI workloads with RBAC, encryption, and service meshes
  • Monitor system health and performance using Prometheus and Grafana
  • Ensure high availability and disaster recovery for GenAI systems
  • Automate GenAI pipelines for continuous integration and delivery

If you feel this book is for you, get your copy today!

https://www.packtpub.com/

Instructions and Navigations

All of the code is organized into folders. For example, ch2.

The code will look like the following:

...
  metadata {
    name = "gp2"
...

Following is what you need for this book: This book is for solutions architects, product managers, engineering leads, DevOps teams, GenAI developers, and AI engineers. It's also suitable for students and academics learning about GenAI, Kubernetes, and cloud-native technologies. A basic understanding of cloud computing and AI concepts is needed, but no prior knowledge of Kubernetes is required.

With the following software and hardware list you can run all code files present in the book (Chapter 1-14).

Software and Hardware List

Chapter Software/Hardware Required OS Required
1–14 Operating system Linux, macOS, Windows (via WSL)
1–14 Kubernetes Amazon EKS, kind (for local testing)
1–14 AI/ML frameworks Hugging Face Transformers, PyTorch, TensorFlow
1–14 Accelerators NVIDIA GPUs, AWS Trainium/Inferentia
1–14 Observability Prometheus, Grafana, OpenTelemetry, Loki
1–14 Automation Kubeflow, MLflow, Ray, Argo Workflows
1–14 Security tools OPA, Kyverno

Related products

Get to Know the Authors

Sukirti Gupta is a technologist and product management leader at Amazon Web Services (AWS), where he leads the adoption of Generative AI technologies across start-up ecosystems. With over 15 years of experience in cloud computing, AI/ML, and data center technologies, he has played influential roles in shaping product narratives and engineering solutions for high-impact workloads across AWS, AMD, and Intel. At AWS, Sukirti leads initiatives that help start-ups integrate GenAI into their product strategy, enabling them to innovate with powerful infrastructure and tools. His previous roles include leading cloud product development at AMD and managing GTM strategy for Intel’s flagship computing platforms, where he helped drive billion-dollar revenue programs. Sukirti holds a B.Tech. from IIT (BHU), Varanasi, an M.S. in electrical engineering from the University of Cincinnati, and an MBA in strategy and marketing from Santa Clara University. In addition to his corporate work, Sukirti loves to mentor AI start-ups through IIT’s accelerator programs and frequently writes on Medium about GenAI trends and product leadership.

Ashok Srirama is a principal specialist solutions architect at AWS, where he leads initiatives to architect scalable, secure, and cost-efficient container-based solutions for enterprise customers. With over 19 years of experience in IT, Ashok brings profound expertise in cloud architecture, Kubernetes, container platforms, and, most recently, Generative AI. Before joining AWS, Ashok held pivotal cloud architecture roles at AIG and IBM, where he led digital transformation initiatives and cloud migration projects across insurance and communication sectors. His technical acumen spans across designing distributed architectures, infrastructure automation, and application modernization using containers and serverless technologies. As a recognized thought leader in cloud-native architecture, Ashok has authored numerous technical publications, including 20+ official AWS blogs and technical guides on Amazon EKS networking, observability, security, and container CI/CD pipelines. He has presented at over 25+ public events, including AWS re:Invent, AWS Summits, and start-up CTO cohorts, sharing his expertise with the broader technical community. Ashok’s commitment to technical excellence is reflected in his extensive certification portfolio, which encompasses all 12 AWS technical certifications and the complete suite of Kubernetes certifications from the Linux Foundation. His achievements have earned him the coveted AWS Gold Jacket and Kubestronaut accreditation. Beyond his architectural work, Ashok is passionate about enabling developers to simplify the complexity of running GenAI workloads at scale using cloud-native tools.

About

Kubernetes for Generative AI Solutions, published by Packt

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •