Visionary technology professional with 20+ years in IT, dedicated to AI-driven innovations. Google Cloud Certified, with deep expertise in designing and scaling AI-optimized, cloud-native infrastructure using Kubernetes.
Platform serving 100,000+ daily active users with 100+ engineering team members, 100+ repositories and microservices, primarily on GCP in a multi-cloud environment.
Reduced false-positive incident creation by 98% through advanced alerting rules and signal correlation in Datadog and Prometheus.
Overhauled Cloudflare filtering and security rules, improving traffic routing accuracy and reducing attack surface exposure.
Simplified and consolidated GitHub Actions workflows, cutting pipeline maintenance overhead by 60% and accelerating feedback loops.
Leading migration from end-of-life Crossplane to Datadog Operator and GCP Config Connector for more sustainable infrastructure management (in progress).
Building advanced AI-driven solutions and cloud-native microservices on GCP, leveraging modern AI frameworks and tools for end-to-end development and deployment.
Designed and implemented production-grade microservices architecture in Python and TypeScript, integrating state-of-the-art AI models (LLMs, embeddings, RAG pipelines) for intelligent automation and data processing.
Deployed scalable, serverless and containerized workloads on GCP (Cloud Run, GKE Autopilot, Vertex AI), achieving cost-efficient auto-scaling and near-zero maintenance overhead.
Automated full development lifecycle with GitHub Actions, Terraform, and GCP-native services, enabling rapid iteration and one-click deployments.
Optimized inference pipelines for performance and cost, reducing latency by 60%+ and token expenses through prompt engineering, caching, and model quantization techniques.
Developed custom AI agents and tools combining multiple modalities (text, vision, structured data), delivering high-accuracy solutions for complex real-world use cases.
Gaming industry platform with 100,000+ daily active users in specialized subdivision (engineering team <10 members); company-wide engineering team of 100+ across AWS and GCP environments.
Optimized GCP infrastructure to achieve 99.99% uptime and reduced release cycles by 86% through comprehensive CI/CD modernization.
Automated infrastructure provisioning with Terraform, Packer, and Ansible; implemented GitOps deployments via ArgoCD.
Designed, deployed, and maintained fully self-hosted observability stack (Prometheus, Grafana, Datadog) from scratch, handling full lifecycle and reducing third-party dependency costs by 40%.
Centralized developer workflows with self-hosted Backstage portal, cutting onboarding time by 70% and improving cross-team visibility.
Led incident response processes, reducing mean-time-to-resolution (MTTR) by 75%.
Executed zero-downtime hot migrations of MySQL (from 5.6 to 8.4, addressing replication and compatibility challenges) and PostgreSQL across cloud providers.
Small team of ~20 engineers focused on complex, high-impact projects with emphasis on system design and DevOps practices.
Architected scalable microservices on GKE and Minikube, achieving 30% downtime reduction through self-healing and auto-scaling mechanisms.
Defined service boundaries and infrastructure standards using Terraform and Helm to ensure consistent, maintainable deployments.
Enhanced Ruby on Rails performance (20% faster response times), enabled zero-downtime deployments via Jenkins and GitLab CI/CD, and sustained 99.9% uptime with Prometheus/Grafana monitoring.
Established DevOps culture and processes; built and led engineering team while delivering a globally distributed mission-critical system from inception.
Engineering team of ~40 across multiple concurrent projects in a fast-paced development environment.
Designed and developed a high-performance cryptocurrency exchange platform using Node.js (Koa) microservices architecture with rigorous TDD/BDD practices (Jest), achieving 99%+ test coverage.
Containerized applications and orchestrated CI/CD pipelines on AWS with Docker and Drone, enabling frequent deployments with zero downtime.
Reduced legacy code complexity by 25% through modularization, service decomposition, and elimination of redundant external library dependencies, enhancing maintainability and reusability across projects.
Optimized CI/CD scripts and processes, reducing average pipeline execution time by 75% (4x speedup).
Extracted critical user data handling into a dedicated compliant module to ensure full GDPR adherence, minimizing regulatory risk while supporting secure scalable operations.
Revitalized a monolithic Ruby on Rails app while leading and mentoring a team of six engineers, driving best practices and fostering a culture of technical excellence.
Spearheaded the adoption of TDD/BDD and led the migration from Rails 3.x to 5 for improved performance and maintainability.
Eliminated 15+ critical security vulnerabilities, cut memory usage by 20%, and slashed test suite runtime by 60% through parallelization.
Overhauled a 20,000-test suite for stability, optimized Jenkins CI to reduce build times by 40%, and containerized the dev environment with Docker, reducing developer onboarding time by 90%.
Transformed complex legacy code into high-performance, reusable modules while driving scalable, client-focused architectural improvements.
Engineering team of ~30 developing and operating a large-scale video streaming service with proprietary CDN infrastructure managing 300+ servers.
Architected high-performance microservices platform for seamless video streaming at scale, supporting peak concurrent viewers with sub-second latency.
Modernized deployment pipeline by replacing Capistrano with GitLab CI/CD, Docker, and Rancher, reducing deployment downtime by 90% and enabling zero-downtime releases.
Increased automated test coverage to 95%+ with RSpec, eliminating critical production defects and enforcing code quality through automated linting and style checks.
Optimized Chef recipes for consistent multi-environment configuration management and built custom CLI tools (Bash, ZSH, SSH) that reduced manual operational tasks by 80%, significantly boosting team efficiency.
Owned full lifecycle of proprietary CDN components, ensuring reliable content delivery and rapid incident recovery across global infrastructure.
Engineering team of ~25 focused on delivering custom Ruby-based enterprise solutions for clients.
Designed and developed production-grade Ruby applications (Sinatra and Rails) with rigorous TDD/BDD practices, consistently achieving 98%+ test coverage.
Architected and launched a complex CRM system from scratch featuring 100+ models and extensive business logic, attaining a 97+ RubyCritic code quality score.
Reduced technical debt in legacy components by 30% through refactoring, modular design, and elimination of redundant dependencies, improving long-term maintainability.
Automated deployment pipelines with Capistrano on PaaS platforms (Heroku and Locum), enabling one-click seamless deliveries and reducing release time by 70%.
Collaborated closely with clients to translate intricate requirements into scalable, high-performance solutions, ensuring on-time delivery and high client satisfaction.