Project Showcase

Explore my innovative projects and technical achievements in detail.

RAG-Based Chatbot on GCP Using Vertex AI

Summary:
Designed and implemented a Retrieval-Augmented Generation (RAG) chatbot that intelligently answers enterprise-specific queries using internal knowledge bases and LLMs.

Tech Stack:
Vertex AI, LangChain, BigQuery, Cloud Functions, Pinecone/Weaviate, Google Cloud Storage, Cloud Run

Key Responsibilities:

Developed pipeline for vectorizing enterprise documents
Integrated document ingestion with LangChain’s retriever
Used Vertex AI for prompt processing and response generation
Deployed on GCP with autoscaling using Cloud Run
Implemented feedback loop for query refinement

Impact:

Enabled automation of 40–60% of L1 support queries
Reduced human intervention in document walkthroughs
Delivered response time under 2 seconds for 95% queries

GenAI Agent AI for Ticket Classification and Routing

Summary:
Built a GenAI-powered agent that auto-classifies and routes support tickets based on contextual understanding, drastically improving triaging speed and accuracy.

A modern, minimalistic folded brochure with multiple panels displayed in a row. Each panel features a different design, including text elements and an image of a mountain landscape during sunset on one side. The color scheme is predominantly white, with accents of dark blue, black, and soft pink.

Tech Stack:
Vertex AI, LangChain, Pub/Sub, Cloud Logging, Google Cloud Functions, BigQuery

Key Responsibilities:

Extracted ticket metadata from monitoring and alerting tools
Designed prompt templates for classifying issues by service/app
Built LangChain agent to suggest resolution paths or L2 ownership
Logged decisions to BigQuery for auditability and optimization

Impact:

Achieved 70% auto-classification accuracy in first rollout
Reduced triage turnaround time by 50%
Enabled fast handover of high-severity tickets to L2/L3

Cloud Migration Roadmap & Execution

Summary:
Led hybrid cloud migration engagements for clients in pharma sectors, assessing existing infrastructure and defining modernization blueprints.

Tech Stack:
GCP, AWS, Cloud SQL, BigQuery, ADLS Gen2, Terraform, Ansible, Google Migration Center

Key Responsibilities:

Assessed on-prem workloads and databases
Designed hybrid strategy for phased migration (GCP + AWS)
Created landing zones with secure IAM and network policies
Defined CI/CD and Infrastructure as Code practices
Migrated data lakes and BI workloads to GCP

Impact:

Enabled a 30% reduction in operational cost
Delivered phased migration roadmap within 90 days
Met all compliance requirements (HIPAA, GDPR)

Monitoring Setup with Cloud Monitoring & Vertex AI Anomaly Detection

Summary:
Established a proactive observability stack that uses AI for real-time anomaly detection and alerting across cloud infrastructure and data pipelines.

Tech Stack:
Google Cloud Monitoring, Vertex AI, Cloud Functions, Pub/Sub, Cloud Logging, BigQuery

Key Responsibilities:

Set up centralized logging & metrics publishing from distributed systems
Created anomaly detection models using Vertex AI
Configured automated alerting with email, SMS, and Slack integrations
Built dashboards for SRE/Ops teams to visualize system health

Impact:

Reduced MTTR (Mean Time to Resolve) by 35%
Prevented major incidents by detecting anomalies before failures
Trained Ops team to use AI-driven dashboarding effectively

An accordion-style brochure is laid out on a textured surface. The brochure consists of multiple panels with a minimalist design, featuring text and some graphical elements in muted colors. Each panel displays different content, with some having bold letters and others displaying paragraphs of text.

Real-Time ELT Pipeline for Lakehouse using Composer, BigQuery, Pub/Sub, and Cloud Storage

Summary:
Developed a real-time, orchestrated ELT data pipeline to ingest, transform, and load structured/unstructured data into a unified Lakehouse architecture on GCP. The pipeline supports both batch and streaming ingestion models.

Tech Stack:
Cloud Composer (Airflow), BigQuery, Cloud Storage, Cloud Functions, Pub/Sub, Dataform

Responsibilities:

Designed event-driven data ingestion using Pub/Sub with schema enforcement
Orchestrated end-to-end workflows using Cloud Composer (Airflow)
Parsed, cleansed, and stored raw data in Cloud Storage (Bronze layer)
Applied transformations and quality checks with BigQuery SQL (Silver layer)
Managed curated views (Gold layer) for downstream analytics & ML
Triggered notification and error alerts via Cloud Functions

Impact:

Reduced batch processing time from 2 hours to 20 minutes
Achieved unified governance with Lakehouse pattern
Enabled seamless consumption by GenAI model(Gemini Pro)

Scalable Batch + Streaming Data Pipeline Using Dataflow, Dataproc, and BigQuery

Summary:
Architected a hybrid batch + stream pipeline to process high-volume clickstream, sales, and data, leveraging GCP native services for scalable processing and warehousing.

Tech Stack:
BigQuery, Dataflow, Dataproc (Spark), Cloud Functions, Cloud Storage, Cloud Scheduler

Responsibilities:

Built Apache Beam pipelines on Dataflow for near-real-time stream processing
Offloaded heavy joins & transformations to Dataproc Spark clusters (scheduled with Cloud Scheduler)
Integrated external data into Cloud Storage and ingested to staging tables
Transformed and enriched data in BigQuery for reporting & ML
Set up auto-scaling, fault-tolerant architecture using native GCP triggers

Impact:

Enabled analytics on Data
Cut cloud compute costs by 25% through hybrid job design
Improved insight availability from 24 hours to 4 hours

Legacy .NET Monolith to Microservices on GKE with PostgreSQL Backend

Summary:
Led the end-to-end modernization of a legacy enterprise application originally built on .NET and IBM DB2, transforming it into a scalable, containerized microservices architecture hosted on Google Kubernetes Engine (GKE)with Python-based APIs and PostgreSQL as the new backend.

Tech Stack:
.NET (legacy), GKE, Docker, Python (FastAPI/Flask), PostgreSQL, IBM DB2, Cloud Build, GCP IAM, Cloud Logging, Cloud SQL, GitOps (Jenkins)

Responsibilities:

🔄 Monolith to Microservices Refactoring
- Analyzed .NET legacy UI and business logic
- Broke monolithic code into domain-driven microservices
- Rewrote APIs using Python (FastAPI) to interact with the new database
🗃️ Database Migration
- Reverse-engineered schema and data from IBM DB2
- Migrated historical and operational data to PostgreSQL
- Created compatibility layers for downstream reporting systems
☁️ Cloud-Native Deployment
- Containerized Python services with Docker
- Deployed all services to Google Kubernetes Engine (GKE)
- Implemented horizontal auto-scaling, readiness/liveness probes, and rolling updates
🔐 Security and Networking
- Configured GCP IAM roles, service-to-service authentication, and private access to Cloud SQL
- Used internal load balancing and VPC-native clusters for secure microservice communication
🔧 Observability and CI/CD
- Integrated Cloud Logging and Monitoring for each service
- Set up Git-based CI/CD pipelines with Cloud Build and ArgoCD for continuous delivery

Impact:

Modernized legacy tech stack, improving scalability and maintainability
Reduced operational costs by moving from licensed DB2 to open-source PostgreSQL
Improved deployment speed with microservices delivering updates independently
Enhanced performance and fault isolation through containerized services on GKE

Enterprise MS SQL Server Migration to GCP Cloud SQL

Summary:
Successfully migrated a production-grade Microsoft SQL Server database from on-premise infrastructure to Google Cloud SQL for SQL Server, enabling better scalability, high availability, and managed backup with reduced operational overhead.

Tech Stack:
MS SQL Server (on-prem), Cloud SQL for SQL Server, Database Migration Service (DMS), Cloud Monitoring, VPC Peering, IAM, Cloud Scheduler, Terraform

Responsibilities:

🧭 Assessment & Planning
- Conducted deep analysis of existing database structure, dependencies, and usage patterns
- Planned zero-downtime cutover window and rollback strategy
⚙️ Migration Execution
- Set up Database Migration Service (DMS) with minimal downtime replication
- Migrated schema, stored procedures, linked servers, SQL Jobs, and data
- Tuned long-running queries and optimized indexes post-migration
🔐 Security & Networking
- Enabled private IP access to Cloud SQL via VPC peering
- Configured IAM roles, SSL enforcement, and automated backups
- Integrated with Secret Manager for app credential handling
🧩 Post-Migration Optimization
- Configured Cloud Monitoring and Query Insights for performance tuning
- Scheduled automated backups and maintenance windows via Cloud Scheduler
- Used Terraform to version control infrastructure provisioning

Impact:

Reduced DB management overhead by 70% through managed Cloud SQL
Improved performance consistency and security posture
Enabled integration with other GCP services like BigQuery and Looker
Achieved seamless migration with <5 min downtime during cutover

Pre-Sales Project: MS SQL Server Migration Evaluation – Azure vs GCP

Summary:
Led a pre-sales engagement for a global manufacturing client to evaluate the migration of a business-critical MS SQL Server hosted on-premises. The engagement focused on determining the feasibility of either lift-and-shift, cloud-managed services, or enterprise server hosting on Azure and GCP. The goal was to provide a comprehensive architecture and operational model aligned with scalability, compliance, and cost optimization objectives.

Engagement Type:
Pre-Sales Architecture & PoC (Proof of Concept)
Client: Confidential (Manufacturing sector)
Status: Solution proposed and PoC completed, deal not closed

Key Objectives:

Evaluate whether to lift-and-shift the MS SQL Server VM or modernize to managed database offerings
Provide a comparative analysis between Azure SQL Managed Instance, GCP Cloud SQL for SQL Server, and self-hosted SQL Server on VM
Ensure support for linked servers, SSIS/SSRS workloads, Always-On availability, and Active Directory integration
Deliver a working PoC with sample workloads on both clouds

Solutioning Responsibilities:

🧭 Requirements Analysis
- Worked with enterprise architects to gather inputs on current workloads, high availability, latency sensitivity, and DR expectations
- Assessed dependencies on SQL Server Agent, Linked Servers, CLR objects, and stored procedures
🏗️ Solution Architecture
- Designed three migration paths:
  1. Lift-and-Shift to IaaS VMs on Azure/GCP using Migrate for Compute Engine / Azure Migrate
  2. Platform Migration to Azure SQL Managed Instance and GCP Cloud SQL (SQL Server)
  3. High-Availability SQL Server 2022 on Azure VM (Enterprise Licensing) with DR and Always-On clustering
- Created TCO comparison, networking diagrams, IAM mapping, backup/restore policies
🔍 Proof of Concept Execution
- Set up Cloud SQL instance on GCP with VPC peering, private IP, and IAM integration
- Created Azure SQL Managed Instance with AD Authentication and VNet
- Migrated sample schema and datasets using SQL Server Migration Assistant (SSMA)
- Validated workloads, performance, replication, and monitoring in both environments

Outcomes & Insights:

Azure SQL Managed Instance supported more enterprise features like linked servers and SSIS without external hacks
GCP Cloud SQL offered lower operational cost, simpler IAM, but had limitations around advanced SQL Server features (e.g., cross-database queries, SQL Server Agent scheduling)
Lift-and-shift to VM was feasible but didn’t align with modernization and O&M reduction goals
Delivered a 40-page solution proposal with PoC performance benchmarks and risk assessment

Impact:

Gave client a clear, technically sound roadmap for future-state architecture
Demonstrated ability to navigate complex SQL workloads across clouds
Although the client paused the initiative due to budget review, the technical groundwork remains reusable for future cycles

Oracle Database Migration to AWS EC2 & RDS

Summary:
Led the migration of a mission-critical Oracle 11g/12c database from an on-premise data center to Amazon Web Services (AWS). The engagement focused on rehosting (lift-and-shift) for short-term continuity and replatforming select workloads onto Amazon RDS for Oracle to reduce operational overhead and licensing costs.

Key Objectives:

Migrate large Oracle transactional and analytical databases (~5TB) from aging on-premise infrastructure
Reduce hardware/maintenance costs, improve backup and recovery, and prepare for cloud-native modernization
Meet DR and HA expectations within a single-region setup

Responsibilities:

🧭 Discovery & Planning
- Conducted infrastructure and application dependency analysis
- Reviewed data access patterns, backup cycles, archive policies, and licensing model
- Selected a hybrid migration strategy:
  - Lift-and-shift critical transactional DB to EC2 (Oracle EE on Linux)
  - Replatform reporting DBs to Amazon RDS for Oracle
⚙️ Execution
- Built EC2-based Oracle instance with custom filesystem layout (ASM to XFS conversion)
- Migrated schema using Oracle Data Pump (expdp/impdp) and RMAN for full backups
- Migrated reporting workloads to RDS for Oracle, tuning parameters for query throughput
- Set up Database Links between EC2 Oracle and RDS Oracle for hybrid queries
🔐 Security & Monitoring
- Implemented VPC peering, security groups, and KMS-encrypted backups
- Integrated CloudWatch monitoring and custom scripts for performance tracking
- Configured automated snapshots and PITR for RDS instances

Impact:

Reduced overall TCO by 40% annually compared to on-premise licensing + infra
Improved RTO/RPO using automated backups and snapshot scheduling
Laid the foundation for future refactoring of data pipeline to AWS-native services
Trained client’s DBAs on managing hybrid EC2 + RDS deployments

Cross-Cloud Data Pipeline: Azure to GCP via ADLS, Databricks, Apigee & BigQuery

Summary:
Designed and implemented an enterprise-grade cross-cloud data platform where data from over 20+ pipelines across multiple Azure regions was ingested, processed, and transferred securely to Google Cloud. The pipeline leveraged Azure Data Factory, ADLS Gen2, Databricks, Apigee, Cloud Composer, and BigQuery for a seamless, end-to-end data ingestion and analytics flow.

🧭 Architecture Overview

Azure Side – Ingestion & Cleansing
- Ingested data from 20+ source systems using Azure Data Factory (ADF) pipelines across multiple geographies
- Data saved to ADLS Gen2 (Raw Layer) in partitioned format with metadata tagging
- Applied cleansing, validation, and formatting rules using Azure Databricks (PySpark)
- Saved output in Cleansed Layer of ADLS Gen2 in Delta Lake format
Cross-Cloud Integration
- Built REST APIs using Apigee (GCP) to securely pull data from ADLS Gen2
- Streamed cleaned datasets via API gateway into GCP Cloud Storage (Staging)
GCP Side – DAG-Orchestrated Processing
- Used Cloud Composer DAGs to:
  - Pull data from Cloud Storage (RAW)
  - Load into BigQuery RAW Layer
  - Apply schema enforcement, deduplication, anomaly detection via PySpark jobs & SQL
- Curated, use-case-specific data was moved into the BigQuery Curated Layer
Consumption
- Curated datasets were exposed via Looker, Data Studio, and Vertex AI notebooks
- Enabled data scientists to pull data from curated or cleansed layers for ML training
- Ensured data availability in near real-time across regions

🔐 Security & Operations

Applied IAM roles, private VNet peering, and OAuth2 tokens for inter-cloud API security
Set up audit logging across Azure & GCP environments
Monitored pipeline failures using Cloud Logging, Azure Monitor, and Slack alerts via Cloud Functions

✅ Impact

Reduced ETL latency from 12 hours to under 2 hours for high-volume pipelines
Enabled cross-cloud compliance and governance across Azure and GCP
Empowered 10+ data science use cases using curated data from GCP
Demonstrated real-time multi-region ingestion and hybrid cloud orchestration

Get in Touch

Feel free to reach out for collaborations, inquiries, or just to connect. I'm here to help and share ideas!

🔗 LinkedIn: linkedin.com/in/gimshra8
🌐 Portfolio: gm01.in

Project Showcase

RAG-Based Chatbot on GCP Using Vertex AI

GenAI Agent AI for Ticket Classification and Routing

Cloud Migration Roadmap & Execution

Monitoring Setup with Cloud Monitoring & Vertex AI Anomaly Detection

Real-Time ELT Pipeline for Lakehouse using Composer, BigQuery, Pub/Sub, and Cloud Storage

Scalable Batch + Streaming Data Pipeline Using Dataflow, Dataproc, and BigQuery

Legacy .NET Monolith to Microservices on GKE with PostgreSQL Backend

Enterprise MS SQL Server Migration to GCP Cloud SQL

Pre-Sales Project: MS SQL Server Migration Evaluation – Azure vs GCP

Oracle Database Migration to AWS EC2 & RDS

Cross-Cloud Data Pipeline: Azure to GCP via ADLS, Databricks, Apigee & BigQuery

🧭 Architecture Overview

🔐 Security & Operations

✅ Impact

Get in Touch

Connect