Staff Data Engineer &
AI Platform Architect.

I build the infrastructure between raw data and intelligent decisions — petabyte-scale pipelines, agentic AI systems, and self-serve data platforms that run without me.

5+ years at Amazon  ·  9,000 hours of manual work automated  ·  One agentic AI platform built from zero.

Professional DNA (The Engineering Pillars)

01

The Scalability Architect

Focusing on end-to-end ETL/ELT frameworks, transforming petabyte-scale noise into governed assets in Snowflake and AWS Redshift.

02

The Observability Evangelist

Treating 'Data-as-Code.' Implementing DQ frameworks and automated monitoring using Python and Airflow to ensure 99.9% uptime.

03

The AI & Edge Innovator

Integrating LLMs into workflows via OpenClaw agents and engineering full-stack Raspberry Pi clusters at the edge.

Career Logistics

2021 - Present

Amazon.com

Business Intelligence Engineer II & Business Analyst II

  • Architected 6+ production Airflow pipelines eliminating ~9,000 hours of manual reporting overhead annually for AMZL senior leadership
  • Built and launched Kiro, a 0-to-1 agentic AI platform on AWS Bedrock, increasing team AI utilization by 60%
  • Resolved >$1M monthly reporting variance by rebuilding AMXL NA cost allocation logic across 250+ Redshift tables
  • Reduced pipeline runtime by 78% and saved $480K/month in EMR costs migrating legacy DJS orchestration to Apache Airflow
  • Founded CODE (Center for Data Excellence), the first cross-org data quality certification program across Seller Protection Services
  • Engineered petabyte-scale fraud detection pipelines contributing to a 12% reduction in financial abuse
Initiative: CODE — Center for Data Excellence
Identified systemic data consistency gaps and drove creation of the first org-wide data trust framework across Seller Protection Services. Certified datasets became the standard for cross-team reporting and ML feature pipelines.
2020 - 2021

Guardhat Inc.

Solutions & Data Engineer

Managed technical product deployment and developed UWB device data extraction programs for COVID-19 Contact Tracing safety solutions.

2019 - 2020

iConsult JOB-IQ

Data Engineer & App Development Lead

Developed the Job-IQ Portal serving 25K+ students. Built Python web scrapers and complex NoSQL/SQL ETL pipelines.

2015 - 2017

Cognizant Technology Solutions

Software Engineer

Engineered "Model Eye," a comprehensive risk management and compliance platform deployed across 8,000+ banking users. Designed data pipelines to ingest and normalize multi-source financial data, built reporting dashboards for risk officers, and translated regulatory requirements into system logic. Foundation in financial data systems and enterprise-scale software delivery.

Academic Architecture

Syracuse University

M.S. Information Management (Data Science)

2018 - 2020

SRM University

B.Tech Information Technology

2011 - 2015

Trine University

MBA, Artificial Intelligence

In Progress — Expected 2026

Business Analytics, AI Foundations, Generative AI, Decision Models, Big Data Advanced Analytics, and AI Consulting.

Case Study

ChitrAI Studio

A hybrid-local AI-powered culling, editing, and client-proofing platform for professional photographers.

60% Culling Time Reduced
<5ms Per-photo Ingest Speed
<10ms Biometric Search Latency
0 Cloud Upload Required for Culling

ChitrAI Studio solves the biggest bottleneck in professional photography — post-production. Photographers capture thousands of high-resolution RAW files per event, creating hundreds of gigabytes of data. Traditional workflows demand massive cloud uploads or isolated desktop tools disconnected from client selection. ChitrAI bridges both worlds with a hybrid-local architecture that keeps heavy compute on-device and shares only lightweight previews to the cloud.

System Architecture

Local Tauri / Rust / React

On-device RAW preview extraction, temporal clustering, quality culling, AI retouching

Sidecar FastAPI / Python

SSIM/CNN similarity, MediaPipe face analysis, Zero-DCE exposure model, rembg background removal

Cloud Next.js / PostgreSQL

Secure client proofing portal, JPEG preview publishing, "Find Me" biometric search

End-to-End Workflow

01
Ingest — Rust multi-threading parses SD card folder, extracts EXIF data and RAW previews at sub-5ms per file. SQLite caches base64 thumbnails for instant re-opens.
02
AI Culling — Sidecar clusters photos by time gap and SSIM visual similarity. Laplacian sharpness, exposure scoring, and MediaPipe face analysis auto-select the "Hero" shot per burst.
03
Local Retouching — Zero-DCE shadow recovery, smart vibrance, OpenCV style transfer, and rembg background removal render instantly on-device.
04
Cloud Publish — Selected JPEGs upload to Google Cloud Storage / OneDrive. Metadata syncs to serverless PostgreSQL (Neon). Photographers get instant share links.
05
Client Proofing + "Find Me" — Clients log in passwordlessly, browse their gallery, and upload a selfie. Cosine similarity on 128-dim face embeddings returns matching photos in under 10ms.

Tech Stack

Desktop & Web React 18 · TypeScript · Vite · Framer Motion · Tailwind CSS
Desktop Core Tauri v2 · Rust (rayon, walkdir, sips, imageproc) · SQLite via sqlx
AI Microservice FastAPI · PyInstaller · PyTorch · OpenCV · MediaPipe · rembg (U2-Net) · ffmpeg
Cloud Backend Next.js 14 · NextAuth.js · Prisma · Neon PostgreSQL · GCS · OneDrive API · Resend

Key Engineering Features

Ultra-Fast RAW Ingest

Rust's rayon multithreading processes previews in under 5ms/file. SQLite caching means galleries re-open instantly with zero reprocessing.

🧠

Temporal & Visual Grouping

EXIF time-gap segmentation combined with SSIM visual similarity clusters bursts into logical Scenes and auto-selects a "Hero" shot per group.

🎯

Deep Learning Quality Tags

MediaPipe detects blinks, smiles, and head angles. Laplacian sharpness + exposure clipping analysis auto-tags blurry, closed_eyes, good_expression and more.

🎨

GPU-Accelerated Retouching

Zero-DCE illumination model lifts shadows without washing highlights. Smart vibrance preserves skin tones. Style transfer and U2-Net background removal run locally.

🔍

Biometric "Find Me" Search

128-dim face embeddings synced to PostgreSQL. Client selfie triggers vectorized NumPy cosine similarity across thousands of embeddings — no pgvector extension required, results in <10ms.

📦

Native AI Sidecar Bundling

PyInstaller packages the full FastAPI server, PyTorch models, and Python runtime into a single native binary. No user-facing Python install. Tauri manages the sidecar lifecycle on port 8001.

Technical Architecture

Category
Core Competencies (Expert)
Frameworks & Cloud (Proficient)
Languages
Python (Advanced), SQL (Expert), Bash/Shell
Java, Scala, YAML, Markdown
Data Platforms
Snowflake, AWS Redshift, Databricks
MongoDB, PostgreSQL, SQLite
Cloud (AWS)
S3, Lambda, EC2, Glue, Step Functions
Athena, Kinesis, IAM, CloudWatch
Orchestration
Apache Airflow, GitHub Actions
Cron, AWS EventBridge
AI & Automation
OpenClaw (Custom Agents), LLM Orchestration
LangChain, OpenAI API, Prompt Eng.
Data Modeling
Star/Snowflake Schema, Dimensional Modeling
ER Diagramming, Schema-on-Read/Write
BI & Visualization
Tableau, Amazon QuickSight
Power BI, Matplotlib, Seaborn
Hardware/Edge
Raspberry Pi (Linux/Server management)
IoT Ingestion, Edge Processing

The Labs

Kiro — Agentic AI Platform

AI

0-to-1 agentic AI platform on AWS Bedrock enabling non-technical stakeholders to query complex financial datasets in natural language via a multi-agent MCP architecture.

  • 60% increase in team AI utilization
  • Eliminated ad-hoc data request bottlenecks across Finance & Operations
  • MCP server layer with live AWS data source integration
  • Self-serve data access for Finance leadership

Stack: AWS Bedrock · Python · MCP Servers · Airflow · Redshift · S3 · Lambda

OpenClaw Agents

AI

Autonomous agents for Stock Market Research and Job Search Automation.

Raspberry Pi System

Live

T-Mobile Bill Splitter

Uptime: 99.8%
PDF Parsing: Active

Through The Lens

Precision in data, perspective in art.

Developer Logs

README.md — The Architect & The Observer
Ayush Pramod Kumar
Location: Bellevue, WA
Current Project: ChitrAI.Studio
Co-Pilot: Tyson (Husky)

> "Based in the Pacific Northwest, I am a Data & Business Enthusiast with a decade-long obsession with how information moves. My career has been defined by a transition from interpreting data (BI) to architecting the very systems that govern it."

> "When I’m not optimizing TB-scale pipelines in Snowflake or building AWS-native ecosystems, I’m usually exploring the intersection of software and physical hardware. Whether it's deploying a custom billing utility on a Raspberry Pi cluster or engineering OpenClaw—my suite of autonomous AI agents—I treat code as a tool for personal and professional liberation."

> "My perspective is shaped by the lens of a Canon R6 Mark II. Photography taught me that the smallest detail defines the whole—a philosophy I carry into my data schemas and system architectures."

> "When the screens are off, you’ll find me exploring the trails of Washington with my Husky, Tyson, or experimenting with the perfect Paneer Tikka recipe. I believe the best engineering is like the best photography: invisible, impactful, and precisely composed."

Published Research

2015

"Improved Routing Security in Wireless Mobile ADHOC Network"

International Journal of Engineering Research & Management Technology (IJERMT) — March 2015

2017

"Evolution and Development of DBMS in Software Development Industry"

Journal of Emerging Technologies and Innovative Research (JETIR) — November 2017