I'm a Data & Analytics Engineer who turns messy data into decision-ready insight. I work with Snowflake, Airflow, dbt, Power BI, and thoughtful modeling that business users actually adopt.
I treat data like a product. Contracts, lineage, and quality get the same rigor as code. I challenge assumptions and hunt for the simple design beneath complexity.
Outside work, I write and teach at Archstack — a small brand where I break down the ideas behind modern data platforms: warehouses, transformation, semantic layers, and the AI/ML stack growing on top of them.
Reduction in Dead Inventory
Faster Inspection Cycles
Faster Order Fulfillment
Improved Yield Accuracy
Click a layer to explore.
ERPs, apps, IoT via Fivetran, Dagster, custom connectors, and Azure Functions. CDC and event streams designed with data contracts from the start.
Independent
Creator brand covering modern data platforms, warehousing, dbt, Snowflake, and the AI stack growing on top — on YouTube and in writing. Tools, teardowns, and architecture breakdowns.
Saint John, NB, Canada
Lead a team of Analytics Engineers, driving global analytics initiatives and ensuring delivery of scalable, high-quality data solutions.
Saint John, NB, Canada
Design and development of optimized datasets, streamlining data access and enhancing the efficiency of data analysts.
Saint John, NB, Canada
Gathered and analyzed data from diverse sources, including customer data, financial data, and sales data, to uncover valuable trends and patterns.
India - Remote
Partnered with clients to gather requirements, translate them into functional specifications, and design scalable reporting solutions.
AI Engineering
Internal chat and agent surface that answers data questions over warehouse context. Semantic layer as retrieval surface, Claude tool use to hit live dbt models, prompt caching to keep per-request cost flat.
What I run in practice.
The two tools that changed how I ship data work.
Daily driver
Every dbt model, pipeline, and PR runs through Cursor. I lean on Composer for multi-file refactors, Agent mode for bounded migrations, and project-level rules to keep style consistent across a warehouse of models.
Default model
Claude is what I reach for when the work is ambiguous — architecture reviews, deep SQL debugging, cross-system design. I build Claude-backed internal tools and agents on the Anthropic SDK with tool use and prompt caching.
See the difference in how data flows through stages.
Key Difference: ETL transforms data before storage (good for structured, known schemas). ELT loads raw data first, then transforms in-warehouse (better for flexibility and modern cloud warehouses).
I measure success by business impact, not lines of code. Every pipeline, model, and dashboard should answer a real question.
Treat data assets like products: documented, tested, versioned, and user-focused. Quality and reliability are foundational.
Complex problems decompose into simple layers. Staging to intermediate to marts. Each layer has a clear purpose and contract.
by Martin Kleppmann
A deep dive into building reliable, scalable, maintainable data systems. Shapes how I design ingestion, storage, and serving.
by Joe Reis & Matt Housley
Pragmatic trade-offs across ingestion, modeling, orchestration, and consumption. Clarifies ETL vs ELT choices.
by Julie Zhuo
Tactics for setting expectations, giving feedback, and building trust. Guides how I mentor engineers and run projects.
Designed and Developed by Rathin Sharma
© 2026