SA
All work

Jade Global — AI Data Readiness Checker

AI & Data Science InternJade Global, enterprise technology consulting

Role

AI & Data Science Intern

Timeline

Jun 2025 – Aug 2025

Context

Jade Global, enterprise technology consulting

Type

AI Product / Data Engineering

Problem

Enterprise clients were asking Jade Global how to start their AI initiatives — and the honest answer was that most of them weren't ready. Their data had quality, consistency, and governance problems that would make any ML model unreliable. But there was no structured, scalable way to assess readiness, communicate it clearly, or give clients a concrete path forward.

Context

Jade Global works with mid-to-large enterprises on technology and digital transformation. As AI adoption accelerated, nearly every client engagement included some version of 'where do we start with AI?' The answer required an honest assessment of their data infrastructure — not a generic maturity model, but a specific evaluation of what they actually had.

Why It Mattered

Organizations that move to AI without fixing their data problems waste significant investment on models that can't perform reliably on their actual data. The failure mode is predictable and expensive. An accurate readiness assessment — one that's honest about gaps and specific about what to fix — saves clients from that outcome and gives them a credible starting point.

My Role

I defined and scoped the AI Data Readiness Checker as a product, designed the feature set, engineered the Snowflake-integrated Python pipelines, built the supporting dashboard, and presented the prototype to senior leadership.

What I Did

Defined the product scope by translating 14 data quality dimensions — including completeness, bias, consistency, freshness, and lineage — into measurable, automatable features that could run against enterprise datasets. Each dimension had to be specific enough to be evaluated programmatically, not just assessed subjectively.

Partnered with stakeholders to prioritize which dimensions to surface first. This wasn't just a technical decision — it required balancing enterprise usability (what clients can act on), technical feasibility (what's automatable in our stack), and time-to-value (what surfaces meaningful signal fastest).

Engineered Snowflake-integrated Python pipelines to profile 100+ large-scale client tables across the prioritized dimensions. Built a supporting dashboard that surfaced findings in a format analysts and engineers could use to identify and remediate issues directly. The pipelines replaced a manual QA process, reducing that effort by approximately 30%.

Presented the prototype to senior leadership. The presentation covered what the tool does, how it works, and the business case for developing it as a client-facing product to accelerate ML project launches. Got the green light.

Key Decisions & Tradeoffs

Decision 1

Scoped the 14 dimensions into a prioritized first release rather than trying to ship all of them at once. Coverage breadth matters less than reliability depth — a tool that assesses 5 dimensions well is more useful than one that checks 14 superficially.

Decision 2

Designed the output for two audiences: analysts who need to remediate specific issues, and leadership who need to understand overall readiness. Same underlying data, two different entry points in the dashboard.

Decision 3

Framed the business case around client-facing value, not internal tooling. Positioning it as something that accelerates ML project launches — a concrete, revenue-adjacent outcome — was what secured leadership support.

Outcome

AI Data Readiness Checker prototype built and presented to senior leadership. Snowflake-integrated pipelines profiling 100+ large-scale tables, reducing manual QA effort ~30%. Leadership secured support to develop it as a client-facing product.

Reflection

This project clarified something I'd suspected but not fully understood: scoping a data tool is a product problem, not just an engineering problem. Deciding which dimensions to measure, how to surface results, and who the primary user is — those decisions determine whether the tool gets used or just built. Getting stakeholder alignment on prioritization before building anything was the move that made the rest of the project work.