Lighthouse ↗ site

Senior MLOps Engineer · Senior AI Solutions Engineer · Senior Data Engineer — Remote (Spain → EU)

Connect AI defended under partner load with golden-set evals; KITT's quality regression caught at deploy time, not by hoteliers; Smart Compset queries answered by an agent over the rate warehouse instead of N bespoke dashboards.

The threat nobody is pricing in

Your customers will be one engineer plus an agent stack. Will Lighthouse ship the same way?

A boutique-hotel revenue manager with a Claude account and a few hundred lines of glue is twelve months from doing in-house what they currently pay a SaaS for. The teams that survive aren't the largest — they're the ones whose know-how is documented as the work happens (so a model upgrade doesn't quietly break Connect AI), and whose infrastructure is agent-ready (so the next product on top of the rate warehouse is a prompt, not a rebuild). My job, in your first 90 days, is to build both — quietly, in parallel with the shipping you already have to do.

Document the know-how

Connect AI's prompt history, KITT's failure modes, the dbt logic behind Smart Compset — captured by the same agents that operate them, not in a quarterly Confluence audit.
Make infra agent-ready

An MCP server over BigQuery + BigTable + the dbt manifest, so the next AI surface (or the next internal tool) is one prompt away, not one team-quarter.
In parallel with shipping

No six-month transformation programme. The artefacts compound from week one — alongside the GA work the team already has on the roadmap.

Speed is the moat

Same engineer for infra, data, app, and the agent loop.

At a fintech data platform I'm the seam-engineer: Hermes (operator chat layer) · CrewAI crews (architecture / development / review) · LangGraph flows · OpenHands autonomous coder · custom MCP servers fronting Airflow / Snowflake / EKS / ArgoCD / Datadog · Langfuse for end-to-end LLM observability and evals — triggered by GitHub webhooks across Terraform, ArgoCD, and Python/Airflow repos. Feature/fix cycle compressed days/weeks → hours/minutes. Daily pipeline failures → near-zero. The pattern travels: same loop, your stack, swap the cloud vendor.

4generic crews shipping
30+self-hosted services
MCPnative, not wrapped
11+yrs eng

What your competition is doing

Three peer references already past the "AI as feature" stage.

All three operate on the axis Lighthouse just stepped onto with Connect AI: AI surfaces sitting on top of a streaming hospitality data plane, with the eval discipline that lets the team change models without partner-side regressions.

Booking.com

Genius AI Trip Planner — LLM-driven itinerary search shipped to production app users on a Kafka + warehouse stack.

Direct peer on travel-data scale. The interesting public artefact is not the launch — it's that their engineering team publishes about evaluation harnesses and per-class regression discipline. That is the maturity bar Connect AI's next twelve months will be measured against.

Source: booking.ai

Mews

PMS-native AI features (assistant, reporting agents) wired directly into their property data plane.

Smaller, AI-forward hospitality SaaS — useful as the "what AI looks like at our scale, not Booking's" reference. Their public posts describe wiring LLM features into existing operational data, which is the pattern an MCP-over-warehouse layer enables one tier up.

Source: mews.com/blog

Expedia Group

Romie travel companion + ChatGPT plugin — large-scale LLM surfaces over their travel-data graph, in-market since 2023.

Direct precedent for Connect AI's ChatGPT-store distribution model. The public learning is that the bottleneck isn't shipping the surface — it's the cost-per-conversion observability and partner trust that defends it for the eighteen months after launch.

Source: expediagroup.com/newsroom

None of the three is a hyperscaler lab. The same pattern travels to a team Lighthouse's size with one engineer who has shipped the loop end-to-end before — not with a six-month transformation programme.

Closest match — agent platform on a streaming data plane

I already run this stack end-to-end on a fintech data platform.

Production data platform (fintech SaaS) — daily failures → near-zero, dataeng cycle compressed days→minutes

Same shape as the seam Lighthouse is now hiring for: a streaming-first data platform feeding multiple product surfaces (including LLM-fronted ones), where the bottleneck is the loop around the platform — not the platform itself.

De facto tech lead in a small team. Owner of infra, ingestion, orchestration, warehouse and an agent-driven development lifecycle that compressed merge-to-prod from days to minutes. Led a Redshift→Snowflake migration. Took daily pipeline failures to near-zero unforced errors. Same Langfuse + custom-MCP + crew-based discipline that maps onto the Connect AI / KITT / rate-warehouse proposals below — one swap of cloud vendors away.

Airflow
Snowflake
Terraform
EKS
ArgoCD
dbt
Langfuse
CrewAI
OpenHands
Custom MCP

↗ full case in the portfolio

What I'd do in the first 90 days

Build the boring infra around your interesting agents.

Three narrow, fast, reversible bets — each doable by one engineer in 4-6 weeks of real work, each mapped to a portfolio anchor I've shipped before.

Weeks 1-4 — Connect AI & KITT eval harness

Stand up Langfuse + a per-partner golden test set so model swaps stop being a partner-trust event

Every Connect AI / KITT request traced with cost, latency, and quality tags. A small per-partner golden set (booking-intent class, parity-question class, rate-explanation class). LLM-as-a-judge regression evals fire on every prompt or model change. Outcome: a "this prompt is safe to ship" signal the team can defend to a partner GM, not "we patched it forward after the support ticket".

Success: 100% of LLM calls traced; per-class regression bar wired into CI by week 4.
Weeks 5-8 — MCP layer over the rate warehouse

One typed query surface over BigQuery + BigTable + the dbt manifest, scoped to internal RM-consultant + customer-success queries

Replaces N bespoke SQL clients ("which markets had the biggest comp-set drift last weekend?", "which clients have a >2σ revenue gap vs. their compset?") with one read surface that any internal agent — or, with auth scoping, the next iteration of Connect AI — can call. Ship the same MCP I run today; adapter pattern is one BigQuery + Bigtable client away.

Success: ≥10 internal queries migrated off bespoke clients; one customer-facing AI surface routed through the MCP by week 8.
Weeks 9-12 — Agent loop for the dataeng team itself

CrewAI flow that owns dbt model authoring + Dataflow scaffolding + first-pass on-call triage

Pick the highest-friction internal flow on the dataeng backlog (likely: schema-change PRs that today take days because of golden-row regression nerves) and rebuild it as an agent that handles 70% end-to-end with human handoff for the rest. Same playbook I shipped at fintech where the runbook was 40 steps and the rebuild was a 4-hour agent loop. The point is the pattern, not just the feature — once one is live the rest follow in weeks not months.

Success: schema-change cycle compressed from days → ≤4 hours on chosen flow; rebuild blueprint documented for the next two flows.

Worth a 15-minute call?

If after a short call it isn't the right fit, no pressure. Either way you keep the analysis I already wrote about the Connect AI / KITT / rate-warehouse stack.

Book 15 min Email instead

Fidel Perez · Agent-platform engineer · 11+ yrs · MCP-native

⌬ This page is auto-generated by a private agent platform I built and operate. It lives inside my open-source portfolio repo — github.com/fidel-perez/portfolio — so the artefact you're reading is itself the proof. The platform stays private; what it ships is what you see.