// HORIZON SHIELD · 2026

Tamper-proof AI
for construction
quotes.

A deterministic cost diagnostic system. Same input, same output, every time — verified by a 12-character SHA-256 audit hash printed on every report.

Built by a 30-year carpenter to fight construction fraud in Japan.

Open dataset · CC-BY 4.0 // engrXiv DOI: 10.31224/7007 // EU AI Act-aligned

View source on GitHub → Read the paper Why this exists

// THE PROBLEM

LLMs give different
answers each time.

That's a problem when a homeowner is negotiating a $20,000 renovation quote with a contractor. The AI's "estimate" becomes worthless at the exact moment it's needed most.

The negotiation collapse

A homeowner gets a $22,000 quote for exterior repainting. They suspect overcharging. They ask ChatGPT for a fair price range. ChatGPT says "around $9,500–$14,500."

The homeowner shows this to the contractor. The contractor opens ChatGPT, asks the same question, gets "around $11,000–$16,000." Different answer.

The contractor laughs: "AI numbers can't be trusted. We've been in business 30 years." The diagnostic is dead.

The structural issue

This isn't a prompt engineering problem. It's an architectural one. LLMs are probabilistic by design. The same query, run twice, produces different completions — that's how they work.

For creative writing or code generation, this is fine. For adversarial settings — where two parties with opposing interests need a shared, verifiable reference — it's catastrophic.

I needed a diagnostic that any party could re-derive on the spot, byte-for-byte identical, and verify with a single glance.

// Contractor's actual rebuttal:

"Ask the AI again. You'll get a different number. So which one is right? Yours, or mine? Neither. That's why you should trust me — I've been in this business 30 years."

// THE ARCHITECTURE

Separate concerns.
Hash everything.

The fix turned out to be straightforward once I stopped trying to make the LLM deterministic and instead removed the LLM from the deterministic path entirely.

LLM does intent classification only Natural language ("外壁塗装 30坪") → canonical category key ("paint_exterior_30tsubo")

↓

Deterministic formula computes pricing 87 plans × 7 categories, static dataset (CC-BY 4.0). Pure function of (region, category, year, month).

↓

War Price Coefficient applied Monthly correction factor from Bank of Japan CGPI + manufacturer announcements. Currently ×1.0935 (Iran/Hormuz oil shock).

↓

SHA-256 hash of canonical inputs hash(region + category + quote + benchmark + year-month) → first 12 hex chars on PDF.

// AUDIT HASH ON SAMPLE PDF

538811dcd9fd

Any party with the same inputs derives the same hash. The contractor can verify it in 30 seconds, on a public chat interface, in front of the homeowner. No "trust me" required.

// TECH STACK

Built on boring
infrastructure.

// FRONTEND

Static HTML/CSS/JS. No framework. Hosted on Cloudflare Pages. Total LP weight: ~80KB. Loads in <200ms even on 3G.

// API LAYER

Cloudflare Workers (16 endpoints). LLM proxy uses Claude Sonnet 4 for intent classification only — no pricing logic touches the LLM.

// DATASET

JSON files in GitHub (data/zaisai-db*.json). 3,350 material items, 87 standard renovation plans, 7 work categories. CC-BY 4.0.

// PRICE SYNC

Cron-triggered Worker pulls Bank of Japan CGPI + manufacturer announcements monthly. Computes War Price Coefficient. Human-in-the-loop approval before publishing.

// HASH FUNCTION

SHA-256(canonical_input_string), first 12 hex characters truncated. Stable across regenerations. Implementation: ~20 lines in JavaScript.

// COMPLIANCE

Designed against EU AI Act Art.12 (record-keeping) and Art.14 (human oversight), even though Japanese consumer-protection AI isn't currently in Annex III scope. Felt right for adversarial settings.

// AUTHOR

Why I built this.

I'm Toshikatsu Oga. I'm 48. I started as a carpenter's apprentice in Osaka at 15, in 1993.

I spent 30 years on construction sites — first as a carpenter, then site supervisor, then construction manager (the role where you represent the homeowner against contractors). I've watched homeowners get systematically overcharged for three decades. 20% margin overcharges are normal. 50% happens. 100% isn't rare.

The Japanese term for the worst pattern is "一式" (isshiki) — a single line item meaning "everything bundled together." It's how contractors hide $5,000–$10,000 in margin under one opaque number. There's no legal protection. Architects represent contractors. Lawyers don't know construction. Consumer protection agencies don't have technical expertise.

Two years ago, at 46, I started learning to code. Python first, then JavaScript. Then LLMs and Cloudflare Workers. Then cryptography. The goal was simple: build a tool that actually works at the moment of negotiation, not just produces a nice-looking PDF.

The hardest part wasn't the code. It was realizing the LLM's nondeterminism was structural, not fixable, and that I needed to architect around it instead of fighting it.

Honestly, learning to ship code at 46 was harder than learning to frame a house at 16. I still don't know what I'm doing half the time. But the system runs.

YEARS
ON SITE

YEARS
CODING

PLANS
OPEN DATA

3,350

MATERIAL
ITEMS

// TRY IT

Run a diagnostic.

The live service is in Japanese (the dataset and target users are Japanese), but the architecture is what's portable. The hash will be identical for identical inputs.

For developers

Clone the repo, run the hash function on a sample input, verify the output matches the published hash on the engrXiv paper.

sha256(canonical_input) → first 12 hex chars Clone the repo →

For curious humans

The Japanese live service is at shield.the-horizons-innovation.com. Send a quote photo via LINE, get a diagnostic in 30 seconds with an audit hash printed on the PDF.

LINE: @172piime (Japanese only) See the live service (JP)

Tamper-proof AI
for construction
quotes.

LLMs give different
answers each time.

The negotiation collapse

The structural issue

Separate concerns.
Hash everything.

Built on boring
infrastructure.

// FRONTEND

// API LAYER

// DATASET

// PRICE SYNC

// HASH FUNCTION

// COMPLIANCE

Why I built this.

For verification.

GitHub Repository

engrXiv Preprint

ORCID Profile

Run a diagnostic.

For developers

For curious humans

Tamper-proof AIfor constructionquotes.

LLMs give differentanswers each time.

The negotiation collapse

The structural issue

Separate concerns.Hash everything.

Built on boringinfrastructure.

// FRONTEND

// API LAYER

// DATASET

// PRICE SYNC

// HASH FUNCTION

// COMPLIANCE

Why I built this.

For verification.

GitHub Repository

engrXiv Preprint

ORCID Profile

Run a diagnostic.

For developers

For curious humans

Tamper-proof AI
for construction
quotes.

LLMs give different
answers each time.

Separate concerns.
Hash everything.

Built on boring
infrastructure.