日本語 →
// HORIZON SHIELD · 2026

Tamper-proof AI
for construction
quotes.

A deterministic cost diagnostic system. Same input, same output, every time — verified by a 12-character SHA-256 audit hash printed on every report.

Built by a 30-year carpenter to fight construction fraud in Japan.

Open dataset · CC-BY 4.0 // engrXiv DOI: 10.31224/7007 // EU AI Act-aligned
// THE PROBLEM

LLMs give different
answers each time.

That's a problem when a homeowner is negotiating a $20,000 renovation quote with a contractor. The AI's "estimate" becomes worthless at the exact moment it's needed most.

The negotiation collapse

A homeowner gets a $22,000 quote for exterior repainting. They suspect overcharging. They ask ChatGPT for a fair price range. ChatGPT says "around $9,500–$14,500."

The homeowner shows this to the contractor. The contractor opens ChatGPT, asks the same question, gets "around $11,000–$16,000." Different answer.

The contractor laughs: "AI numbers can't be trusted. We've been in business 30 years." The diagnostic is dead.

The structural issue

This isn't a prompt engineering problem. It's an architectural one. LLMs are probabilistic by design. The same query, run twice, produces different completions — that's how they work.

For creative writing or code generation, this is fine. For adversarial settings — where two parties with opposing interests need a shared, verifiable reference — it's catastrophic.

I needed a diagnostic that any party could re-derive on the spot, byte-for-byte identical, and verify with a single glance.

// Contractor's actual rebuttal:
"Ask the AI again. You'll get a different number. So which one is right? Yours, or mine? Neither. That's why you should trust me — I've been in this business 30 years."
// THE ARCHITECTURE

Separate concerns.
Hash everything.

The fix turned out to be straightforward once I stopped trying to make the LLM deterministic and instead removed the LLM from the deterministic path entirely.

01
LLM does intent classification only Natural language ("外壁塗装 30坪") → canonical category key ("paint_exterior_30tsubo")
02
Deterministic formula computes pricing 87 plans × 7 categories, static dataset (CC-BY 4.0). Pure function of (region, category, year, month).
03
War Price Coefficient applied Monthly correction factor from Bank of Japan CGPI + manufacturer announcements. Currently ×1.0935 (Iran/Hormuz oil shock).
04
SHA-256 hash of canonical inputs hash(region + category + quote + benchmark + year-month) → first 12 hex chars on PDF.
// AUDIT HASH ON SAMPLE PDF
538811dcd9fd
Any party with the same inputs derives the same hash. The contractor can verify it in 30 seconds, on a public chat interface, in front of the homeowner. No "trust me" required.
// TECH STACK

Built on boring
infrastructure.

// FRONTEND

Static HTML/CSS/JS. No framework. Hosted on Cloudflare Pages. Total LP weight: ~80KB. Loads in <200ms even on 3G.

// API LAYER

Cloudflare Workers (16 endpoints). LLM proxy uses Claude Sonnet 4 for intent classification only — no pricing logic touches the LLM.

// DATASET

JSON files in GitHub (data/zaisai-db*.json). 3,350 material items, 87 standard renovation plans, 7 work categories. CC-BY 4.0.

// PRICE SYNC

Cron-triggered Worker pulls Bank of Japan CGPI + manufacturer announcements monthly. Computes War Price Coefficient. Human-in-the-loop approval before publishing.

// HASH FUNCTION

SHA-256(canonical_input_string), first 12 hex characters truncated. Stable across regenerations. Implementation: ~20 lines in JavaScript.

// COMPLIANCE

Designed against EU AI Act Art.12 (record-keeping) and Art.14 (human oversight), even though Japanese consumer-protection AI isn't currently in Annex III scope. Felt right for adversarial settings.

// AUTHOR

Why I built this.

I'm Toshikatsu Oga. I'm 48. I started as a carpenter's apprentice in Osaka at 15, in 1993.

I spent 30 years on construction sites — first as a carpenter, then site supervisor, then construction manager (the role where you represent the homeowner against contractors). I've watched homeowners get systematically overcharged for three decades. 20% margin overcharges are normal. 50% happens. 100% isn't rare.

The Japanese term for the worst pattern is "一式" (isshiki) — a single line item meaning "everything bundled together." It's how contractors hide $5,000–$10,000 in margin under one opaque number. There's no legal protection. Architects represent contractors. Lawyers don't know construction. Consumer protection agencies don't have technical expertise.

Two years ago, at 46, I started learning to code. Python first, then JavaScript. Then LLMs and Cloudflare Workers. Then cryptography. The goal was simple: build a tool that actually works at the moment of negotiation, not just produces a nice-looking PDF.

The hardest part wasn't the code. It was realizing the LLM's nondeterminism was structural, not fixable, and that I needed to architect around it instead of fighting it.

Honestly, learning to ship code at 46 was harder than learning to frame a house at 16. I still don't know what I'm doing half the time. But the system runs.

30
YEARS
ON SITE
2
YEARS
CODING
87
PLANS
OPEN DATA
3,350
MATERIAL
ITEMS
// RESOURCES

For verification.

Everything is open. Read the paper, audit the dataset, run the code. The whole point is that you don't have to take my word for it.

// CODE + DATA

GitHub Repository

The full dataset (87 plans, 3,350 items), worker source, and hash function implementation. CC-BY 4.0 for data, MIT for code.

github.com/ogasurfproject-jpg/japan-construction-cost-database →
// TECHNICAL PAPER

engrXiv Preprint

"Japan Construction Cost Database: An Open Dataset for LLM-Based Cost Estimation and Fraud Detection." DOI: 10.31224/7007.

engrxiv.org/preprint/view/7007 →
// AUTHOR

ORCID Profile

Author identity verification. ORCID ID: 0009-0000-9180-903X. Linked to engrXiv and Zenodo records.

orcid.org/0009-0000-9180-903X →
// TRY IT

Run a diagnostic.

The live service is in Japanese (the dataset and target users are Japanese), but the architecture is what's portable. The hash will be identical for identical inputs.

For developers

Clone the repo, run the hash function on a sample input, verify the output matches the published hash on the engrXiv paper.

sha256(canonical_input) → first 12 hex chars Clone the repo →

For curious humans

The Japanese live service is at shield.the-horizons-innovation.com. Send a quote photo via LINE, get a diagnostic in 30 seconds with an audit hash printed on the PDF.

LINE: @172piime (Japanese only) See the live service (JP)