Hardagent — Sovereign Silicon for Open Inference

§ 01 — The accelerators

Three dies. Each one embodies a model.

We do not ship a single part. We compile a model class into the silicon that runs it best — a dense transformer engine, a feature-matching fabric, a fused multimodal pipeline.

HA-L

Large Language Models

transformer · dense

SRAM-heavy decode, HBM-heavy prefill. Bit-exact against your frozen checkpoint.

HA-T

Tversky Neural Nets

feature-matching · similarity

Asymmetric comparison fabric for retrieval, ranking and similarity at scale.

HA-M

Multimodal

vision · audio · text

Fused front-end pipeline that keeps tokens on-die across modalities.

§ 02 — Portfolio

One thesis. Three ways in.

Full custom is the flagship — but it shouldn't be the only door. A hybrid portfolio broadens the funnel and turns one-off tape-outs into recurring revenue.

Reference · MPW

Standard

Shared mask · fastest path

Pre-validated reference designs for popular model families
Multi-project wafer slots lower the entry cost dramatically
Single-appliance minimum — built for teams, not just enterprises
Quarterly derivative updates as models refresh

Full custom

Bespoke

Your checkpoint, your die

Workload-profiled floorplan sized to your real traffic
Bit-exact determinism against your frozen reference
Sealed enclosure, network isolation, source-code escrow
Rack-to-fab-line scale with custody and warranty silicon

Analog tier

Quiet

Power-constrained · edge

Compute-in-memory dies for always-on, low-power inference
Frozen weights embodied directly in the analog fabric
Built on specialty / mature nodes — off the leading-edge queue
For edge, defense, automotive and air-gapped deployments

§ 03 — Analog & photonic

Past the digital wall: compute in the physics.

Matrix multiply is most of inference, and matrix multiply is something physics will do for nearly free. Analog in-memory and photonic substrates trade a little precision for order-of-magnitude efficiency — and they don't compete for leading-edge foundry slots.

Custom · weights-as-fabric

Embodied analog ASIC

Your frozen weights aren't loaded onto the chip — they are the chip. Charge-domain or memristor in-memory compute holds the model in place, so a sealed unit answers without ever moving a weight across a bus. The "weights are the program" thesis, taken to its physical conclusion.

Efficiency envelope: 10–100× vs. digital on MAC-bound work Effective precision: 4–8 bit, calibrated Best for: edge · always-on · air-gapped sealed appliances

General purpose · reconfigurable

Analog hypercompute fabric

A field-programmable analog array — FPGA-like, but the cells do math in the analog domain — paired with a digital control plane for precision and orchestration. One platform for many models, plus scientific simulation, optimization and physics-informed AI beyond LLMs.

Architecture: hybrid digital ⇄ analog, calibration in-loop Workloads: inference · simulation · optimization Substrate: specialty CMOS & thin-film photonics · GlobalFoundries

Industry context, not a Hardagent claim: photonic analog processors have demonstrated ~99.7% inference precision with up to ~30× lower energy in independent labs, and analog in-memory accelerators are now shipping at petaops-class density in single-digit-watt envelopes. We treat these as the efficiency ceiling we engineer toward, validated against first silicon before any number ships on a spec sheet.

§ 04 — Thesis

General-purpose silicon is paying for flexibility you don't need.

The weights are the program.

Open-weight models are fully specified, frozen, and yours to embody in any substrate — digital, analog, or optical.

Inference is not training.

Training needs gradient flow and experimentation. Inference needs one numerical recipe at maximum efficiency. Different problem, different machine.

Sealed is the new private.

If the model fits in a sealed appliance with no network egress, privacy stops being a policy and becomes a wiring diagram.

One supplier is one failure.

The cloud was a single point of pricing. A single foundry is a single point of geopolitics. Ownership means neither dependency holds your roadmap hostage.

§ 05 — Foundry resilience

Designed once. Built anywhere.

Leading-edge capacity is the tightest constraint in the industry: advanced packaging is booked out past a year and HBM is sold out for the year. A single-foundry roadmap is a liability we engineer away — node-agnostic RTL, chiplets, and qualification on parallel fabs.

Path	Role in the stack	Status
TSMC N6 / N5 → N2	Primary compute die, flagship builds	Lead path
Samsung SF2	Second-source leading edge for compute dies	Qualifying
Intel 18A	North-American supply alternative, sub-2nm	Qualifying
GlobalFoundries / SMIC	I/O, PMIC, support dies on mature nodes	Available
Specialty / TF-LiNbO₃	Analog CIM & photonic fabric	Partner-fab
Packaging — CoWoS / EMIB / FOPLP / OSAT	2.5D multi-die + HBM integration, de-risked across vendors	Multi-source

Resilience playbook → node-agnostic RTL and standard-cell discipline so a design retargets in months, not years · chiplets to mix nodes and isolate the leading-edge die from I/O · packaging diversification beyond a single CoWoS queue · multi-sourced HBM with capacity reservations · fallback designs on older nodes for continuity · geographic spread across friendly fabs. Qualifying a new foundry takes 12–24 months — which is exactly why parallel qualification starts now.

§ 06 — Process

From open weights to sealed silicon, in seven steps.

/ model

Model selection & freeze.

Pick a model, a quantization, a context window. That checkpoint is now your product.

/ profile

Workload profiling.

We instrument your real traffic — token mixes, batch shapes, latency targets. Silicon sized to the workload, not the spec sheet.

/ floorplan

Architecture & substrate.

Layer mapping, dataflow, memory hierarchy — and a digital / analog / photonic substrate decision driven by the power budget.

/ rtl

RTL, retarget & verification.

Bit-exact match against your frozen reference, written portable so it can tape out on more than one foundry.

/ tape-out

Tape-out & first silicon.

Mask set, MPW or full reticle, first wafers — on the qualified fab with available capacity.

/ appliance

Appliance integration.

Board, chassis, firmware, network isolation. A sealed unit that boots your model and serves nothing else.

/ deliver

Delivery, custody & lifecycle.

Bonded transport, tamper-evident seals, source-code escrow — plus an update path for the day your checkpoint moves.

§ 07 — Software & services

The chip is the start. The relationship is the business.

Silicon is a one-time sale; lifecycle is recurring revenue. We wrap the appliance in the services that keep it current, trusted, and integrated.

⟳

Model update service

Re-tapeout or firmware path when your checkpoint advances. Your hardware doesn't go stale.

⌘

Verified model marketplace

A catalogue of open models pre-optimized and validated for the Hardagent fabric.

⚿

Security & audit

Independent audits, escrow, and attestation for the regulated and the sovereign.

⊞

Turnkey integration

Cooling, networking, orchestration — racks and clusters that drop into your facility.

Open peripheral IP — interfaces and firmware — to seed a developer community, the way RISC-V grew, while the core compute fabric stays proprietary.

§ 08 — The math

Why fixed-function beats general-purpose on inference.

We delete every transistor that does not multiply a weight or move a token — and the analog line deletes the bus itself.

Lead processTSMC N6 (N5 / N2 opt.)

Second sourceSamsung SF2 · Intel 18A

Die area815 mm²

Transistor count53 B

On-package memory192 GB HBM3e

Memory bandwidth5.2 TB/s

Peak throughput1.8 PFLOPS FP8

TDP700 W

Appliance power2.5 kW (8-card)

First siliconQ3 2026 (target)

Metric	GPU baseline	HA-01 · DeepSeek build	Δ
Inference-relevant die area	~35%	~92%	2.6×
Tokens / sec / watt (batch=1)	0.45	14.2	31×
Idle power (model resident)	~280 W	~40 W	7×
Time-to-first-token, 32k ctx	2.4 s	0.31 s	7.7×
5-yr TCO (1B tok/day)	$18.2 M	$3.9 M	4.7×

Figures are simulation targets for the HA-01 reference design; independent silicon validation pending first wafers. Baseline: H100 SXM5 at MLPerf Inference reference settings. Analog-tier efficiency figures are engineering targets, not yet measured silicon.

§ 09 — Engage

If your inference budget exceeds your headcount, we should talk.

Beyond the flagship cohort, we now meet you where your balance sheet is — buy, lease, or co-fund capacity. Sovereign-AI and on-prem hyperscaler programs welcome.

Request workload review Join the team

Cohort — four custom customers per fabrication run · MPW reference tier — single-appliance minimum
Commercials — purchase · multi-year lease · financed appliances · customer-funded fab capacity
Programs — sovereign AI · on-prem for hyperscalers · regulated & air-gapped · defense & automotive edge

§ 10 — Careers

Wanted: people who have shipped real silicon.

Physical-design, RTL, packaging, analog and firmware engineers with first-silicon experience. Equity is generous; optics are not.

RTL

Senior RTL — Tensor Core

Own the MAC array and dataflow. SystemVerilog; formal a plus.

Physical Design Lead

Floorplan, P&R, timing on an 800+ mm² die. Bring war stories.

ANALOG

Mixed-Signal / CIM

Charge-domain compute-in-memory and calibration circuits. Rare skill, welcome here.

PKG

Packaging & Substrate

2.5D across CoWoS/EMIB/OSAT, HBM stacking, thermal. Talk to foundries directly.