ML-model-01 / Linear A

Reading function before language.

A structural-deductive lab for Linear A administrative documents: role assignment, invariant templates, and arithmetic validation without claiming phonetic decipherment.

Functional reconstruction No phonetic decipherment claim Constraint-first audit
Central move Function before language

The model asks what a record does as an administrative artifact before asking what its words sounded like.

Public caution Not decipherment

Role labels such as AG, EN, QT, and OP are structural functions, not proposed phonetic values or lexical translations.

Verification rule Arithmetic can veto

A plausible template is rejected or downgraded when totals, residuals, or damage policy do not support it.

51 Dwork documents
68 Dsum documents
417 Normalized lines
312 Lines with numerals
274 Template-matched lines
85 Lines with UNK
58 Excluded damaged lines
Inventory register

KU-RO GRA 12

Template match

AG-EN-QT

Actor or office, commodity field, and quantity.

Arithmetic validation

total-type

12 = 12
Residual 0

Exact arithmetic validation: the recorded total equals the sum of item quantities.

Role system

ML-model-01 makes every token claim pass through a role test.

AG

Agent

A sign-group behaving like an administrative actor, source, person, office, or responsible unit.

82 token types 179 lines
EN

Commodity

A commodity-like field or item class connected to lists, logograms, and numeric accounting contexts.

95 token types 232 lines
QT

Quantity

Numeral tokens assigned by membership first, then used as strict arithmetic evidence.

149 token types 312 lines
OP

Operation

A balancing, summary, or operation context used to classify residuals and accounting behavior.

43 token types 74 lines
UNK

Unknown

A conservative bucket for near-threshold or structurally unstable tokens.

64 token types 85 lines

Audit cockpit

The brilliance of the model is that it knows when not to read.

Selected audit node 01

Corpus split

Claim

Dwork and Dsum are deliberately separated, so template inference and arithmetic evaluation do not contaminate each other.

Evidence

Dwork has 51 template-grade documents; Dsum has 68 summary-bearing documents, including 17 records not used for template extraction.

Risk controlled

If the same unstable documents trained and validated the model, the result would look stronger than it is.

Verdict Strong design choice

Template concentration

Three invariant templates account for most matched administrative lines.

Among 274 fully matched lines, the top three templates account for 204 lines, or 74.4% of matched structure in the reported working corpus.

AG-EN-QT 94 lines / 22.5%

Actor or office, commodity field, and quantity: the core administrative register shape.

EN-QT-OP 62 lines / 14.9%

Commodity, quantity, and operation marker: likely summary or balancing context.

AG-EN-QT-OP 48 lines / 11.5%

Full administrative sequence: responsible unit, item, amount, and operation state.

Threshold stability

Raising thresholds reduces commitments and increases unknowns.

theta AG EN OP UNK
0.60 104 113 67 31
0.70 91 103 54 47
0.75 82 95 43 64
0.80 71 83 32 78
0.85 59 70 24 91

Dsum arithmetic outcomes

Classification is allowed to fail instead of forcing a reading.

49 Exact validation total-type
11 Structured residual balance-type
8 Unstructured mismatch unverifiable

Failure modes

A serious decipherment-adjacent system must publish its refusal rules.

Severe damage

More than 30% unreadable/uncertain token positions, damaged numeral field, or missing segmentation boundary.

Exclude from template inference; retain only low-confidence descriptive evidence where appropriate.

Ambiguous segmentation

Multiple plausible line segmentations satisfy local tokenization rules.

Enumerate alternatives and classify only if one dominates by constraint satisfaction.

Arithmetic inconsistency

Recorded total does not match item sums and no operator context licenses the residual.

Mark as unverifiable instead of forcing an administrative reading.

Role instability

A token behaves differently across archives or falls near thresholds under sensitivity checks.

Keep as UNK or low-confidence; exclude from invariant template claims.

Method pipeline

Constraint-first inference keeps the model from becoming speculative decipherment.

  1. Tokenize and normalize documents into line-level records.
  2. Compute positional bias and numeral adjacency in Dwork.
  3. Assign primary roles by thresholded constraints: QT, OP, AG, EN, UNK.
  4. Extract invariant role templates from stable non-UNK roles.
  5. Promote limited UNK tokens once using high-support slot evidence.
  6. Validate totals, residuals, and balance states on Dsum.

Uploaded source

ML-model-01: structural-deductive reconstruction of Minoan administration.

The public page presents the work as a functional reconstruction model: administrative roles, invariant templates, threshold sensitivity, ambiguity control, and arithmetic validation. It does not present phonetic decipherment as settled.

Open PDF
Corpus basis

GORILA, Younger transliterations, and sigLA sign control.

Default thresholds

theta_AG 0.75, theta_OP 0.80, theta_EN 0.75, support minimum 10.

Evaluation split

Dwork for templates; Dsum for arithmetic outcomes.

Public posture

Functional reconstruction only; no lexical or phonetic claim is forced.