Seshat

Documentation

Self-Learning

Self-Learning Model

Seshat's scoring weights are not static. Every Sunday, a ridge regression model trains on the previous week's labelled outcomes and updates all dimension weights. The model gets objectively better with each week of data.

Cadence: weekly (Sunday)Current: v12 · R² = 0.71

Why ridge regression

Ridge regression (L2-regularized linear regression) is the right choice for Seshat because:

Interpretable weights

Each coefficient directly maps to a scoring dimension, making it easy to audit why weights changed.

Handles collinearity

Developer score dimensions are correlated (active devs often have smart followers). Ridge handles this without overfitting.

Zero dependencies

The entire implementation is vanilla TypeScript — no numpy, no scikit-learn, runs in a Cloudflare Worker.

Bounded outputs

After training, weights are clipped to [0.05, 0.60] per dimension to prevent any single signal from dominating.

Training data

The model trains on rows from learning_outcomes, which are joined to their corresponding composite scores. Outcomes are manually labelled (or auto-labelled based on price action) as one of five categories.

CREATE TABLE learning_outcomes (
  id                 TEXT PRIMARY KEY,
  composite_score_id TEXT NOT NULL,  -- FK → composite_scores
  token_id           TEXT NOT NULL,
  outcome_label      TEXT NOT NULL,  -- see below
  outcome_notes      TEXT,
  labelled_at        INTEGER NOT NULL,
  labelled_by        TEXT NOT NULL   -- "auto" | "manual"
);

-- outcome_label values:
-- "strong_positive" → token performed very well  (→ 1.0)
-- "mild_positive"   → token performed OK         (→ 0.7)
-- "neutral"         → no significant movement    (→ 0.5)
-- "mild_negative"   → token declined             (→ 0.3)
-- "dump"            → rug / hard dump            (→ 0.0)

const LABEL_TO_NUMERIC: Record<string, number> = {
  strong_positive: 1.0,
  mild_positive:   0.7,
  neutral:         0.5,
  mild_negative:   0.3,
  dump:            0.0,
};

Feature matrix

Each training sample is a vector of four normalized dimension scores from the composite scorer at the time the token was evaluated. The target variable is the numeric outcome.

// X matrix columns (features):
// [dev_composite, fee_score, tech_score, larp_inverse]
// where larp_inverse = 1 - larp_probability

// Y vector (targets): numeric outcomes 0.0–1.0

// Example training set:
const X = [
  [0.91, 0.72, 0.84, 0.96],  // AIFORGE — strong_positive → 1.0
  [0.45, 0.30, 0.55, 0.60],  // RUGTOK  — dump           → 0.0
  [0.75, 0.88, 0.62, 0.91],  // BASEMIND — mild_positive → 0.7
  // ... 30+ samples
];
const Y = [1.0, 0.0, 0.7, ...];

Ridge regression implementation

The solver uses Gaussian elimination to solve the normal equations directly — no external ML library required. Alpha (regularization strength) is set to 0.01.

// learning.ts — vanilla TypeScript ridge regression
const ALPHA = 0.01; // regularization strength

function ridgeRegression(
  X: number[][],   // (n_samples, n_features)
  Y: number[],     // (n_samples,)
  alpha: number
): number[] {
  const n = X.length;
  const p = X[0].length;

  // Compute X^T X + alpha * I  (normal equations with L2 penalty)
  const XtX = Array.from({ length: p }, () => Array(p).fill(0));
  const XtY = Array(p).fill(0);

  for (let i = 0; i < n; i++) {
    for (let j = 0; j < p; j++) {
      XtY[j] += X[i][j] * Y[i];
      for (let k = 0; k < p; k++) {
        XtX[j][k] += X[i][j] * X[i][k];
      }
    }
  }
  // Add ridge penalty to diagonal
  for (let j = 0; j < p; j++) XtX[j][j] += alpha;

  // Solve via Gaussian elimination
  return gaussianElimination(XtX, XtY);
}

function postProcessWeights(raw: number[]): number[] {
  // Clip each weight to [0.05, 0.60]
  const clipped = raw.map(w => Math.max(0.05, Math.min(0.60, w)));
  // Normalize so weights sum to 1.0
  const sum = clipped.reduce((a, b) => a + b, 0);
  return clipped.map(w => w / sum);
}

Weight update flow

// Every Sunday at 02:00 UTC
async function runWeightUpdate(env: Env) {
  const db = env.DB;

  // 1. Pull labelled outcomes (min 30 samples required)
  const outcomes = await db.prepare(`
    SELECT cs.dev_composite, cs.fee_score, cs.tech_score,
           cs.larp_probability, lo.outcome_label
    FROM learning_outcomes lo
    JOIN composite_scores cs ON lo.composite_score_id = cs.id
    WHERE lo.labelled_at > ?
    ORDER BY lo.labelled_at DESC
    LIMIT 200
  `).bind(Date.now() / 1000 - 90 * 86400).all(); // last 90 days

  if (outcomes.results.length < 30) {
    console.log("Insufficient training data — skipping weight update");
    return;
  }

  // 2. Build X / Y
  const X = outcomes.results.map(r => [
    r.dev_composite  / 100,
    r.fee_score      / 100,
    r.tech_score     / 100,
    1 - r.larp_probability,   // larp_inverse
  ]);
  const Y = outcomes.results.map(r =>
    LABEL_TO_NUMERIC[r.outcome_label as string] ?? 0.5
  );

  // 3. Train
  const rawWeights = ridgeRegression(X, Y, ALPHA);
  const weights    = postProcessWeights(rawWeights);

  // 4. Compute R² on training set
  const predicted = X.map(row =>
    row.reduce((sum, xi, i) => sum + xi * weights[i], 0)
  );
  const rSquared = computeRSquared(Y, predicted);

  // 5. Deactivate old weights, insert new version
  await db.prepare("UPDATE scoring_weights SET active = 0").run();
  await db.prepare(`
    INSERT INTO scoring_weights (
      id, w_dev, w_fee, w_tech, w_larp_inverse,
      r_squared, training_samples, active, created_at
    ) VALUES (?, ?, ?, ?, ?, ?, ?, 1, ?)
  `).bind(
    crypto.randomUUID(),
    weights[0], weights[1], weights[2], weights[3],
    rSquared, outcomes.results.length,
    Math.floor(Date.now() / 1000)
  ).run();

  console.log(`Weight update v${version}: R²=${rSquared.toFixed(3)}`);
}

R² tracking

R² (coefficient of determination) measures how well the model explains outcome variance. An R² of 0.71 means 71% of the variation in outcomes is explained by the four scoring dimensions. This is tracked over model versions to ensure improvement over time.

function computeRSquared(actual: number[], predicted: number[]): number {
  const mean    = actual.reduce((a, b) => a + b, 0) / actual.length;
  const ssTot   = actual.reduce((s, y) => s + (y - mean) ** 2, 0);
  const ssRes   = actual.reduce((s, y, i) => s + (y - predicted[i]) ** 2, 0);
  return 1 - ssRes / ssTot;
}

// Model version history stored in scoring_weights table:
CREATE TABLE scoring_weights (
  id               TEXT PRIMARY KEY,
  w_dev            REAL NOT NULL,
  w_fee            REAL NOT NULL,
  w_tech           REAL NOT NULL,
  w_larp_inverse   REAL NOT NULL,
  r_squared        REAL NOT NULL,
  training_samples INTEGER NOT NULL,
  active           INTEGER DEFAULT 0,
  created_at       INTEGER NOT NULL
);

Current weights (v12)

Developer composite

0.34

Highest weight — dev quality most predictive

Fee score

0.28

Unclaimed fees correlate with creator engagement

Tech score

0.24

Novelty matters but less than dev credibility

LARP inverse

0.14

Hard floor — LARPs almost always dump