Self-Learning

Self-Learning Model

Seshat's scoring weights are not static. Every Sunday, a ridge regression model trains on the previous week's labelled outcomes and updates all dimension weights. The model gets objectively better with each week of data.

Cadence: weekly (Sunday)Current: v12 · R² = 0.71

Why ridge regression

Ridge regression (L2-regularized linear regression) is the right choice for Seshat because:

✓

Interpretable weights

Each coefficient directly maps to a scoring dimension, making it easy to audit why weights changed.

✓

Handles collinearity

Developer score dimensions are correlated (active devs often have smart followers). Ridge handles this without overfitting.

✓

Zero dependencies

The entire implementation is vanilla TypeScript — no numpy, no scikit-learn, runs in a Cloudflare Worker.

✓

Bounded outputs

After training, weights are clipped to [0.05, 0.60] per dimension to prevent any single signal from dominating.

Training data

The model trains on rows from learning_outcomes, which are joined to their corresponding composite scores. Outcomes are manually labelled (or auto-labelled based on price action) as one of five categories.

CREATE TABLE learning_outcomes (
  id                 TEXT PRIMARY KEY,
  composite_score_id TEXT NOT NULL,  -- FK → composite_scores
  token_id           TEXT NOT NULL,
  outcome_label      TEXT NOT NULL,  -- see below
  outcome_notes      TEXT,
  labelled_at        INTEGER NOT NULL,
  labelled_by        TEXT NOT NULL   -- "auto" | "manual"
);

-- outcome_label values:
-- "strong_positive" → token performed very well  (→ 1.0)
-- "mild_positive"   → token performed OK         (→ 0.7)
-- "neutral"         → no significant movement    (→ 0.5)
-- "mild_negative"   → token declined             (→ 0.3)
-- "dump"            → rug / hard dump            (→ 0.0)

const LABEL_TO_NUMERIC: Record<string, number> = {
  strong_positive: 1.0,
  mild_positive:   0.7,
  neutral:         0.5,
  mild_negative:   0.3,
  dump:            0.0,
};

Feature matrix

Each training sample is a vector of four normalized dimension scores from the composite scorer at the time the token was evaluated. The target variable is the numeric outcome.

// X matrix columns (features):
// [dev_composite, fee_score, tech_score, larp_inverse]
// where larp_inverse = 1 - larp_probability

// Y vector (targets): numeric outcomes 0.0–1.0

// Example training set:
const X = [
  [0.91, 0.72, 0.84, 0.96],  // AIFORGE — strong_positive → 1.0
  [0.45, 0.30, 0.55, 0.60],  // RUGTOK  — dump           → 0.0
  [0.75, 0.88, 0.62, 0.91],  // BASEMIND — mild_positive → 0.7
  // ... 30+ samples
];
const Y = [1.0, 0.0, 0.7, ...];

Ridge regression implementation

The solver uses Gaussian elimination to solve the normal equations directly — no external ML library required. Alpha (regularization strength) is set to 0.01.

// learning.ts — vanilla TypeScript ridge regression
const ALPHA = 0.01; // regularization strength

function ridgeRegression(
  X: number[][],   // (n_samples, n_features)
  Y: number[],     // (n_samples,)
  alpha: number
): number[] {
  const n = X.length;
  const p = X[0].length;

  // Compute X^T X + alpha * I  (normal equations with L2 penalty)
  const XtX = Array.from({ length: p }, () => Array(p).fill(0));
  const XtY = Array(p).fill(0);

  for (let i = 0; i < n; i++) {
    for (let j = 0; j < p; j++) {
      XtY[j] += X[i][j] * Y[i];
      for (let k = 0; k < p; k++) {
        XtX[j][k] += X[i][j] * X[i][k];
      }
    }
  }
  // Add ridge penalty to diagonal
  for (let j = 0; j < p; j++) XtX[j][j] += alpha;

  // Solve via Gaussian elimination
  return gaussianElimination(XtX, XtY);
}

function postProcessWeights(raw: number[]): number[] {
  // Clip each weight to [0.05, 0.60]
  const clipped = raw.map(w => Math.max(0.05, Math.min(0.60, w)));
  // Normalize so weights sum to 1.0
  const sum = clipped.reduce((a, b) => a + b, 0);
  return clipped.map(w => w / sum);
}

Weight update flow

// Every Sunday at 02:00 UTC
async function runWeightUpdate(env: Env) {
  const db = env.DB;

  // 1. Pull labelled outcomes (min 30 samples required)
  const outcomes = await db.prepare(`
    SELECT cs.dev_composite, cs.fee_score, cs.tech_score,
           cs.larp_probability, lo.outcome_label
    FROM learning_outcomes lo
    JOIN composite_scores cs ON lo.composite_score_id = cs.id
    WHERE lo.labelled_at > ?
    ORDER BY lo.labelled_at DESC
    LIMIT 200
  `).bind(Date.now() / 1000 - 90 * 86400).all(); // last 90 days

  if (outcomes.results.length < 30) {
    console.log("Insufficient training data — skipping weight update");
    return;
  }

  // 2. Build X / Y
  const X = outcomes.results.map(r => [
    r.dev_composite  / 100,
    r.fee_score      / 100,
    r.tech_score     / 100,
    1 - r.larp_probability,   // larp_inverse
  ]);
  const Y = outcomes.results.map(r =>
    LABEL_TO_NUMERIC[r.outcome_label as string] ?? 0.5
  );

  // 3. Train
  const rawWeights = ridgeRegression(X, Y, ALPHA);
  const weights    = postProcessWeights(rawWeights);

  // 4. Compute R² on training set
  const predicted = X.map(row =>
    row.reduce((sum, xi, i) => sum + xi * weights[i], 0)
  );
  const rSquared = computeRSquared(Y, predicted);

  // 5. Deactivate old weights, insert new version
  await db.prepare("UPDATE scoring_weights SET active = 0").run();
  await db.prepare(`
    INSERT INTO scoring_weights (
      id, w_dev, w_fee, w_tech, w_larp_inverse,
      r_squared, training_samples, active, created_at
    ) VALUES (?, ?, ?, ?, ?, ?, ?, 1, ?)
  `).bind(
    crypto.randomUUID(),
    weights[0], weights[1], weights[2], weights[3],
    rSquared, outcomes.results.length,
    Math.floor(Date.now() / 1000)
  ).run();

  console.log(`Weight update v${version}: R²=${rSquared.toFixed(3)}`);
}

R² tracking

R² (coefficient of determination) measures how well the model explains outcome variance. An R² of 0.71 means 71% of the variation in outcomes is explained by the four scoring dimensions. This is tracked over model versions to ensure improvement over time.

function computeRSquared(actual: number[], predicted: number[]): number {
  const mean    = actual.reduce((a, b) => a + b, 0) / actual.length;
  const ssTot   = actual.reduce((s, y) => s + (y - mean) ** 2, 0);
  const ssRes   = actual.reduce((s, y, i) => s + (y - predicted[i]) ** 2, 0);
  return 1 - ssRes / ssTot;
}

// Model version history stored in scoring_weights table:
CREATE TABLE scoring_weights (
  id               TEXT PRIMARY KEY,
  w_dev            REAL NOT NULL,
  w_fee            REAL NOT NULL,
  w_tech           REAL NOT NULL,
  w_larp_inverse   REAL NOT NULL,
  r_squared        REAL NOT NULL,
  training_samples INTEGER NOT NULL,
  active           INTEGER DEFAULT 0,
  created_at       INTEGER NOT NULL
);

Current weights (v12)

Developer composite

0.34

Highest weight — dev quality most predictive

Fee score

0.28

Unclaimed fees correlate with creator engagement

Tech score

0.24

Novelty matters but less than dev credibility

LARP inverse

0.14

Hard floor — LARPs almost always dump