Self-Learning
Self-Learning Model
Seshat's scoring weights are not static. Every Sunday, a ridge regression model trains on the previous week's labelled outcomes and updates all dimension weights. The model gets objectively better with each week of data.
Why ridge regression
Ridge regression (L2-regularized linear regression) is the right choice for Seshat because:
Interpretable weights
Each coefficient directly maps to a scoring dimension, making it easy to audit why weights changed.
Handles collinearity
Developer score dimensions are correlated (active devs often have smart followers). Ridge handles this without overfitting.
Zero dependencies
The entire implementation is vanilla TypeScript — no numpy, no scikit-learn, runs in a Cloudflare Worker.
Bounded outputs
After training, weights are clipped to [0.05, 0.60] per dimension to prevent any single signal from dominating.
Training data
The model trains on rows from learning_outcomes, which are joined to their corresponding composite scores. Outcomes are manually labelled (or auto-labelled based on price action) as one of five categories.
CREATE TABLE learning_outcomes (
id TEXT PRIMARY KEY,
composite_score_id TEXT NOT NULL, -- FK → composite_scores
token_id TEXT NOT NULL,
outcome_label TEXT NOT NULL, -- see below
outcome_notes TEXT,
labelled_at INTEGER NOT NULL,
labelled_by TEXT NOT NULL -- "auto" | "manual"
);
-- outcome_label values:
-- "strong_positive" → token performed very well (→ 1.0)
-- "mild_positive" → token performed OK (→ 0.7)
-- "neutral" → no significant movement (→ 0.5)
-- "mild_negative" → token declined (→ 0.3)
-- "dump" → rug / hard dump (→ 0.0)
const LABEL_TO_NUMERIC: Record<string, number> = {
strong_positive: 1.0,
mild_positive: 0.7,
neutral: 0.5,
mild_negative: 0.3,
dump: 0.0,
};Feature matrix
Each training sample is a vector of four normalized dimension scores from the composite scorer at the time the token was evaluated. The target variable is the numeric outcome.
// X matrix columns (features): // [dev_composite, fee_score, tech_score, larp_inverse] // where larp_inverse = 1 - larp_probability // Y vector (targets): numeric outcomes 0.0–1.0 // Example training set: const X = [ [0.91, 0.72, 0.84, 0.96], // AIFORGE — strong_positive → 1.0 [0.45, 0.30, 0.55, 0.60], // RUGTOK — dump → 0.0 [0.75, 0.88, 0.62, 0.91], // BASEMIND — mild_positive → 0.7 // ... 30+ samples ]; const Y = [1.0, 0.0, 0.7, ...];
Ridge regression implementation
The solver uses Gaussian elimination to solve the normal equations directly — no external ML library required. Alpha (regularization strength) is set to 0.01.
// learning.ts — vanilla TypeScript ridge regression
const ALPHA = 0.01; // regularization strength
function ridgeRegression(
X: number[][], // (n_samples, n_features)
Y: number[], // (n_samples,)
alpha: number
): number[] {
const n = X.length;
const p = X[0].length;
// Compute X^T X + alpha * I (normal equations with L2 penalty)
const XtX = Array.from({ length: p }, () => Array(p).fill(0));
const XtY = Array(p).fill(0);
for (let i = 0; i < n; i++) {
for (let j = 0; j < p; j++) {
XtY[j] += X[i][j] * Y[i];
for (let k = 0; k < p; k++) {
XtX[j][k] += X[i][j] * X[i][k];
}
}
}
// Add ridge penalty to diagonal
for (let j = 0; j < p; j++) XtX[j][j] += alpha;
// Solve via Gaussian elimination
return gaussianElimination(XtX, XtY);
}
function postProcessWeights(raw: number[]): number[] {
// Clip each weight to [0.05, 0.60]
const clipped = raw.map(w => Math.max(0.05, Math.min(0.60, w)));
// Normalize so weights sum to 1.0
const sum = clipped.reduce((a, b) => a + b, 0);
return clipped.map(w => w / sum);
}Weight update flow
// Every Sunday at 02:00 UTC
async function runWeightUpdate(env: Env) {
const db = env.DB;
// 1. Pull labelled outcomes (min 30 samples required)
const outcomes = await db.prepare(`
SELECT cs.dev_composite, cs.fee_score, cs.tech_score,
cs.larp_probability, lo.outcome_label
FROM learning_outcomes lo
JOIN composite_scores cs ON lo.composite_score_id = cs.id
WHERE lo.labelled_at > ?
ORDER BY lo.labelled_at DESC
LIMIT 200
`).bind(Date.now() / 1000 - 90 * 86400).all(); // last 90 days
if (outcomes.results.length < 30) {
console.log("Insufficient training data — skipping weight update");
return;
}
// 2. Build X / Y
const X = outcomes.results.map(r => [
r.dev_composite / 100,
r.fee_score / 100,
r.tech_score / 100,
1 - r.larp_probability, // larp_inverse
]);
const Y = outcomes.results.map(r =>
LABEL_TO_NUMERIC[r.outcome_label as string] ?? 0.5
);
// 3. Train
const rawWeights = ridgeRegression(X, Y, ALPHA);
const weights = postProcessWeights(rawWeights);
// 4. Compute R² on training set
const predicted = X.map(row =>
row.reduce((sum, xi, i) => sum + xi * weights[i], 0)
);
const rSquared = computeRSquared(Y, predicted);
// 5. Deactivate old weights, insert new version
await db.prepare("UPDATE scoring_weights SET active = 0").run();
await db.prepare(`
INSERT INTO scoring_weights (
id, w_dev, w_fee, w_tech, w_larp_inverse,
r_squared, training_samples, active, created_at
) VALUES (?, ?, ?, ?, ?, ?, ?, 1, ?)
`).bind(
crypto.randomUUID(),
weights[0], weights[1], weights[2], weights[3],
rSquared, outcomes.results.length,
Math.floor(Date.now() / 1000)
).run();
console.log(`Weight update v${version}: R²=${rSquared.toFixed(3)}`);
}R² tracking
R² (coefficient of determination) measures how well the model explains outcome variance. An R² of 0.71 means 71% of the variation in outcomes is explained by the four scoring dimensions. This is tracked over model versions to ensure improvement over time.
function computeRSquared(actual: number[], predicted: number[]): number {
const mean = actual.reduce((a, b) => a + b, 0) / actual.length;
const ssTot = actual.reduce((s, y) => s + (y - mean) ** 2, 0);
const ssRes = actual.reduce((s, y, i) => s + (y - predicted[i]) ** 2, 0);
return 1 - ssRes / ssTot;
}
// Model version history stored in scoring_weights table:
CREATE TABLE scoring_weights (
id TEXT PRIMARY KEY,
w_dev REAL NOT NULL,
w_fee REAL NOT NULL,
w_tech REAL NOT NULL,
w_larp_inverse REAL NOT NULL,
r_squared REAL NOT NULL,
training_samples INTEGER NOT NULL,
active INTEGER DEFAULT 0,
created_at INTEGER NOT NULL
);Current weights (v12)
Developer composite
0.34Highest weight — dev quality most predictive
Fee score
0.28Unclaimed fees correlate with creator engagement
Tech score
0.24Novelty matters but less than dev credibility
LARP inverse
0.14Hard floor — LARPs almost always dump