How Scoring Works (Overview)
A SaferSkills score is a single 0–100 number computed deterministically from five weighted sub-scores, then bucketed into one of four color bands. It is methodology, not opinion: every point of penalty traces to a documented rule and a quotable line of evidence, with no LLM in the verdict path. This page is the overview; the full methodology carries the exact math.
What are the five sub-scores?
Section titled “What are the five sub-scores?”Every scan produces five sub-scores, each scored 0–100, then combined by a fixed weighted sum:
| Sub-score | Weight | What it catches |
|---|---|---|
| Security | 35% | Prompt injection, obfuscation, dangerous shell, credential exfiltration |
| Supply Chain | 20% | Typosquat, owner-transfer, hash drift, unsigned releases |
| Maintenance | 15% | Commit recency, commit frequency, issue-response time, CI health |
| Transparency | 15% | LICENSE / README / CHANGELOG / SECURITY.md / manifest presence |
| Community | 15% | Stars, contributors, cross-registry presence, fork health |
Security carries the most weight because it catches the threats that actually compromise a machine — but the weight alone does not let security dominate, which the severity ceiling (below) corrects.
How is the aggregate calculated?
Section titled “How is the aggregate calculated?”Each finding carries a severity tier with a fixed penalty: info 0 · low 5 · medium 12 · high 25 · critical 40. A sub-score is max(0, 100 − Σ penalties). The aggregate score is the rounded weighted sum:
aggregate = round(0.35·security + 0.20·supply + 0.15·maintenance + 0.15·transparency + 0.15·community)A pure weighted sum would let a critical security flaw hide behind good docs and a healthy star count, so a severity ceiling caps the whole aggregate by the worst active finding: one active critical caps it at ≤15, one active high caps it at ≤45. This is why a serious flaw can never be diluted by the 65% of weight that sits outside Security. The exact step-by-step math — per-finding penalty, running sub-score, weighted aggregate, ceiling application — is in the detailed methodology, and every public report renders it inline.
What do the color bands mean?
Section titled “What do the color bands mean?”The aggregate buckets into four color bands:
| Band | Range | Label |
|---|---|---|
| Green | 80–100 | Approved |
| Yellow | 60–79 | Watch |
| Orange | 40–59 | Caution |
| Red | 0–39 | Block |
A band is a reading aid, not a verdict on you. SaferSkills publishes methodology, not endorsements: a Red score means review the findings before use, not “never use this.” You decide; the score tells you where to look first.
Why is it deterministic?
Section titled “Why is it deterministic?”Same input, same score — byte-for-byte. There is no model, no random seed, and no temperature anywhere in the verdict path. Every scan stamps three identifiers: rubric_version (the git SHA of the rule set), engine_version (the scan engine), and the scan run’s ref_sha (the scanned commit, or content_hash_sha256 for an upload). A vendor can check out those exact versions and re-derive any historical verdict offline, without SaferSkills’ participation. That reproducibility is what makes the vendor right-of-reply meaningful.
Where do I go deeper?
Section titled “Where do I go deeper?”- How scoring works — the full methodology — the complete scoring contract.
- The five sub-scores — what each axis measures and how it is penalized.
- Detection categories — the closed set of rule categories.
- Glossary — every term used above.