Skill discovery and workflow experiments for Codex/OpenClaw-style agent tooling.
  • TypeScript 85.6%
  • JavaScript 14.4%
Find a file
Piotr Durlej 2a332c2726
All checks were successful
CI / check (push) Successful in 23s
Merge pull request #14 from pdurlej/codex/wave1-hardening
feat: wave 1 hardening for SkillSurfer
2026-05-19 15:12:38 +02:00
.github/workflows feat: harden skill manifests and audit reads 2026-05-19 14:21:16 +02:00
docs feat: harden skill manifests and audit reads 2026-05-19 14:21:16 +02:00
fixtures feat: add route benchmark fixture 2026-05-17 18:36:42 +02:00
src feat: harden skill manifests and audit reads 2026-05-19 14:21:16 +02:00
.gitignore chore: initialize skill-surfer scaffold 2026-05-17 18:28:00 +02:00
package.json feat: harden skill manifests and audit reads 2026-05-19 14:21:16 +02:00
pnpm-lock.yaml chore: initialize skill-surfer scaffold 2026-05-17 18:28:00 +02:00
README.md feat: harden skill manifests and audit reads 2026-05-19 14:21:16 +02:00
tsconfig.json feat: harden skill manifests and audit reads 2026-05-19 14:21:16 +02:00
vitest.config.ts chore: initialize skill-surfer scaffold 2026-05-17 18:28:00 +02:00

SkillSurfer

Local-first context resolver for agent skills and MCP capability inventories.

SkillSurfer is not a marketplace, installer, or agent runtime. Its first job is narrower:

Given a user task and the local skill inventory, which reviewed skill or MCP capability should be suggested, previewed, or loaded, and why?

The project is currently a private hardening spike. The intended public shape is a small read-only layer that helps Codex first, then Claude Code and OpenClaw-style agents, orient themselves without giving agents install/update/delete powers.

Current Scope

  • Index local SKILL.md packages into stable manifests.
  • Keep stable skill identity separate from exact content revision hashes.
  • Route tasks before full skill instructions enter agent context.
  • Enforce local policy decisions before SKILL.md reads.
  • Emit audit receipts for route, preview, policy, and read decisions.
  • Keep mutating skill and MCP operations outside the runtime interface.

Non-Goals

  • Public marketplace.
  • Auto-install from the internet.
  • LLM-as-router in the first spike.
  • MCP proxy/tool caller.
  • Runtime sandbox implementation.
  • Automatic execution of skill scripts.
  • Embeddings before deterministic benchmark data proves they are needed.

CLI

pnpm install
pnpm build
pnpm test

Index Codex-style roots:

pnpm build
node dist/cli.js index --profile codex

Route a task:

node dist/cli.js route "review this PR for auth bypass and SSRF" --profile codex

Preview and gated read:

node dist/cli.js preview oracle --profile codex --json
node dist/cli.js read oracle --profile codex --confirm --json

Write an audit log with raw task text redacted:

node dist/cli.js route "review this PR" --audit-dir .skillsurfer-audit

Use --include-task-text only when the audit store is allowed to contain prompt text.

Root Profiles

--profile codex scans Codex-oriented roots, including workspace .agents/skills, user ~/.agents/skills, legacy ~/.codex/skills, and /etc/codex/skills.

--profile claude-code scans Claude Code-oriented roots, including workspace .claude/skills and user ~/.claude/skills.

--profile openclaw scans OpenClaw-oriented roots, including workspace .openclaw/skills, user ~/.openclaw/skills, and shared ~/.agents/skills.

Manual roots default to scope=user and trustTier=user. For unreviewed sources, set these explicitly:

node dist/cli.js index --root ./candidate-skills --scope imported --trust-tier imported

Security Boundary

SkillSurfer gates what SkillSurfer reads and returns. It does not automatically prevent Codex, Claude Code, or another host agent from loading native skills that are already visible in that host's own skill directories.

The recommended operating model is:

~/.skillsurfer/
  inbox/       # not visible to native agents
  reviewed/    # may be symlinked or copied into native skill roots
  rejected/    # never exposed
  audit/       # JSONL decision receipts

Keep unreviewed skills outside native agent roots.