Skill discovery and workflow experiments for Codex/OpenClaw-style agent tooling.

agent-tooling dogfooding public-tooling skills tier-2

TypeScript 85.6%
JavaScript 14.4%

Find a file

Piotr Durlej 2a332c2726 All checks were successful CI / check (push) Successful in 23s Details Merge pull request #14 from pdurlej/codex/wave1-hardening feat: wave 1 hardening for SkillSurfer		2026-05-19 15:12:38 +02:00
.github/workflows	feat: harden skill manifests and audit reads	2026-05-19 14:21:16 +02:00
docs	feat: harden skill manifests and audit reads	2026-05-19 14:21:16 +02:00
fixtures	feat: add route benchmark fixture	2026-05-17 18:36:42 +02:00
src	feat: harden skill manifests and audit reads	2026-05-19 14:21:16 +02:00
.gitignore	chore: initialize skill-surfer scaffold	2026-05-17 18:28:00 +02:00
package.json	feat: harden skill manifests and audit reads	2026-05-19 14:21:16 +02:00
pnpm-lock.yaml	chore: initialize skill-surfer scaffold	2026-05-17 18:28:00 +02:00
README.md	feat: harden skill manifests and audit reads	2026-05-19 14:21:16 +02:00
tsconfig.json	feat: harden skill manifests and audit reads	2026-05-19 14:21:16 +02:00
vitest.config.ts	chore: initialize skill-surfer scaffold	2026-05-17 18:28:00 +02:00

README.md

SkillSurfer

Local-first context resolver for agent skills and MCP capability inventories.

SkillSurfer is not a marketplace, installer, or agent runtime. Its first job is narrower:

Given a user task and the local skill inventory, which reviewed skill or MCP capability should be suggested, previewed, or loaded, and why?

The project is currently a private hardening spike. The intended public shape is a small read-only layer that helps Codex first, then Claude Code and OpenClaw-style agents, orient themselves without giving agents install/update/delete powers.

Current Scope

Index local SKILL.md packages into stable manifests.
Keep stable skill identity separate from exact content revision hashes.
Route tasks before full skill instructions enter agent context.
Enforce local policy decisions before SKILL.md reads.
Emit audit receipts for route, preview, policy, and read decisions.
Keep mutating skill and MCP operations outside the runtime interface.

Non-Goals

Public marketplace.
Auto-install from the internet.
LLM-as-router in the first spike.
MCP proxy/tool caller.
Runtime sandbox implementation.
Automatic execution of skill scripts.
Embeddings before deterministic benchmark data proves they are needed.

CLI

pnpm install
pnpm build
pnpm test

Index Codex-style roots:

pnpm build
node dist/cli.js index --profile codex

Route a task:

node dist/cli.js route "review this PR for auth bypass and SSRF" --profile codex

Preview and gated read:

node dist/cli.js preview oracle --profile codex --json
node dist/cli.js read oracle --profile codex --confirm --json

Write an audit log with raw task text redacted:

node dist/cli.js route "review this PR" --audit-dir .skillsurfer-audit

Use --include-task-text only when the audit store is allowed to contain prompt text.

Root Profiles

--profile codex scans Codex-oriented roots, including workspace .agents/skills, user ~/.agents/skills, legacy ~/.codex/skills, and /etc/codex/skills.

--profile claude-code scans Claude Code-oriented roots, including workspace .claude/skills and user ~/.claude/skills.

--profile openclaw scans OpenClaw-oriented roots, including workspace .openclaw/skills, user ~/.openclaw/skills, and shared ~/.agents/skills.

Manual roots default to scope=user and trustTier=user. For unreviewed sources, set these explicitly:

node dist/cli.js index --root ./candidate-skills --scope imported --trust-tier imported

Security Boundary

SkillSurfer gates what SkillSurfer reads and returns. It does not automatically prevent Codex, Claude Code, or another host agent from loading native skills that are already visible in that host's own skill directories.

The recommended operating model is:

~/.skillsurfer/
  inbox/       # not visible to native agents
  reviewed/    # may be symlinked or copied into native skill roots
  rejected/    # never exposed
  audit/       # JSONL decision receipts

Keep unreviewed skills outside native agent roots.