On this site · docs
DocsFeatures
§ 03Engine · Features

Engine features.

Five concepts you'll meet across the CLI and the Action. One engine, one implementation of the rules, every surface.

§ 01 · Learned profile

Every dismissed finding and applied fix is recorded into a per-repo profile stored in .codetitan/learned-profile.json — inside your repo, traveling with the git history. The mechanism is implemented and locally verified; what it accumulates is specific to your codebase, not the language.

What the profile stores:

  • Dismissal patterns — dismiss the same finding three times (--dismiss) and it is auto-suppressed for this repo from then on.
  • Code conventions — the idioms and patterns your existing code uses, recorded per repo.
  • AI-drift baselines — what your code normally looks like, so AI-generated code that diverges gets flagged.

The profile is private to your repo. Never shared, never mined for training data.

§ 02 · PR Risk Score

A 0–100 composite with a letter grade — Risk: 60 (high / C) — that weighs the severity mix of the findings against the repo's learned profile. The same diff scores differently in a repo with history than in a fresh one; that is the point.

It appears in the console output, in report.json as prRiskScore, and in the PR comment the Action posts. Gate on it with --risk-threshold <number> (exit 1 at or above; default 80), or gate on severities directly with --fail-on.

§ 03 · AI-drift detector

AI-generated code drifts. Copilot and Claude produce code that looks like the surrounding file but subtly diverges — a different logger, a newly-added dependency, aparseInt without a radix where the rest of the file uses radix 10.

The drift detector catches:

  • Imports added that your codebase has never imported before
  • Type regressions — any where TS-strict would reject
  • Helper reimplementation — AI rewrites fs.readFile inline instead of using your existing readFile helper
  • Style drift — arrow functions where the file uses named functions, etc.

These are MEDIUM findings — not blocking by default, but visible so you know what the AI changed.

§ 04 · 3-pass cross-file taint analysis

Taint analysis traces untrusted input (HTTP request, cookie, header) through function calls and variable assignments until it reaches a sink (database query, exec, file write). If the path doesn't pass through a sanitizer, you have a data-flow vulnerability.

The three passes:

  1. Pass 1 — identify sources and sinks. Scan every file; list entry points (e.g. req.query, req.body) and dangerous functions (e.g. db.query, child_process.exec).
  2. Pass 2 — build the call graph. Resolve imports across files. Build a directed graph of "this variable flows into that parameter".
  3. Pass 3 — trace. For each source, walk the graph. If a path reaches a sink without going through a sanitizer, emit a finding with the full trace (source file + line → intermediate hops → sink file + line).

The 3-pass approach is what makes cross-file reachability possible. A regex linter seesdb.query(q) in isolation. CodeTitan sees that q was built with untrusted input three files away.

§ 05 · False-positive suppression

False positives destroy trust, so several mechanisms work against them in layers rather than one magic filter:

  • Confidence scoring — every finding carries a 0–100 confidence, shown inline. Filter with --min-confidence.
  • Structural guards — rules carry file- and line-level guards (for example, command-injection taint only fires when the file actually imports child_process).
  • The learned profile — three dismissals of the same finding and it is suppressed for this repo. Your judgment accumulates instead of repeating.
  • Optional AI filter — with your own Anthropic key, an LLM pass re-examines candidate findings before they reach you.

The proof style we prefer: cold audits against real repos at pinned commits, where every surfaced finding can be checked by cloning the repo yourself.

Last updated·2026-06-12Feedback →