← All docs

docs / case-study-pre-release-cleanup.md

Case study: pre-release cleanup, cix vs. a less-capable indexer

Date of run: April 2026 Codebase: Laravel 12 API + Vue 3 SPA, ~300 indexed files State of the codebase: mid-rename, with several recent table and model renames in flight, and a contact-unification refactor only half-propagated above the data layer

This is a measured, side-by-side run of two AI-assisted cleanup passes on the same project, with the same prompt, by the same evaluator. The point of the exercise was to compare cix against a more conventional symbol-only indexer at a task representative of what teams actually use AI for.

The summary: cix found three additional structural issues the baseline missed, while using roughly half the tool calls and a third of the tokens.

The task

Both passes received the same instruction:

Act as a senior maintainer doing a pre-release cleanup review. Find the highest-value bugs, naming overlaps, structure problems, dead code, duplicated logic, and overengineering. Do not edit. Group findings, give file/symbol evidence, and recommend a fix order.

Cold start each pass. No memory carried over.

The two passes

Pass 1 — a symbol-only indexer. A capable LSP-flavored indexer that exposes find-symbol, find-referencing-symbols, and a symbols-overview tool, plus notes/memory features. No route table. No schema view. No project orientation. Falls back to shell grep and ls for layout discovery.

Pass 2 — cix. Full feature surface: orientation, route listing, schema view, find-usages, impact analysis, audit, and parse-error reporting.

What each pass turned up

#FindingSymbol-onlycix
1Model relation method references a class deleted in a recent rename — runtime ClassNotFound on first callyesyes
2An entire controller queries a table dropped one migration earlier — every endpoint 500syesyes
3Feature test imports the deleted model in setUp() — file fails before the first assertionyesyes
4Three orphan models (backing tables dropped, not referenced anywhere live)yesyes
5A controller named for the legacy domain is now a thin façade over the unified entity — naming overlap with the real controlleryesyes
6Three frontend modules (Pinia store, api wrapper, modal component) with zero importersyesyes
7Composable likely double-lists entries in a search dropdown (no filter on the second source)yesyes
8Many-to-many relation uses a non-default key against a table whose primary-key info isn't visible — possible silent join duplication if a uniqueness assumption breaksyesyes
9Leftover Python script in the Laravel public/ directory, hardcoding root MySQL credentials in a sample WSGI scriptyes
10Four duplicate-table warnings — two intentional, two staleyes
11Route parameter naming inconsistency between auto-generated and custom routesyes

Eight findings in common. cix surfaced three additional findings that the symbol-only pass missed entirely.

Why cix found the extra three

The credential-leaking Python file. This was surfaced by the orientation step, which scans the project for runtime database connections and flags unexpected ones. A symbol graph wouldn't list this file because nothing in the project imports it — but cix flagged it because the indexer detected a database-connecting entry point sitting in a route-served directory. This is the kind of issue that doesn't show up on anyone's checklist; the system noticed it because it was looking comprehensively, not narrowly.

The duplicate-table warnings. These are emitted automatically with every schema-related response. Two duplicates were intentional (rollback recovery in a down() method); two were stale leftovers from interrupted migration work. A symbol indexer doesn't track schema, so it has no way to surface this category at all.

The route parameter inconsistency. A standard Laravel apiResource route auto-singularizes its parameter (e.g. {ticketE}), while custom routes elsewhere in the project use {ticket}. Type-bound binding still works, but the URL surface is inconsistent. cix surfaced this through its route-listing view, which normalizes and aligns parameter names across registration styles. A symbol-only indexer cannot see route registration as a structured surface.

Cost comparison

Symbol-onlycix
Tool calls~55 (~25 of them shell-fallback)~29
Files opened/read~251 (a single inspection of a flagged file)
Distinct probe-and-verify loops~12~6
Estimated tokens80–100k30–40k

The symbol-only pass burned calls on layout discovery — listing directories, grepping for table names, checking schema-table existence — because the indexer's surface is purely symbol-shaped. cix's orientation step returned routes, schema entities, hot spots, entry points, and the credential-leaking file finding in a single call. Each "is this imported anywhere?" question collapsed into one structured query instead of a sweep of shell commands.

What the two passes had in common

Both passes nailed the same eight findings. This is the right baseline: cix is not making the symbol-indexer-class findings better — both tools are capable of those. cix is finding additional findings that require schema, route, and orientation context the symbol-only indexer simply doesn't have.

The token and call-count savings were not the goal of the exercise — they were a side effect. They came from cix being able to answer in one structured query what the symbol-only pass needed to assemble from many small probes.

What this case study does not prove

  • It does not prove cix is uniformly faster. This was one project of moderate size with a real rename in flight. Other projects, other tasks, will produce different numbers.
  • It does not prove the credential discovery generalizes. That finding came from a specific class of orientation check; not every cleanup pass will surface something equivalent.
  • It does not prove the symbol-only indexer is "bad." It found eight of eleven findings. It would catch the same eight on most other projects. The point is not that one tool is broken; the point is that schema and route awareness are real, additional capabilities.

What it does suggest

  • Cleanup workflows benefit disproportionately from index breadth. The findings cix added were specifically the ones that required cross-cutting visibility (a runtime DB connection in a static file, a route surface as structured data, a schema view that can flag duplicates).
  • The token savings are real. When a session burns 60–70% fewer tokens for the same outcome, that has direct cost implications at team scale.
  • Coverage compounds. A sweep that surfaces eleven findings instead of eight is not 37% more useful — it's the difference between "we did a cleanup pass" and "we did a cleanup pass that caught a credential leak."

Methodology notes

  • Both passes used the same evaluator (a coding-assistant session driven by an experienced maintainer).
  • Both passes were given the same prompt verbatim, with no project-specific guidance.
  • Both passes were cold-started (no prior context).
  • Token counts are estimates derived from observed tool-call patterns, not exact billing data.
  • Domain terms in the writeup are generic stand-ins for the real (Spanish-language) entity names.

Related