No other domain publishes its scoring rubric with the exam. An RFP contains Section L (instructions to offerors) and Section M (evaluation criteria). These tell you what to write, how to format it, and exactly how it will be scored. Outstanding, Good, Acceptable, Marginal, Unacceptable — the rating scale is published. The subfactors are published. The relative weights are published.
This is the most explicit loss function in any industry. When the persona says "review this volume against Section M," the AI doesn't need to infer what "better" means. The contracting officer already defined it.
The other advantage: volume. A single IDIQ recompete can produce 500 pages of proposal across three volumes. An SBIR Phase II is 50 pages with strict formatting. A CPARS narrative is two paragraphs that determine whether you win the next contract. Every one of these is text, every one is iteratively refined, and every one is evaluated against criteria you already have.
The technical volume is where contracts are won or lost. Evaluators score against Section M subfactors. Every paragraph should map to an evaluation criterion. Every claim should have proof.
# .persona
You are a proposal reviewer with Shipley methodology experience.
You review technical volumes against the RFP's Section L
instructions and Section M evaluation criteria. Focus on:
compliance with every stated requirement, traceability from
evaluation factors to proposal sections, specificity of
approach, substantiation of claims, and discriminating content
that differentiates from competitors. One improvement per round.
Read .crumbs. DONE when the volume would score Outstanding
on every subfactor.
# .tools
ito
cat
grep
wc
🍞 12 times toast tech-volume.md "improve — Section M compliance, substantiation, discriminators"
a2f3b4c Section M subfactor 2.1 requires "demonstrated experience with FedRAMP High" — no mention found in volume, add proof point
b4d5e6f paragraph on staffing approach says "highly qualified" — replace with specific certs, clearance levels, and years of experience
c6e7f8a RFP requires discussion of phase-in plan within first 30 days — section exists but doesn't address knowledge transfer from incumbent
d8f9a0b risk section identifies risks but doesn't include mitigation strategies — evaluators score risk handling, not just identification
e0a1b2c past performance reference in 3.2 describes contract scope but omits relevance to THIS requirement — add mapping to PWS paragraphs
f2b3c4d Section L mandates organizational conflict of interest disclosure — not addressed anywhere in volume
a4c5d6e key personnel resume for PM lists experience but doesn't tie accomplishments to evaluation criteria — add "so what" statements
b6d7e8f quality control plan references ISO 9001 but RFP asks for CMMI Level 3 — wrong framework, replace
c8e9f0a approach section uses passive voice throughout — evaluators want to see "we will" not "it is anticipated that"
d0f1a2b management volume cross-references tech volume Section 4.3 — but that section is numbered 4.2 in current draft
e2a3b4c cost volume assumptions don't match LOE estimates in technical approach — 3 FTEs described, 2.5 FTEs priced
DONE after 11 rounds
The OCI disclosure that was never addressed. The cross-reference to a section that was renumbered. The mismatch between technical approach and cost volume. Any one of these could drop a rating from Outstanding to Acceptable — or get the proposal thrown out entirely. Each round catches one before the contracting officer does.
Small Business Innovation Research proposals have published evaluation criteria, strict page limits, and a success rate around 15%. The margin between funded and not-funded is often one weak section.
# .persona
You are an SBIR proposal reviewer familiar with DoD, NIH,
and NSF evaluation criteria. You review proposals for:
technical merit and innovation, team qualifications,
commercialization potential, and compliance with agency-
specific formatting requirements. Tighten every section
to maximize score within page limits. One improvement per
round. Read .crumbs. DONE when the proposal maximizes
score on every evaluation criterion.
# .tools
ito
cat
grep
wc
🍞 10 times toast sbir-phase2.md "sharpen — evaluation criteria, commercialization, page limits"
a1c2d3e innovation section describes the technology but not what's novel about it — add comparison to state of the art with specific gaps addressed
b3d4e5f commercialization plan says "large addressable market" — quantify: $2.3B TAM with bottoms-up analysis from three customer segments
c5e6f7a Phase I results referenced but key metrics buried in a paragraph — extract to a table showing baseline vs. demonstrated performance
d7f8a9b team section lists credentials but doesn't address PI's time commitment — evaluators check for sufficient dedication, add percentage
e9a0b1c technical objectives are vague — rewrite as measurable milestones with go/no-go criteria per DoD SBIR guidance
f1b2c3d budget justification for equipment purchase doesn't explain why lease isn't feasible — add cost comparison, required per solicitation
a3c4d5e letters of support from potential customers are mentioned but not included — add as appendix or remove the claim
b5d6e7f section 3 runs 6.5 pages against a 6-page limit — cut the background subsection, evaluators already know the problem space
DONE after 8 rounds
The page limit violation alone kills the proposal. The vague commercialization plan drops the score. The missing letters of support undermine credibility. SBIR reviewers spend 2-4 hours per proposal — they're looking for reasons to score down. Each round removes one.
After you win, you have to comply. The Federal Acquisition Regulation is 1,900+ pages. DFARS adds another 1,000+. Every clause flows down. Every clause has requirements. Missing one is a finding; missing several is a corrective action; a pattern is termination for default.
# .persona
You are a contracts compliance specialist. You review
contractor policies and procedures against applicable
FAR and DFARS clauses incorporated by reference in the
contract. Focus on: clause-by-clause compliance, flowdown
to subcontractors, required certifications and representations,
and reporting obligations. One gap per round. Read .crumbs.
DONE when all incorporated clauses are addressed.
# .tools
ito
cat
grep
🍞 10 times toast compliance-matrix.md "review — FAR/DFARS clause compliance, flowdown, reporting"
a2d3e4f FAR 52.204-21 (Basic Safeguarding) incorporated but no corresponding policy for protecting covered contractor info systems
b4e5f6a DFARS 252.204-7012 (CYBER) requires 72-hour incident reporting to DIBNet — SOP says "promptly notify the CO", add specific timeline and channel
c6f7a8b FAR 52.222-50 (Trafficking in Persons) compliance plan required but not in compliance matrix — add policy and training evidence
d8a9b0c DFARS 252.225-7001 (Buy American) — commercial item determination file missing for COTS components in BOM
e0b1c2d FAR 52.203-13 (Code of Business Ethics) requires internal control system and disclosure program — no evidence of hotline or reporting mechanism
f2c3d4e DFARS 252.227-7013 (Technical Data Rights) — data deliverables list doesn't specify rights assertions per CDRL
a4d5e6f FAR 52.216-7 (Allowable Cost and Payment) — timekeeping system description missing labor distribution methodology
b6e7f8a subcontract with company X doesn't flow down DFARS 252.204-7012 — mandatory flowdown for all subcontracts involving CDI
c8f9a0b FAR 52.219-9 (Small Business Subcontracting Plan) — plan exists but goals don't match percentages in SF 294 report
DONE after 9 rounds
The missing DFARS cyber flowdown. The trafficking compliance gap nobody thinks about until DCMA asks. The small business subcontracting percentages that don't match. A DCAA audit or DCMA surveillance visit will find these — the question is whether you find them first. Each round closes one compliance gap.
Contractor Performance Assessment Reporting System. Two paragraphs per evaluation area that determine your past performance score on every future proposal. The government rates you; you get to respond. Most contractors write terrible responses because nobody treats CPARS like the high-stakes document it is.
# .persona
You are a CPARS response strategist. You craft contractor
responses that address government ratings with specific
evidence, quantified accomplishments, and context. Turn
"Satisfactory" into evidence for "Exceptional" on the next
proposal. For unfavorable ratings, provide factual context
without being combative. Every sentence must serve the
purpose of strengthening future past performance evaluations.
One improvement per round. Read .crumbs. DONE when every
response maximizes future proposal value.
# .tools
ito
cat
wc
🍞 6 times toast cpars-response.md "sharpen — quantify, evidence, future proposal value"
a1e2f3a "completed tasks on time" — replace with: delivered 47 of 48 CDRLs on or ahead of schedule, sole late delivery was government-caused delay
b3f4a5b cost control narrative is defensive — rewrite to lead with 12% underrun on CPFF portion, then explain the reallocation
c5a6b7c quality section mentions "no deficiencies" — add: passed 3 DCMA surveillance reviews with zero findings, QMS maintained ISO 9001 certification
d7b8c9d management section doesn't mention key personnel retention — add: 94% retention rate across 18-month period, zero gaps in key personnel positions
e9c0d1e "Marginal" rating on schedule — response should acknowledge the delay, provide root cause (government-furnished equipment late by 6 weeks), and document the recovery plan that delivered final milestone only 8 days past original date
DONE after 5 rounds
A CPARS response that says "we did good work" scores nothing. A response that says "delivered 47 of 48 CDRLs on schedule, maintained 94% key personnel retention, and passed DCMA surveillance with zero findings" wins the next evaluation. These two paragraphs per area echo across every proposal for five years.
Every DoD contractor handling Controlled Unclassified Information needs a System Security Plan documenting implementation of 110 security controls from NIST 800-171. CMMC assessments evaluate the SSP control by control. Vague implementation descriptions fail.
# .persona
You are a CMMC Registered Practitioner. You review System
Security Plans against NIST SP 800-171 Rev 2 control
families. Every control implementation description must be
specific to THIS environment — not boilerplate. Include:
the technology used, the configuration, the responsible
role, and evidence of implementation. One control per round.
Read .crumbs. DONE when every control description is
assessment-ready.
# .tools
ito
cat
grep
🍞 15 times toast ssp.md "tighten — NIST 800-171, environment-specific, assessment-ready"
a2c3d4e AC.1.001 says "access is limited to authorized users" — specify: Active Directory GPO enforces role-based access, quarterly access reviews by IT Manager per AC-SOP-003
b4d5e6f AC.2.007 references MFA but doesn't specify the implementation — add: Duo Security with push notification, enforced at VPN and O365 conditional access policy
c6e7f8a AU.2.041 says "audit logs are reviewed" — add frequency (weekly), tool (Splunk SIEM), responsible role (SOC Analyst), and alert thresholds
d8f9a0b CM.2.061 baseline configuration described generically — specify: Windows 10 STIG V2R3 applied via SCCM, Linux hosts hardened per DISA RHEL 8 STIG
e0a1b2c IA.2.078 password complexity stated but no reference to policy enforcement — add: enforced via AD GPO, minimum 14 characters, 60-day rotation per IA-SOP-001
f2b3c4d IR.2.092 incident response capability described but no test evidence — add: tabletop exercise conducted quarterly, last exercise March 2026, after-action report filed
a4c5d6e MA.2.111 says "maintenance is performed by authorized personnel" — specify: maintenance authorization list maintained by Facility Security Officer, updated within 5 days of personnel change
b6d7e8f MP.2.120 media protection describes encryption but not the standard — add: AES-256 via BitLocker on endpoints, LUKS on Linux, Virtru for email attachments containing CUI
c8e9f0a PE.1.003 physical access control described as "badge system" — specify: Lenel OnGuard, programmed for CUI-designated areas, access logs retained 3 years
d0f1a2b PS.2.127 personnel screening references "background checks" — specify: NACI minimum for CUI access, T3 for privileged users, adjudicated by FSO before access provisioned
e2a3b4c RA.2.141 vulnerability scanning described but no cadence — add: Tenable Nessus scans weekly on all CUI-scoped assets, critical findings remediated within 72 hours per RA-SOP-006
f4b5c6d SC.1.175 says "communications are monitored at external boundary" — specify: Palo Alto PA-850 at network perimeter, IDS/IPS signatures updated daily, alerts to SOC
DONE after 12 rounds
A CMMC assessor reads "access is limited to authorized users" and marks it as not met — the description could apply to any organization on earth. A description that names the technology, the configuration, the responsible role, and the evidence is verifiable and passable. Each round converts one boilerplate control into one assessable implementation.
Performance Work Statements and Statements of Work define what the contractor must deliver. Ambiguity in a SOW creates disputes. Missing requirements create gaps. Poorly defined acceptance criteria create stalemates.
# .persona
You are a contracting officer's representative reviewing
Statements of Work. You check for: completeness of
requirements, measurability of performance standards,
clarity of acceptance criteria, proper use of PWS vs SOW
language, and consistency with CDRLs and contract line
items. One improvement per round. Read .crumbs. DONE
when the SOW is unambiguous and enforceable.
# .tools
ito
cat
grep
wc
🍞 8 times toast pws.md "tighten — measurable, unambiguous, enforceable"
a3d4e5f "provide adequate staffing" — define: maintain minimum 8 FTEs across task areas per labor category mix in Attachment J-3
b5e6f7a quality standard says "in accordance with industry best practices" — replace with specific standard: IEEE 829 for test documentation, CMMI-DEV ML3 for processes
c7f8a9b CDRL DI-MGMT-81466 referenced in Section 3.4 but not listed in Contract Data Requirements List — add or remove reference
d9a0b1c response time SLA says "within a reasonable timeframe" — define: Severity 1 within 1 hour, Severity 2 within 4 hours, Severity 3 next business day
e1b2c3d deliverable acceptance criteria missing for monthly status report — add: accepted when all data elements in Attachment J-7 template are complete and accurate
f3c4d5e travel section says "travel may be required" — specify: estimated 6 trips per year to CONUS locations, ODC ceiling of $45K, requires prior CO approval
a5d6e7f transition-out requirements not addressed — add: 90-day transition period, knowledge transfer deliverables, government ownership of all work product per DFARS 252.227-7014
b7e8f9a Section 5 performance metrics reference a QASP that isn't attached — add as Attachment J-8 or define metrics inline
DONE after 8 rounds
"Adequate staffing." "Reasonable timeframe." "Industry best practices." These are the phrases that create contract disputes. Each round replaces one unmeasurable standard with one enforceable requirement. When the COR conducts surveillance, they either can or can't verify compliance. Vague language makes that determination impossible.
Government contracting is audited more heavily than almost any other industry. DCAA audits your costs. DCMA surveils your performance. DIBCAC assesses your cybersecurity. Contracting officers evaluate your past performance. Every one of them wants documentation — specific, current, and traceable.
Every change logged with intent. Content-addressed. Hashed. When DCAA asks who reviewed the cost narrative and when — it's here.
What was found, what was fixed, what remains. Maps to proposal review gate documentation. Shows the color team trail.
Tied to Section M evaluation factors, FAR/DFARS clauses, or NIST controls. Explicit. Reproducible. The methodology is the document.
Compare this to the current state: a SharePoint with 14 versions of Volume I, color-coded comments from Pink Team and Red Team in different Word files, a compliance matrix in Excel that doesn't match the actual proposal, and an email thread titled "RE: RE: RE: FW: Final Final Comments." The gradient descent approach produces a better proposal and a process you can actually reconstruct after the fact.
Shipley methodology already defines iterative review gates — Capture, Pink Team, Red Team, Gold Team. Gradient descent maps directly onto this:
# Pink Team — storyboards and outlines
🍞 5 times toast tech-volume-outline.md "review — Section M traceability, story arc, discriminators"
# Red Team — full draft review
🍞 10 times toast tech-volume-v2.md "score — Section M evaluation criteria, substantiation, compliance"
# Gold Team — final QC
🍞 5 times toast tech-volume-final.md "final check — cross-refs, page counts, RFP compliance, no unsupported claims"
# Compliance scrub
🍞 3 times toast compliance-matrix.md "verify — every Section L requirement addressed, every Section M factor mapped"
The difference: a color team meets for four hours and produces comments that may or may not get incorporated. The gradient descent loop runs in minutes, produces a complete change trail, and every modification is individually reversible. It doesn't replace the color team — it gives them a cleaner draft to review.
A mid-size GovCon company might have 15-20 active proposals, hundreds of active contracts with compliance obligations, and a CMMC assessment coming. The pattern scales because each document is independent.
# Batch proposal QC before submission
for vol in proposal-FA8750/*.md; do
🍞 5 times toast "$vol" "final check — Section L compliance, page limits, cross-refs"
done
# Annual compliance review across all contracts
for contract in active-contracts/*/compliance.md; do
🍞 5 times toast "$contract" "review — FAR/DFARS clauses, current flowdowns, stale references"
done
# SSP refresh before CMMC assessment
for control in ssp/controls/*.md; do
🍞 3 times toast "$control" "tighten — environment-specific, evidence referenced, assessment-ready"
done
# CPARS response season
for response in cpars-2026/*.md; do
🍞 5 times toast "$response" "sharpen — quantify, evidence, future proposal value"
done
The ito history across all active proposals becomes your institutional knowledge. When you bid the follow-on in five years, you don't start from scratch — you start from a documented trail of what worked, what was scored well, and what the government valued.
The AI doesn't know your solution. It doesn't know your competitive position. It doesn't make capture decisions or pricing strategy. It doesn't replace the capture manager, the proposal manager, or the solution architect.
What it does: the mechanical compliance work that eats proposal teams alive. Checking that every Section L instruction is followed. Verifying that every Section M factor is addressed. Catching the CDRL that's referenced but not listed. Finding the cross-reference that broke when sections were renumbered at midnight before submission.
The humans design the solution. The AI checks the paperwork. The trail documents both.
# .persona for your domain — pick one, tune it
You are a [proposal reviewer | contracts specialist | CMMC
practitioner | COR]. You review [document type] against
[Section M | FAR clause | NIST control family]. One
improvement per round. Read .crumbs. DONE when [scoring
threshold | compliance standard].
# .tools — keep it minimal
ito
cat
grep
wc
# Start with your next proposal
$ cd proposal-FA8750 && ito init
$ cp ~/drafts/tech-volume-v1.md .
# Pair mode — see what it catches
$ toast
> review this technical volume against Section M evaluation criteria
# Then let it run
🍞 10 times toast tech-volume-v1.md "improve — Section M, substantiation, compliance"
$ ito history
Read the trail. GovCon is the domain where the loss function is most explicit — the government literally publishes the rubric with the test. Section L is the format. Section M is the scoring. The FAR is the rulebook. Run the loop against any of them.
The government contracting industry spends billions of dollars a year on proposal development and compliance documentation. The evaluation criteria haven't changed. The FAR clauses haven't gone away. The CMMC requirements are only getting stricter. The mechanical work of checking documents against published standards — that's gradient descent. The government already wrote the loss function. It's in the solicitation.