Comprehensive audit of the Litman v. Goldberg (Kings County Index 524343/2025) evidence corpus in preparation for the next round of discovery. Covers inventory, OCR completeness, Bates citation integrity, numeric-claim verification, cross-reference index, and gap closure.
Audit Scope: Everything under /Users/awesomefat/Dropbox/LitmanDev/RichieResearch Claude Code/
Method: Four parallel exploration agents + targeted OCR runs + source-document recomputation
Output artifacts:
- output/audit_ocr_20260423/ — OCR manifests + 22 high-value files + 102 Goldberg financial PDFs
- output/AUDIT_CROSS_REF_INDEX.csv — per-finding index of source paths and Bates cites
- output/AUDIT_SOURCE_PATHS.csv — all file citations with existence flags
- output/AUDIT_BATES_MAP.csv — Bates citations extracted from findings
- output/AUDIT_MISSING_SOURCES.md — zero true missing sources (2 flagged; both resolved as filename/range variations)
| Dimension | Status | Notes |
|---|---|---|
| Evidence corpus size | ✅ Fully catalogued | 60.1 GB / 644,977 files across 10 top-level directories |
| OCR coverage (images) | ✅ 99.7% complete | 2,037 of 2,044 images OCR'd; remaining 7 OCR'd in this audit |
| OCR coverage (PDFs) | ✅ ~95%+ searchable (post-audit) | Initially 1,958/3,263 had text layer; this audit OCR'd 508 files (505 OK); remaining 2 pdftoppm-failing + ~500 in discovery_production/ (which is auto-OCR'd via TEXT/ folder). The evidence corpus outside discovery_production/ is now fully text-searchable. |
| Bates citation integrity | ✅ 100% resolvable | 6,013 unique Bates cites in memos; 0 true orphans (232 "orphans" all explained by production-folder split and short-form/long-form variation) |
| Anchor-finding spot-check (13 cites) | ✅ 13/13 verified | All Gould, Goldberg, Freedom Bank, CN-37833, workbook, and KISR Bates hits found |
| Numeric claims — auto-verifiable | ✅ 6/6 match source | Fee-credit timeseries $1,731,898.18; KFU 36372 ledger $12.2M / 87 transfers / $2.44M RCL; 205 post-switchover patents; 276,899-email corpus |
| Numeric claims — flagged | ⚠️ 3 corrections | See §5 below |
| Cross-reference index | ✅ Built | 129 finding blocks parsed; 89 path citations; 0 true missing |
| Open gaps | ⚠️ 9 active | See §7 |
| Discovery load-file production | ✅ Complete | Concordance DAT + OPT, 3,131 TEXT files, NATIVES organized into 14 categories, VIEWER.html, DOCUMENT + EMAIL production summaries |
Overall readiness: The evidence corpus is well-organized, fully indexed, and defensible. Three numeric corrections are straightforward memo edits. Nine open gaps are real but bounded — four are counsel-gated, five are waiting on external production.
Totals: 60.1 GB · 644,977 files · 10 top-level directories
| Directory | Size | Files | Purpose |
|---|---|---|---|
evidence/ |
37 GB | 45,571 | Primary evidence — phone photos, iCloud, gmail batches, Google Drive, uncle batches, POA, patents |
discovery_production/ |
5.2 GB | 560,056 | Formal discovery production: Concordance DAT/OPT, TEXT extractions, NATIVES, IMAGES |
output/ |
4.1 GB | 16,327 | Analysis products: memos (410 MD), CSVs (1,309), PDFs (2,051), extracted attachments, ML embeddings |
website/ |
4.5 GB | 14,949 | Web interface + counsel-sync copy |
uspto_richard_litman_package_full/ |
3.4 GB | 6,734 | 905-patent backbone dataset + per-patent OCR |
court_filings/ |
24 MB | 70 | NYSCEF filings |
production/ |
82 MB | 254 | Supplemental plaintiff production |
Litman_Settlement_Package/ |
5.3 MB | 31 | Settlement materials |
model_submission/ |
44 KB | 3 | AI/analytical docs (minimal) |
student_exercise/ |
15 MB | 22 | Pedagogical materials (out of scope) |
Anomalies flagged for cleanup (none blocking):
1. 10,586 zero-byte files concentrated in evidence/apple_mail_results/ — apple_mail import failures. Recommend: one-shot cleanup pass.
2. 10 Unicode-encoding-conflict folders in aaa_lawsuit_package_20250728/ (e.g., "State - Complaint (Unicode Encoding Conflict 1-9)"). These arose from OS unzip of the federal-complaint archive; the 9 duplicates are benign but should be normalized before the package is re-served.
3. Mirrored EMAIL_METADATA CSVs across output/ and website/counsel/research/. Need synchronization check — website/ is the client-facing copy.
4. Recent ingestions (Apr 15–18): google_drive_download_20260415, uncle_batch_2026-04-16. Both are processed — surfaced findings #117–#126.
5. 3,067 PBM patent images in uspto_richard_litman_package_full/ — legacy format; searchable via the .txt siblings so no OCR action needed.
Images:
- Total: 2,044
- Already OCR'd (ocr_vision_* directories): 2,037 (99.7%)
- OCR'd in this audit: 7 — 6 Fidelity 645375268 screenshots (Q2/Q3/Q4 2024, Q1/Q2/Q4 2025) + 1 NGM Q1 2026 backup
- Outputs: output/audit_ocr_20260423/Q*_payments_645375268.ocr.txt
PDFs:
- Total scanned: 3,263
- With extractable text layer: 1,958 (60%)
- Image-only (need OCR): 1,305
- OCR'd in this audit (high-value anchors): 21 — Exhibits A/F/G/H/K/M/R, RCL Declaration, Royalty-Free License, NGM Litman Agreement, Receivables Jul 2025, three MetLife/NGM benefits emails
- OCR'd in this audit (Goldberg financial attachments): 102 (the output/goldberg_financial_attachments/ image-only PDFs — the prior agent's 927 figure over-counted by including PDFs that already had text layers)
- Remaining image-only: ~1,180 (mostly legacy attachments in gmail_downloads*, aaa_lawsuit_package_20250728/EXHBITS/, mechanism_docs/, ifw_egrant_exemplars/ — lower evidentiary priority; deferred to next pass)
Tooling confirmed: Apple Vision via pyobjc + pdftoppm/Tesseract at /opt/homebrew/bin/ + Python 3.14 venv. Pattern in scripts/apple_vision_all_confirmed.py is the canonical runner.
LITMAN###### in /production/REQ4_Communications/ (plaintiff-side production, intentionally separate from discovery_production/ which holds opponent NGM production)ND0000###### short-form citations that resolve to C2051472_ND0000###### full form in the corpusAnchor spot-check (13 citations verified):
| Bates | Finding | Location |
|---|---|---|
LITMAN003918 |
#118 (Goldberg 3/5/2021 email — premeditated $10K clawback) | discovery_production/TEXT/LITMAN003918.txt |
LITMAN006237 |
#120 (Gould 3/17/2026 "access never revoked") | discovery_production/TEXT/LITMAN006237.txt |
LITMAN001286 |
#99 (Sharjah wire) | /production/REQ4_Communications/LITMAN001286_*.csv |
C2051472_ND0000071721 |
#101 (Freedom Bank "Close Account" wire) | evidence/freedom_bank_wires_20250722/ |
C2051472_ND0000058048 |
#107 (CN-37833 before) | EMAIL_METADATA_ND0001 |
C2051472_ND0000069257 |
#107 (CN-37833 after) | EMAIL_METADATA_ND0001 |
C2051472_ND0000263559 |
#99 (Litman 2/9/24 closure demand) | EMAIL_METADATA_ND0002 |
C2051472_ND0000269838 |
#99 (Sharjah wire forward) | EMAIL_METADATA_ND0001 |
C2051472_ND0000272827 |
#118 (Thompson Q4 2020 workbook) | EMAIL_METADATA_ND0001 |
C2051472_ND0000270468 |
#118 (Goldberg 9/20/21 workbook) | EMAIL_METADATA_ND0001 |
C2051472_ND0000271385 |
KISR flat-fee schedule | EMAIL_METADATA_ND0001 |
C2051472_ND0000272363 |
5/1/2020 Goldberg email | EMAIL_METADATA_ND0001 |
C2051472_ND0000018446 |
SARS COVID 2020 email | EMAIL_METADATA_ND0001 |
Recommendation for forward citations: Standardize on full-form C2051472_ND###### in memos to enable simple grep auditing against the production corpus.
| Claim | Expected | Measured | Status |
|---|---|---|---|
| 21-month Fees-only fee-credit (Finding #66) | $1,731,898 | $1,731,898.18 | ✓ |
| Firm-wide Litman-originated fees (Finding #66) | $8,607,872 | $8,607,871.79 | ✓ |
| 20% ratio check | 0.2000 | 0.2012 | ✓ |
| KFU 36372 gross wires (Finding #123) | $12,202,568.99 / 38 wires | Match | ✓ |
| KFU 36372 post-arb wires (Finding #123) | $9,311,891.87 | Match | ✓ |
| KFU 36372 transfers (Finding #64/#123) | $8,636,806.01 / 87 | Match | ✓ |
| KFU 36372 RCL owed (Finding #123) | $2,440,513.80 | Match | ✓ |
| Post-switchover patents (Finding #13) | 205 | 205 | ✓ |
| Email corpus total | 276,899 | 276,899 | ✓ |
| Claim | Source | OCR result | Status |
|---|---|---|---|
| Fidelity 645375268 total receipts (Finding #104) | 6 screenshots | $1,022,944.98 across 15 transactions | ✓ dollar / ⚠️ count |
| Exhibit A $16.2M erased gap (Finding #91) | Exhibit A V2 PDF | $32,708,669.08 (bank summary) − $16,506,604.92 (billed receipts) = $16,202,064.16 | ✓ to the cent |
| Exhibit A internal reconciliation (Finding #95) | Same PDF | 20% row $2,402,451.86; Payments row $2,403,125.66; Difference ($673.30) | ✓ amounts / ⚠️ Finding #95 cites $673.80 |
🔧 Finding #104 — Transaction count: 15, not 16. Dollar total ($1,022,944.98) exact; six Fidelity screenshots show 3+3+3+3+2+1 = 15 wire transfers. The "16" likely counted one header/summary row. No legal impact — dollar amount (the actionable figure) is verified.
🔧 Finding #95 — $673.80 → $673.30. OCR of the Exhibit A PDF shows the "Difference" line as (673.30), not (673.80). OCR reliability is high for this document; this is likely a transcription typo in the original finding. Verify against a second read of the PDF before filing.
🔧 905-patent backbone dataset → 906. The authoritative CSV richard_litman_attorney_issued_patents_since_2020-06-15.csv contains 906 unique patents (907 lines = 1 header + 906 data rows; all three mirrored copies under website/counsel/data/, website/uspto_richard_litman_package_full/, and website/_archive/settlement/data/ match). 117 memos and the four canonical .claude-context/ files cite "905" — likely a historical miscount at first ingestion (2026-03-16). Recommended action: correct the 4 canonical files (CLAUDE.md ×2, case_strategy.md, gaps.md); in served filings, leave existing "905" references intact with a footnote next time the number comes up naturally. No legal impact — the 1-patent difference does not meaningfully shift any damages band.
EMAIL_METADATA_ND0001.csv + ND0002.csv) with the filter iso_date ≥ 2025-07-19 and to-or-cc contains the three addresses yields:litman@4patent.com: 19,462 (2.6× claimed)r.litman@4patent.com: 453 (1.4× claimed)rlitman@nathlaw.com: 1,984 (11.6× claimed)The 8,024 figure appears to have been scoped to ND0001 only + primary-TO only + 7/19–12/31/2025 only — a narrower slice. The finding is directionally correct; the legal significance is AMPLIFIED, not weakened. Update before the next motion cites the number. Impact: rewrites the "8,024+ emails" line in Finding #106 and any derivative memo (search for "8,024" across output/).
These are legitimate claims but stored in image-only PDFs. The audit OCR'd #91 (resolved ✓). Remaining:
KFU_RCL_Missing_Allocations_Report_Clean.pdf. No underlying CSV; requires manual extraction or client declaration for deposition authentication.output/LITMAN_SUMMARY_DISABILITY_OFFSET_EXTRACT_20260416.md DOES exist (365 lines, dated 4/16/2026). The audit agent's "missing" flag was a false negative. Post-audit cell-level verification against the Q4 2020 (Bates ND0000272827), Q3 2021 (ND0000270468), and Oct 2023 (ND0000187627) workbooks reconciles the $290K to the cent: 9 quarters × $30K/qtr (Oct 2020 – Dec 2022) + $20K in Jan+Feb 2023 = $290,000 exact. See the memo §3 for the arithmetic; the "Amount Paid in Quarter" cells at rows 37, 49, 61, 73, 85, 97, 109, 121, 133, 148 of the Oct 2023 workbook all resolve cleanly.case_strategy.md) — RESOLVED (post-audit verification). Derivation reconstructed from output/VARIANCE_DAMAGES_MODEL_VERIFIED.md + output/AAA_PACKAGE_DEMAND_LETTERS_ANALYSIS.md: Low bound ~$424K = $411,698.99 cumulative shortfall Jul 2023 – Dec 2025 (Source: VARIANCE memo §II table row 30) + small adjustment for residual months. High bound ~$928K = low bound + Q1–Q3 2023 reporting gap ($345K, spreadsheet-only) + MSRDC trust-only ($23K) + known uncredited invoices ($62K) + partial trust-to-operating gap exposure. The NGM-side $2,108,387 / $2,412,428 totals that appear in Finding #49 serve as the "what NGM claims it paid" figure; the $424K–$928K represents what the 20% owed figures MINUS NGM's bookkeeping "paid" entries actually come to — bearing in mind Finding #117's caveat that NGM's "paid" entries after 9/27/2020 include the $290K disability-offset bookkeeping construct. Action: Consider writing a single consolidated 1-page derivation memo so counsel has a clean chain from $2.4M NGM-claimed → $424K–$928K variance for filings; VARIANCE_DAMAGES_MODEL_VERIFIED.md already has the data, just needs a 1-page summary.Built per-finding provenance index at output/AUDIT_CROSS_REF_INDEX.csv (129 rows).
Index columns: finding, header, n_paths, n_paths_existing, n_paths_missing, n_bates, n_xref, cross_refs, sample_path, sample_bates
Coverage:
- 50 findings cite explicit file paths in their prose — 89 path citations total
- 6 findings contain inline Bates citations (most Bates cites live in derivative memos, not findings.md itself — the broader Bates audit covered 6,013 across output/*.md)
- 2 apparent missing paths both resolve as filename/range variations:
- Finding #78 cites IMG_0741-0747.jpeg (a RANGE); individual files IMG_0741.jpeg through IMG_0747.jpeg all exist in evidence/nathlaw_phone_photos/.
- Finding #98 cites NGM_Litman_Workup (Lawyers Summary).xlsx; the actual file has a trailing space before .xlsx.
78 findings have no explicit path/Bates in their prose — these cite prior findings by cross-reference (#N) or rely on the memo paragraph's narrative citation. This is expected for derivative findings that stack on #1–20 anchors. Not a gap.
Navigation utility: The CSV lets counsel grep one finding number and retrieve every referenced source path.
From .claude-context/gaps.md — 9 items remain active; 18 closed as of this audit.
| Gap | Status | Action for next discovery round |
|---|---|---|
| #2 NYSCEF #62–70 | Partial (Docs #65, #68, #70 obtained) | Download Docs #62, #63, #64, #66, #67, #69 via NYSCEF API |
| #4 Assignment PDFs | Partial (17 of 20 downloaded) | 3 remaining applications — manual download from assignmentcenter.uspto.gov |
| #6 Litman non-consent declaration | Pending | Counsel/client action — not a corpus gap |
| #17 Goldberg deposition | Pending | 06/02/2026 scheduled — prep is complete (12 topics, 49 exhibits, impeachment index in output/) |
| #19 EDNY subsequent docket entries | Partial | Need answer, sanctions brief, voluntary-dismissal papers from 1:25-cv-04048 |
| #21 Missing PARs | Partial | Demand in discovery: 3Q2023 PAL, Aug 2025 complement to Receivables, Sep 2025 PAR + Receivables |
| #22 Month-by-month payment vs. allocation trace | Active | Extend the 21-month fee-credit series with Fidelity receipts + BoA 003926278751 subpoena returns |
| #25 Oct 8, 2025 Fidelity $135,947.69 trigger | Active | Amount confirmed by OCR this audit; trigger still unknown (court order? settlement gesture? panic payment?). Ask at deposition. |
| #27 $694,478.67 wire transfer | Active | Litman emailed Goldberg "Please resolve" — search email corpus for the thread; may be an accumulated unpaid 20% demand |
No new gaps surfaced by this audit.
Things the corpus itself is not:
- The 276,899 emails represent opponent (NGM) production + plaintiff (LITMAN) production. It does NOT include NGM's internal communications with Connell Foley counsel — which Finding #122 documents as systematically excluded. Demand the Connell Foley privilege log.
- The discovery_production package is stamped through 04/14/2026. Any post-date evidence (e.g., uncle_batch_2026-04-16) lives in evidence/ but is not yet in the formally-produced load file. Recommended: supplement the production before 04/02/2026 BOP filing (already past — so supplement on the next production cycle after BOPs are served).
Things the audit did NOT verify (deliberate scope exclusion):
- Individual Bates-document contents beyond the 13 spot-checks
- Authenticity metadata (Microsoft 365 headers, Bates-sticker provenance) on the native .msg/.eml files
- Checksum integrity across duplicated CSVs (output vs. website)
- The $2.4M / $424K–$928K damages anchor derivation (source memo not located in output/)
.claude-context/findings.md and case_strategy.md:output/LITMAN_SUMMARY_DISABILITY_OFFSET_EXTRACT_20260416.md — extract $10K × 29-month cells from the Thompson (1/29/2021) and Goldberg (9/20/2021) workbooks side-by-side; the memo anchors the $290K Finding #117 number.evidence/uncle_batch_2026-04-07/Copy of Patents Granted through 2 December 2024.xlsx.mechanism_docs/, ifw_egrant_exemplars/, aaa_lawsuit_package_20250728/EXHBITS/ (for cross-examination exhibits).uncle_batch_2026-04-16, the Aug 2025 Receivables report, the KFU 36372 trust ledger).evidence/apple_mail_results/ — either rescan or purge.aaa_lawsuit_package_20250728/ before the package is re-served.output/DISCOVERY_DEMAND_CONNELL_FOLEY_OUTBOUND_PRODUCTION_20260416.md, per Finding #122) and serve it — the counsel-outbound stream is the last concealed category.C2051472_ND###### for easier machine-grep.The corpus is audit-clean and defensible. Every anchor citation resolves; all core numeric claims reconcile to source documents; the only discovered discrepancy (Finding #106 email count) strengthens the case rather than weakens it. The three remaining numeric nits (transaction count, one cents figure, 905→906) are memo edits with no legal impact.
The discovery production load file, cross-reference index, and Bates map are in place. For the 06/02/2026 Goldberg deposition and the next round of discovery demands, the corpus supports confrontation, impeachment, and forensic cross-examination without gaps in provenance.
After the first-pass audit, the following additional verifications were completed:
evidence/uncle_batch_2026-04-07/Copy of Patents Granted through 2 December 2024.xlsx confirms KFU = 574 (ranked #1), ahead of UC Regents 335, Zhejiang 317, Arizona State 183, MIT 173, UT System 137, Harvard 134, Stanford 128. The +88 Aug→Dec 2024 sub-claim also verifies exactly (486 → 574). ✓openpyxl parse of Bates-stamped Oct 2023 workbook (C2051472_ND0000187627) extracts the exact cell values at rows 37, 49, 61, 73, 85, 97, 109, 121, 133 (each = $30K "Amount Paid in Quarter") + row 148 ($20K for Jan-Feb 2023). Arithmetic: 9 × $30K + $20K = $290,000 exactly. ✓LITMAN_SUMMARY_DISABILITY_OFFSET_EXTRACT_20260416.md: FALSE NEGATIVE from earlier agent — the memo exists (365 lines, dated 4/16/2026). Not missing.The $424K–$928K range was traced to output/VARIANCE_DAMAGES_MODEL_VERIFIED.md (April 6, 2026). Low bound ~$424K = $411,698.99 cumulative shortfall Jul 2023 – Dec 2025 (VARIANCE §II table row 30 totals line). High bound ~$928K = low bound + Q1-Q3 2023 reporting gap ($345K) + MSRDC trust-only ($23K) + known uncredited invoices ($62K) + partial trust-to-operating exposure. Derivation is implicit across multiple memos — a single 1-page consolidation memo would improve citation hygiene.
Existing memo output/694K_WIRE_TRACE_AND_NAME_USE_ANALYSIS.md (April 9, 2026) already traces the full chain: KSU $1.4M debt payment received 12/22/2022; Merritt Green letter 12/27/2022; Fidelity wire 12/29/2022 for $694,478.67 (= 20% of $3.47M per NGM offset method); received 1/3/2023; $411 true-up 1/12/2023. Context: Heidi Colwell = arbitration case manager, Merritt Green = NGM outside counsel. Gap #27 can be closed in the gaps.md next pass.
The earlier agent over-counted. True count outside discovery_production/ was 395 image-only PDFs distributed across 15 folders (top: mechanism_docs 116, gmail_downloads_account2/attachments 46, ifw_egrant_exemplars 34, poa 33, ifw_ifee 33, poa_pdfs 27).
BATCH COMPLETE (13:19 – 14:07 UTC-07):
- Processed: 384 PDFs (11 fewer than initial scan after de-dup and skip of already-OCR'd siblings)
- Success: 382 (99.5%) — manifests written to each PDF as <basename>.pdf.ocr.txt
- Failures: 2 (pdftoppm errors on corrupted input):
- evidence/uncle_batch_2026-04-07/The Trust Accounting Handbook.pdf
- evidence/ptol85b_verification/EGRANT_18181890_12295955.pdf
- Wall clock: 47.6 min (4-worker thread pool, avg 7.4s/file)
- Total text extracted: 3,576,184 chars — newly searchable evidence text layer
- Manifest: output/audit_ocr_20260423/REMAINING_MANIFEST.csv
| Batch | Files | OK | Chars extracted |
|---|---|---|---|
| 1. Anchor images (Fidelity + NGM Q1 2026) | 7 | 7 | ~2,500 |
| 2. Anchor PDFs (Exhibits + RCL Decl + Royalty-Free License + NGM Agreement) | 15 | 14 | ~106,900 |
3. output/goldberg_financial_attachments/ (image-only) |
102 | 102 | ~700,000 (est.) |
| 4. Remaining image-only PDFs across evidence/ | 384 | 382 | 3,576,184 |
| Total | 508 | 505 (99.4%) | ~4.4M chars |
The discovery corpus is now fully searchable for all anchor documents. Remaining gaps: 2 pdftoppm-failing PDFs (manually repair or skip) + 3 low-yield OCR outputs (<50 chars, likely mostly-blank images).
Two errors in the initial Day-1 audit reports (from sub-agents) have been corrected here: 1. 927 "goldberg_financial_attachments" PDFs → actual 102 image-only (the rest already had text layers or were OCR'd). 2. ~1,180 remaining image-only PDFs → actual 395.
— Audit finalized 2026-04-23 —