May 1, 2026

From CRISPR screen to triage: twenty candidate dependencies, one SPARKIT call

It's 4pm Wednesday. A computational biologist at a Series B biotech just got the read-out from a CRISPR screen — twenty candidate hits cleared the threshold, validated against the wider DepMap dataset. By Friday morning's R&D meeting, leadership wants a triage. Which of these are worth the next twelve months of follow-up? Which map cleanly to a known mechanism? Which still represents an open commercial opportunity, and which is already in someone else's pipeline?

Doing this well means, per gene: pulling the primary literature on the dependency biology, scanning prior chemistry for named compounds, checking which programs are at clinic vs. preclinical vs. unbroken. At normal human cadence, that's about 25 hours per gene. Twenty genes is 500 person-hours — roughly 12 weeks of one researcher's calendar, for the triage step alone, before any analysis of the dataset begins. The clock says forty-eight hours.

Below is what a comp-bio scientist's actual workflow looks like with one variable changed: instead of skimming abstracts and triaging by gut feel, they sent all twenty genes to SPARKIT in a single query.

The twenty hits

The screen flagged twenty real, well-validated DepMap selective dependencies, spanning the maturity spectrum from approved drugs to fully preclinical:

Gene	Selective dependency context
WRN	Microsatellite-instability-high cancers
MDM2	TP53-wildtype, especially MDM2-amplified
EZH2	SWI/SNF-mutant cancers, EZH2-mutant lymphomas
PRMT5	MTAP-deleted cancers (~10–15% of solid tumors)
POLQ	Homologous-recombination-deficient cancers
USP1	BRCA-deficient, PARPi-resistant
WEE1	TP53-mutant, replication stress
PKMYT1	CCNE1-amplified
ATR	DDR / replication stress
MAT2A	MTAP-deleted (partner of PRMT5)
BRD9	SMARCB1-deficient synovial sarcoma
SMARCA2	SMARCA4-mutant
KAT6A	ER-positive breast cancer
DNMT1	AML and hematological cancers
CHK1	DDR / replication stress
CDK7	Transcription, super-enhancer addiction
CDK12	DDR, mCRPC with biallelic CDK12 loss
PARG	DDR
WRNIP1	DDR (emerging)
SLC7A11	KEAP1-mutant, ferroptosis vulnerability

The query

A single 1,200-character prompt — disease context, named candidate compounds, clinical-trial stage, druggability assessment per gene, plus a closing comparison grouping by maturity tier and identifying open opportunities versus crowded fields.

What came back

Wall clock: 1,782 seconds (29.7 minutes). Sources cited: 105. Below is a substantial excerpt; the full report runs to roughly 4,000 words and is the kind of thing you'd save into a research-ops doc and bring to the meeting.

Summary

The 20 targets span the maturity spectrum from FDA-approved (EZH2 with tazemetostat; DNMT1 indirectly via azacitidine/decitabine) down to fully preclinical (WRNIP1; SLC7A11/xCT). The most crowded, "late-and-late" fields are ATR, EZH2, MDM2, PRMT5 (MTA-cooperative) and WEE1, where multiple Phase 2/3 programs from large pharma are converging. The clearest commercial white space is around WRNIP1, SLC7A11, selective non-nucleoside DNMT1, BRD9 (post-failure reset), CDK12 selective drugs and PARG — each has either zero, one, or only failed clinical-stage assets, often in genetically defined niches with strong human-genetics validation.

Maturity-tier breakdown

Tier	Targets	Notes
Approved	EZH2 (tazemetostat, valemetostat), DNMT1 (legacy aza/dec only)	Only EZH2 has selective approvals; DNMT1 lacks any selective non-nucleoside drug.
Phase 3 / pivotal	MDM2 (brigimadlin, navtemadlin), EZH2 (mevrometostat), KAT6A (PF-07248144), ATR (ceralasertib LATIFY), WEE1 (azenosertib)	Leaders 1–3 years from filing; new entrants need clear differentiation.
Late Ph1 / Ph2	PRMT5 (MRTX1719, AMG193, TNG462, AZD3470), MAT2A (IDE397), POLQ (ART4215), PKMYT1 (lunresertib), USP1 (RO7623066, ISM3091), CHK1 (prexasertib/ACR-368), CDK7 (samuraciclib), PARG (IDE161)	Active proof-of-concept zone.
Early Ph1	WRN (HRO761, RO7589831), SMARCA2 (PRT3789, FHD-909), CDK12/13 (CT7439), WEE1 (Debio 0123)	First wave still de-risking.
Preclinical / failed	BRD9 (CFT8634 disc.; FHD-609 paused), WRNIP1 (no asset), SLC7A11 (no clinical asset), selective DNMT1 (GSK programs paused)	True white space.

One gene, in depth — WRN

Disease context. Synthetic-lethal dependency driven by TA-dinucleotide repeat expansions that form non-B DNA in MSI-H tumors, requiring WRN's helicase/ATPase activity for resolution.

Clinical pipeline.

Novartis HRO761 — allosteric inhibitor locking D1/D2 helicase domains, Phase 1/1b in MSI-H solid tumors.

Vividion/Roche RO7589831 (VVD-133214) — covalent allosteric inhibitor, also in Phase 1, both having shown early clinical activity.

Multiple fast-followers (e.g. ZMS-4084) entering at the bioisostere/IND stage.

Druggability: now validated — chameleonic structure-based drug design overcame the historic "undruggable helicase" label.

The full report covers MDM2, EZH2, PRMT5, POLQ, USP1, WEE1, PKMYT1, ATR, MAT2A, BRD9, SMARCA2, KAT6A, DNMT1, CHK1, CDK7, CDK12, PARG, WRNIP1, and SLC7A11 in the same shape — named compounds, clinical-stage programs, druggability assessments, with every claim cited.

The closing that drives the triage

Most crowded competitive fields: ATR (5 clinical assets, one in Ph3), EZH2 (≥5 clinical, two approved, one Ph3), MDM2 (≥4 mid-to-late), MTA-cooperative PRMT5 (≥4 with rapidly converging Ph2 reads), and WEE1 (≥4 with one in pivotal pursuit). New programs in these targets need a sharp differentiator — paralog/isoform selectivity, brain penetration (e.g., TNG908, Debio 0123), cleaner hematology safety, oral/QD dosing, or biomarker-defined niches not yet claimed.

Most open commercial opportunities for new programs:

WRNIP1 — strong preclinical synthetic-lethality rationale and zero competition; risk is unproven druggability.

SLC7A11/xCT — KEAP1-mutant NSCLC is a large genetically defined population with no targeted therapy.

Selective non-nucleoside DNMT1 — large hematology market, validated biology, GSK's tools have stalled.

BRD9 — after CFT8634/FHD-609 setbacks, a cleaner degrader with no QTc liability is contestable.

PARG — only IDE161 is credible; second entrants with brain penetrance or distinct combos have a clear lane.

PKMYT1 — Repare/Debiopharm is unchallenged in CCNE1-amp; a second-generation differentiated chemotype is a single-asset opportunity.

CDK12 (selective) — high technical risk but uncontested clinically; success addresses an mCRPC/ovarian biomarker population with no targeted option.

That's the answer to the triage question, before the meeting starts.

What this would have cost by hand

Twenty genes × 25 hours per gene = 500 person-hours — three months of one comp-bio scientist's calendar, just to triage. By Friday's meeting, structurally impossible. Any honest manual workflow at this scale truncates to "skim five genes I already know, list fifteen as 'needs follow-up later,' guess." The cost of that compression is what shows up three months later as "we should have caught that earlier."

With SPARKIT: a single 1,200-character prompt, thirty minutes of wall clock, one query against the API. The triage that would have taken three months lands in time for the meeting, with citations intact so leadership can audit any part of the synthesis they want to push back on.

What SPARKIT didn't do

Honesty about a real workflow:

The report does not include patent-landscape synthesis — only published literature and press releases. For competitive IP review you still need Espacenet or a paid patent search tool.
Strategic recommendations are the model's reading of the literature, not market analysis. The "WRNIP1 is the most open commercial opportunity" framing is a credible reading of what's published, but a comp-bio scientist's gut feel about regulatory/payor risk, target-class precedent, or specific company posture is not in those sources. The Markdown is the substrate; the human still does the strategy.
A few targets came back thinner than others. WRNIP1 has almost no chemistry literature — SPARKIT correctly flagged that, but the report on that gene is shorter and less actionable as a result. The agent reflects the field's coverage; it does not invent literature that isn't there.

The point isn't to take the human out of the triage loop. It's to move them from skimming abstracts to evaluating sourced summaries — and to deliver in minutes what used to take weeks.

The pattern

Triage isn't research. It's the gate before research — the question of which of fifty hits is worth anyone's time. SPARKIT compresses the gate from months to minutes, with the citations intact so the next person can audit the call.

The science is hard. Reading the science shouldn't be the bottleneck.

May 1, 2026

AI safety in a research agent: what's in place, what we don't claim

When research agents run at scale, two failure modes dominate: invisible hallucination at industrial volume, and agentic uplift for harmful research. Here's what SPARKIT has in place to engineer against both — and what we deliberately don't claim.

Read post →

April 30, 2026

20% off SPARKIT for academic researchers

Verified academics get 20% off any SPARKIT subscription, applied automatically at checkout from your academic-domain email. No paperwork, no annual reverification, no separate plan to choose.

Read post →

April 28, 2026

Hallucinated vs. fetched: a GAIA case study on a verifiable question

We ran a single GAIA question through SPARKIT, direct Claude Opus 4.7, and direct GPT-5.5. SPARKIT fetched Nature's archive, counted 1,002 articles, and answered correctly. Both direct LLMs invented different article counts and confidently landed on the wrong answer.

Read post →

The twenty hits

The query

What came back

Summary

Maturity-tier breakdown

One gene, in depth — WRN

The closing that drives the triage

What this would have cost by hand

What SPARKIT didn't do

The pattern

More from the blog

AI safety in a research agent: what's in place, what we don't claim

20% off SPARKIT for academic researchers

Hallucinated vs. fetched: a GAIA case study on a verifiable question