NYT v. OpenAI — 20 Million Logs in Discovery
On January 5, 2026, Judge Sidney Stein affirmed a magistrate's order compelling OpenAI to produce twenty million anonymised ChatGPT logs into the New York Times's discovery in the Southern District of New York. Summary judgment is set for April 2026. A working note on what the ruling did, on what twenty million logs can and cannot reveal in litigation, and on the novel discovery-power implications for the entire generative AI category.
The fact pattern of NYT v. OpenAI has, for most of its life as a public case, been treated as a copyright lawsuit. It is one. The reporting on it has been dominated by the question of whether training a large language model on the New York Times’s journalism, and whether the model’s outputs occasionally reproduce that journalism, constitutes infringement under US copyright law. Those are the questions on the merits and they are the questions a summary-judgment motion in April 2026 will be asked to resolve.
The publication’s view, after several months of watching the case develop, is that the discovery procedure may matter more than the merits. The January 5, 2026 ruling by Judge Sidney Stein, affirming the magistrate’s order compelling OpenAI to produce twenty million anonymised ChatGPT conversation logs into the plaintiff’s discovery, is the development that has changed the structural shape of the case and, by extension, the structural shape of every comparable case that follows it. A working note on what the ruling did, on what twenty million logs are good for, and on the precedent the discovery posture has set, is overdue.
The ruling — what Judge Stein actually affirmed
Judge Stein’s January 5 order is on the public docket. The clearest accounts of what it did are in Bloomberg Law’s coverage, the Jones Walker analysis, and the NatLawReview write-up.
The procedural posture is worth setting out plainly. The Times had requested, in discovery, a sample of ChatGPT user conversations. OpenAI had objected on several grounds, most prominently the privacy of the users whose conversations would be produced. The magistrate had granted the Times’s motion to compel and ordered production of the logs in anonymised form. OpenAI had sought reconsideration before the district judge. Judge Stein affirmed.
The ruling is narrow in its formal holdings and broad in its operational effect. It holds, on the formal side, that the magistrate’s anonymisation procedure adequately addresses the user-privacy objections OpenAI raised. It holds, on the substantive side, that the logs are relevant to the Times’s claim that ChatGPT outputs reproduce its copyrighted material in ways that infringe. It holds, on the discovery-management side, that twenty million is the appropriate sample size given the population of conversations from which the sample is drawn.
The procedural background to the case is in the NPR coverage of the March 2025 ruling, in which Judge Stein rejected OpenAI’s motion to dismiss the principal infringement claims. The progression from “case proceeds” to “twenty million logs in discovery” took roughly nine months. That is a fast schedule for a complex civil case in the Southern District of New York. The schedule has not slipped.
Summary judgment is on the calendar for April 2026, per McKool Smith’s AI Infringement Updates tracker. Some of the trade-press write-ups have, by mid-2026, treated the April hearing as the moment the case will be substantively decided. The publication’s view, on the record we have read, is that the summary-judgment motion is more likely to narrow than to resolve the case. The merits will, in our reading, go to trial unless one of the parties takes an unexpected position before April.
What twenty million logs are good for
The interesting question, for a publication on this beat, is not the procedural one. It is what twenty million logs actually let the plaintiff do.
Three things, in our reading of the published procedural material and the relevant evidentiary doctrine.
The first thing twenty million logs let the plaintiff do is establish output frequency. If the Times’s underlying claim is that ChatGPT outputs occasionally reproduce its copyrighted material, the empirical question behind that claim is how often. A small sample of cherry-picked outputs is the kind of evidence the dismissive press has called “fishing.” A statistically meaningful sample drawn from twenty million conversations is not fishing. It is a measurement. The plaintiff is, in effect, being permitted to measure how often the alleged infringement actually occurs across a representative cross-section of the defendant’s product in operation.
This is novel. In the predecessor copyright cases against earlier categories of technology — peer-to-peer file-sharing, search-engine caching, news aggregation — the plaintiffs typically had to construct their infringement claims from a small number of specific instances. They could not measure the infringement rate. They could only point to the instances. The change of evidentiary posture from “instances” to “rate” is what the twenty-million-log discovery is delivering.
The second thing twenty million logs let the plaintiff do is identify systematic patterns. A copyright plaintiff is not only interested in whether infringing outputs occur, but also in what triggers them. Twenty million logs contain enough material to identify, for example, that infringing outputs are concentrated in particular prompt patterns, that they are more common in certain product surfaces than others, or that they vary systematically over the model’s release history. Those findings, if they emerge, are the kind of finding that supports an argument about the defendant’s knowledge of the infringement and the defendant’s failure to mitigate it.
This is also novel. In a traditional copyright case the question of the defendant’s knowledge is litigated from internal documents — emails, technical specifications, after-the-fact engineering notes. In an AI case where the defendant produces an artefact at high volume, the artefact itself contains the knowledge evidence. Twenty million logs are a long record of what the model was doing and, by inference, what its operators knew about what it was doing.
The third thing twenty million logs let the plaintiff do is develop comparators. The summary-judgment standard requires the plaintiff to produce evidence from which a reasonable factfinder could conclude that the defendant’s product infringes. A representative log corpus allows the plaintiff to develop, statistically, evidence of the kinds of comparators a copyright analysis depends on — what the output is, what the input was, how the output relates to the plaintiff’s copyrighted works, and whether the relationship is, in the doctrinal sense, substantial. The corpus is not the evidence of infringement; the corpus is the basis from which the evidence of infringement is extracted.
The defence position, on the record, has been that the anonymisation procedure addresses the privacy concerns. The publication’s reading is that the more consequential disputes about the log corpus will not be about privacy. They will be about how the plaintiff’s experts are permitted to characterise what they find. Those disputes are admissibility disputes. They will be litigated in the run-up to summary judgment and, if the case proceeds, into trial.
The novel discovery-power implications
The discovery posture in NYT v. OpenAI is, in our reading, the most consequential development in AI litigation since the Bartz pirated-set finding. The two have a common structure: a court has accepted that the defendant’s internal corpus is discoverable, in a form the plaintiff’s experts can analyse, in a sample large enough to support statistical inference. Bartz did this for the training corpus; NYT is doing it for the output corpus.
For an AI lab whose product is, in effect, a large output-generation system, that pattern produces a particular kind of exposure. The product itself, in operation, is now subject to discovery. The product’s logs are the record the plaintiff is permitted to develop its case from. The size of the discoverable sample is large enough to support empirical, not anecdotal, claims about how the product behaves.
The structural exposure this produces is worth being explicit about. An AI lab whose model occasionally produces outputs that arguably infringe a particular plaintiff’s copyrighted material now has to contemplate, in any litigation it loses on motion-to-dismiss, that the plaintiff will be permitted to discover a substantial fraction of the lab’s product-in-operation. The plaintiff will be permitted to measure, with statistical confidence, how often the alleged behaviour occurs. The plaintiff will be permitted to identify, from that measurement, the conditions under which it occurs. And the plaintiff will be permitted to use that measurement to characterise the lab’s product to a factfinder.
The downstream effect on the lab’s procurement conversations is, in the procurement reporting the publication has read, already visible. Enterprise buyers of frontier-model APIs are asking, in writing, about the lab’s discovery-exposure posture. The question is novel and the answers are uneven. A lab whose answer is a refusal to characterise its exposure is, in the procurement conversation, communicating something specific.
The privacy posture, honestly read
The publication’s editorial commitments require us to be specific about one part of the ruling we are skeptical of. The anonymisation procedure approved by the magistrate, and affirmed by Judge Stein, is — on the published material we have read — a substantive procedure but not a final one. The published material is consistent with the anonymisation being technically sound at the volume of twenty million logs. It is not consistent with a claim that no individual user’s conversation could be reconstructed from the discovered corpus by an adversary with access to the corpus and to ancillary information.
We are not the first to make this observation. The privacy-research community has been making it about every “anonymised” corpus produced into litigation for a decade. The question of whether twenty million anonymised conversations, with their full prompt content and the model’s full response content, can be re-identified by an attacker who has them and who has other data, is an open research question. The court has not resolved it. The court has decided that the procedure is adequate for the discovery’s stated purpose. Those are not the same thing.
The publication’s reading is that the privacy posture in the case is the part of the ruling most likely to be revisited. If the discovered corpus is used in ways that produce, on the record, evidence of re-identification — and the discovery is conducted under protective order, so this is a question about whether the protective order holds, not whether the corpus is public — the privacy ruling will be litigated again. The case has not produced a settled position on the privacy question. It has produced a discovery permission and a protective order, and the protective order will, in time, be tested.
What the case still has to decide
It is worth being precise about what the April 2026 summary-judgment motion is and is not.
It is a motion that will ask the court to decide that, on the evidence developed in discovery, no reasonable factfinder could find for the non-moving party on one or more of the claims. The plaintiff’s motion will ask the court to find infringement as a matter of law on the most clear-cut portions of the discovered output corpus. The defendant’s motion will ask the court to find fair use as a matter of law on the training and on the outputs, in the alternative.
It is not a trial. It is not, in either direction, the end of the case. A summary-judgment ruling either disposes of some claims, narrows others, or denies the motion and leaves the case to proceed. The ruling will, in our reading, narrow the case substantially in one direction or another. It will not, in our reading, resolve it.
The publication is also tracking — without yet a clear procedural picture — the relationship between this case and the parallel cases against OpenAI and other AI labs by other rightsholders. The AI Lawsuit Tracker has more than 160 cases on the docket as of mid-2026. The Times case is the highest-profile of them. It is also the case furthest along procedurally on the discovery side. Whatever the discovery posture produces — in admissibility, in expert testimony, in the eventual summary-judgment ruling — will be the template for the rest of them.
The precedent already set
The publication’s view, on the record currently available, is that the precedential weight of NYT v. OpenAI is already substantial regardless of what the merits ruling produces. The discovery rulings are the part of the case that has already changed what is possible in AI litigation. They have established that:
- The plaintiff in an AI copyright case is, on a properly developed record, entitled to discover a statistically meaningful sample of the defendant’s product in operation.
- The defendant’s privacy objections will be litigated, and may be partially addressed, but will not be a categorical block to the production.
- The court’s anonymisation procedure, while reviewable, is functionally compatible with the volume of production the plaintiff requires.
Those three holdings, read together, are the operational precedent. They apply to every AI lab that has, in its corporate possession, a comparable corpus of product-in-operation. They apply whether or not the lab is currently being sued. They apply prospectively to the design of every AI product whose operators are now considering what their litigation exposure will look like.
The publication’s view is that the rational response, for a lab whose product generates such a corpus, is not to hope that the discovery precedent does not extend. The rational response is to design the product, and the corpus, with the discovery precedent in mind from the beginning. The logs that will be produced in litigation, six years from now, are being generated today. The product engineering decisions about what to log, how to log it, what retention policy to apply, and what audit primitives to build into the product, are the decisions that determine what the discovery production will look like when it arrives.
This is the part of the NYT case the publication finds most interesting. The merits will be decided when they are decided. The discovery template is already set. The labs that have read the case correctly are designing for it. The labs that have not are operating on the assumption that the production they will eventually have to make will look different from the production OpenAI was just ordered to make. We see no reason on the published record to believe that it will.
Editorial note. This piece is written from the published rulings, the trade-press coverage of the case, and the procedural-tracker material available as of late spring 2026. We have not had access to the discovery material itself, which is under protective order, and we make no claim about its contents. The publication will revise this piece as the April 2026 summary-judgment ruling and the subsequent procedural posture are placed on the public record. We will not publish leaks from the discovery corpus and we will not characterise its contents from material we have not seen.