Recently, OpenAI suffered a setback in a consolidated copyright lawsuit, with a federal court ordering the company to submit 20 million anonymized ChatGPT user conversation records. The company had previously attempted to overturn this discovery order on privacy grounds but failed to gain the judge's support.

On January 5, 2026, District Court Judge Sidney H. Stein issued an order upholding Magistrate Judge Ona T. Wang's earlier ruling. Judge Stein noted that Magistrate Judge Wang had thoroughly weighed privacy concerns against evidentiary relevance when granting the discovery request by multiple news organizations as plaintiffs. OpenAI had proposed conducting its own internal search of a sample of the 20 million records and submitting only conversations involving the plaintiffs' copyrighted works, but this approach was rejected. Judge Stein explicitly stated that no precedent requires courts to adopt the least burdensome discovery method.

This ruling propels the consolidated pre-trial proceedings for 16 copyright lawsuits against OpenAI into a lengthy and complex discovery phase. The multi-district consolidated litigation, heard in the U.S. District Court for the Southern District of New York, forms a central part of dozens of lawsuits filed by content creators against AI companies. It aims to resolve key legal issues surrounding the training of generative AI using copyrighted material.

In July 2025, plaintiffs led by The New York Times Company and The Chicago Tribune Company initially sought 120 million sample records. After negotiations, OpenAI proposed providing 20 million records (approximately 0.5% of its total retained data) as an alternative, which the plaintiffs accepted. However, in October of the same year, OpenAI changed its position, stating it could not provide a complete de-identified sample and would only offer results based on specific queries. In November, Judge Wang ruled in favor of the plaintiffs, and in December, he denied OpenAI's motion for reconsideration.

Judge Stein noted that OpenAI primarily cited a securities case, but that case had significant differences from the present one. In that securities case, the U.S. Court of Appeals for the Second Circuit had barred the disclosure of SEC telephone recordings from a defendant in another lawsuit. However, that case involved the legality of the original wiretapping, and the call participants enjoyed stronger privacy rights due to the covert nature of the recordings. Judge Stein stated that, by contrast, OpenAI's legal ownership of ChatGPT records was undisputed, and users voluntarily submitted their conversation content.