Epstein Files Jan 30, 2026

Data hoarders on reddit have been hard at work archiving the latest Epstein Files release from the U.S. Department of Justice. Below is a compilation of their work with download links.

Please seed all torrent files to distribute and preserve this data.

Ref: https://old.reddit.com/r/DataHoarder/comments/1qrk3qk/epstein_files_datasets_9_10_11_300_gb_lets_keep/

Epstein Files Data Sets 1-8: INTERNET ARCHIVE LINK

Epstein Files Data Set 1 (2.47 GB): TORRENT MAGNET LINK
Epstein Files Data Set 2 (631.6 MB): TORRENT MAGNET LINK
Epstein Files Data Set 3 (599.4 MB): TORRENT MAGNET LINK
Epstein Files Data Set 4 (358.4 MB): TORRENT MAGNET LINK
Epstein Files Data Set 5: (61.5 MB) TORRENT MAGNET LINK
Epstein Files Data Set 6 (53.0 MB): TORRENT MAGNET LINK
Epstein Files Data Set 7 (98.2 MB): TORRENT MAGNET LINK
Epstein Files Data Set 8 (10.67 GB): TORRENT MAGNET LINK


Epstein Files Data Set 9 (Incomplete). Only contains 49 GB of 180 GB. Multiple reports of cutoff from DOJ server at offset 48995762176.

ORIGINAL JUSTICE DEPARTMENT LINK

  • TORRENT MAGNET LINK (removed due to reports of CSAM)

/u/susadmin’s More Complete Data Set 9 (96.25 GB)
De-duplicated merger of (45.63 GB + 86.74 GB) versions

  • TORRENT MAGNET LINK (removed due to reports of CSAM)

Epstein Files Data Set 10 (78.64GB)

ORIGINAL JUSTICE DEPARTMENT LINK

  • TORRENT MAGNET LINK (removed due to reports of CSAM)
  • INTERNET ARCHIVE FOLDER (removed due to reports of CSAM)
  • INTERNET ARCHIVE DIRECT LINK (removed due to reports of CSAM)

Epstein Files Data Set 11 (25.55GB)

ORIGINAL JUSTICE DEPARTMENT LINK

SHA1: 574950c0f86765e897268834ac6ef38b370cad2a


Epstein Files Data Set 12 (114.1 MB)

ORIGINAL JUSTICE DEPARTMENT LINK

SHA1: 20f804ab55687c957fd249cd0d417d5fe7438281
MD5: b1206186332bb1af021e86d68468f9fe
SHA256: b5314b7efca98e25d8b35e4b7fac3ebb3ca2e6cfd0937aa2300ca8b71543bbe2


This list will be edited as more data becomes available, particularly with regard to Data Set 9 (EDIT: NOT ANYMORE)


EDIT [2026-02-02]: After being made aware of potential CSAM in the original Data Set 9 releases and seeing confirmation in the New York Times, I will no longer support any effort to maintain links to archives of it. There is suspicion of CSAM in Data Set 10 as well. I am removing links to both archives.

Some in this thread may be upset by this action. It is right to be distrustful of a government that has not shown signs of integrity. However, I do trust journalists who hold the government accountable.

I am abandoning this project and removing any links to content that commenters here and on reddit have suggested may contain CSAM.

Ref 1: https://www.nytimes.com/2026/02/01/us/nude-photos-epstein-files.html
Ref 2: https://www.404media.co/doj-released-unredacted-nude-images-in-epstein-files

  • PeoplesElbow
    link
    fedilink
    arrow-up
    6
    ·
    1 day ago

    Ok everyone, I have done a complete indexing of the first 13,000 pages of the DOJ Data Set 9.

    KEY FINDING: 3 files are listed but INACCESSIBLE

    These appear in DOJ pagination but return error pages - potential evidence of removal:

    EFTA00326497

    EFTA00326501

    EFTA00534391

    You can try them yourself (they all fail):

    https://www.justice.gov/epstein/files/DataSet 9/EFTA00326497.pdf

    The 86GB torrent is 7x more complete than DOJ website

    DOJ website exposes: 77,766 files

    Torrent contains: 531,256 files

    Page Range Min EFTA Max EFTA New Files


    0-499 EFTA00039025 EFTA00267311 21,842

    500-999 EFTA00267314 EFTA00337032 18,983

    1000-1499 EFTA00067524 EFTA00380774 14,396

    1500-1999 EFTA00092963 EFTA00413050 2,709

    2000-2499 EFTA00083599 EFTA00426736 4,432

    2500-2999 EFTA00218527 EFTA00423620 4,515

    3000-3499 EFTA00203975 EFTA00539216 2,692

    3500-3999 EFTA00137295 EFTA00313715 329

    4000-4499 EFTA00078217 EFTA00338754 706

    4500-4999 EFTA00338134 EFTA00384534 2,825

    5000-5499 EFTA00377742 EFTA00415182 1,353

    5500-5999 EFTA00416356 EFTA00432673 1,214

    6000-6499 EFTA00213187 EFTA00270156 501

    6500-6999 EFTA00068280 EFTA00281003 554

    7000-7499 EFTA00154989 EFTA00425720 106

    7500-7999 (no new files - all wraps/redundant)

    8000-8499 (no new files - all wraps/redundant)

    8500-8999 EFTA00168409 EFTA00169291 10

    9000-9499 EFTA00154873 EFTA00154974 35

    9500-9999 EFTA00139661 EFTA00377759 324

    10000-10499 EFTA00140897 EFTA01262781 240

    10500-12999 (no new files - all wraps/redundant)

    TOTAL UNIQUE FILES: 77,766

    Pagination limit discovered: page 184,467,440,737,095,516 (2^64/100)

    I searched random pages between 13k and this limit - NO new documents found. The pagination is an infinite loop. All work at: https://github.com/degenai/Dataset9

    • PeoplesElbow
      link
      fedilink
      arrow-up
      2
      ·
      14 hours ago

      DOJ Epstein Files: I found what’s around those 3 missing files (Part 2)

      Follow-up to my Dataset 9 indexing post. I pulled the adjacent files from my local copy of the torrent. What I found is… notable.


      TLDR

      The 3 missing files aren’t random corruption. They all cluster around one event: Epstein’s girlfriend Karyna Shuliak leaving St. Thomas (the island) in April 2016. And one of the gaps sits directly next to an email where Epstein recommends her a novel about a sympathetic pedophile—two days before the book was publicly released.


      The Big Finding: Duplicate Processing Batches

      Two of the missing files (326497 and 534391) are the same document processed twice—once with redactions, once without—208,000 files apart in the index.

      Redacted Batch Unredacted Batch Content
      326494-326496 534388-534390 AmEx travel booking, staff emails
      326497 - MISSING 534391 - MISSING ???
      326498-326500 Email chain continues
      326501 - MISSING ???
      326502-326506 Reply + Invoice
      534392 Epstein personal email

      Random file corruption hitting the same logical document in two separate processing runs, 208,000 positions apart? That’s not how corruption works. That’s how removal works.


      What’s Actually In These Files

      I pulled everything around the gaps. It’s all one email chain from April 10, 2016:

      The event: Karyna Shuliak (Epstein’s girlfriend) booked on Delta flight from Charlotte Amalie, St. Thomas → JFK on April 13, 2016.

      St. Thomas is where you fly in/out to reach Little St. James. She was leaving the island.

      The chain:

      • 11:31 AM — AmEx Centurion (black card) sends confirmation to [email protected]
      • 11:33 AM — Lesley Groff (Epstein’s executive assistant) forwards to Shuliak, CC’s staff
      • 11:35 AM — Shuliak replies “Thanks so much”
      • 3:52 PM — Epstein personally emails Shuliak
      • Next day — AmEx sends invoice

      The unredacted batch (534xxx) reveals the email addresses that are blacked out in the redacted batch (326xxx):


      The Epstein Email (EFTA00534392)

      The document immediately after missing file 534391:

      From: "jeffrey E." <jeevacation@gmail.com>
      To: Karyna Shuliak
      Date: Sun, 10 Apr 2016 19:52:13 +0000
      
      order http://softskull.com/dd-product/undone/
      

      He’s telling her to buy a book. The same day she’s being booked to leave his island.


      The Book

      “Undone” by John Colapinto (Soft Skull Press)

      On-sale date: April 12, 2016
      Epstein’s email: April 10, 2016

      He recommended it two days before public release.

      Publisher’s description:

      “Dez is a former lawyer and teacher—an ephebophile with a proclivity for teenage girls, hiding out in a trailer park with his latest conquest, Chloe. Having been in and out of courtrooms (and therapists’ offices) for a number of years, Dez is at odds with a society that persecutes him over his desires.

      The protagonist is a pedophile who resents society for judging him.

      The author (John Colapinto) is a New Yorker staff writer, former Vanity Fair and Rolling Stone contributor. Exactly the media circles Epstein cultivated.


      What’s Missing

      So now we know the context:

      • EFTA00326497 — Between AmEx confirmation and Groff’s forward. Probably the PDF ticket attachment referenced in the emails.

      • EFTA00326501 — Between the forward chain and Shuliak’s reply. Unknown.

      • EFTA00534391Immediately before Epstein’s personal email about the pedo book. Unknown, but its position is notable.


      Open Questions

      1. How did Epstein have this book before release? Advance copy? Knows the author?

      2. What is 534391? It sits between staff logistics emails and Epstein’s direct correspondence. Another Epstein email? An attachment?

      3. Are there other Shuliak travel records with similar gaps? Is April 2016 unique or part of a pattern?

      4. What else is in the corpus from [email protected]?


      Verify It Yourself

      Try the DOJ links (all return errors):

      Check the torrent: Pull the EFTA numbers I listed. Confirm the gaps. Confirm the adjacencies.

      Grep the corpus: Search for “QWURMO” (booking reference), “Shuliak”, “jeevacation”, “Colapinto”


      Summary

      Three files missing from 531,256. All three cluster around one girlfriend’s April 2016 departure from St. Thomas. Same gaps appear in two processing batches 208,000 files apart. One gap sits adjacent to Epstein personally recommending a novel about a sympathetic pedophile, sent before the book was even publicly available.

      This isn’t random corruption.

      Full analysis + all code: https://github.com/degenai/Dataset9


      If anyone has the torrent and wants to grep for Colapinto connections or other Shuliak trips, please do. This is open source for a reason.

        • PeoplesElbow
          link
          fedilink
          arrow-up
          1
          ·
          5 hours ago

          That is new information! I wasnt even able to get that ‘no images produced’ page, good to know thank you. I just hit a file corruption error when I tried to dl from the DOJ. Thank you for the information. I guess this means the content is still missing in a way but at least accounted for.

    • kongstrong
      link
      fedilink
      arrow-up
      2
      ·
      16 hours ago

      ysk the page limit has been fixed, it caps out around 9600 for a total of ~197k file entries. Way less than the largest torrent’s 530k. Scraping now to get a list of the files they kept on the DOJ so we can determine which files they don’t want out there. Would be a good lead to further investigate the torrent

      • PeoplesElbow
        link
        fedilink
        arrow-up
        1
        ·
        13 hours ago

        Oh no…I didn’t know this, on one hand now i need to run another scan, but on the other it could reveal something, the torrent has 500k+ files so there is still a gap. I will run the scraper again and do a new analysis in the next day or two.

    • Wild_Cow_5769
      link
      fedilink
      arrow-up
      3
      ·
      1 day ago

      Just like I said… In NO way do I trust DOJ… Our only hope is if someone drops the full data set 9 somewhere.

      • PeoplesElbow
        link
        fedilink
        arrow-up
        2
        ·
        1 day ago

        My question is, why is the total download size so large and the range of displayed documents so little? Only 15% of the known documents are individually served on the site, and some arent seen until page 10,000