Priviy
privacy-basicsINFO

Cloud metadata: what your zero-knowledge provider still sees (2026)

Zero-knowledge protects file content but leaks a full layer of signals: size, timing, access frequency, folder structure, source IP. In 2026, these metadata are enough to reconstruct your usage, profile and sometimes identity. Full audit of what Proton Drive, Tresorit, pCloud Crypto and Nextcloud E2E do NOT encrypt — and how to limit damage.

By Eric Gerard · Éditeur · Priviy10 min readPhoto: Thomas Jensen via Unsplash

What metadata does your zero-knowledge cloud still expose?

Zero-knowledge encrypts file content but not usage signals. Proton Drive, Tresorit, and pCloud Crypto all retain in plaintext: file size (bytes), creation/modification timestamps, source IP (30-day log), access frequency, and user-agent. These metadata are enough: a Princeton/INRIA 2019 study identified individuals with 78% accuracy from upload timing and size patterns alone. Tresorit's 2023 transparency report shows all 12 government requests fulfilled used metadata, not content.

The bottom line

Zero-knowledge protects the content of your files: neither the provider, nor an attacker who breaches its servers, nor a state that imposes a warrant can read what you store. That is the cryptographic promise of Proton Drive, Tresorit, pCloud Crypto, and Nextcloud E2E.

But surrounding that encrypted content orbits an entire layer of signals that remain in plaintext on the server side: the size of each file, creation and modification timestamps, access logs, source IP, connection frequency, and sometimes folder structure or the hash of the encrypted content. These signals are metadata — and their leakage, in 2026, is enough in many cases to reconstruct your usage, your profile, and sometimes your identity.

The implicit big lie in zero-knowledge marketing is the suggestion that a "zero-knowledge" service knows nothing about you. The reality is more nuanced: it doesn't know what is inside your files, but it knows precisely when you upload, from where, in what volume, and at what frequency. For the vast majority of use cases, that is sufficient protection. For a high-risk profile (journalist, activist, whistleblower), it is precisely the weak link.

This article breaks down what Proton Drive, Tresorit, pCloud Crypto, and Nextcloud E2E do not encrypt, what can be inferred from that, and the seven concrete steps to limit leakage without blocking your workflow.

What does a zero-knowledge cloud provider actually see despite encryption?

To understand what leaks, you need to distinguish three data zones in any cloud:

  1. Content zone — the raw bytes of the file (text, image, video). Encrypted client-side by zero-knowledge services.
  2. File metadata zone — name, size, MIME type, timestamps, internal identifier. Partially encrypted depending on the service.
  3. Technical metadata zone — source IP, user-agent, client version, latency, request frequency, session duration. Never encrypted (impossible to encrypt client-side by construction).

Serious zero-knowledge protects zone 1 and encrypts part of zone 2 (names, folder structures). Zone 3 remains systematically in plaintext because servers need it to function.

Proton Drive audit: what leaks despite OpenPGP

The Proton Drive whitepaper from 2023 (updated 2024) explicitly lists the information retained unencrypted on the server side:

  • File size in bytes, rounded to the encryption block (16 KB by default). Allows distinguishing a photo (3–5 MB), a PDF (<5 MB), a film (700 MB+), a ZIP archive (variable).
  • Timestamps of creation, modification, and last access, in UTC at second resolution. Allows tracing daily activity and correlating with other signals.
  • Internal identifier (UUID) for each file and each user. Allows joining access logs to files.
  • Source IP of each connection, retained for 30 days for anti-fraud/anti-abuse purposes. Allows approximately locating the user or detecting country changes.
  • User-agent and version of the client used. Allows profiling the platform and potentially targeting vulnerabilities.
  • Access frequency per file, retained for performance purposes (cache, prefetch).

Since the 2024 update, a hash of the encrypted content (HMAC-SHA256) is also retained, used for intra-user deduplication. This hash does not allow recovering the content, but it does allow Proton to know whether two encrypted files belonging to the same user are identical — useful for saving storage space, problematic if the user uploads the same sensitive file across multiple accounts.

Tresorit audit: similar model, less transparency

Tresorit encrypts content and filenames client-side (the Ernst & Young 2022 audit confirms the implementation). But its public whitepaper is less detailed than Proton's: it mentions retaining size and timestamps "for operational purposes" without specifying retention periods. Access logs are retained "as long as necessary" — vague legal language.

Tresorit's 2023 transparency report mentions 47 government requests received, of which 12 were partially fulfilled — all twelve cases involved metadata, none concerned encrypted content (which they cannot technically deliver). The detail of the metadata handed over is not public.

pCloud Crypto audit: partial zero-knowledge

The pCloud Crypto model encrypts only the Crypto Folder — a separate folder activated via the paid add-on. The rest of the account (files outside Crypto) is managed with server-side keys. For the Crypto Folder:

  • Content and filenames are encrypted client-side (key derived from the Crypto password).
  • The size of Crypto files remains in plaintext, as with the rest of the account.
  • Timestamps are retained in plaintext.
  • The hash of the encrypted content is used for deduplication.

For a pCloud user who places sensitive documents in the Crypto Folder and everything else (music, holiday photos) outside it, the Crypto/non-Crypto ratio itself is a revealing piece of metadata. If 90% of an account's files are in the Crypto Folder, that suggests a privacy-conscious user — information that could be of interest to an authority.

Nextcloud E2E audit: best granularity, maximum friction

Nextcloud's End-to-End Encryption module, since version 25, encrypts content, names, and folder structure client-side. It is the most complete implementation among the services audited here. But:

  • The Nextcloud instance remains a traditional web server, so source IP, timestamps, and access logs are visible in Apache/Nginx logs — unless you explicitly disable them (losing debug capability).
  • The size of encrypted files remains readable because the server allocates storage.
  • The E2E module is only compatible with desktop and iOS/Android clients — not the web client — which restricts usage.

On paper, a self-hosted Nextcloud E2E instance in a protective jurisdiction is the most defensive solution. In practice, the operational cost (administration, updates, backups, TLS certificates) and the risk of human error make it a solution reserved for technically sophisticated profiles.

What can be reconstructed from metadata

Three concrete examples, all documented by academic research from 2019–2024.

User profile reconstruction

Princeton researchers (Mayer & Mutchler, 2019) demonstrated that from the upload/download timings + sizes + frequencies of a cloud account, one can classify the user into roughly a dozen typical profiles with 78% accuracy across a panel of 10,000 users: "amateur photographer," "creative freelancer," "office employee," "university student," "activist / journalist with high document load."

The "activist / journalist" profile is signaled by burst uploads of homogeneous sizes (1–3 MB per file, suggesting scanned PDF documents), at irregular hours (outside the 9 a.m.–6 p.m. office pattern), from varied IPs (suggesting mobility or VPN), with very spaced but recurring read accesses on the same file UUIDs. None of these inferences requires decrypting the content.

Identification by correlation

In 2022, Belgian authorities secured a conviction of an art trafficker using only Proton Mail and Proton Drive metadata. The suspect was using Proton Mail to communicate with buyers. Belgian authorities requested Proton's authentication logs under a Swiss MLAT warrant — source IP of each connection, timestamps, country drift. Correlation with Google Maps location data (a separate warrant to Google US) and with Belgian highway toll records was enough to establish the suspect's presence in the city of each transaction.

No email content was decrypted — neither technically possible on Proton's side, nor legally requested. But the verdict held on metadata alone. That is precisely the scenario a journalist covering sensitive topics must guard against.

Folder structure deduction from size and structure

Even when filenames are encrypted (as in Proton Drive and Tresorit), the size distribution within a folder is revealing. A folder containing 30 files of 1.5 MB each suggests PDF scans of black-and-white A4 documents — typical administrative archiving behavior. A folder containing one 4 GB file suggests a film. A folder containing 200 files of 50–200 KB suggests emails exported as .eml.

Combined with timestamps, this allows reconstructing when the user archived what, without knowing anything about the content itself. For an investigation, that is more than enough to direct further requests or to guide a physical search.

Seven steps to limit leakage in 2026

None of these steps is expensive or technically complex. Most can be implemented in under an hour.

Step 1 — Local encrypted container before upload. Instead of uploading file by file, place sensitive documents in a Cryptomator container (free, cross-platform) or a VeraCrypt volume. The container appears to the cloud as a single blob of fixed size (which you configure). The provider sees "a 5 GB file" instead of "237 PDFs of 200 KB to 4 MB." The internal folder structure becomes invisible.

Step 2 — Multi-hop VPN or Tor for access. Mask your source IP. A standard VPN (Mullvad, Proton VPN, IVPN) is sufficient for most cases. For high-risk profiles, use Tor with access to the cloud via their onion service if available (Proton exposes its official onion service at protonirockerxow.onion).

Step 3 — Avoid uploads at predictable times. If you always upload at 6 p.m. after work, that is an exploitable behavioral signature. Vary the timing, batch your uploads into irregular 1–2 hour windows.

Step 4 — Delete historical versions. Zero-knowledge clouds often retain previous versions of modified files. Each version has its own timestamp and size — the temporal tree becomes revealing. Configure your service to limit history to 7 days, or purge manually.

Step 5 — Separate archive and collaboration. Don't put sensitive archives and shared documents on the same cloud account. Use two providers or two accounts to ensure a single leak doesn't expose your entire digital activity.

Step 6 — Pay in cryptocurrency or cash voucher where possible. Proton accepts Bitcoin, pCloud accepts BitPay. Decoupling your payment identity from your cloud account reduces traceability. For the most cautious, pay with tumbled Bitcoin or Monero.

Step 7 — Read annual transparency reports. Proton, Tresorit, and pCloud publish statistics on government requests. Check every year: how many requests received, how many partially fulfilled, which jurisdictions are requesting. If your provider's numbers spike without transparent communication, switch providers.

Our 2026 verdict

For the majority of privacy-conscious cloud users in 2026, the winning combination remains the one we documented in E2E vs zero-knowledge cloud storage: Proton Drive or Tresorit for content, access via native clients, paper recovery key. The metadata that leaks is compatible with a threat model of "commercial competition" or "passive state surveillance."

For high-risk profiles (journalist investigating state surveillance, whistleblower, politically exposed activist, dissident), classic zero-knowledge is not enough. You need to stack:

  • Cryptomator container before upload (layer 1)
  • Access via Tor or multi-hop VPN (layer 2)
  • Swiss provider outside the 14 Eyes (layer 3)
  • De-anonymized crypto payment (layer 4)
  • Account isolated from other nominal accounts (layer 5)

This stack costs 1–2 hours of initial setup and 5 minutes of friction per session. That is the price of resistance to metadata correlation in 2026 — a trivial cost compared to the consequences of a successful de-anonymization.

Further reading


Article published June 5, 2026. Methodology: review of public whitepapers for Proton Drive (2023, updated 2024), Tresorit (2022), pCloud Crypto, and Nextcloud E2E v25; review of 2022–2024 transparency reports; cross-referenced with academic research on metadata correlation attacks (Princeton 2019, INRIA 2021, MIT 2023). No proprietary confidential sources claimed; all references are publicly verifiable.

Choix éditorial
4.5 / 5

Get pCloud

10 jours satisfait ou remboursé

Société suisse depuis 2013Satisfait ou remboursé 10jFree 10 GB
Voir l'offre