Skip to main content

Constructor

JmailClient(cache=True)
Create a new client instance.

Parameters

cache
bool
default:"True"
Enable local ETag-based file caching at ~/.cache/jmail/. Set to False to always download fresh data.

Example

from jmail import JmailClient

# Default: caching enabled
client = JmailClient()

# No caching
client = JmailClient(cache=False)

Methods

MethodReturnsDescription
manifest()dictAPI manifest with dataset metadata
emails(slim)DataFrameEmail archive
documents(include_text)DataFrameDocument metadata/text
photos()DataFramePhoto metadata
people()DataFrameIdentified people
photo_faces()DataFrameFace bounding boxes
imessage_conversations()DataFrameiMessage conversations
imessage_messages()DataFrameiMessage messages
star_counts()DataFrameCrowd-sourced stars
release_batches()DataFrameRelease batch info
url(dataset, fmt)strRaw dataset URL

manifest()

Fetch the API manifest with dataset metadata and checksums.
manifest = client.manifest()
print(manifest)
Returns: dict — parsed JSON from data.jmail.world/v1/manifest.json

url(dataset, fmt="parquet")

Get the raw URL for a dataset file. Useful for passing directly to DuckDB, Polars, or any tool that reads Parquet over HTTP.
url = client.url("emails-slim")
# "https://data.jmail.world/v1/emails-slim.parquet"

url = client.url("documents", fmt="ndjson.gz")
# "https://data.jmail.world/v1/documents.ndjson.gz"
dataset
str
required
Dataset name. One of: emails, emails-slim, documents, photos, people, photo_faces, imessage_conversations, imessage_messages, star_counts, release_batches.
fmt
str
default:"parquet"
File format. Either parquet or ndjson.gz.
Returns: str — full URL