Cloud storage built for AI

S3-compatible APIs with NFS and SMB. Your buckets or ours.

Start for free
bucketfs

Try it out

# ── Use Training Pipes storage ──$ npx bucketfs buckets create --name datasetscreated bucket: datasets# ── Mount it locally with NFS near your compute ──$ npx bucketfs mounts create --bucket datasets --region us-east-1 --name datasetscreated mount: datasets$ npx bucketfs mount connect datasetsmounted at ./datasets$ ls ./datasetsdata-000.parquet  data-001.parquet  data-002.parquet# ── Mount your existing S3/GCS buckets ──$ npx bucketfs mount connect my-s3-bucket <your credentials>mounted at ./my-s3-bucket$ ls ./my-s3-bucket/checkpointsepoch-08.pt  epoch-09.pt
SavvyCalLaravelTupleTransistorStatamic

Your data pipeline, simplified.

Features

Built for performance and reliability.

Portability

Mount your data anywhere

Take advantage of the world's cheapest compute regardless of where your data is stored.

Access

Multi-protocol support

Access your data via NFSv4.0, NFSv4.1, or SMB. Use the protocol that fits your workflow, all backed by the same high-performance cache.

API (Coming soon)

Built for automation

Manage your data infrastructure with our comprehensive REST API. Provision mounts, configure caching, and monitor usage programmatically.

Storage

Connect any cloud

Bring your own storage from AWS S3, Google Cloud Storage, Azure Blob, or any S3-compatible provider. We handle the rest.

Scale

Deploy globally

Provision regional gateways close to your compute. Minimize latency and maximize throughput for distributed teams and workloads.

Operations

Enterprise-ready from day one.

Security

Multi-tenant isolation

Complete namespace isolation and resource quotas ensure your data stays private. Short-lived credentials and encryption at rest keep your data secure.

Monitoring

Complete observability

Track cache hit ratios, throughput, and egress with built-in Prometheus metrics. Grafana dashboards provide real-time insights into your data pipeline.

Reliability

Production-tested

Built for the most demanding production workloads with automatic scaling and self-healing capabilities.

Control

Fine-grained caching

Configure cache sizes, flush policies, and write-back behavior per mount. Optimize for your specific access patterns and workload requirements.

What everyone is saying

Trusted by professionals.

Thanks to Training Pipes, we're processing training data faster than ever before with streamlined pipelines.

Tina Yards

VP of Engineering, Nexus AI

Training Pipes made optimizing our ML workflows an absolute breeze.

Conor Neville

Head of ML Infrastructure, DataForge

Our data pipeline setup time dropped from days to minutes with Training Pipes.

Amy Chase

Head of Data Engineering, Synthex

We've managed to scale our ML infrastructure 10x in just 6 months with Training Pipes.

Veronica Winton

CTO, ModelOps Labs

I was able to automate 80% of our data pipeline tasks with Training Pipes AI tools.

Dillon Lenora

VP of Data Science, Quantum ML

Training data preparation is now fully automated, saving our team countless hours.

Harriet Arron

ML Platform Lead, CloudScale

Join data teams accelerating their pipelines with Training Pipes.