Product

For Data Engineers

Build your Bronze layer without building ingest. EdgeMQ gives you a standard HTTP → S3 ingest layer. Every service, device, and partner writes to the same endpoint; EdgeMQ lands data in your S3 Bronze layer as segments, raw Parquet, or schema-aware Parquet views, and you use Snowflake, Databricks, ClickHouse, DuckDB, and Postgres on top. EdgeMQ is the lakehouse ingest layer that keeps your S3 Bronze tables continuously fed for the rest of your stack.

Get started Read the ingest docs

SnowflakeDatabricksClickHouseDuckDB

The boring part of your job

Yet another ingest pipeline

You should be modeling, optimizing, and shipping insights, not wiring up one-off collectors and brittle upload scripts.

One-off collectors for each team/partner
curl + cron + Lambda + bash that "just uploads to S3"
Lossy partial uploads and silent retry failures
"Near real-time" requests without more infra budget
Snowpipe / Autoloader / COPY INTO glued to fragile upstreams

Standard ingest

EdgeMQ: your standard HTTP → S3 ingest layer

EdgeMQ is a global ingest service for modern data stacks. Producers send NDJSON to a single HTTPS endpoint; EdgeMQ lands it durably in your S3 prefix with clear commit semantics. You keep your warehouses, lakehouses, and tools-EdgeMQ just makes sure S3 is always up to date.

curl -X POST "https://<region>.edge.mq/ingest" \
  -H "Authorization: Bearer $EDGEMQ_TOKEN" \
  -H "Content-Type: application/x-ndjson" \
  --data-binary @events.ndjson

Append to a WAL on NVMe at the edge
Bounded queues with honest backpressure (503 + Retry-After)
Compressed segments uploaded to your S3
Commit markers define when data is safe to read

One ingest abstraction for your org

From your perspective as a data engineer, every new upstream producer just needs:

A URL: https://<region>.edge.mq/ingest
A token
A shared contract: "send NDJSON; we'll see it in S3"

S3 as Bronze

S3 is your Bronze layer-by design

Most warehouse and lakehouse tooling assumes S3 as the Bronze layer. EdgeMQ is built to feed that layer with segments, raw Parquet, and schema-aware Parquet views, so you can choose between raw replay, payload-preserving Parquet, and typed, query-ready tables.

Your bucket, your prefix, least-privilege IAM role
Outputs: compressed NDJSON segments + commit markers, raw Parquet, and schema-aware Parquet views
Roadmap: CSV, JSON objects, and table-friendly layouts (Iceberg-style directories)

Raw Parquet preserves your full JSON payload in a payload field alongside metadata columns (tenant, ingest timestamp, format), so you don't have to lock in a global schema up front-you can still project and type fields in your warehouse when you're ready. Schema-aware views are for when you want typed columns generated at ingest time.

Standardize ingest. Unblock your modeling roadmap.

Give every producer the same HTTPS endpoint. Land data durably in S3 with commit semantics-then model at your pace.

Get started Pricing

Fits right into your existing stack

Snowflake

Use EdgeMQ as your S3 staging area. Snowpipe or COPY INTO ingest from EdgeMQ-managed prefixes into tables:

▸Segments: load NDJSON using a JSON file format (as in the example below), driven by commit markers.
▸Parquet (raw or views): treat EdgeMQ Parquet prefixes as external tables or copy from Parquet directly, using the same commit markers to drive incremental loads.

COPY INTO raw.events
FROM 's3://your-bucket/edge-events/prod/'
CREDENTIALS=(AWS_ROLE='arn:aws:iam::123:role/edge-snowflake-access')
FILE_FORMAT = (TYPE = JSON)
PATTERN = '.*\.json\.gz';

Databricks / Spark

Treat EdgeMQ prefixes as your Autoloader or Spark streaming source-continuous ingest into Delta tables:

▸Read compressed segments (JSON) when you want full raw control.
▸Or point Autoloader/structured streaming at EdgeMQ's Parquet prefixes (raw or views) for direct columnar reads from S3.

ClickHouse / Postgres

Use EdgeMQ as a burst buffer: ingest at edge speed → S3; a controlled loader job batches into your databases, either by expanding segments back to NDJSON or by loading from Parquet outputs (raw or views) when enabled.

DuckDB (and MotherDuck)

Analysts can query fresh data directly from S3 without spinning new pipelines:

▸Use read_json_auto over EdgeMQ segments when you want raw NDJSON.
▸Or use read_parquet over EdgeMQ's Parquet prefixes for faster, cheaper scans (including schema-aware views).

-- Raw segments
SELECT *
FROM read_json_auto('s3://your-bucket/edge-events/prod/*.json.gz');

-- Parquet outputs (when enabled)
-- SELECT * FROM read_parquet('s3://your-bucket/edge-events/prod/parquet/.../*.parquet');

How EdgeMQ helps data engineers specifically

Standardize ingest across teams

Provision an EdgeMQ endpoint + S3 prefix per domain
Hand teams a small client snippet and schema contract
They publish; you see consistent data in your lake

Reliable and observable ingest

Durable WAL at the edge before acknowledgement
Bounded queues with 503 + Retry-After under pressure
Commit markers-no guessing which files are safe
Per-tenant isolation on microVM + WAL volume

Clean security and governance

Role assumption via STS; no long-lived access keys
Least privilege to specific prefixes
Per-environment isolation (dev/stage/prod)

Focus on modeling and transformation

Spend time on dbt, table layouts, and performance
Assume a trustworthy Bronze layer (S3), fed by segments and/or Parquet
Use Parquet output where you want query-ready files, and fall back to segments when you need full raw detail
Build reusable Bronze → Silver → Gold patterns

Operational peace of mind

Managed edge infrastructure, clear SLAs, real observability

Per-tenant microVMs, tuned WAL, shipper, and health checks
Starter (free), Pro, and Enterprise tiers with guarantees
Metrics & logs for ingest rates, latencies, failures, and S3 delivery

Turn "getting it into S3" into a solved problem

Keep using the tools you know. Let EdgeMQ be the standard HTTP → S3 ingest layer that feeds them with compressed segments and Parquet output from the edge.

Start free Read the ingest docs