v0.5 Available

The Transparent Acceleration
Layer for Lakehouse

⭐ Now accepting 3 Early Access Design Partners for Iceberg acceleration.

Accelerate Apache Iceberg workloads by 5–20× with zero application changes. Faster metadata. Smarter pushdown. Unified caching.

Book a Demo

Works seamlessly with

Spark

Trino

Flink

DuckDB

DataFusion

Why Laketap?

The Lakehouse acceleration layer — eliminating I/O bottlenecks with smarter metadata and data pushdown.

5–20× Faster Queries

Real Acceleration, Not Just Caching. Laketap reduces object-store latency, eliminates repetitive Parquet decoding, and applies selective pushdown to minimize CPU & network overhead.

Faster BILower ETL LatencyCost-efficient

Works Across All Engines

No need to tune each engine separately. Unifies caching for Spark, Trino, Flink, DuckDB, DataFusion — with more to come.

One LayerMulti-Engine

Zero Application Changes

No SQL rewrite. No table changes. No engine forks. Simply deploy Laketap and update the connector.

Drop-inSeamless

Tiered & Adaptive Caching

Tiered design: TCache holds manifest/schema, ACache Meta stores footers/indexes, ACache Data caches row-groups with selective decode—auto promote/demote based on workload.

SmartLightweight

Runs in Your VPC

No Lock-In. Your data never leaves your environment. Laketap is an on-prem / VPC-side acceleration layer with full enterprise isolation.

SecurePrivate

Architecture

Unified Acceleration Framework

A three-step architecture that accelerates metadata, optimizes split planning, and performs compute pushdown across cache and object storage.

Architecture Overview · Metadata, planning, and data paths in one layer

Laketap is now validating TCache/ACache with design partners on real Iceberg workloads.

Step 101

Catalog Cache (TCache) — Metadata Acceleration

Catalog-facing cache that implements the Iceberg REST Catalog API and stores manifests, schema, and snapshots to cut catalog latency and avoid repeated object-store round trips.

Snapshot-aware routing
Manifest list & manifest file cache
Schema / partition spec cache
Fast listFiles/listManifests

Step 202

Footer & Index Cache (ACache Meta) — Intelligent Split Planning

During planning, cached Parquet footers and indexes enable row-group pruning to emit a cache-aware scan plan with only the required splits.

Row-group metadata caching
ZoneMap / dictionary reuse
Predicate-aware pruning
Cache-aware split tagging

Step 303

Data Pushdown Cache (ACache Data) — Federated Read

At execution, cached splits run selective decode and vectorized filtering; cache hits return via Arrow Flight, while cache misses stream directly from the object store, preserving native scan semantics.

Selective decode
Encoding-aware filtering
Arrow Flight data streaming
Federated read across cache & OS

ENGINEQuery entry

Iceberg REST API

Cached Metadata

TCACHECatalog Proxy

SnapshotsSchemasManifestsPartitions

Cache Miss

Raw Metadata

UPSTREAMCatalog

PolarisGlueHMS

Cache Hit

Cache Refresh

Advanced Capabilities

Laketap isn't just a cache; it's a smart layer that learns from your data.

Enterprise Ready

Workload-Aware Adaptive Policies

Automatically adjusts cache eviction, retention, and selective decode thresholds based on real-time workload usage.

Metrics-Driven Access Heatmaps

Collects table/column heat, row-group access patterns, and cross-engine hotspots to dynamically optimize caching strategies.

Historical Optimization

Predicts repetitive queries, pre-warms based on daily/hourly patterns, and creates shared hotspot replicas for cross-engine workloads.

Layout Optimization

Provides recommendations for manifest layout, partition heat analysis, and compaction strategies based on historical statistics.

Design Partner Insights

Early access to new planning strategies, metadata optimizations, and pushdown extensions. Collaborate directly with the engineering team.

Experience Laketap

Deploy in your VPC with a connector swap—accelerate your Lakehouse without rewriting workloads.