
Following the success and interest in our C3 benchmarking project, 28Stone is releasing more data that demonstrates an important shift in capital markets: cloud infrastructure is now a viable alternative for ultra-low latency trading. In modern trading environments, the market’s demand for lower latency, performance determinism and higher throughput never stops.
In the effort to provide our clients with actionable performance data, 28Stone is releasing tick-to-trade benchmarking of Google Cloud’s next-generation C4 machine types (powered by Intel Emerald Rapids). The results represent significant improvements from the previous generation, featuring ultra-low latencies that close the gap with traditional co-location.
Calling All CME Globex Customers
Get ready to migrate to Google Cloud with 28Stone
- Prepare for the Globex migration, including 2026 requirements
- Learn about the ultra-low latency achieved in the cloud with CME, Google Cloud and 28Stone
- Explore why cloud is a viable option vs. on-prem for high-performance trading – expanding the opportunity for your firm to modernize trading
28Stone is here to help! Contact us, and get our Globex Cloud Migration Preparedness Guide.
The New Performance Frontier (under 15 microseconds RTT)
The primary takeaway from our new data is clear: cloud-native now offers a meaningful alternative to traditional co-location performance. Our headline results shows that the C4 platform delivers an impressive P50 Round-Trip Time (RTT) of 15 microseconds and P99 Round-Trip Time (RTT) of 22 microseconds in a tick-to-trade testing topology that utilizes a variety of optimization techniques including kernel bypass networking, hardware offloading, ring buffering, core pinning, vNUMA alignment, P-state, C-state tuning and core isolation.
For firms who reviewed the original results, the C3 benchmarks, this result represents up to a 40% reduction in median RTT compared to our previously published results due to the performance and consistency of the C4 family combined with a revised configuration optimizations only possible on Google Cloud instances. Furthermore, C4 instances consistently achieve sub-1.5 microseconds in-process results in the benchmarking application.
Achieving this level of performance requires specific, advanced techniques - low-level languages combined with kernel bypass networking and hardware offloading. Accessing these performance tiers in your environment requires precise, repeatable configurations that have been benchmarked in a variety of testing scenarios. The full whitepaper provides the details on what is possible on the next generation of Google Cloud instances.
Determinism, Consistency and Vertical Scaling
For any trading firm, performance consistency is as critical as raw speed. It's often the core concern cited against adopting cloud infrastructure. Our results confirm C4’s stability: performance gains were stable across the full test load range (1x - 100x market replay speeds), maintaining consistent latency and throughput even under significant load and volume.
Critically, performance was consistent across the tested instance sizes (48, 96, and 192 vCPU instances). It demonstrates how firms can now scale their compute capacity vertically without compromising the absolute latency floor. This offers significant flexibility for implementing complex, scaling trading strategies, allowing engineers to dedicate more resources to proprietary logic.
Why This Matters to Trading Firms
This significant performance shift reduces the need for proprietary, specialized hardware for ultra-low latency trading. It not only lowers the barrier to entry but also provides material flexibility, allowing firms to focus capital and resources entirely on proprietary trading logic, rather than expensive, static infrastructure maintenance.
The democratization of low-latency and high-frequency trading through the new value proposition of Google Cloud allows trading firms to take advantage of new market dynamics and a broader range of strategies.
Conclusion
The C4 benchmarking results are more than an incremental improvement; they present a meaningful set of decisions for trading firms looking to address high-performance cloud trading infrastructure needs.
After the release of the C4 benchmark results, 28Stone will publish a full C3 vs C4 side-by-side comparison across data, detailed methodology, and critical latency percentiles (P99 and P99.9) as a guide for are essential for true performance budgeting and infrastructure planning.
Download it now for the full comparison and the data required to budget your next-generation trading platform.