<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>containers on Red Hat App Services Performance Team</title><link>https://redhatperf.github.io/categories/containers/</link><description>Recent content in containers on Red Hat App Services Performance Team</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Thu, 09 Apr 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://redhatperf.github.io/categories/containers/index.xml" rel="self" type="application/rss+xml"/><item><title>Why isn't Quarkus 2x faster than Spring on my machine?</title><link>https://redhatperf.github.io/post/hidden-cost-rootless-container-networking/</link><pubDate>Thu, 09 Apr 2026 00:00:00 +0000</pubDate><guid>https://redhatperf.github.io/post/hidden-cost-rootless-container-networking/</guid><description>&lt;img src="https://redhatperf.github.io/post/hidden-cost-rootless-container-networking/diff-flamegraph.png" alt="Featured image of post Why isn't Quarkus 2x faster than Spring on my machine?" />&lt;div class="paragraph">
&lt;p>A community member ran our &lt;a href="https://github.com/quarkusio/spring-quarkus-perf-comparison">Quarkus vs Spring CRUD benchmark&lt;/a> on their bare-metal Fedora workstation and asked:&lt;/p>
&lt;/div>
&lt;div class="quoteblock">
&lt;blockquote>
&lt;div class="paragraph lead">
&lt;p>&lt;em>Why do I see only 1.19x instead of 2x?&lt;/em>&lt;/p>
&lt;/div>
&lt;/blockquote>
&lt;/div>
&lt;div class="paragraph">
&lt;p>&lt;strong>Our perf-lab shows Quarkus at 2.08x Spring’s throughput, but locally the gap nearly disappears.&lt;/strong>&lt;/p>
&lt;/div>
&lt;div class="paragraph">
&lt;p>This post walks through the investigation that found the culprit.&lt;/p>
&lt;/div>
&lt;div class="sect1">
&lt;h2 id="_the_gap">The gap&lt;/h2>
&lt;div class="sectionbody">
&lt;div class="paragraph">
&lt;p>The benchmark is a REST/CRUD application backed by PostgreSQL. The app runs on the host, PostgreSQL in a rootless podman container. Each HTTP request executes 2 SQL queries (confirmed via &lt;a href="https://www.postgresql.org/docs/current/pgstatstatements.html">pg_stat_statements&lt;/a>).&lt;/p>
&lt;/div>
&lt;div class="imageblock">
&lt;div class="content">
&lt;img src="throughput-gap.svg" alt="Throughput comparison: Local vs Perf-lab"/>
&lt;/div>
&lt;/div>
&lt;div class="paragraph">
&lt;p>Spring delivers roughly the same throughput in both environments. Quarkus swings from 15.5K to 24.5K TPS — it is being held back locally. &lt;strong>Something in the local environment is capping Quarkus but not Spring.&lt;/strong>&lt;/p>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="sect1">
&lt;h2 id="_mpstat_where_is_the_cpu_going">mpstat: where is the CPU going?&lt;/h2>
&lt;div class="sectionbody">
&lt;div class="paragraph">
&lt;p>The benchmark collects &lt;a href="https://man7.org/linux/man-pages/man1/mpstat.1.html">mpstat&lt;/a> data during every run — per-CPU utilization split into &lt;code>%usr&lt;/code> (application code), &lt;code>%sys&lt;/code> (kernel), &lt;code>%soft&lt;/code> (softirq, mainly network packet processing), and &lt;code>%idle&lt;/code>. This is part of our &lt;a href="https://github.com/quarkusio/spring-quarkus-perf-comparison/issues/62">active benchmarking practice&lt;/a>: observing the system &lt;em>while it runs&lt;/em>, not just collecting final TPS numbers.&lt;/p>
&lt;/div>
&lt;div class="paragraph">
&lt;p>Both environments run Quarkus at 2.3GHz with the same workload and CPU pinning. The mpstat profiles could not be more different:&lt;/p>
&lt;/div>
&lt;table class="tableblock frame-all grid-all stretch">
&lt;colgroup>
&lt;col style="width: 33.3333%;"/>
&lt;col style="width: 16.6666%;"/>
&lt;col style="width: 16.6666%;"/>
&lt;col style="width: 16.6666%;"/>
&lt;col style="width: 16.6669%;"/>
&lt;/colgroup>
&lt;thead>
&lt;tr>
&lt;th class="tableblock halign-left valign-top">Environment&lt;/th>
&lt;th class="tableblock halign-left valign-top">%usr&lt;/th>
&lt;th class="tableblock halign-left valign-top">%sys&lt;/th>
&lt;th class="tableblock halign-left valign-top">%soft&lt;/th>
&lt;th class="tableblock halign-left valign-top">%idle&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">Local (Fedora, 15,504 TPS)&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">39-50%&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">34-41%&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">9-17%&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">3-5%&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">Perf-lab (RHEL, 24,472 TPS)&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">87-94%&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">5-11%&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">0-2%&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">0%&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;div class="paragraph">
&lt;p>&lt;code>%usr&lt;/code> is time running application code. &lt;code>%sys&lt;/code> is time in the kernel. On perf-lab, over 85% of CPU goes to the application. Locally, nearly half goes to the kernel. Same application, same clock speed, same workload: &lt;strong>locally, a significant fraction of CPU time is spent in the kernel rather than in application code.&lt;/strong>&lt;/p>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="sect1">
&lt;h2 id="_where_is_the_kernel_time_going">Where is the kernel time going?&lt;/h2>
&lt;div class="sectionbody">
&lt;div class="paragraph">
&lt;p>A &lt;a href="https://www.brendangregg.com/flamegraphs.html">differential flamegraph&lt;/a> of the JFR CPU profiles (collected via &lt;a href="https://github.com/async-profiler/async-profiler">async-profiler&lt;/a>) from the perf-lab and local Quarkus runs shows exactly where the extra kernel time is spent:&lt;/p>
&lt;/div>
&lt;div class="imageblock">
&lt;div class="content">
&lt;a class="image" href="diff-flamegraph-gap.svg">&lt;img src="diff-flamegraph-gap.png" alt="Differential flamegraph: perf-lab vs local"/>&lt;/a>
&lt;/div>
&lt;/div>
&lt;div class="paragraph">
&lt;p>Red frames appear more in the local run; blue frames appear more on the perf-lab. The brightest red hotspots are kernel spin locks (&lt;code>_raw_spin_unlock_irqrestore&lt;/code>), nftables firewall evaluation (&lt;code>nft_do_chain&lt;/code>, &lt;code>nft_meta_get_eval&lt;/code>), and TCP packet processing (&lt;code>tcp_clean_rtx_queue&lt;/code>, &lt;code>skb_defer_free_flush&lt;/code>). The blue band at the bottom is application code that gets more CPU on the perf-lab — because the kernel isn’t eating it. &lt;strong>The local kernel is spending cycles on network packet processing and firewall rules that the perf-lab doesn’t need.&lt;/strong>&lt;/p>
&lt;/div>
&lt;div class="paragraph">
&lt;p>The brightest red frame — &lt;code>_raw_spin_unlock_irqrestore&lt;/code> — is worth a closer look. The stack trace shows it’s triggered by Agroal (Quarkus’s connection pool) returning a JDBC connection after a query: &lt;code>ConnectionPool.returnConnectionHandler&lt;/code> → &lt;code>LinkedTransferQueue.tryTransfer&lt;/code> → &lt;code>LockSupport.unpark&lt;/code> → kernel &lt;code>futex_wake&lt;/code> → &lt;code>try_to_wake_up&lt;/code> → spin lock. If network round-trips are slower, JDBC connections are held longer and more threads pile up waiting for a free connection. Every connection return triggers a &lt;code>futex_wake&lt;/code> to unpark a waiter — the higher the network latency, the more waiters accumulate, and the more kernel time is spent waking them.&lt;/p>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="sect1">
&lt;h2 id="_the_suspect_pasta_the_userspace_tcp_proxy">The suspect: pasta, the userspace TCP proxy&lt;/h2>
&lt;div class="sectionbody">
&lt;div class="paragraph">
&lt;p>Rootless podman on Fedora uses &lt;a href="https://passt.top/passt/">pasta (passt)&lt;/a> to forward container ports. Unlike rootful podman (which uses kernel-level port forwarding), pasta is a userspace process that proxies every TCP packet:&lt;/p>
&lt;/div>
&lt;div class="listingblock">
&lt;div class="content">
&lt;pre>With pasta (default rootless):
App --&amp;gt; kernel --&amp;gt; pasta (userspace) --&amp;gt; kernel --&amp;gt; container netns --&amp;gt; PostgreSQL
With --network=host:
App --&amp;gt; kernel --&amp;gt; PostgreSQL (same network namespace)&lt;/pre>
&lt;/div>
&lt;/div>
&lt;div class="paragraph">
&lt;p>Every JDBC packet traverses two extra kernel/userspace boundary crossings plus a userspace copy in the pasta process. For a chatty protocol like JDBC with small, frequent packets, this adds up fast. The kernel functions visible in the flamegraph — &lt;code>nft_do_chain&lt;/code>, &lt;code>tcp_clean_rtx_queue&lt;/code>, &lt;code>skb_defer_free_flush&lt;/code> — are not pasta’s own CPU time (pasta runs in a separate process), but they are the kernel-side cost of the extra network hops that the application’s syscalls now traverse. The connection pool contention (&lt;code>futex_wake&lt;/code> from Agroal) could be a consequence of the added queuing delay: if each round-trip takes longer, connections are held longer, and waiters accumulate.&lt;/p>
&lt;/div>
&lt;div class="paragraph">
&lt;p>Crucially, &lt;strong>pasta is single-threaded&lt;/strong>. It processes all forwarded packets on a single CPU core. If that core saturates, packet processing queues up — latency spikes and throughput hits a ceiling regardless of how many cores the application has available. The alternative is &lt;code>--network=host&lt;/code>: the container shares the host’s network namespace, so packets stay in the kernel and never pass through a proxy.&lt;/p>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="sect1">
&lt;h2 id="_quantifying_the_overhead_with_pgbench">Quantifying the overhead with pgbench&lt;/h2>
&lt;div class="sectionbody">
&lt;div class="paragraph">
&lt;p>To measure pasta’s impact on database traffic, we ran &lt;a href="https://www.postgresql.org/docs/current/pgbench.html">pgbench&lt;/a> with the same 2-query workload (50 clients — matching the default JDBC connection pool size for both Quarkus and Spring — prepared statements, 30 seconds) over different network paths. We also tested with Fedora’s &lt;a href="https://wiki.nftables.org/">nftables&lt;/a> firewall disabled, since the flamegraph showed &lt;code>nft_do_chain&lt;/code> in the kernel stacks:&lt;/p>
&lt;/div>
&lt;table class="tableblock frame-all grid-all stretch">
&lt;colgroup>
&lt;col style="width: 66.6666%;"/>
&lt;col style="width: 33.3334%;"/>
&lt;/colgroup>
&lt;thead>
&lt;tr>
&lt;th class="tableblock halign-left valign-top">Network path&lt;/th>
&lt;th class="tableblock halign-left valign-top">TPS&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">Host → container (pasta + nftables)&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">18,106&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">Host → container (pasta, no nftables)&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">20,402&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">Host → container (&lt;code>--network=host&lt;/code>)&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">53,262&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;div class="paragraph">
&lt;p>With &lt;code>--network=host&lt;/code>, throughput jumps from 18K to 53K TPS — roughly a 3x increase. Pasta caps at ~18K TPS for this 2-query workload: that is the ceiling imposed by a single-threaded proxy.&lt;/p>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="sect1">
&lt;h2 id="_the_fix">The fix&lt;/h2>
&lt;div class="sectionbody">
&lt;div class="paragraph">
&lt;p>Run the PostgreSQL container with &lt;code>--network=host&lt;/code> instead of port-mapping (&lt;code>-p 5432:5432&lt;/code>). We added &lt;code>DB_HOST_NETWORK=true&lt;/code> to the benchmark’s &lt;a href="https://github.com/quarkusio/spring-quarkus-perf-comparison/blob/main/scripts/infra.sh">infrastructure script&lt;/a>.&lt;/p>
&lt;/div>
&lt;table class="tableblock frame-all grid-all stretch">
&lt;colgroup>
&lt;col style="width: 40%;"/>
&lt;col style="width: 20%;"/>
&lt;col style="width: 20%;"/>
&lt;col style="width: 20%;"/>
&lt;/colgroup>
&lt;thead>
&lt;tr>
&lt;th class="tableblock halign-left valign-top">Configuration&lt;/th>
&lt;th class="tableblock halign-left valign-top">Quarkus TPS&lt;/th>
&lt;th class="tableblock halign-left valign-top">Spring TPS&lt;/th>
&lt;th class="tableblock halign-left valign-top">Ratio&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">Default (pasta + nftables)&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">15,504&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">13,062&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">1.19x&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">&lt;code>--network=host&lt;/code>&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">24,116&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">13,368&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">1.80x&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">&lt;code>--network=host&lt;/code> + no nftables&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">26,039&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">13,214&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">1.97x&lt;/p>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">Perf-lab (RHEL 9.6, different hardware)&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">24,472&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">11,783&lt;/p>&lt;/td>
&lt;td class="tableblock halign-left valign-top">&lt;p class="tableblock">2.08x&lt;/p>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;div class="paragraph">
&lt;p>&lt;strong>With host networking, Quarkus throughput improves by 55% while Spring moves by +2.3%.&lt;/strong> Disabling the firewall on top recovers another 8% for Quarkus, bringing the ratio from 1.19x back to 1.97x — close to the perf-lab’s 2.08x.&lt;/p>
&lt;/div>
&lt;div class="paragraph">
&lt;p>Fedora’s &lt;code>firewalld&lt;/code> loads nearly 1000 &lt;a href="https://wiki.nftables.org/">nftables&lt;/a> rules that every packet traverses. This is independent of pasta — disabling the firewall adds another 13% throughput in the pgbench test (18,106 → 20,402 TPS).&lt;/p>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="sect1">
&lt;h2 id="_why_quarkus_is_affected_but_spring_is_not">Why Quarkus is affected but Spring is not&lt;/h2>
&lt;div class="sectionbody">
&lt;div class="paragraph">
&lt;p>As the pgbench data shows, pasta caps at ~18,000 TPS for a 2-query workload. pgbench is a minimal client that does nothing between queries — it represents the maximum throughput pasta can forward. Quarkus, which also processes HTTP requests, runs ORM and serialization between SQL queries, reaches 15,504 TPS through pasta — lower than pgbench’s 18,106 because the application work between queries reduces the pressure on the proxy, but still constrained by it.&lt;/p>
&lt;/div>
&lt;div class="paragraph">
&lt;p>With host networking, Quarkus reaches ~24,000 TPS — well above what pasta can deliver. Spring reaches ~13,000 TPS, which is below pasta’s ceiling regardless of networking mode. &lt;strong>Any application that can push close to pasta’s ceiling will be constrained by it; any application that stays well below it will not notice.&lt;/strong>&lt;/p>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="sect1">
&lt;h2 id="_confirming_the_fix">Confirming the fix&lt;/h2>
&lt;div class="sectionbody">
&lt;div class="paragraph">
&lt;p>A second differential flamegraph — this time comparing the local default (pasta) run with the local &lt;code>--network=host&lt;/code> run — confirms the overhead is gone:&lt;/p>
&lt;/div>
&lt;div class="imageblock">
&lt;div class="content">
&lt;a class="image" href="diff-flamegraph.svg">&lt;img src="diff-flamegraph.png" alt="Differential flamegraph: default pasta vs host networking"/>&lt;/a>
&lt;/div>
&lt;/div>
&lt;div class="paragraph">
&lt;p>Red means more CPU in the default (pasta) run; blue means more CPU with host networking. The red stacks that dominated the first flamegraph — &lt;code>_raw_spin_unlock_irqrestore&lt;/code>, &lt;code>nft_do_chain&lt;/code>, &lt;code>tcp_clean_rtx_queue&lt;/code> — have disappeared.&lt;/p>
&lt;/div>
&lt;div class="paragraph">
&lt;p>&lt;strong>With &lt;code>--network=host&lt;/code>, the app and PostgreSQL share the same network namespace; packets never leave the kernel.&lt;/strong>&lt;/p>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="sect1">
&lt;h2 id="_takeaways">Takeaways&lt;/h2>
&lt;div class="sectionbody">
&lt;div class="ulist">
&lt;ul>
&lt;li>
&lt;p>&lt;strong>A benchmark that saturates an unexpected resource and component is not measuring what you think.&lt;/strong> This is what Brendan Gregg calls &lt;a href="https://www.brendangregg.com/activebenchmarking.html">active benchmarking&lt;/a>:&lt;/p>
&lt;div class="quoteblock">
&lt;blockquote>
&lt;div class="paragraph">
&lt;p>&lt;em>You benchmark A, but actually measure B, and conclude you’ve measured C.&lt;/em>&lt;/p>
&lt;/div>
&lt;/blockquote>
&lt;div class="attribution">
— Brendan Gregg
&lt;/div>
&lt;/div>
&lt;div class="paragraph">
&lt;p>We benchmarked framework throughput, but Quarkus was saturating pasta’s single CPU core — so we were measuring pasta’s forwarding capacity, not framework performance. Only by collecting &lt;a href="https://man7.org/linux/man-pages/man1/mpstat.1.html">mpstat&lt;/a> and flamegraphs &lt;em>during&lt;/em> the run — as &lt;a href="https://github.com/quarkusio/spring-quarkus-perf-comparison/issues/62">required by our benchmarking practice&lt;/a> — did we identify which resource and component was saturated. Without that, the 1.19x ratio would have been taken at face value.&lt;/p>
&lt;/div>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>The impact is asymmetric.&lt;/strong> Pasta’s single-threaded ceiling affects only applications whose throughput would otherwise exceed it. In this benchmark, Quarkus exceeds the ceiling and is capped; Spring does not and is unaffected. The same logic applies to any workload — the proxy is invisible until you hit its limit.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Check your networking path.&lt;/strong> Run &lt;code>podman info | grep rootlessNetworkCmd&lt;/code> to see your backend. If it says &lt;code>pasta&lt;/code> and your benchmark talks to a containerized database, use &lt;code>--network=host&lt;/code> for the database container.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Firewall rules add up.&lt;/strong> Nearly 1000 nftables rules cost 8-13% throughput on this workload (8% for Quarkus with host networking, 13% for pgbench through pasta). For benchmarking, consider temporarily disabling the firewall or using a minimal ruleset.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/div>
&lt;/div>
&lt;/div>
&lt;div class="sect1">
&lt;h2 id="_known_upstream_issues">Known upstream issues&lt;/h2>
&lt;div class="sectionbody">
&lt;div class="paragraph">
&lt;p>Our findings are consistent with several known issues in the podman/pasta ecosystem:&lt;/p>
&lt;/div>
&lt;div class="ulist">
&lt;ul>
&lt;li>
&lt;p>&lt;strong>pasta is single-threaded by design&lt;/strong> and degrades under concurrent load. Community reports confirm that above ~8 connections, even the older &lt;a href="https://github.com/rootless-containers/slirp4netns">slirp4netns&lt;/a> backend can outperform it. (&lt;a href="https://github.com/containers/podman/discussions/22559">Podman Discussion #22559&lt;/a>)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>pasta consuming 90-100% CPU&lt;/strong> has been reported under sustained network load, e.g. Wireguard tunnels on kernel 6.x. (&lt;a href="https://github.com/containers/podman/issues/23686">Podman Issue #23686&lt;/a>)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Java + PostgreSQL hang&lt;/strong> — a Spring app running PostgreSQL &lt;code>COPY FROM STDIN&lt;/code> via pasta consistently freezes mid-transfer. &lt;code>--network=host&lt;/code> fixes it. (&lt;a href="https://github.com/containers/podman/issues/22593">Podman Issue #22593&lt;/a>)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Throughput far below host capacity&lt;/strong> — rootless containers on multi-gigabit hosts achieving only ~100 Mbit/s through pasta. (&lt;a href="https://github.com/containers/podman/issues/17865">Podman Issue #17865&lt;/a>)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Traffic stalls under sustained load&lt;/strong> — TCP downloads through pasta start normally then halt, with pasta pinned at high CPU. (&lt;a href="https://github.com/containers/podman/issues/17703">Podman Issue #17703&lt;/a>)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>The official &lt;a href="https://github.com/containers/podman/blob/main/docs/tutorials/performance.md">Podman performance tutorial&lt;/a> documents &lt;code>--network=host&lt;/code> and socket activation as workarounds for network-sensitive workloads.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;/div>
&lt;/div>
&lt;/div></description></item></channel></rss>