Java vs Rust Benchmark — 10M Transactions
Ce contenu n’est pas encore disponible dans votre langue.
This page compares Spring Batch (Java 25 / Spring Boot 4.x) and Spring Batch RS (Rust) on a realistic ETL pipeline: reading 10 million financial transactions from CSV, storing them in PostgreSQL, then exporting to XML.
Both implementations use identical settings — chunk size 1 000, connection pool 10, same data schema — so the comparison is apples-to-apples.
Test Environment
Section titled “Test Environment”| Parameter | Value |
|---|---|
| Machine | 8-core CPU, 16 GB RAM, NVMe SSD |
| OS | Ubuntu 22.04 LTS |
| PostgreSQL | 15.4 (local, same machine) |
| Java | OpenJDK 25, Spring Boot 4.0.3, Spring Batch 6.x |
| JVM flags | -Xms512m -Xmx4g -XX:+UseG1GC + virtual threads enabled |
| Rust | 1.77 stable, --release (opt-level = 3) |
| JVM GC | G1GC, logged with -Xlog:gc*:gc.log |
| Virtual threads | Enabled (spring.threads.virtual.enabled=true) |
| Chunk size | 1 000 (both) |
| Pool size | 10 connections (both) |
Pipeline
Section titled “Pipeline”transactions.csv (10M rows) │ ▼ CsvItemReader / FlatFileItemReader TransactionProcessor (USD/GBP → EUR conversion, CANCELLED → FAILED) │ ▼ PostgresItemWriter / JdbcBatchItemWriter (bulk insert, chunk=1000) PostgreSQL: table transactions │ ▼ RdbcItemReader / JdbcPagingItemReader (paginated, page_size=1000) │ ▼ XmlItemWriter / StaxEventItemWriter transactions_export.xmlTransaction record
Section titled “Transaction record”| Field | Type | Example |
|---|---|---|
transaction_id | string | TXN-0000000001 |
amount | float | 1234.56 |
currency | string | USD, EUR, GBP |
timestamp | string | 2024-06-15T12:00:00Z |
account_from | string | ACC-00042137 |
account_to | string | ACC-00891023 |
status | string | PENDING, COMPLETED, FAILED, CANCELLED |
amount_eur | float | 1135.80 (added by processor) |
Code Side by Side
Section titled “Code Side by Side”Data Model
Section titled “Data Model”#[derive(Debug, Clone, Deserialize, Serialize, FromRow)]struct Transaction { transaction_id: String, amount: f64, currency: String, timestamp: String, account_from: String, account_to: String, status: String, #[serde(default)] amount_eur: f64,}@Entity@Table(name = "transactions")@XmlRootElement(name = "transaction")@XmlAccessorType(XmlAccessType.FIELD)public class Transaction { @Id @Column(name = "transaction_id") private String transactionId; private double amount; private String currency; private String timestamp; @Column(name = "account_from") private String accountFrom; @Column(name = "account_to") private String accountTo; private String status; @Column(name = "amount_eur") private double amountEur; // getters / setters ...}Processor (currency conversion + status normalisation)
Section titled “Processor (currency conversion + status normalisation)”#[derive(Default)]struct TransactionProcessor;
impl ItemProcessor<Transaction, Transaction> for TransactionProcessor { fn process(&self, item: &Transaction) -> ItemProcessorResult<Transaction> { let rate = match item.currency.as_str() { "USD" => 0.92, "GBP" => 1.17, _ => 1.0, }; let status = if item.status == "CANCELLED" { "FAILED".to_string() } else { item.status.clone() }; Ok(Some(Transaction { amount_eur: (item.amount * rate * 100.0).round() / 100.0, status, ..item.clone() }) }}@Componentpublic class TransactionProcessor implements ItemProcessor<Transaction, Transaction> {
private static final Map<String, Double> RATES = Map.of( "USD", 0.92, "GBP", 1.17, "EUR", 1.0);
@Override public Transaction process(Transaction item) { double rate = RATES.getOrDefault(item.getCurrency(), 1.0); item.setAmountEur( Math.round(item.getAmount() * rate * 100.0) / 100.0); if ("CANCELLED".equals(item.getStatus())) item.setStatus("FAILED"); return item; }}Step 1 — CSV → PostgreSQL
Section titled “Step 1 — CSV → PostgreSQL”let file = File::open(csv_path)?;let buffered = BufReader::with_capacity(64 * 1024, file);
let reader = CsvItemReaderBuilder::<Transaction>::new() .has_headers(true) .from_reader(buffered);
let writer = RdbcItemWriterBuilder::<Transaction>::new() .postgres(&pool) .table("transactions") .add_column("transaction_id") // ... 8 columns total .postgres_binder(&TransactionBinder) .build_postgres();
let step = StepBuilder::new("csv-to-postgres") .chunk::<Transaction, Transaction>(1_000) .reader(&reader) .processor(&TransactionProcessor) .writer(&writer) .build();@Beanpublic FlatFileItemReader<Transaction> csvReader() { return new FlatFileItemReaderBuilder<Transaction>() .name("transactionCsvReader") .resource(new FileSystemResource(csvPath)) .linesToSkip(1) .delimited().delimiter(",") .names("transactionId","amount","currency","timestamp", "accountFrom","accountTo","status") .targetType(Transaction.class) .build();}
@Beanpublic Step step1(...) { return new StepBuilder("csvToPostgresStep", repo) .<Transaction, Transaction>chunk(1_000, tx) .reader(csvReader()) .processor(processor) .writer(postgresWriter(dataSource)) .build();}Step 2 — PostgreSQL → XML
Section titled “Step 2 — PostgreSQL → XML”let reader = RdbcItemReaderBuilder::<Transaction>::new() .postgres(pool.clone()) .query( "SELECT transaction_id, amount, currency, timestamp, \ account_from, account_to, status, amount_eur \ FROM transactions ORDER BY transaction_id", ) .with_page_size(1_000) .build_postgres();
let writer = XmlItemWriterBuilder::<Transaction>::new() .root_tag("transactions") .item_tag("transaction") .from_path(xml_path)?;
let step = StepBuilder::new("postgres-to-xml") .chunk::<Transaction, Transaction>(1_000) .reader(&reader) .processor(&PassThroughProcessor::new()) .writer(&writer) .build();@Beanpublic JdbcPagingItemReader<Transaction> postgresReader(DataSource ds) { return new JdbcPagingItemReaderBuilder<Transaction>() .name("postgresTransactionReader") .dataSource(ds) .selectClause("SELECT transaction_id,amount,currency,timestamp," + "account_from,account_to,status,amount_eur") .fromClause("FROM transactions") .sortKeys(Map.of("transaction_id", Order.ASCENDING)) .rowMapper(/* maps columns → Transaction */) .pageSize(1_000).build();}
@Beanpublic Step step2(...) { return new StepBuilder("postgrestoXmlStep", repo) .<Transaction, Transaction>chunk(1_000, tx) .reader(postgresReader(dataSource)) .writer(xmlWriter(marshaller)) .build();}Results
Section titled “Results”Measured on the reference environment described above.
Overall performance
Section titled “Overall performance”| Metric | Spring Batch RS (Rust) | Spring Batch (Java) | Rust advantage |
|---|---|---|---|
| Total pipeline time | 42 s | 187 s | 4.5× faster |
| Step 1 duration (CSV→PG) | 28 s | 124 s | 4.4× |
| Step 2 duration (PG→XML) | 14 s | 63 s | 4.5× |
| JVM / binary startup | < 10 ms | 3 200 ms | 320× |
| Deployable artefact size | 8 MB (binary) | 47 MB (fat JAR) | 6× smaller |
Throughput (records/sec)
Section titled “Throughput (records/sec)”| Step | Rust | Java | Ratio |
|---|---|---|---|
| Step 1 — CSV → PostgreSQL | 357 000 | 80 600 | 4.4× |
| Step 2 — PostgreSQL → XML | 714 000 | 158 700 | 4.5× |
Memory (peak RSS)
Section titled “Memory (peak RSS)”| Metric | Rust | Java |
|---|---|---|
| Peak RSS | 62 MB | 1 840 MB |
| Heap peak | N/A (no GC) | 1 620 MB |
| Steady-state RSS | ~45 MB | ~820 MB |
GC (Java only)
Section titled “GC (Java only)”| Metric | Value |
|---|---|
| Total GC events | 312 |
| Total GC pause time | 8.4 s |
| Longest single pause | 340 ms |
| % of runtime in GC | 4.5% |
Analysis
Section titled “Analysis”Why is Rust ~4.5× faster?
Section titled “Why is Rust ~4.5× faster?”1. No garbage collection. Java’s G1GC paused for a cumulative 8.4 seconds. Rust uses RAII — memory is freed the instant a chunk goes out of scope, with zero overhead and zero latency spikes.
2. Lower memory pressure.
Java holds JVM metadata, class bytecode, and JIT-compiled code in addition to heap data.
Spring Batch also retains JobExecution and StepExecution objects throughout the run.
Rust’s binary is a single executable: 62 MB vs 1 840 MB peak RSS.
3. Zero-cost abstractions.
Rust’s trait-based pipeline (ItemReader → ItemProcessor → ItemWriter) compiles to a
tight loop with no virtual dispatch overhead. Java’s pipeline involves Spring AOP, proxy
objects, and transaction management wrappers on every chunk boundary.
4. Startup time. The JVM takes 3.2 s to start, load classes, and JIT-compile hot paths. The Rust binary starts in under 10 ms — critical for short jobs or frequent schedules.
When to choose Java
Section titled “When to choose Java”- Your team is Java-first and migration cost outweighs performance gains
- You need Spring ecosystem integrations (Spring Data, Spring Cloud Task, Spring Integration)
- Your batch jobs run infrequently and throughput is not the bottleneck
- You require rich operational features:
JobRepository,JobExplorer, REST API control
When to choose Rust
Section titled “When to choose Rust”- Throughput and latency are business requirements (financial settlement, real-time ETL)
- Memory is constrained (embedded systems, small containers)
- GC pauses would cause SLA violations
- You want a single statically-linked binary with no runtime dependency
- Cold-start time matters (serverless, frequent scheduling)
How to Reproduce
Section titled “How to Reproduce”Prerequisites
Section titled “Prerequisites”# PostgreSQL 15+ (Docker):docker run -d --name pg-bench \ -p 5432:5432 \ -e POSTGRES_PASSWORD=postgres \ -e POSTGRES_DB=benchmark \ postgres:15Run the Rust benchmark
Section titled “Run the Rust benchmark”# Build in release mode (required for fair comparison)cargo build --release --example benchmark_csv_postgres_xml \ --features csv,xml,rdbc-postgres
# Run and measure peak RSS/usr/bin/time -v \ cargo run --release --example benchmark_csv_postgres_xml \ --features csv,xml,rdbc-postgres \ 2>&1 | tee rust_bench.log
# Extract key metricsgrep -E "Step|SUMMARY|Maximum resident" rust_bench.logRun the Java benchmark
Section titled “Run the Java benchmark”cd benchmark/java
# Requires Java 25 + Maven 3.9+# Build fat JAR (Spring Boot 4.0.3 / Spring Batch 6.x)mvn package -q -DskipTests
# Run with GC logging, virtual threads, and RSS measurement/usr/bin/time -v java \ -Xms512m -Xmx4g \ -XX:+UseG1GC \ -Xlog:gc*:gc.log \ -jar target/spring-batch-benchmark-1.0.0.jar \ --spring.datasource.url=jdbc:postgresql://localhost:5432/benchmark \ 2>&1 | tee java_bench.log
# Parse GC summarygrep "Pause" gc.log | tail -20grep "Maximum resident" java_bench.log