This document explains how different execution strategies work and when to use each one.
Note for Bundler Users: When using Worker-based strategies (worker, workerWasm, etc.) with bundlers like Vite or Webpack, you must explicitly specify the
workerURLoption. See How to Use with Bundlers for details.
CSV parsing can be a performance bottleneck, especially when dealing with large files. The performance requirements vary significantly depending on the runtime environment:
Browser Context:
Server Context (Node.js, Deno, Bun):
web-csv-toolbox addresses these challenges through flexible execution strategies that work across all JavaScript runtimes.
web-csv-toolbox supports multiple execution strategies that can be combined for optimal performance:
Parsing runs synchronously on the main thread (browser) or event loop (Node.js/Deno/Bun), occupying it and preventing other operations until completion.
Advantages:
Disadvantages:
import { parse } from 'web-csv-toolbox';
// Runs on main thread (default)
for await (const record of parse(csv)) {
console.log(record);
}
sequenceDiagram
participant Main as Main Thread
(UI Events & Rendering)
participant Worker as Worker Thread
(Parsing)
Main->>Worker: Send CSV data
activate Worker
Note over Main: UI stays responsive ✓
Worker->>Worker: Parse CSV
Worker-->>Main: Send records
Worker-->>Main: Send results
deactivate Worker
Note over Main: Render results
Parsing is offloaded to a separate worker thread, keeping the main thread free.
How It Works:
sequenceDiagram
participant Main as Main Thread
participant Worker as Worker Thread
Main->>Worker: postMessage(CSV data)
activate Worker
Worker->>Main: postMessage(record 1)
Worker->>Main: postMessage(record 2)
Worker->>Main: postMessage(record 3)
deactivate Worker
Note over Worker,Main: Records sent one by one
via postMessage
Records are sent one by one via postMessage API.
Characteristics:
Best For:
How It Works:
sequenceDiagram
participant Main as Main Thread
participant Worker as Worker Thread
Main->>Worker: Transfer ReadableStream
(zero-copy)
activate Worker
Note over Worker: Parse stream
Worker-->>Main: Transfer result stream
(zero-copy)
deactivate Worker
Main->>Main: Read records
Note over Main,Worker: Transferable Streams
No memory copies
The entire stream is transferred to the worker using Transferable Streams.
Characteristics:
Best For:
For detailed information about supported environments and browser compatibility, see Supported Environments.
Quick Summary:
Note: Safari automatically falls back to message-streaming when Transferable Streams are not available. See Automatic Fallback Behavior for details.
Worker execution works across different JavaScript runtimes with platform-specific implementations:
Browser (Web Workers)
import { parseString, EnginePresets } from 'web-csv-toolbox';
for await (const record of parseString(csv, {
engine: EnginePresets.balanced()
})) {
console.log(record);
// UI stays responsive!
}
Node.js (Worker Threads)
import { parseString, EnginePresets } from 'web-csv-toolbox';
// Worker threads are used automatically
for await (const record of parseString(csv, {
engine: EnginePresets.balanced()
})) {
console.log(record);
}
node:worker_threads internally (Node.js LTS)Deno (Web Workers)
For detailed platform specifications and compatibility information, see Supported Environments.
web-csv-toolbox automatically falls back to more compatible execution methods when the requested strategy is not available. This behavior is enabled by default but can be disabled with strict mode.
Stream Transfer → Message Streaming:
Example (Default - with auto-fallback):
import { parseString, EnginePresets } from 'web-csv-toolbox';
// Request stream-transfer strategy
for await (const record of parseString(csv, {
engine: EnginePresets.memoryEfficient()
})) {
console.log(record);
}
// On Safari: Automatically falls back to message-streaming
// On Chrome/Firefox/Edge: Uses stream-transfer as requested
Example (Strict mode - no fallback):
import { parseString } from 'web-csv-toolbox';
// Strict mode: throw error if stream-transfer not supported
for await (const record of parseString(csv, {
engine: {
worker: true,
workerStrategy: 'stream-transfer',
strict: true // Disable automatic fallback
}
})) {
console.log(record);
}
// On Safari: Throws an error
// On Chrome/Firefox/Edge: Uses stream-transfer as requested
When to use strict mode:
For more details on the strict option, refer to the EngineConfig type documentation in your IDE or the API Reference.
✅ Use workers when:
worker preset)balanced preset)❌ Skip workers when:
mainThread preset insteadBrowser (UI responsiveness):
import { parse, EnginePresets } from 'web-csv-toolbox';
// Worker with message streaming (Safari compatible)
for await (const record of parse(csv, {
engine: EnginePresets.responsive()
})) {
console.log(record);
// UI stays responsive!
}
// Worker with stream transfer (best for Chrome/Firefox/Edge)
for await (const record of parse(response, {
engine: EnginePresets.memoryEfficient()
})) {
console.log(record);
// Zero-copy streaming!
}
Server (concurrent processing):
import { parseStringStream, ReusableWorkerPool } from 'web-csv-toolbox';
import { createReadStream } from 'node:fs';
import { Readable } from 'node:stream';
// Create worker pool for handling multiple files
using pool = new ReusableWorkerPool({ maxWorkers: 4 });
// Process multiple CSV files concurrently with streaming
await Promise.all(
csvFiles.map(async (filePath) => {
// Stream file from disk
const fileStream = createReadStream(filePath, 'utf-8');
const webStream = Readable.toWeb(fileStream);
let processedCount = 0;
for await (const record of parseStringStream(webStream, {
engine: { worker: true, workerPool: pool }
})) {
// Process each record without loading all into memory
await processRecord(record); // e.g., insert to database
processedCount++;
}
console.log(`Processed ${processedCount} records from ${filePath}`);
})
);
async function processRecord(record) {
// Process individual record (e.g., database insert, validation, etc.)
// Records are not accumulated in memory
}
Parsing uses pre-compiled WebAssembly code for improved performance compared to JavaScript.
Performance Note: Actual performance improvements vary depending on data size, content complexity, and runtime environment. In many cases, WASM can provide significant speedups, but results may differ based on your specific use case.
JavaScript:
// Interpreted/JIT compiled at runtime
function parseCSV(text) {
// Character-by-character processing
for (let i = 0; i < text.length; i++) {
const char = text[i];
// ... complex logic ...
}
}
WebAssembly:
;; Pre-compiled to machine code
(func $parse_csv
;; Optimized low-level operations
;; Direct memory access
;; Efficient memory operations
)
Performance Characteristics:
Advantages:
Disadvantages:
") onlyloadWASM() beforehand)// ✅ Works
parse(utf8CSV, { engine: { wasm: true } });
// ❌ Won't work
parse(shiftJISCSV, {
charset: 'shift-jis',
engine: { wasm: true }
});
// ✅ Works
parse('a,"b,c",d', { engine: { wasm: true } });
// ❌ Won't work
parse("a,'b,c',d", {
quotation: "'",
engine: { wasm: true }
});
✅ Use WASM when:
❌ Skip WASM when:
loadWASM())import { parse, EnginePresets, loadWASM } from 'web-csv-toolbox';
// Recommended: Load WASM module once at startup
// This prevents initialization overhead on first parse
await loadWASM();
// Parse with WASM
for await (const record of parse(csv, {
engine: EnginePresets.fast()
})) {
console.log(record);
}
import { parse, EnginePresets } from 'web-csv-toolbox';
// WASM is automatically initialized on first use
// However, this can cause a noticeable delay on the first parse
for await (const record of parse(csv, {
engine: EnginePresets.fast()
})) {
console.log(record);
}
Note: The first WASM parse without pre-loading can take longer due to module initialization. Pre-loading with loadWASM() is recommended to avoid this bottleneck, especially in performance-critical applications.
In Browsers:
sequenceDiagram
participant Main as Main Thread
(UI Events & Rendering)
participant Worker as Worker Thread
(WASM Parser)
Main->>Worker: Send CSV data
activate Worker
Note over Main: UI stays responsive ✓
Worker->>Worker: WASM Parse
(faster)
Worker-->>Main: Send results
deactivate Worker
Note over Main: Render results
Note over Main,Worker: Fast + Non-blocking
On Servers:
sequenceDiagram
participant EventLoop as Event Loop
(HTTP Requests)
participant Worker as Worker Thread
(WASM Parser)
EventLoop->>Worker: Send CSV data
activate Worker
Note over EventLoop: Can handle other requests ✓
Worker->>Worker: WASM Parse
(faster)
Worker-->>EventLoop: Send results
deactivate Worker
Note over EventLoop: Process results
Note over EventLoop,Worker: Fast + High throughput
Combines the benefits of both strategies:
Advantages:
Disadvantages:
") onlyloadWASM() beforehand)✅ Use combined when:
❌ Skip combined when:
import { parse, EnginePresets, loadWASM } from 'web-csv-toolbox';
// Recommended: Load WASM module once at startup
await loadWASM();
// Best of both worlds: Worker + WASM + stream-transfer
for await (const record of parse(csv, {
engine: EnginePresets.responsiveFast()
})) {
console.log(record);
// Fast + non-blocking!
}
import { parse, EnginePresets } from 'web-csv-toolbox';
// WASM is automatically initialized on first use
// However, this can cause a delay on the first parse
for await (const record of parse(csv, {
engine: EnginePresets.responsiveFast()
})) {
console.log(record);
}
Performance varies significantly based on:
| Strategy | Parse Time | Main Thread/Event Loop | Memory | CPU Usage |
|---|---|---|---|---|
| Main Thread | Baseline | ❌ Occupied | Moderate | Higher |
| Worker (message) | Baseline + overhead | ✅ Free | Moderate | Higher |
| Worker (stream) | Baseline + overhead | ✅ Free | Lower | Higher |
| WASM | Generally faster | ❌ Occupied | Lower | Lower |
| Worker + WASM | Generally faster + overhead | ✅ Free | Moderate | Lower |
Key Insights:
Note: For measured performance data on specific workloads, see CodSpeed benchmarks. Always benchmark with your actual data to determine the best strategy for your use case.
Memory: O(n) - proportional to file size
Memory: O(n) + overhead for message copies
Memory: O(1) - constant per record (streaming)
graph TD
Start{File Size} --> |< 100KB| MainThread[mainThread]
Start --> |100KB-1MB| Choice1{Browser?}
Choice1 --> |Yes| Worker1[worker]
Choice1 --> |No| MainThread
Start --> |1MB-10MB| Balanced[balanced
worker + stream]
Start --> |10MB-100MB| UTF8{UTF-8?}
UTF8 --> |Yes| Fastest[fastest
worker + wasm + stream]
UTF8 --> |No| Balanced2[balanced
any encoding]
Start --> |> 100MB| Stream[balanced
with streaming input]
style MainThread fill:#e1f5ff
style Worker1 fill:#ccffcc
style Balanced fill:#ccffcc
style Fastest fill:#ffffcc
style Balanced2 fill:#ccffcc
style Stream fill:#ccffcc
Browser - Need UI responsiveness?
→ Use worker (balanced, worker, fastest)
Server - Need high throughput?
→ Use worker with pool (balanced, fastest)
Need maximum speed?
→ Use WASM (wasm, fastest)
Need broad format support (non-UTF-8, custom quotes)?
→ Avoid WASM (balanced, worker, mainThread)
Browser - Safari support required?
→ Use message-streaming (worker, not workerStreamTransfer)
Need maximum compatibility?
→ Use mainThread (works everywhere)
Browser (UI-critical):
→ balanced or fastest (keep UI responsive)
Node.js/Deno/Bun (server-side):
→ balanced with WorkerPool (concurrent processing)
Safari:
→ worker or balanced (auto-fallback to message-streaming)
Chrome/Firefox/Edge:
→ fastest or workerStreamTransfer (zero-copy streams)
CLI tools / Scripts:
→ mainThread or wasm (no worker overhead)
For advanced configuration options, refer to the EngineConfig type documentation in your IDE or the API Reference.