Complete reference for the engine configuration option.
The engine option controls how CSV parsing is executed. It allows you to:
parseBlob() and parseFile())interface EngineConfig {
worker?: boolean;
wasm?: boolean;
workerStrategy?: 'message-streaming' | 'stream-transfer';
workerPool?: WorkerPool;
workerURL?: string;
strict?: boolean;
arrayBufferThreshold?: number;
backpressureCheckInterval?: {
lexer?: number;
assembler?: number;
};
queuingStrategy?: {
lexerWritable?: QueuingStrategy<string>;
lexerReadable?: QueuingStrategy<Token>;
assemblerWritable?: QueuingStrategy<Token>;
assemblerReadable?: QueuingStrategy<CSVRecord<any>>;
};
}
workerType: boolean
Default: false
Enable worker thread execution to offload parsing from the main thread.
Platforms:
Example:
import { parseString } from 'web-csv-toolbox';
for await (const record of parseString(csv, {
engine: { worker: true }
})) {
console.log(record);
// Main thread stays responsive!
}
Benefits:
Considerations:
wasmType: boolean
Default: false
Enable WebAssembly-based parsing for improved performance.
Initialization:
web-csv-toolbox (main entry): Auto-initializes on first use. For better first-parse latency, we recommend preloading via loadWASM().web-csv-toolbox/slim (slim entry): You must call loadWASM(). With bundlers, you may need to pass a wasmUrl to loadWASM().Example:
import { parseString, loadWASM } from 'web-csv-toolbox';
await loadWASM();
for await (const record of parseString(csv, {
engine: { wasm: true }
})) {
console.log(record);
}
Performance:
Limitations:
workerStrategyType: 'message-streaming' | 'stream-transfer'
Default: 'message-streaming'
Choose how data is transferred between main thread and worker.
'message-streaming'Records are sent via postMessage one by one.
Characteristics:
Example:
{
worker: true,
workerStrategy: 'message-streaming'
}
'stream-transfer'Streams are transferred directly using Transferable Streams (zero-copy).
Characteristics:
Example:
{
worker: true,
workerStrategy: 'stream-transfer'
}
Browser Support:
workerPoolType: WorkerPool (implemented by ReusableWorkerPool)
Default: Shared singleton pool
Specify a custom WorkerPool for managing worker lifecycle.
Why Use Custom Pool:
Example:
import { ReusableWorkerPool, parseString } from 'web-csv-toolbox';
const pool = new ReusableWorkerPool({ maxWorkers: 4 });
app.onShutdown(() => {
pool.terminate();
});
for await (const record of parseString(csv, {
engine: { worker: true, workerPool: pool }
})) {
console.log(record);
}
Security: Always use WorkerPool with limited maxWorkers in production applications that process user uploads.
See: How-To: Secure CSV Processing
workerURLType: string
Default: Bundled worker script
Specify a custom worker script URL.
Use Case:
Example:
{
worker: true,
workerURL: 'https://cdn.example.com/csv-worker.js'
}
Note: Custom workers must implement the expected message protocol.
Node.js: In Node, engine: { worker: true } works without workerURL. The bundled worker path is resolved internally.
strictType: boolean
Default: false
Strict mode prevents the automatic fallback (stream-transfer → message-streaming) when workerStrategy: 'stream-transfer' is requested.
Behavior:
true: Throws if stream-transfer is unavailable; does not auto-fallback to message-streaming.false: Automatically falls back to message-streaming and calls onFallback.Notes:
strict is valid only when worker: true and workerStrategy: 'stream-transfer'. Other combinations are invalid and will throw.Use Case:
Example (strict for Chrome/Firefox/Edge):
{
worker: true,
workerStrategy: 'stream-transfer',
strict: true // throws on Safari where stream transfer is unsupported
}
arrayBufferThresholdType: number (bytes)
Default: 1048576 (1MB)
Applies to: parseBlob() and parseFile() only
Controls the automatic selection between two Blob reading strategies based on file size.
Strategies:
Files smaller than threshold: Use blob.arrayBuffer() + parseBinary()
maxBufferSize (default 10MB)Files equal to or larger than threshold: Use blob.stream() + parseBinaryStream()
Special Values:
0 - Always use streaming (maximum memory efficiency)Infinity - Always use arrayBuffer (maximum performance for small files)Default Rationale: The 1MB default threshold is determined by benchmarks and provides:
maxBufferSize (10MB)Example: Always Use Streaming (Memory-Efficient)
import { parseBlob } from 'web-csv-toolbox';
const largeFile = new Blob([csvData], { type: 'text/csv' });
for await (const record of parseBlob(largeFile, {
engine: { arrayBufferThreshold: 0 } // Always stream
})) {
console.log(record);
}
Example: Custom Threshold (512KB)
import { parseBlob } from 'web-csv-toolbox';
for await (const record of parseBlob(file, {
engine: { arrayBufferThreshold: 512 * 1024 } // 512KB threshold
})) {
console.log(record);
}
Example: Always Use ArrayBuffer (Small Files)
import { parseBlob } from 'web-csv-toolbox';
const smallFile = new Blob([csvData], { type: 'text/csv' });
for await (const record of parseBlob(smallFile, {
engine: { arrayBufferThreshold: Infinity } // Always use arrayBuffer
})) {
console.log(record);
}
Security Note:
When using arrayBufferThreshold > 0, ensure files stay below maxBufferSize (default 10MB). Files exceeding this limit will throw a RangeError for security reasons.
See Also:
backpressureCheckInterval 🧪Type: { lexer?: number; assembler?: number }
Default: { lexer: 100, assembler: 10 }
Status: Experimental
Controls how frequently the internal parsers check for backpressure during streaming operations (count-based: number of tokens/records processed).
⚠️ Advanced Performance Tuning
This is an experimental feature for advanced users. The default values are designed to work well for most scenarios. Only adjust these if profiling indicates a need for tuning or you're experiencing specific performance issues with large streaming operations.
Parameters:
lexer - Check interval for the lexer stage (default: every 100 tokens processed)assembler - Check interval for the assembler stage (default: every 10 records processed)Lower values:
Higher values:
Example: Increase Check Frequency
import { parseString } from 'web-csv-toolbox';
for await (const record of parseString(csv, {
engine: {
backpressureCheckInterval: {
lexer: 50, // Check every 50 tokens (more responsive)
assembler: 5 // Check every 5 records (more responsive)
}
}
})) {
console.log(record);
}
Example: Decrease Check Frequency (Performance-Focused)
for await (const record of parseString(csv, {
engine: {
backpressureCheckInterval: {
lexer: 200, // Check every 200 tokens (less overhead)
assembler: 20 // Check every 20 records (less overhead)
}
}
})) {
console.log(record);
}
When to Consider Adjusting:
Note: This API may change in future versions based on ongoing performance research.
queuingStrategy 🧪Type: object
Status: Experimental
Controls the internal queuing behavior of the CSV parser's streaming pipeline.
⚠️ Advanced Performance Tuning
This is an experimental feature for advanced users. The default queuing strategies are designed to balance memory usage and buffering behavior. Only adjust these if profiling indicates a need for tuning or you have specific memory or performance requirements.
Structure:
{
lexerWritable?: QueuingStrategy<string>;
lexerReadable?: QueuingStrategy<Token>;
assemblerWritable?: QueuingStrategy<Token>;
assemblerReadable?: QueuingStrategy<CSVRecord<any>>;
}
Pipeline Stages:
The CSV parser uses a two-stage pipeline:
Each stage has both writable (input) and readable (output) sides:
Example: Memory-Constrained Environment
import { parseString } from 'web-csv-toolbox';
for await (const record of parseString(csv, {
engine: {
queuingStrategy: {
// Minimize memory usage with smaller buffers across entire pipeline
lexerWritable: new CountQueuingStrategy({ highWaterMark: 1 }),
lexerReadable: new CountQueuingStrategy({ highWaterMark: 1 }),
assemblerWritable: new CountQueuingStrategy({ highWaterMark: 1 }),
assemblerReadable: new CountQueuingStrategy({ highWaterMark: 1 })
}
}
})) {
console.log(record);
}
Example: Tuning for Potential High-Throughput Scenarios
for await (const record of parseString(csv, {
engine: {
queuingStrategy: {
// Larger buffers to allow more buffering
lexerWritable: new CountQueuingStrategy({ highWaterMark: 200 }),
lexerReadable: new CountQueuingStrategy({ highWaterMark: 100 }),
assemblerWritable: new CountQueuingStrategy({ highWaterMark: 100 }),
assemblerReadable: new CountQueuingStrategy({ highWaterMark: 50 })
}
}
})) {
console.log(record);
}
Example: Optimize Token Buffer (Between Lexer and Assembler)
for await (const record of parseString(csv, {
engine: {
queuingStrategy: {
// Only tune the token transfer between stages
lexerReadable: new CountQueuingStrategy({ highWaterMark: 2048 }),
assemblerWritable: new CountQueuingStrategy({ highWaterMark: 2048 })
}
}
})) {
console.log(record);
}
Theoretical Trade-offs:
Adjusting highWaterMark values affects the balance between memory usage and buffering behavior:
Note: The actual performance impact depends on your specific use case, data characteristics, and runtime environment. The default values are designed to work well for most scenarios. Only adjust these settings if profiling indicates a need for tuning.
Potential Use Cases:
Note: This API may change in future versions based on ongoing performance research.
See Also:
import { ReusableWorkerPool, EnginePresets } from 'web-csv-toolbox';
const pool = new ReusableWorkerPool({ maxWorkers: 4 });
const config = EnginePresets.balanced({
workerPool: pool
});
Why:
import { EnginePresets, loadWASM } from 'web-csv-toolbox';
await loadWASM();
const config = EnginePresets.responsiveFast();
Why:
const config = EnginePresets.responsive();
Why:
import { EnginePresets } from 'web-csv-toolbox';
const config = EnginePresets.balanced({
arrayBufferThreshold: 2 * 1024 * 1024, // 2MB threshold
backpressureCheckInterval: {
lexer: 50, // Check every 50 tokens (more responsive)
assembler: 5 // Check every 5 records (more responsive)
},
queuingStrategy: {
// Tune entire pipeline with larger buffers
lexerWritable: new CountQueuingStrategy({ highWaterMark: 200 }),
lexerReadable: new CountQueuingStrategy({ highWaterMark: 100 }),
assemblerWritable: new CountQueuingStrategy({ highWaterMark: 100 }),
assemblerReadable: new CountQueuingStrategy({ highWaterMark: 50 })
}
});
Configuration:
⚠️ Note: These are experimental APIs that may change in future versions.
import { EnginePresets } from 'web-csv-toolbox';
const config = EnginePresets.balanced({
arrayBufferThreshold: 0, // Always use streaming
backpressureCheckInterval: {
lexer: 10, // Check every 10 tokens (frequent checks)
assembler: 5 // Check every 5 records (frequent checks)
},
queuingStrategy: {
// Minimal buffers throughout entire pipeline
lexerWritable: new CountQueuingStrategy({ highWaterMark: 1 }),
lexerReadable: new CountQueuingStrategy({ highWaterMark: 1 }),
assemblerWritable: new CountQueuingStrategy({ highWaterMark: 1 }),
assemblerReadable: new CountQueuingStrategy({ highWaterMark: 1 })
}
});
Why:
Use Cases:
new ReusableWorkerPool(options?: { maxWorkers?: number })
Options:
interface WorkerPoolOptions {
maxWorkers?: number; // Default: 1
}
Example:
const pool = new ReusableWorkerPool({ maxWorkers: 4 });
isFull()Check if the pool has reached maximum capacity.
Returns: boolean
Example:
if (pool.isFull()) {
return c.json({ error: 'Service busy' }, 503);
}
terminate()Terminate all workers and clean up resources.
Example:
app.onShutdown(() => {
pool.terminate();
});
getWorker()Get a worker from the pool (internal use).
Returns: Promise<Worker>
sizeGet the current number of active workers.
Type: number (read-only)
Example:
console.log(`Active workers: ${pool.size}`);
Web Workers:
Memory:
Worker Threads:
Considerations:
Web Workers API:
| Configuration | Init Cost | Parse Speed | Memory | UI Blocking |
|---|---|---|---|---|
{ worker: false } |
None | Baseline | Low | Yes |
{ worker: true } |
Low (worker init) | Baseline | Low | No |
{ wasm: true } |
Very Low | Faster | Low | Yes |
{ worker: true, wasm: true } |
Low (worker init) | Faster | Low | No |
Note: Actual performance varies based on hardware, runtime, and CSV complexity. See CodSpeed benchmarks for measured results.
✅ Use workers when:
❌ Skip workers when:
try {
for await (const record of parseString(csv, {
engine: { worker: true, strict: true }
})) {
console.log(record);
}
} catch (error) {
if (error.message.includes('Worker')) {
console.error('Workers not available, falling back...');
// Handle fallback
}
}
import { parseString, loadWASM } from 'web-csv-toolbox';
try {
await loadWASM();
} catch (error) {
console.error('WASM failed to load:', error);
// Use non-WASM config
}
for await (const record of parseString(csv, {
engine: { wasm: true }
})) {
console.log(record);
}