CSV: The Simplest Data Format
- Origin: CSV predates personal computers—used in 1972 on IBM mainframes!
- No standard: RFC 4180 exists but is just a guideline—Excel, Google Sheets, and databases all do it slightly differently
- Still dominant: In 2024, CSV remains the #1 data exchange format for non-developers
Delimiter Options Explained
- Comma (,): Standard CSV—but breaks when data contains commas (addresses, numbers with thousand separators)
- Semicolon (;): European standard—countries using comma as decimal separator (1.234,56) use semicolon for CSV
- Tab (\t): TSV format—safest choice, tabs rarely appear in data. Excel 'Export as Text' uses this
- Pipe (|): Database dumps and legacy systems—visible separator that almost never appears in data
CSV Gotchas This Tool Handles
- Quoted fields: "New York, NY" stays together despite the comma inside
- Escaped quotes: "She said ""Hello""" → She said "Hello"
- Empty fields: a,,c → three fields, middle one is empty string
- Trailing commas: Some exports add extra comma—we handle it gracefully
- Mixed line endings: Windows (CRLF), Mac (CR), Unix (LF)—all supported
Why Convert CSV to JSON?
- APIs: REST APIs expect JSON—not a single major API accepts CSV directly
- JavaScript: JSON.parse() is native; CSV needs a library or custom parser
- Type safety: JSON preserves numbers, booleans, nulls—CSV treats everything as strings
- Nesting: JSON supports hierarchy; CSV is flat tables only
- Databases: MongoDB, Elasticsearch, Firebase—all JSON-native
Common CSV Sources
- Excel/Google Sheets: File → Download → CSV
- Database exports: MySQL, PostgreSQL, SQLite all support CSV export
- Bank statements: Most banks offer CSV transaction downloads
- CRM exports: Salesforce, HubSpot, Mailchimp contact exports
- Analytics: Google Analytics, Mixpanel, Amplitude data exports
- Government data: Census, FDA, SEC—public data often in CSV
JSON Output Formats
- Array of objects: [{name: 'Alice', age: 30}, ...] — most common, each row is an object
- Pretty printed: Indented with newlines—human readable, larger file size
- Minified: No whitespace—smaller file, harder to debug
Pro Tips
- Clean headers first: Spaces become awkward keys—'First Name' → 'firstName' is better
- Check for BOM: Excel UTF-8 exports add invisible \uFEFF at start—can break parsing
- Large files: For 100MB+ files, use streaming parsers (Papa Parse, csv-parse) instead
- Date formats: CSV dates are strings—you'll need to parse them after conversion
- Number precision: '0.1 + 0.2' in JSON is 0.30000000000000004—watch for financial data!