CSV: The Simplest Data Format

  • Origin: CSV predates personal computers—used in 1972 on IBM mainframes!
  • No standard: RFC 4180 exists but is just a guideline—Excel, Google Sheets, and databases all do it slightly differently
  • Still dominant: In 2024, CSV remains the #1 data exchange format for non-developers

Delimiter Options Explained

  • Comma (,): Standard CSV—but breaks when data contains commas (addresses, numbers with thousand separators)
  • Semicolon (;): European standard—countries using comma as decimal separator (1.234,56) use semicolon for CSV
  • Tab (\t): TSV format—safest choice, tabs rarely appear in data. Excel 'Export as Text' uses this
  • Pipe (|): Database dumps and legacy systems—visible separator that almost never appears in data

CSV Gotchas This Tool Handles

  • Quoted fields: "New York, NY" stays together despite the comma inside
  • Escaped quotes: "She said ""Hello""" → She said "Hello"
  • Empty fields: a,,c → three fields, middle one is empty string
  • Trailing commas: Some exports add extra comma—we handle it gracefully
  • Mixed line endings: Windows (CRLF), Mac (CR), Unix (LF)—all supported

Why Convert CSV to JSON?

  • APIs: REST APIs expect JSON—not a single major API accepts CSV directly
  • JavaScript: JSON.parse() is native; CSV needs a library or custom parser
  • Type safety: JSON preserves numbers, booleans, nulls—CSV treats everything as strings
  • Nesting: JSON supports hierarchy; CSV is flat tables only
  • Databases: MongoDB, Elasticsearch, Firebase—all JSON-native

Common CSV Sources

  • Excel/Google Sheets: File → Download → CSV
  • Database exports: MySQL, PostgreSQL, SQLite all support CSV export
  • Bank statements: Most banks offer CSV transaction downloads
  • CRM exports: Salesforce, HubSpot, Mailchimp contact exports
  • Analytics: Google Analytics, Mixpanel, Amplitude data exports
  • Government data: Census, FDA, SEC—public data often in CSV

JSON Output Formats

  • Array of objects: [{name: 'Alice', age: 30}, ...] — most common, each row is an object
  • Pretty printed: Indented with newlines—human readable, larger file size
  • Minified: No whitespace—smaller file, harder to debug

Pro Tips

  • Clean headers first: Spaces become awkward keys—'First Name' → 'firstName' is better
  • Check for BOM: Excel UTF-8 exports add invisible \uFEFF at start—can break parsing
  • Large files: For 100MB+ files, use streaming parsers (Papa Parse, csv-parse) instead
  • Date formats: CSV dates are strings—you'll need to parse them after conversion
  • Number precision: '0.1 + 0.2' in JSON is 0.30000000000000004—watch for financial data!