X Bookmarks — 2024 KW34: AI doing the CSV-wrangling grunt work

August 22, 2024

|bookmarks

by Florian Narr

X Bookmarks — 2024 KW34: AI doing the CSV-wrangling grunt work

@illyism — AI CSV importer in 5 minutes

  1. Copy/paste CSV into textarea
  2. Parse locally with PapaParse
  3. Auto-match columns with AI
  4. Bulk insert data into database

This used to take me two months & a team of engineers.

Now, just 5 minutes to build yourself 🤯

That's a fair claim for once. The two-month version wasn't just about writing code — it was requirements gathering, edge case handling, QA, the dedicated importer screen with mapping UI built by hand. Collapsing that into PapaParse + one AI call for column matching + a bulk insert is genuinely tight. The AI step does the hardest part (fuzzy column reconciliation) without you having to enumerate every possible header variation.


@pontusab — 40 lines to categorize transactions

It's literally just 40 lines of code to categorize transaction names using the Vercel AI SDK.

Smart, because transaction categorization is one of those tasks that used to demand a trained classifier or a hand-curated regex list — and neither aged well. Handing it to an LLM via the AI SDK sidesteps the maintenance burden entirely, and 40 lines means there's no abstraction to learn. Worth saving for the next time someone proposes a full ML pipeline for a labeling problem.


@pontusab — AI-based CSV importer architecture

How our AI based CSV Importer works:

  • Select a few rows with Papaparse
  • Generate mappings via Vercel AI SDK
  • Stream to React Hook Form
  • Save to database

The detail that makes this concrete is streaming the AI-generated mappings directly into React Hook Form. That means the user sees the column suggestions populate in real time and can correct them before submit — rather than waiting for a full response and then re-rendering. Good UX call, and it's a pattern that transfers to any form that benefits from AI-assisted prefill.


@pontusab — Local embeddings with Supabase edge functions

How we generate local embeddings with @supabase:

  • Upload a new file to Supabase storage ⛅
  • Trigger an edge function with gte-small model ⚡
  • Save the result to the documents table 📂

Honestly the most underrated part here is gte-small. Running a small embedding model inside an edge function avoids the round-trip to an external embeddings API, keeps the data local, and cuts cost to near zero for moderate volumes. Supabase edge functions support this natively now, which makes the whole pipeline feel like a first-class feature rather than a workaround.