đź’° Your ML Pipeline Is Burning Cash You Already Spent

That Raspberry Pi collecting dust could replace your $500/month cloud ML bill

Welcome to GigaElixir Gazette, your 5-minute digest of Elixir ecosystem news that actually matters đź‘‹.

. WEEKLY PICKS .

🎯 ReqLLM Ships 1.0 with 125+ Models and Provider-Agnostic API: Mike Hostetler's library went production-ready after 21 contributors stress-tested it across 332 commits. Swap Claude for GPT-4 without rewriting code—fixture-based tests catch when providers change APIs. Key architecture: Finch handles streaming's persistent connections, Req handles everything else. Model registry auto-syncs from models.dev. Production companies already running it.

đź’» Observer Traces Memory Leaks Without Vendor Tools: AppSignal tutorial shows tracing GenServer memory issues using Erlang's Observer GUI. Creates intentional leak (binaries accumulating every 2 seconds), traces with Observer patterns, watches bin_vheap_size grow from 7MB to 372MB. Two-node setup keeps measurement overhead off production. No SaaS required for production debugging.

đź“– National Bridge Player Explains Falling for Erlang After Rejecting Programming: Bora GönĂĽl's path from "X = X + 1 is a lie" at age eight to discovering distributed message passing. Turning point: watching two Erlang nodes on different machines exchange messages without HTTP or serialization. Just actors talking. Traded national-level bridge for this. Philosophy that stuck: let it crash, processes are cheap, share nothing.

🎮 MMORPG Running on Elixir Backend and Godot Frontend: Real-time multiplayer architecture with Elixir handling server state and player coordination while Godot renders client-side. Operational game proving BEAM's fit for stateful concurrent servers. Comments discuss WebSocket sync patterns and how GenServers map to game entities. Discord and Bleacher Report already learned this: BEAM coordinates thousands of concurrent connections without JVM complexity.

đź”§ PingCRM Demo Shows Inertia.js SSR with Phoenix: Developer zekedou ports PingCRM to Combo (Phoenix fork) with full-stack patterns—auth, CRUD, forms, file uploads, SSR through Inertia.js. React/Vue/Svelte frontend with Phoenix backend, no separate API layer. Inertia passes props directly to components server-side, eliminating JSON:API ceremony. Fills documentation gap in official resources.

Your ML Pipeline Is Burning Cash You Already Spent

I keep seeing the same pattern at meetups. Developer shows their cloud ML bill—$3,650 yearly for object detection. Then I ask: "What's that Pi on your desk?" Suddenly they remember they already paid for the compute. Victoria from Driver Technologies fought this for months—hardware sitting there, bills piling up, procrastination winning. Her breakthrough came at a meetup: "You got that working?" She hadn't. Embarrassment beat inertia.

ML needs GPUs, training needs cloud scale, just call the API—the instinct seems obvious. Anthropic, OpenAI, whoever. Pay per inference. Let them handle complexity. Then the bills arrive and you remember that $75 Raspberry Pi has a PCIe lane. Plug in a $120 Hailo AI accelerator hat. Boot Nerves. Load YOLOv8. Suddenly you're processing camera feeds at 50 frames per second with single-digit millisecond latency—zero cloud costs, zero network lag, zero video leaving your network. Under $200 total.

Here's what they miss: cloud ML APIs optimize for vendor revenue, not your operational costs. Pay-per-inference works for unpredictable spiky load. It's catastrophically expensive for continuous inference—security cameras running 24/7, retail analytics processing constant foot traffic, industrial quality control checking every manufactured part. Edge computing inverts the equation: high upfront hardware cost, near-zero marginal inference cost.

Building the pipeline showed why Elixir fits edge ML naturally. NX handles tensor operations—pre-processing camera frames with Evision (OpenCV bindings), padding images to YOLOv8's expected dimensions. Hailo RT API gets called through NIFs: Hailo.load_model/1 compiles the network to Hailo Exchange Format, Hailo.infer/2 runs inference on the accelerator. Post-processing happens in NX—parsing detected objects, drawing bounding boxes, calculating confidence scores. Supervision trees restart crashed inference processes automatically. OTP manages device connections and model lifecycle. Standard patterns, specialized use case.

Every time I help someone set this up, same blockers emerge. Nerves isn't magic. Victoria and Paulo spent weeks on camera detection. IP cams needed custom network configuration. Raspberry Pi camera modules had compatibility issues with Nerves on Pi 5. Standard webcams finally worked reliably. Compiling Evision for Nerves required custom system images. Loading ffmpeg meant building additional dependencies. The repo works out-of-the-box for webcams over USB now. Everything else requires documentation and testing.

The community calls it "yak shaving"—you wanted to run object detection, now you're compiling OpenCV. But once the stack stabilized, Victoria had something cloud APIs can't deliver: single-digit millisecond latency with zero per-inference cost.

Once you see it this way, the infrastructure choice is obvious. ML inference isn't special—it's distributed systems. Camera feeds are message streams. Inference requests are async function calls. Results get pattern-matched. Failures get supervised and restarted. The same patterns handling Phoenix LiveView connections and background job processing apply to ML pipelines. Elixir already knows this dance.

Victoria's pitch to retail teams: calculate current cloud spending, realize hardware ROI measures in weeks not years. Warehouse robotics teams doing real-time quality control. Security integrators wanting on-premise processing for privacy compliance. Nobody asks "should we do edge ML?" They ask "why are we still paying per inference?"

Your Raspberry Pi isn't collecting dust anymore. It's a production ML inference node waiting to save you thousands monthly.

Remember, for edge ML deployment:

Start with operational economics – Calculate current cloud API costs for continuous inference vs. one-time hardware investment with near-zero marginal costs

Use existing distributed patterns – Treat inference like any BEAM workload—message passing, supervision trees, cluster orchestration, not specialized infrastructure

Pre-process locally, not remotely – NX + Evision handle tensor operations faster than serializing to cloud APIs—eliminates network latency entirely

Hardware clusters beat scaling APIs – Small Raspberry Pi clusters cost less than weeks of cloud ML bills while providing better latency and data privacy

. TIRED OF DEVOPS HEADACHES? .

Deploy your next Elixir app hassle-free with Gigalixir and focus more on coding, less on ops.

We're specifically designed to support all the features that make Elixir special, so you can keep building amazing things without becoming a DevOps expert.

See you next week,

Michael

P.S. Forward this to a friend who loves Elixir as much as you do đź’ś