Who Keeps the Lights On?
Every so often, someone in the Ruby community will ask,
âSo⌠what does Planet Argon actually do these days?â
Fair question.
Every so often, someone in the Ruby community will ask,
âSo⌠what does Planet Argon actually do these days?â
Fair question.
Weâve spent the last decade optimizing for scale. How do we handle more traffic? More users? More engineers? The assumptions were baked in: Growth is coming. Prepare accordingly.
So we split things apart. We mapped services to teams. We built for the org chart we were about to have.
Then 2023 happened. And 2024. And now 2025.
Turns out, the future isnât always bigger.
Iâve been thinking about what happens when open source organizations hit their breaking point⌠when funding dries up, relationships fracture, and everyoneâs scrambling to make sense of what went wrong.
It turns out, the patterns look familiar.
As the opening keynote on Day 2 of Rails World 2025, I had the chance to host a panel with three people whoâve been shaping the direction of both Ruby and Rails from deep within the internals.
@tenderlove
)@hsbt
)@byroot
)We covered a lot in an hour:
Thereâs even a moment where Aaron and Jean get into a friendly disagreement about performance and priorities. If you enjoy technical nuance and sharp perspectives, youâll appreciate that exchange.
And yes⌠I asked Aaron about his favorite Regular Expression. His response did not disappoint.
It was a fun, thoughtful, and occasionally surprising conversation â and a reminder that Ruby and Rails continue to evolve in the hands of people who care deeply about their future.
If you werenât in Amsterdam or want to revisit it, the full panel is now available:
Also worth pairing with this interview with Jean on the On Rails podcast, where we dig into IO-bound workloads, misconceptions, and what itâs like maintaining Rails at scale.
A solid pairing if youâre curious where the ecosystem is headed next.
Iâve been part of the Ruby on Rails ecosystem for over two decades. Iâve watched teams adopt Rails with wild enthusiasm⌠evolve their systems⌠struggle through growing pains⌠and eventually find themselves in an uncomfortable position; debating whether to abandon the tools that once brought them so much joy.
I donât think thatâs necessary⌠or even wise.
But I do think itâs understandable.
After working with and talking to hundreds of teams⌠many of them using Rails, Laravel, Ember.js, or even React⌠Iâve noticed a pattern. A lifecycle of sorts. The way teams internally adopt and evolve their relationship with a technical stack. Iâve seen it reflected in our consulting clients at Planet Argon, the guests on my podcasts (Maintainable and On Rails), and peers whoâve been part of the various peak waves of these ecosystems.
And while every team is different, the stages of internal tech stack adoption often follow a similar spiral.
This post is an attempt to describe that spiral.
Not as a fully baked theory; but as a conversation starter. A mirror. And maybe a compass.
Because whether your team is building your core product with Rails, or youâre a non-software company maintaining internal tools on Laravel, understanding where you are in this lifecycle might help you understand what comes next.
Before we go deeper, hereâs a quick overview of the seven stages Iâve observed. These arenât fixed; your team might skip around or revisit them multiple times. But in general, this is the pattern Iâve seen:
Adopting: A small group of enthusiastic engineers selects and introduces the stack while building a prototype or MVP.
Expanding: The stack proves useful⌠so it spreads. More features, more developers, more tooling.
Normalizing: The stack becomes the default. Teams standardize around it. Hiring pipelines and best practices emerge.
Fragmenting: Pain points surface. Teams bolt on new tools or sidestep old ones. Internal consistency erodes.
Drifting: The stack feels sluggish. Upgrades are deferred. The excitement is gone.
Debating: Conversations shift to rewrites or migrations. Confidence is shaken.
Recommitting: Teams pause, reflect, and decide to reinvest in the stack⌠and their shared future with it.
Again, these stages arenât a ladder; theyâre a spiral.
And the question your team has to ask is: Are we spiraling upward⌠or downward?
Because while The Downward Spiral is a great album, it doesnât have to be your trajectory.
It might be tempting to look at this lifecycle and think, âOur goal is to get to the Recommitting stage and stay there forever.â
But thatâs not how this works.
Every team will move through these stages multiple times over the lifespan of their product. Shifting priorities, team turnover, organizational pivots⌠they all create new dynamics that ripple across your tech stack.
Recommitting isnât a finish line. Itâs an inflection point. One that clears the fog, sharpens priorities, and invites your team to move forward with intent.
Just donât mistake clarity for comfort⌠the spiral keeps turning.
Caching in Rails is like duct tape. Sometimes it saves the day. Sometimes it just makes a sticky mess youâll regret later.
Nowhere is this more true than in data-heavy apps⌠like a custom CRM analytics tool that glues together a few systems. You know the type: dashboards full of metrics, funnel charts, KPIs, and reports that customers swear they need âreal-time.â
And thatâs where the caching debates begin.
All valid. All expensive in their own way.
Your SaaS app has grown to 2,000 customers, each with multiple users.
For the overwhelming majority, dashboards load just fine. Nobody complains.
But then your whales log in. The Fortune 500 accounts your sales reps obsess over. Their dashboards pull data from half a dozen APIs, crunch millions of rows, and stitch together a wall of charts. Itâs not just a page. Itâs practically a data warehouse in disguise.
These dashboards are slow. Painfully slow. And you hear about it⌠through support tickets, account managers, and sometimes even a terse email from someone with âChiefâ in their title.
So your engineering team digs in. You fire up AppSignal, Datadog, or Sentry and zero in on the slowest dashboard requests. You look at traces, database query timings, and request logs. You chart out the p95 and p99 response times to understand how bad it gets for the biggest customers.
From there, you start experimenting:
You squeeze what you can out of the obvious optimizations. Maybe things improve⌠but not enough.
So the conversation shifts.
When your product team actually sits down with those whale customers, the conversation shifts.
They start by saying: we need real-time data. But after a little probing, everyone realizes âreal-timeâ doesnât always mean right now this second.
Maybe what they really need is a reliable snapshot of activity as of the end of the previous day. Thatâs good enough for the kinds of decisions their leadership is making in the morning meeting. Nobody is making million dollar calls based on a lead that just landed five minutes ago.
And your team can remind them: there are other real-time metrics in the system. For example:
So now youâve reframed the dashboard story. Instead of one giant âreal-timeâ data warehouse, you split it into two categories:
Thatâs usually enough to recalibrate expectations. Customers feel like theyâre still getting fresh data, while your app no longer sets itself on fire every time a big account logs in.
Armed with that agreement, your team ships the âreasonableâ solution most of us have built at least once:
The next morning, dashboards are instant. Support tickets quiet down. Account reps breathe easier. Everyone celebrates.
But hereâs the kicker: the whales were the problem. The rest of your customers never needed this optimization in the first place. Their dashboards were already fine.
So now youâve turned one customerâs problem into everyoneâs nightly job. And under the hood, youâve cranked through hours of CPU, memory, and database load⌠just to prepare data for customers who wonât even log in later today.
Worse, youâve stuffed your background job queue with 2,000 little tasks every night. Which means your queue systemâwhether itâs Sidekiq, Solid Queue, or GoodJobâis spending precious time juggling busy work instead of focusing on the jobs that actually matter. And when those queues get stuck, or a worker crashes, youâre left wading through a mountain of pending jobs just to catch up.
This is what I call Cache Pollution: the buildup of unnecessary caching work that bloats your systems, slows down your queues, and leaves your caching strategy with a far bigger carbon footprint than it needs to. Another benefit of tackling Cache Pollution early is future flexibility â you might eventually solve the computation challenges in a different way, and you wonât be anchored to big, scary scheduled tasks that churn through all of your customers every night.
Do these reports need to run every single day⌠or only on weekdays when your customers actually log in?
If your traffic drops on Saturdays and Sundays, consider a lighter schedule. Or even none at all. Because âslowâ isnât so slow when almost nobody is around. A BigCorp admin poking the dashboard on Sunday morning might be fine with an on-demand render⌠especially if the weekday experience is snappy.
And hereâs another angle: if your scheduled job runs at 1 AM, that means when a BigCorp user logs in later that same day, theyâre still looking at data thatâs less than 24 hours old. For most business use cases, thatâs plenty. You donât need to rerun heavy jobs every few hours just because you can.
This is all about right-sizing frequency:
If your dashboard code doesnât rely on the cache to render, you keep the option to not precompute. That flexibility is where the savings live. As the business grows, the cost of overly eager schedules grows with it⌠so design for dials, not hard-coded habits.
One more question to ask about recurring scheduled jobs: do you really need to iterate through all users or all organizations?
In many cases, the answer is no. Most customers donât trigger the conditions that require a heavy recompute. Yet teams often design jobs to blast across every top-level object in the database, every night, without discrimination.
Instead, look for signals that help you scope the work down:
By narrowing the set of work each job touches, you cut down on wasted compute, reduce queue congestion, and avoid the kind of Cache Pollution that grows silently as your business scales.
The trick isnât just caching everything for everyone. Itâs knowing who to cache for and when.
And hereâs a bonus: if your job fails at 1 AM, re-running it for 50 customers is a whole lot faster than crawling through 2,000.
Extra credit: scope your scheduled tasks so that when a customer crosses a certain thresholdâsay, user count, dataset size, or request volumeâthey automatically join the âwhaleâ group. No manual babysitting required.
Not all caching challenges look like dashboards.
Case Study: The Press Release Problem
We once managed a public-facing site for a massive brand. Whenever they dropped a big press release, it spread fast across social media. Traffic would spike within minutes.
Of course, thatâs when the CEO would notice a typo. Or the PR team would need to update a paragraph to reflect a question from the media. Despite their editorial workflows, changes still had to happen after publication.
So we had to get clever. We couldnât cache those fresh pages for hours. Instead, we used a sliding window approach:
- First 5 minutes: cache for 30 seconds at a time.
- After 5 minutes: increase to 1 minute.
- After 10 minutes: increase to 2 minutes.
- After 20 minutes: increase to 5 minutes.
- After 6 hours: safe to cache for an hour.
- After a day: cache for a few hours at a time.
This let us protect our Rails servers from massive traffic spikes when a new article was spreading fast, while still giving editors the ability to push corrections through quickly. Older articles, once stable, could safely sit in Akamaiâs cache for hours.
At the time, Akamai could take up to seven minutes to guarantee a purge across their global network. Not ideal. We had to plan for that lag. Today, most CDNs can purge instantly, but back then⌠it was a constraint we had to design around.
A lot of what weâve talked about here comes down to avoiding Cache Pollution.
Thatâs the unnecessary churn your system takes on when it generates data nobody asked for. Itâs the background job queue bloated with thousands of tasks that fight with more important work. Itâs the 1 AM process chewing through CPU just to prep dashboards for customers who never log in.
Cache Pollution looks like optimization on the surface⌠but underneath itâs just waste.
So before your team spins up the next caching project, stop and ask:
Because the goal isnât just faster dashboards. The goal is to keep your caching strategy lean, resilient, and focused â instead of leaving behind a trail of Cache Pollution that grows with every new customer you add.
(or: How Iâm Learning to Step Back So We Can Move Forward)
Hereâs a pattern Iâve been contributing to more than Iâd like to admit:
Someone on the team proposes a new internal tool. Thereâs a clear need. Thereâs momentum. Conversations start about how weâll build it⌠what tools weâll use⌠where itâll live⌠whether it might become client-facing someday.
It hasnât been built yet, but weâre already architecting the scaffolding.
And thatâs usually when I step in.
Not with a thumbs-up. Not with funding. But with questions. The kind that start with, âWhat if we didnât build this?â
Itâs not fun. Itâs not fair. And itâs not how I want to lead.
By that point, people are invested. Theyâve done the thinking. Theyâve shared the idea. Theyâve taken a risk. And now Iâm asking them to scale it backâor stop entirely.
This is me taking responsibility for that pattern.
So I did the only thing I know to do in moments like this: I wrote it down.
We now have a model. Internally, we call it The Internal Tooling Maturity Ladder.
Level 0 â One-off Manual Script
Something one person runs on their own machine to save time or reduce repetitive work.
Maybe it lives in an unsaved file or as a task in Alfred or Automator.
Itâs not elegantâbut it works.
You run it manually. You copy-paste the result. You feel smug for five minutes.
No repo. No expectations. No long-term promises.
Example: A quick Ruby or Bash script that tallies something from an API and drops it into Slack.
Level 1 â Shared Manual Script
You cleaned it up. You wrote a little README. You dropped it in a shared Gist or Google Drive.
Itâs still manually triggered, but now others can use it tooâif they read the instructions.
Itâs still lightweight. Still safe.
And itâs often where great tools should stay.
Example: A command-line tool that a few team members can run locally, maybe to generate a report or fetch usage stats.
Level 2 â Scheduled Automation
Now weâre automating things.
It runs on a scheduleâmaybe through Zapier, a GitHub Action, or a scheduled Rake task.
No UI. No buttons. Just automated updates that go where we already spend time.
Slack. Google Sheets. Email.
These tools hum quietly in the background, doing one job well.
Example: A script that posts weekly project stats to a Slack channel every Monday morning.
Level 3 â Lightweight Internal Service
Now weâre getting fancy.
This has a small UI. A form. A dashboard. Maybe some configuration options.
It needs hosting. Credentials. Some thought about security.
Itâs still simple enough that one person can manage itâbut now itâs a thing.
And it needs some care.
Example: A mini app that lets the team search across client project docs or surface stale Jira tickets.
Level 4 â Fully Hosted Internal Product
This is a real web app.
Itâs deployed. It has a frontend and a backend. It has users. Sessions. Maybe even tests (hopefully).
It needs to be maintained. Updated. Monitored.
It might solve a meaningful problemâbut itâs not free.
This is the top of the ladder for a reason.
Example: A polished internal dashboard thatâs become a critical part of day-to-day operations.
This isnât a blueprint. Itâs a conversation starter.
The higher you go, the more you commitâtime, infrastructure, expectations.
So weâre learning to start lower on the ladder.
To earn our way up.
To see if people care before we care too much.
Every internal tool is a promise.
To support it. To upgrade it. To explain it to the next person who inherits it.
And sometimes⌠the smallest version of the tool is all we need.
A Slack post.
A spreadsheet.
A script that helps one person do their job 10% faster.
Not everything needs a UI.
Not everything needs a repo.
And not everything needs me to be the one who calls time on the project two weeks in.
This post isnât about our internal model. Not really.
Itâs about building fewer things that trap us.
And creating more space to experiment without regret.
If youâve found yourself playing the role of reluctant gatekeeper⌠youâre not alone.
This ladder is helping me find a better way.
One rung at a time.
If you didnât make it to RailsConf this yearâŚor couldnât make it to my talkâŚIâve got good news: the full video is now live.
đĽ Watch it here
Preparing for this talk was one of the most nostalgic (and sometimes absurd) research dives Iâve done in years. I pitched The Features We Loved, Lost, and Laughed At thinking it would be easy to uncover a long list of removed or weird Rails features to poke fun at.
Turns out? They werenât so easy to find.
Rails hasnât just thrown things away. Itâs looped. Itâs learned. Itâs come back to old ideas and made them better.
In the talk, I trace that evolutionâŚusing code examples and stories from the early days of ActiveRecord, form builders, observe_field
, semicolon routes, and even a few lesser-known misadventures involving matrix parameters.
I touch on features like Observers (invisible glue, invisible bugs) and ActiveResourceâŚwhich wasnât confusing so much as it was optimistic. It assumed the APIs you were consuming were designed with Rails-like conventions in mind. That was rarely the case.
I also explore what Rails has taught us about developer happiness, what it means to build with care, and what the community keeps refining (and laughing about).
Hereâs a quick example: I once wrote an InvoiceObserver
that did four different things silentlyâŚand when it broke, it took hours to even figure out where the logic lived. Magical until it wasnât.
With RailsConf coming to a close, it felt like the right moment to reflect not just on the frameworkâŚbut on how we evolve alongside it.
Rails doesnât just chase trends. It revisits its own decisions and asks: âWhat still brings us joy?â
Thatâs a rare trait in software. And itâs why Rails still feels like home for so many of us.
âRails doesnât just move forwardâŚit reflects. It loops. It asks: Whereâs the friction? What can we make effortless again?â
If youâre newer to the framework, or just curious what Rails has quietly taught us over the yearsâŚI hope you find something here to smile at.
Iâm grateful to my Ruby friendsâŚsome old, some newâŚwho shared memories, weird bugs, screenshots, mailing list lore, and just the right amount of healthy skepticism while I was putting this together.
Ruby on Rails is often celebrated for how quickly it lets small teams build and ship web applications. Iâd go further: itâs the best tool for that job.
Rails gives solo developers a powerful framework to bring an idea to lifeâwhether itâs a new business venture or a behind-the-scenes app to help a company modernize internal workflows.
You donât need a massive team. In many cases, you donât even need a team.
Thatâs the magic of Rails.
Itâs why so many companies have been able to start with just one developer. They might hire a freelancer, a consultancy, or bring on a full-time engineer to get something off the ground. And often, they do.
Ideas get shipped. The app goes live. People start using it. The team adds features, fixes bugs, tweaks things here and there. Maybe theyâve got a Kanban board full of tasks and ideas. Maybe they donât. Either way, the thing mostly works.
Until something breaks.
Someone has to redo work. A weird bug eats some data. A quick patch is deployed. Then someone in management asks the timeless question: âHow do we prevent this from happening again?â
Time marches on. Other engineers come and go, but the original developer is still around. Still knows the system inside and out. Still putting out fires.
Eventually, the company stops backfilling roles. Thereâs not quite enough in the backlog to justify it. And besides, everything important seems to be in one personâs head. That person becomes both the systemâs greatest assetâand its biggest risk.
This is usually about the time our team at Planet Argon gets a call.
Sometimes, itâs the developer who reaches out. Theyâre burned out. They miss collaborating with others. Theyâre tired of carrying the whole thing. Other times, itâs leadership. Things are moving too slowly. Tickets arenât getting closed. The bugs they reported last quarter still havenât been addressed. Theyâre worried about what happens if that one dev goes on vacation. Or leaves.
Theyâve tried bringing in outside help⌠but nothing sticks. The long-term engineer keeps saying new people âdonât get it.â
By the time we step in, weâve seen some version of this story many, many times.
Documentation? Sparse or outdated.
Tests? There are some, but good luck trusting them.
Git commit messages? A series of âfixesâ and âWIPâ.
Hardcoded credentials? Of course.
Onboarding materials? Thereâs nobody to onboard.
Rails upgrades? âWeâll get to it eventually⌠maybe.â
Today marks the launch of On Rails, a new podcast produced by the Rails Foundation and hosted by yours truly.
Weâve recorded the first batch of episodes, and Episode 1 is out now: Rosa GutiĂŠrrez on Solid Queue.
The show dives into technical decision-making in the Ruby on Rails world. Not the shiny trend of the week⌠but the real conversations teams are having about how to scale, what trade-offs to make, and what long-term maintainability actually looks like.
Youâll hear from developers running real apps. Some are building internal tools. Others work on products youâve probably used. A few are out there blogging and tweeting⌠but many are too deep in the day-to-day to stop and write about it. Theyâre just doing the work â shipping, fixing, refactoring, and figuring it out as they go.
The idea for On Rails started with those hallway conversations at conferences. The ones that donât make it into keynotes or blog posts. It grew out of the calls I have with clients at Planet Argon. And, of course, out of years of hosting Maintainable.fm.
Youâd think that after recording over 200 episodes of Maintainable, I wouldnât be so nervous to hit record on something new⌠but here we are. New show jitters are real.
Weâre approaching this podcast with depth and focus. Fewer episodes. Longer interviews. Conversations that aim to surface lessons learnedâŚand the thinking behind the decisions that shape real systems.
If youâre a Rails fan, I hope youâll give it a listen. If subscribing is your thing, you know what to do. And if youâve got a story worth sharing â Iâd love to hear from you.
đ§ Listen to Episode 1: Rosa GutiĂŠrrez: Solid Queue
đ Browse all episodes: onrails.buzzsprout.com
đ˘ Official announcement: Ruby on Rails blog
Earlier this year, I dusted off this blog â which I started back in 2005 â and found myself reflecting on Maintainable, the podcast Iâve hosted for the past few years about long-term software health.
At the same time, I was toying with the idea of spinning off something more Rails-focused. A show that could spotlight the kinds of conversations I was already havingâŚwith clients, with other devs, and in those casual, between-session moments at conferences.
Right around then, Amanda from the Rails Foundation reached out with a prompt:
âA podcast of Rails devs talking about the nitty gritty technical decisions theyâve made along the way.â
âŚwhich aligned nicely with what I had been ruminating on. The timing was perfect and we decided to make it happen.
One of the early shifts for me was adapting to a more collaborative production process. Iâve been running Planet Argon for more than two decades â and Iâm used to moving quickly, often without needing to pitch or workshop ideas with others. But with On Rails, Iâve had the opportunity to work closely with Amanda, the Foundation, and DHH. Theyâve all taken an active interest in shaping the vision, the guests, and the format.
Another early challenge? The first round of guests were pitched to me â which meant jumping into the deep end with folks I hadnât already spoken with. That raised the bar for prep. On Maintainable, Iâve occasionally relied on some degree of improvisation. Here, I knew Iâd need to come in more preparedâŚand thatâs been a good thing.
So On Rails was born.
Iâll still be hosting Maintainable (though likely on a slower cadence). And Iâm excited to run both of these shows side by side â each with their own tone and focus.
Hope you get a chance to give it a listen.
Drowning in technical debt?
It doesnât have to be this way.
Back in September at Rails World 2024, I shared what Iâve learned from helping teams tack their way out of troubleâless theory, more battle-tested strategies. Lessons from Planet Argonâs clients, Maintainable.fm guests, and real-world Rails teams.
Have 25 minutes? Watch it here:
Your future self (and your app) will thank you.
As a consultant, Iâve looked over a shitload (how many? probably ~150-200) over the last 12 1/2 years in the Ruby on Rails community. I havenât worked on most of them, but I do get invited to look over, review, audit, and provide feedback on a lot of those.
Over on the Planet Argon blog Iâve shared my quick hit list of a few initial things that I look for when looking over an existing code base.