Commercial Opportunities in a Renaissance of Self-Hosting
I’m still pondering on whether there’s a venture-backable opportunity in the data center sitting in my basement, but the last time I felt this curious about a technical shift (besides LLMs) was when we blew the circuits in Paul Benigeri’s Stanford dorm mining altcoins in 2013.
The accessibility, performance, and cost-effectiveness of self-hosting today are fascinating. You can buy commodity hardware at one-tenth the annual cost of renting it on Amazon Web Services, connect it smoothly to the public internet behind a residential IP with Cloudflare Tunnels or Tailscale, and seamlessly deploy with a host of solutions from a mature Docker ecosystem to DHH’s Kamal. For a total cost of several hundred dollars, I now have about $15,000 of annual AWS compute running in my basement:
Macro Trends
- Major cloud providers have healthy margins benefiting from a historical trend towards the cloud and increasingly performant and cost-effective hardware.
- Running your own servers is cheaper and more convenient than ever before with better hardware, the latest software infrastructure, and ChatGPT to troubleshoot.
- Compute is becoming more valuable than ever with flexible use cases and high demand for LLMs. Rumor has it that Google is now spending more on compute than on people.
Hosting Spectrum
I broke down the hosting spectrum a bit further:
- Cloud - Hyperscalers like AWS, GCP, Azure; “Smaller” cloud providers like OVH.
- Colo - Colocation data centers that give you racks (and maybe installation), but you bring your own hardware.
- “Managed” Colo - See Heztner’s dedicated offering where you can buy dedicated servers that are plugged into the rack and accessible remotely, but no other support.
- On-Prem - Data centers owned and managed by the end customer (i.e. “on-premise”)
- Edge Compute - Deployments where a company owns and manages compute in the field (ranging from sensors to servers to the oil barrels I mentioned)
Potential Commercial Opportunities
Cloud continues to benefit
Paradoxically, I think that the hyperscalers will continue to benefit from growing demand for compute and will not significantly suffer in the short term from alternatives for self-hosting.
AWS, GCP, and Azure maintain substantial momentum:
- Flexibility - Customers can spin up massive infrastructure in minutes. Especially important for uncertain demand (e.g. a high-growth consumer startup).
- Convenience - Cloud services still take care of a level of devops that colos or on-premise doesn’t. Small teams without a devops expert can still save money by offloading that to the provider.
- Mindshare/Dev Familiarity - AWS’s APIs, in particular, have become an industry standard. There’s a moat in mindshare and developer training.
- Proprietary Edge - Hyperscaler customers are often buying more than just raw compute. E.g. AWS “services” are just that: APIs either with AWS proprietary technology for a managed service like Sagemaker or know-how to manage larger deployments at scale like RDS. They also have a literal edge in infrastructure like edge compute (physically distributed CDNs), bandwidth (physical cabling to allow for faster transit times), or redundancy support (S3 has event nines of durability).
- Cost Basis - Due to their scale, hyperscalers can also theoretically deliver products at a lower cost than anyone else (even if they keep their higher margins today).
For these reasons, while hyperscalers might lose market share in some segments, I believe they will still dominate as the default option for new startups and many existing large companies with complex operations on their platforms. As compute appetite (especially GPUs) continues to grow, they will likely continue to grow overall market share, but perhaps lose it in some specific segments (below).
Mature Businesses Shift More Off The Cloud
More mature businesses might start to shift more workloads off the cloud.
If hyperscale cloud customers are paying a premium for speed, flexibility, sophistication, and convenience, then alternative colo and smaller cloud providers can differentiate with a more cost-effective product in exchange for a slower, more constrained, simpler product.
This has always been the colo value proposition, but now those tradeoffs are less severe because of the trends discussed earlier: easier self-hosting devalues convenience. Colo and smaller cloud providers can lean in even further on the low-cost value proposition and flip Jeff Bezos’s famous quote on him because of the current margins charged by hyperscalers.
Coincidentally, with rough, back-of-the-napkin calculations, it looks like Heztner is providing this significantly cheaper colo value proposition while still keeping AWS-like margins:
Note: Using 2022 as a common year since Hetzner has 2022 data accessible by German regulations even though it is private.
Heztner is interesting because it provides colo and smaller “cloud” offerings, but what it’s known for is a category I call “managed” colo like servers.com’s headline value proposition where the hardware is taken care of for you but nothing else.
I bet we’ll see more colo providers thrive and the “managed” colo offering grow faster for customer segments that have more predictable and mature businesses which don't need the flexibility or cutting-edge capabilities offered by hyperscalers. DHH references this with respect to 37Signal’s businesses whose longer-term, more steady growth and business models better match the colo product. 37Signals’ legacy businesses can be expected to remain relatively flat and their hey.com service is expected to grow and be sticky at a predictable pace. 37Signals is using Deft’s colo services and lauding the cost savings as a result.
Again, these tradeoffs are nothing new. The improved software tooling and improved performance for cost just make the value proposition outside of the cloud more compelling.
One computing customer segment that I would be curious to investigate further are mid-market and PE-backed companies. These companies can have significant compute spend. Pre-covid, I’ve spoken with firms like TSG that have dedicated operations teams for digital transformation - often to bring their portfolio companies from on-premise to the cloud.
The purported calculus of PE-backed firms was that they would save significant value and add increased flexibility by sunsetting on-prem resources. However, as early as 2018, Michael Dell (biased in his self-interest) was advocating for the cost savings of on-prem. I think an even bigger motivation for PE-backed companies to move to the cloud has been as a catalyst to clean up architecture and mitigate dependencies on the arcane tribal knowledge of specific employees. Running your own services also has higher human capital costs for some companies than others - it’s hard to attract a world-class data center infrastructure engineer to go work for a lumber mill company. You don’t want to be beholden to the guy in the back office who has been there 20 years and is the only person with the on-prem knowledge to unbrick the service when he retires.
However, the calculus for PE and mid-market companies might be shifting, if not to on-prem, at least colo. Now that the latest deployment technology for self-hosted software is more accessible, the human capital cost of running your own - especially colo where you don’t need to run your own hardware - is dropping. The portability is also dropping as well which behooves PE-backed companies shopping for new buyers that may want to integrate or split up acquisitions. The traditional PE model of squeezing out costs and businesses that are past their stage of exponential growth also matches well with non-cloud offerings of price at the cost of requiring greater predictability of compute.
New Opportunities for On-Premise and Edge Compute
Where I suspect the most interesting venture opportunities lie in these changing computing trends is with respect to on-prem deployments and edge compute. These two categories lie on the extreme other end of the cloud where the increased accessibility and decreased cost don’t just improve economics, but lower the activation energy to potentially unlock applications that might not have made sense in the past.
Like I said, I’m still synthesizing more concrete commercial opportunities, but here are a few areas to start:
Vertical Specific Corporate Compute
The best place to uncover this would be going and working at or hanging out in an IT job at Chevron, JP Morgan, Walmart, BHP, HomeDepot, or another large, non tech company, and better understanding their compute, internal politics, and opportunities from the inside.
Nothing new as a strategy for need-finding - though ironically, the skills that probably put you in such a place to see the opportunities are the opposite of those to exploit them. However, going into a more tech-sleepy industry that uses on-prem either because of specific requirements (finance) or legacy (HomeDepot) with the lens of the trends we have discussed could be enlightening.
Some areas I might delve deeper in would be: alpha in financial institutions through on-prem or edge compute that could be revisited to find greater efficiencies or compute needs for businesses still operating off the grid, like BHP’s mining operations, where we looked back at doing computer vision on-site at SVI circa ~2018.
I’m also curious if there are businesses that were impossible or uneconomic in the past because they required cost-prohibitive on-prem deployments with these types of customers, but with today’s tools could make sense with a lower implementation cost. This is similar to how SMB SaaS customers were cost-prohibitive to acquire in the past, but have become increasingly more accessible with the lower CACs enabled with improved marketing tools and now mass personalization with LLMs.
Open Source Enterprise Application Alternatives
Open source enterprise applications may be able to catch some of these tailwinds as well. Easier self-hosted deployments on colo or on-premise make open-source alternatives to mainstream cloud-based enterprise applications more compelling by removing their downsides without impacting their advantages in privacy, compliance, configuration, and cost.
Gitlab self-managed offering has recently seen increased interest after leading player, Atlassian, recently decided to sunset its self-hosted product:
Atlassian’s decision to stop support for its server offering is making customers reconsider what product they use for Enterprise Agile Planning. We are focused on making it easier for these customers to move to GitLab SaaS and self-managed. We recently launched a new Enterprise Agile Planning SKU. Now, GitLab Ultimate customers can easily bring non-technical users into the platform.
Because of that server offering being deprecated, a lot of customers are taking stock of where they're at. It is a natural point for enterprises to evaluate.
(Source)
With the costs and the barriers to deploying open source and on-premise dropping, it’s not just application characteristics like configuration or customization that may compel enterprises to switch from the cloud. With hyperscalers and major cloud providers like Salesforce reaching into all aspects of software, there are strategic and competitive concerns about avoiding your competitors’ apps and infrastructure. More accessible and deployable alternatives increasingly look more compelling.
In some cases, the revolution making it easier to deploy anywhere may also help open source providers find synergies with the cloud. Gitlab is taking advantage of its independence from Microsoft (vs. Microsoft’s Github subsidy) to go after enterprise customers through partner deals with AWS and GCP. Easier, more portable deployments translate into a go-to-market where open source can play more flexible channels to get their products to market that previously might have required much deeper investments and integrations for deployment.
Defense Applications
Defense technology appears to become a greater and greater need in today’s, unfortunately, increasingly polarized world. The rapid iterations in drone and electronic warfare in Ukraine are frightening. They are also showing the imperative of having local compute to support increasingly digitized warfare on the front.
It’s hard to under-emphasize how inadequate cloud services can be on the battlefield compared to the connected experience of living in a major developed city. You don’t need to go to a warzone. Just live in the Philippines for a month to experience the country’s variable internet to get an idea of how frustrating access can be. Now add people’s lives at stake. Even with Starlink, potentially life-determining network access is at the mercy of weather conditions, electronic warfare, and the whims of the world’s richest man.
My MARSOC friends have seen this need for edge compute for a while and pushed for more adoption in the Marines, but the US moves slowly. I expect (and hope) that the fast-paced iterations by both sides in Ukraine and attention in the Far East will spur faster US demand for this and open up opportunities to leverage more accessible and cheaper edge compute infrastructure - in fact, the drone developments in Ukraine are exactly that - perhaps we’ll see more Andurils or more niche, unique, and nimble hardware/software applications in defense that didn’t make sense when edge compute was a higher barrier.
The Home and Privacy
Apple and Google clearly think the connected home is a big market, but haven’t cracked it yet. Could a new startup leverage these trends to succeed with the mass consumer or compete in a sub-segment like high-end A/V installations?
The last area comes full circle back to my basement. What is possible in the home now that was impractical before this software infrastructure shift? Upon further thought, I think there may actually be an overlooked opportunity in the near future to finally deliver on the failed IoT promises of pre-pandemic connected home and IoT hype.
A lot of at-home compute devices have stumbled because of the complexity of setup, the incompatibility between competing companies, and privacy concerns. Some, like Ring, still found success, but the landscape has turned into a lot of point-solution devices all linked to the cloud, but not truly interconnected. I’ve yet to hear of Google Home or Apple Home living up to their promise of true automation and connection as they cut out their fiefdoms of partnerships with specific providers.
However, those barriers can now be overcome. The barriers to deploy a consumer compute or sensor device in the home keeps dropping - whether it’s a company leveraging open source software to make a smoother, closed source product, or the consumer going to necessary incantations to set up a device. There are new possibilities for companies to build on the plethora of cheap, higher performing electronic sensors from Asia. The barriers to integrate with on-prem at-home devices keeps dropping with open source communities like Home Assistant.
Anecdotally, wrestling with home automation has been one of my cousin’s hobbies for years, and I’ve watched it be a constant source of headaches in spite of his evangelism. I remember earlier this year watching one cousin at the cutting edge wire up his house in Vegas with a voice-automated steam room, blinds corresponding to tea pouring, and electricity usage corresponding to power pricing. However, this year he made the striking comment that “it’s almost there…easy enough for me to recommend it to my non-technical family members.” That was part of my impetus to wire up my own Home Assistant installation and see for myself how accessible it is today. It’s turned out way better so far than the high-end A/V installations I’ve watched billionaires try to troubleshoot when they want to watch a movie on their estate past onsite tech support’s working hours. Furthermore, the facial recognition running on my connected cameras is 100% private, running on a local model without sending any sensitive data to the cloud. My cousin’s and my installation are still just toys, but sometimes toys become more than just that.
One other advantage that on-prem home applications have is privacy. As LLMs gobble up more and more data and make previously obscure, but public records more and more accessible, I have a feeling that privacy might grow back as a consideration for many home-owning consumers. What if I want the convenience of facial recognition and a modern voice assistant in my home without the creepiness of Amazon, Google, or Facebook knowing my every move?
When easier at-home deployment is coupled with maturing local AI tech and increasing cloud privacy concerns, I’m curious about what new applications entrepreneurs invent, and what commercial opportunities open up.
Conclusion
While AI dominates today's tech headlines, a quieter but significant transformation is unfolding in the world of computing infrastructure. We're witnessing a silent renaissance in self-hosting, driven by a convergence of factors: improved hardware economics, a mature deployment ecosystem, and shifting market dynamics in cloud computing.
I don’t believe this shift is a simple pendulum swing away from cloud computing - rather, it's creating a more nuanced and stratified hosting landscape. Hyperscalers will continue their dominance for high-growth companies that prioritize flexibility and scale. However, the emergence of more accessible self-hosting options is opening new possibilities across the spectrum, from more accessible colo solutions for PE-backed firms to new edge computing applications for industries previously underserved by traditional cloud infrastructure.
The most compelling venture opportunities may not be in direct competition with hyperscalers, but in reimagining what's possible when the barriers to self-hosted and edge computing fall away. Whether it's enabling new financial applications that demand specific compliance, supporting critical defense capabilities in contested environments, or reshaping privacy-first home automation - I think we're only beginning to see what's possible when sophisticated computing capabilities can be deployed anywhere, efficiently and economically.
I’m still not sure what I’ll do with all the compute in my basement yet, but writing this has given me some more ideas to explore.
Appendix
Basement Data Center Numbers
If you're curious about the breakdown of the data center running in my basement.
Here's the model and less rigorous commentary below:
Cost: "AWS margins:"
- 82-91% annual cost savings by running the blog on the box in the bottom of the picture at the top of this post.
- On-Prem: $100-$200/year - annual cost before amortization of the 4 core, 16GB memory, 1TB storage Lenovo M715q in the picture ($80 on eBay + $1-10/mo of electricity depending on load).
- AWS EC2: $1,117.34/year - annual cost of a t4g.xlarge equivalent EC2 instance ($0.1344/hr * 8760hrs/year)
Accessibility: 1 weekend with LLMs to:
- Set up a dozen or so heterogeneous, free or old computers as Linux servers- Connected them to the public internet behind a dynamic residential IP with Cloudflare Tunnels.
- Add infrastructure like electrical backups, monitoring, and a CCTV circuit with local object detection on old Android phones and Home Assistant.
- The equivalent setup would have probably taken a month or more to troubleshoot in the past.
Other Costs: There are other costs across 15 boxes that I take for granted as "free" in my basement. These could actually get you slightly above AWS in the first year if you are conservative ($1,293 = $1,093 + $200), but realistically, labor is cheaper than what I built in and internet an rent is likely further amortized depending on circumstance:
- Labor: ~$40-$533/box/year - Assume a high rate of $500/hr (compared to a standard rate ~$35/hr for a data center technical) and ~16 hours for setup for 15 servers with a buffer for ad hoc maintenance.
- Internet - ~$160/box/year - Conservatively assume we're paying $200/mo for one of the highest-speed residential connections from Verizon and nothing else is on the system.
- Rent - ~$0-$400/box/year for a half rack in a data center that can hold ~21 servers. Of course, this isn't apples to apples since the heterogeneous hardware I have isn't rack-ready, but it's a comp vs. my basement which is "free."
- Easy of Use: Once the physical setup is done, I've personally found it easier to deploy on my home lab than on AWS.
Cloud Discounts: A friend from Google noted that big companies benefit from large discounts with hyperscalers. I don't have good benchmarks for this, but assume that on-premise also becomes more price competitive at such a scale through bulk hardware costs and further amortization of labor and better use of space at scale.
Other Real World Examples
- A good friend ran the bulk of his youth sports AI video streaming company (with tens of thousands of users spiking on game days) on his own servers - first at home and then at a colo. They are still running GPUs in the colo after being acquired by a sizable firm.
- There is a whole subculture of at home data centers on Serve The Home and r/homelab.