Quick tour of my local inference setup.

May 22, 2026
9:58 pm

So someone was asking if I ran LM Studio locally and I said no, and they said Ollama and I said no, llama.cpp directly, and they were like isn’t that pretty bare bones and hard to use and keep up? So I promised a screen grab and a bit of exposition. Rather than send it by email though, I’m just gonna post it for you. So, here we go.

First thing, I run a multi-system setup. I have two headless ThinkCentre m93p tiny machines for the actual agents. Each one is a Core i5 with 16GB RAM and 230GB SSD. They run headless and each one is about the size of three old CD jewel cases stacked on top of each other, so two m93p systems is still… almost no space at all. Right now, one is OpenClaw and one is running multiple coding harnesses. Having each of them in their own metal gives more safety re: access to the local environment without me having to futz around with virtualization.

I interact with the claw mostly on Discord and the others Tailscale SSH.

Inference actually runs on the machine I interactively use, which is a 3-GPU setup with 76GB VRAM total along with a Core i9-9660k and 128GB DDR4. Two of them are Radeon Pro data center cards, it requires a bit of hacking and home engineering to use these safely as they don’t have onboard active cooling. 3D printer is your friend here. I run actually a fairly stable model selection, almost always Qwen 3.5 122B on the GPUs plus Qwen 3 Embedding in CPU just for memory and vector search stuff.

I then have a service set up with systemctl that launches a bunch of scripts that are bunged into a tmux session. Each script is lightly watchdogged so that if it goes down, the crash is logged and it comes back up and of course if the whole mess goes down systemd brings the service back up from scratch. So the tmux session is available on boot if I choose to attach to it, and of course if I’m remote (say, with Discord on a laptop) I can still open a shell, ssh in, and attach to the tmux session to monitor if I want. It looks like this:

Point being, I treat both inference and embedding just as an OS service that’s always running in the background on different ports, I don’t really treat it like an app. The claw agent is also just a service, it’s always running on its headless machine, it hasn’t been offline since Feb. except to run updates (always an adventure with OpenClaw). The other machine with the coding harnesses is closer to “app” use as those really aren’t doing anything unless I log in to start/use them. But for the most part, when I said “mine is more like infrastructure” that what I meant. I don’t really play with models much. I did early on, but it quickly became clear that Qwen 122B is the hot target for balancing performance and quality for any VRAM quanity from 64GB to probably four times that. Nothing else really comes close.

Right now I’m running ROCm from the lemonade repository. At first I was compiling from AMD myself but that was a PITA to track releases and with stuff happening so fast, I’m happy to let someone else build for me so now I just run lemonade’s llama as they track the nightlies and get all of the latest merges and the build is stable and performant. The v620s aren’t the fastest things on earth for this role as they were clearly intended for partitioned/multi-tenant graphics workloads and lack P2P BUT they’re cheap as dirt and have 32GB VRAM each, so if you’re okay with 35 tokens/sec chat and 50-60 tokens/sec code, they’re a way to get to a 122B model locally without spending what it costs to buy a car.

Oh, I forgot, I also route the whole thing through LiteLLM, which runs on the same machine as the inference, in order to get a working version of /v1/responses/ rather than /v1/completions/ as responses caching behavior is more sane, and that’s important at 122B so that you don’t have to re-run the entire prefill each turn, as that would be painful, especially with claw-based agents as they run with like 50k context at the first turn (which is also why nobody wants to run them on paid tokens). Completions is supposed to be okay at this, but practice says it’s not, responses is better.

All of this said, there are also more changes coming. There is an X399 board and a Threadripper waiting to replace the i9-9900k momentarily so that even without P2P we at least get x8/x16 to CPU independently for each card, rather than x4 shared amongst the entire PCIe pool. I also have another v620 to add (for 108GB VRAM total, which will let me move from q4 to q6 on the 122B model and open up to the full 256k context). There will probably be two rounds of changes here. The 9900k -> Threadripper will be relatively easy on some upcoming Saturday. The third v620 is a bit harder, as I have to figure out how to power it. 😀 I have a 1200W PSU now and a fat case, but I may need a 1500W when all is said and done. I definitely will need more connectors than I currently have. Each GPU is 2x 12v connectors and of course the mainboard wants more, too, post switch. So… the problem isn’t the high tech stuff, as always these days the complexity is ironically in good old fashioned wires and amps.

But anyway, that’s all to-do.

Hope that’s helpful!

More on ROCm with dual v620s on Linux.

May 18, 2026
11:51 am

So the state of things in local inference on my non-sleek non-Strix Halo non-Mac Studio build is that I’ve settled on:

The dual Radeon v620s running the chat and orchestrator model, which right now is Qwen 3.6 35B A3B
The empty space on the RX 6700xt being used for local embedding model, which right now is qwen3-embedding

The reason for this is that basically the 6700xt is slower than the v620s, and we’re using layer splits (more on this in a moment) which means that when part of the pipeline runs through the 6700xt, it slows down the output. In fact, including the 6700xt loses me about 10t/s in inference speed while in practice only adding about 6-8MB to the inference VRAM pool (since it’s a 12GB card that’s also driving three displays).

With that said, I’ve continued to play with ROCm and try to learn WTF is going on, becasue:

ROCm works with a wider variety of components
ROCm holds out the promise of tensor parallelism (more on this in a moment as well)

And coming back to ROCm a few weeks later, with more experience under my belt, I’m better able to intuitively grok what’s happening and then confirm my suspicions as I go. So here’s what I’ve learned.

— § —

First, ROCm is about 30% faster that Vulkan on my hardware (dual v620s on PCI x16 that’s topologically just run through the Z390 chipset). That’s not nothing. So if possible, ROCm is to be preferred.

Second, ROCm is actually verrry ragged when it comes to stability. Things I started to suspect that I then was able to confirm with web searches (that LLMs didn’t proactively provide to me when I was trying to solve the ROCm problem before):

ROCm doesn’t want any card in the pool to run beyond about 85% VRAM usage as shown by rocm-smi. If you pass 85% you’re into crashy territory and if you hit 90% a crash is almost certainly imminent.
ROCm really isn’t built for llama.cpp or layer splits; it’s designed for tensor parallelism with a high degree of symmetry. Almost any tensor split value for layer splits other than 1:1 (i.e. split evenly across all cards) will cause big stability problems. As in, crashes every 2-3 prompts.
In general, ROCm also doesn’t really love MoE models, you’ll still get some crashiness even if everything else is perfect in most cases, but if you can solve everything else, it’ll be reduced, i.e. every few dozen prompts. This can be helped by running llama-server under a watchdog so that upon crash it comes back up and we continue seamlessly, just with a slower response for that turn.

What I once thought mattered that probably actually didn’t matter:

Being extraordinarily picky and suspicious about PCI-e address space under linux; letting Linux map it with 4G+ and BAR enabled is likely enough to address the cards.
ROCm versions actually don’t seem to matter that much, gfx1030 is pretty well supported.
All kinds of tweaks and environment variables that cause LLMs to repeatedly say “Aha, I found it! You need to…” but then don’t solve the problem.

Basically the two sins I was committing was:

Trying to squeeze the nicest quant and biggest context I could into the VRAM pool, i.e if I had 85% full I was like “Oh I can get a bigger quant, I still have 10GB free! (Nope, doesn’t work that way.)
Trying to run 3 cards in splits and trying to tune those splits so that all cards would fill up at the same time. (Miracle it ever worked at all while I was trying that.)

So while before I was trying to run Qwen 3.5 35B A3B at Q8_XL, and trying to tune –tensor-split so that all those % used counters matched exactly, now I’m strictly at 1:1 and I’ve had enough runs to see that if I set to anything other than 1:1 we’re essentially guaranteed to crash within the first three turns.

— § —

What still doesn’t work?

Sadly, vllm. It should be possible to run vllm with tensor splits, which would theoretically give me better multi-context and a better /responses/ API, but there was a regression in the most recent versions that causes it to punt unless you can stand up P2P between the cards.

And, just as importantly, P2P between the cards. Two things on that point:

I now understand that I have, fundamentally, the wrong CPU/mainboard architecture for this, because the Intel platforms at this price point only have 16 lanes to the CPU from PCI-e and only one slot runs direct; the rest of the x16 slots run through the chipset and are essentialy x4 under the hood. So for a while, I was considering swapping out the Z390 and Core i9-9900k for an AMD Threadripper setup, though I think I’ve backed off of that. Threadripper gives each slot dedicated lines to CPU. It also enables P2P between cards.
Happily (and unhappily), before I pulled the trigger, I learned that the v620 / Radion Pro “Navi” cards were really for data center fractional gaming provisioning, and not machine learning workloads, and thus they actually lack the hardware for P2P anyway. Not the end of the world, especially when you consider the value of the price/performance, here—I was able to put together 64GB of VRAM with compute and memory bandwidth that’s like double the speed of a 6700XT, and all for like $500 in cash. That’s a tremendously good deal, even if it won’t reach the same performance level as true machine learning / inference hardware.

Note that there may still be some benefit to Threadripper, even without P2P, as the dedicated x16 lines to CPU for the two cards have far more bandwidth than the x4 lines shared amongst the entire chipset-attached PCI-e bus (i.e. almost everything in the system that isn’t the 6700XT). However:

I’m not sure exactly how much benefit there will be to making that round trip happen on a true, unshared x16 pipe vs. an x4 pipe, so it’s hard to measure value or ROI.
The cheapo Threadripper on the market (i.e. X399/TR4, last generation) is only PCI-e 3.0 which has half the bandwidth of PCI-e 4.0. So I’m not that inclined to shell out for PCI-e 3.0 for undetermined benefits, but I’m also not inclined to shell out for 4.0 at a much higher price for, still, undetermined benefits. So we wait.

— § —

So that’s the state of things. If I had it all to do again, what would I do differently?

Get on Threadripper at the last rebuild (when I moved from an i7-3770k to the i9-9900k and to the Z390 chipset). I was tempted, but I stuck with Intel for the faster single-thread interactive (web, photos, etc.). Who knew that a few months later LLMs would hit the mainstream? But in any case, the AMD platform is obviously better for local inference; Intel consumer is hobbled.
Consider a different family of retired server hardware (Insight or similar) on eBay. The AMD data center hardware is still the right move; it’s dirt cheap and readily available if you’re willing to do a bit of hacking. However, for inference, having more modern hardware with faster compute and higher bandwidth is offset by the ability to run P2P with tensor parallelism on slower, cheaper cards. So there’s no reason, if you’re doing multi-card, not to go for the slower, cheaper, older hardware, which, since you’re able to run P2P with parallelism, will end up at the same speed as a couple of v620s that can’t.
Not bother replacing the old RX480 with a 6700XT, since the RX480 could also have run an embedding model and it proved not to be practical or worthwhile to bother with adding the 6700XT to the pool. From the outside before this all started I was thinking, in part with help from LLMs, that it would be good to have three cards that were the same compute architecture (Navi / gfx103x) and the 6700XT with 12GB would add yet a few more GB to the pool. In practice, the LLMs were exactly wrong; there is basically no advantage to the 6700XT and adding it to the pool makes things either slower or less stable or both.
Not listening to LLMs so much or using them for search so much. My real unlock came when I started to Google search and skip past AI results. AI has a lot of opinions about AI, but they’re all wrong. Even when you ask it to do web search. Better just to hang out in the repos and on Github and read the interactions.

And finally, for anyone looking to run v620s on Linux for inference, my kernel command line is:

pci=realloc,earlydump amdgpu.gpu_recovery=1 amdgpu.noretry=1 amdgpu.ras_enable=0 amdgpu.mcbp=0 iommu=pt intel_iommu=on pci=big_root_window pcie_aspm=off amdgpu.runpm=0 pcie_port_pm=off amdttm.pages_limit=16777216 ttm.pages_limit=16777216 amdttm.page_pool_size=1048576 ttm.page_pool_size=1048576 amdgpu.gartsize=4096

Pair this with BIOS settings that enable addressing beyond 4GB and that enable BAR and VT-d/IOMMU and they’ll get seen. Crazy to remember that I spent the first day just trying to get the cards to (first off) post, and then after that, (next) be seen by the Linux kernel.

I’ve learned a lot. Not sure how transferrable it is, but it’s nice to be in a space where the smoke has cleared.

— § —

Bonus note:

I actually can run Qwen 3.5 122B A10B well on the two v620s at (say) Unsloth UD_IQ3 and I like its output a lot, better than Qwen 3.5 A3B at Q6_XL. So if you’re wanting to run a “big” model like that (at least, big for home office purposes), it’s totally doable. I get about 27 tokens/sec on inference, which is quite respectable. I have to do it with Vulkan, though, where I can push the memory use right to “full”; on ROCm we just don’t have enough space given that ideally we need to stay below 80-85% use for stability purposes, and I don’t want to go more compressed than Q3.

Thing is, Qwen 3.6 35B A3B at Q6_XL with ROCm delivers ~55 tokens/sec, no MTP. Twice as fast. It’s really, really hard to sit and be patient for 122B when 35B is twice as fast and still… acceptable. So that’s where I am now. But if you’re wanting to run 122B or similar biggish MoE, UD_IQ3 and 27 tokens/sec is pretty damned good.

College econ not really getting me there.

May 17, 2026
11:16 pm

There’s this discourse, which picked up a bunch during the COVID years, about how the most essential workers in our society earn the least, and people then debate the value of the CEO or the elite white collar tech worker.

I’ve never been an EMT or a grocery store clerk, but there are a decent number of other things that I’ve been, in some splits that are maybe not common. For example, I’ve been both an author and a professor, each for a number of years. I had to stop doing both of these things because, with very few exceptions, it’s simply not possible to earn a living doing them. They don’t pay a living wage. In some cases, they don’t pay even half a living wage.

The advances for my books were each on the order of $1,500 to $3,000 for trade nonfiction, and with the total sales life of a trade nonfiction book being a year or two and the total sales if you’re lucky being numbered in the tens of thousands if you really do well, all you had to do is write 50 books a year to make a living.

Similar story in academic life. You’re expected, as a matter of course, to publish. A lot. One paper can be as difficult as, if not more difficult than, writing an entire nonfiction trade book. And yet in academics, the pay for publishing is… zero. Zilch. Nada. You get the dollars for the classroom side of things. Which, for the only tenure track offer I ever got, was $30,000 per year for a 5/5 (basically, you spend your entire waking life either in the classroom or awake at home at 3:00 am grading homework), with the chance to earn as much as $48-$50 if I made tenure in a decade by somehow publishing a ton.

I stopped doing these things because there is literally no way to make a living doing them for most people. They’re pastimes for the already wealthy.

And yet, they are objectively and subjectively the most consequential things I’ve ever done, or will ever do. I still hear from students who say I changed the trajectory of their lives, and that my classes were the most informative that they ever took. And I still hear from people who have read my books. Some of them are the only book on a given subject, and are in the Claude Anthropic settlement (i.e. what the AI knows about that topic… it learned from me).

The things that I’ve done since then have no lasting importance. My first book was in 1997 and is still of import today. Meanwhile, the stuff I do now is generally obsolete and discarded within 3-6 months at most, and only a handful of people will ever see it, and it holds no particular importance for humanity. Yet it pays multiples of anything I could ever have earned writing or teaching.

There’s a sort of econ 101 logic or boilerplate analysis about this that says that people like fulfilling work, ergo there’s a surplus of labor for it, and thus it pays nothing. No, it’s not about demand alone because in fact there has been captive demand, say, in higher education, with infinite government subsidy, for a good long time. But all of those dollars went to administrators and executives of various kinds, and basically none of it went to the instructors.

Similarly, the Claude Anthropic settlement lists 500,000 works (seven of mine among them) that were used to train the AI. The estimated market value of Anthropic at the moment is around $380 billion. If get get really conservative and say that these 500,000 books are only 5% of its value (which I think is a ridiculously small number, given the fact that people expect AIs to give authoritative answers), and that the training and knowlege are only 25% of the value of the AI (also ridiculously low, but just for effect), then that’s just short of $10,000 of value per work, or $70,000 in value for my works. And of course Gemini and OpenAI were both trained in similar corpuses and are worth significantly more, so if we just lowball it and say they have the same value, that’s like $200k in value.

So it’s not that academic work or writing work isn’t in demand. Just like it’s not as though EMTs or grocery store clerks aren’t in demand. I know, this is kindergarten stuff. Back to supply again. People are just willing to do it for less.

The thing that dime store econ can’t really tackle is the philosophical problem here, something that seems a defect in our society. We leave this things that really matter, and that are very in demand, to just be compensated on a supply/demand curve, so that people really can’t make much of a living doing them (or in the case of academic work, or authorship, or teaching grade school for that matter), can’t make a living doing them at all. So what you get is high turnover and uneven quality.

I guess the thing I’m getting at is that there is a gap in the demand world, and it matches the enshittification of everything else. See, the demand isn’t for books, it’s for accurate, useful books. It’s not for academics, it’s for inspiring, mentoring academics who are legitimate experts. The demand isn’t for kindergarten teachers, it’s for good kindergarten teachers. This is the part that the econ books tend to gloss over, because it’s inconvenient.

The public doesn’t hire all of these functions. Book buyers don’t get to hire book writers, and parents don’t get to hire teachers. And this is where the moral problem comes. Because over and over again, the public is frustrated. Why are the experts wrong? Why does my grade school suck shit?

It’s because you didn’t get what you bargained for, what the demand was for, what you paid taxes for or bought the book for. Instead, you got the bad teacher, or the bad expert. Why? Because we won’t pay more. Why? Because someone, somewhere in the chain, and usually really the entire top half of it, is getting nice budget numbers for their PowerPoint decks by saying “we only need to pay X” and eliding the fact that what they’re doing when they pay X is, basically, scamming the public by taking their money and delivering a fake.

Of course I can hear the public school people freaking out now saying it’s a funding problem, but relax, the “up the chain” people here are the district level admins and union folk and of course the senators and congresspeople who once again are in it for themselves and won’t do the hard work of telling the public the truth.

At some level, the reason our economy is broken, and the reason our teachers suck, and the reason the experts are so wrong, is that we’ve had a moral collapse in our civilization. There’s no more Wilford Brimley voice coming out of people saying “I’m sorry, I’m not going to do that, it wouldn’t be right.” Everyone is willing to compromise to pad their own stats. Everyone is out for themselves. Nobody will pay more to do it the right way.

I can hear all the capitalism free market people here wading in trying to figure out if I’m just a Keynesian or if I’m a full on commie but the thing is, I’m old. I was alive in the ’70s and ’80s. And literally, literally you would hear people who could cut corners on a deal, or advertise and sell an inferior product, say things like “well, I know I could make a lot more money that way, but it just wouldn’t be right.” Or “I could claim to be the best in my yellow pages listing, but that would be dishonest, there are better than me in this city, but I don’t charge quite as much.”

That energy is gone. And that’s the point at which capitalism and free enterprise lose the public.

I don’t know, this is a nonsense rant from a non-economist that will no doubt cause a bunch of people to call me an idiot. But it’s not really about economics. It’s really me saying that once upon a time, people didn’t seize their full advantage because it “wouldn’t be right” and people cared a lot about “doing the right thing” and just as importantly, they knew that the “right thing” was not always the “most profitable” thing. This was lost, I think starting with Regan, to market ideology that says that whatever the market does is inherently right, because Adam Smith is god.

That puts capitalism really on the same footing as communism; the world of men ceases to be a space of moral agency and responsibility and is instead just a place where you throw up your hands and say “I don’t have any choice in the matter, it’s all laws of history!”

I’m here to call bullshit on that. And really what this post is all about is just me reflecting on how stupid it is that the most important things I’ve ever done, that contributed the most to society, were the least well compensated, and as a result society lost my labor (and the labor of many others) doing them. Which is dumb. And no, don’t do the thing I just criticized and say “well if you were all that the market would have rewarded you.” Because YouTube is fucking full of worthless streamers who would improve society by dying, yet who are making absolute bank. The market has no morality.

Humans have that.

Well… had that.

2026 is strange and probably 2027 will be stranger.

May 16, 2026
6:53 pm

So it’s been a long time since I sat in silence and made a blog post in the middle of the day. Maybe even years. Hard to say.

Thing is, there are so many forces mitigating against posting on your own blog these days. Or at least my own blog. As in:

I’m a parent with two teens == busy, busy life
Work wants 60+ hours a week from me and mostly gets it
Almost any platform you use for anything has some sort of chat, commenting, or reviews that eats what you have to say about many things in life
Now there’s also AI, with which you end up getting conversational despite yourself

So you don’t really have time to breathe, and then the things you think about your stuff go on Amazon reviews and the things you think about politics go on X or YouTube, and the questions you have and reflections you have are accidentally pounded out into GPT, or Claude, or Gemini, or OpenClaw on local inference (oh yes, I “have” all of the above) because you’re co-working with these things and then it’s like chatting with a co-worker.

And so, at the end of the day, when you finally get a second, your brain is glazed over and inaccessible on the one hand but that’s sort of inconsequential because on the other hand, it’s basically empty.

I don’t exactly know what’s different about today. I think I’m just getting older and grumpier and some of this stuff is starting to break down because I begin to feel like I (and many others) have been fully “virtualized” and I don’t like that. I think the closer you get to your own mortality, the less you like the idea of “virtual you.”

I mean, dead is dead, but data is certainly more dead than, say, a corpse. A corpse at least lays there for a few years. Your immortal soul, if such exists, is eternal. Your physical possessions are good for decades, or even generations in some cases, as long as your offspring see fit to continue to pass them down.

But virtual stuff? Made of bits? Anyone who has worked in software, and then looked back at the last five years of work and realized that unlike many others they’re not building a “body of work” and in fact the things that they have built usually only live for 3-6 months before zapping back out of existence again, understand that a “virtual you” exists in the same way that ClarisWorks or Netscape or the Metaverse exist, which is to say, not at all.

— § —

Meanwhile, on the point of local inference, the weird thing is that now that I have it set up and fully deployed and working well and robustly configured as a systemctl service pointing to LiteLLM as a fellow systemctl service with watchdogging and dashboarding and blah, blah, blah and calling tools and doing research and writing code, I don’t feel like I want to use it for anything.

I have this weird impulse to maybe just put all of it to bed and go outside and make three-legged stools.

I’d love to say that at least the experience was worth something, as in maybe it’s a business model or a useful skill to go and build people out local inference machines with a bunch of stacked GPU cards on PCI-E X16 in Linux with some sort of repeatable deployment package, only despite claims of “shortages” and “supply chain trouble” over the last couple of years the channel is already absolutely full of purpose-built local inference computers that are effectively the next generation of PCs and that already make my local setup with a big fat case and three GPU cards just to get to 76GB VRAM look pretty ridiculous. Hello, Strix Halo and Mac Studio.

Once upon a time I’d have been excited about all of this but now as a person with student loans that I will not pay off within my lifetime who is in the process of prepping to leave SAVE, it all just seems dumb.

The social contract was never really that great, but now it’s pretty much a scam. And, as time has gone on, we’ve gone to this kind of post-linguistic-turn version of The Matrix in which we’re all farmed, yes, but we’re actually being farmed without compensation or freedom for our words, because it is words that power the economy for the billionaires.

But I digress.

— § —

The other thing worth mentioning is that we’re all lonely out here. Funny thing. I have all these friends in my age group, fellow Generation Xers, who I sometimes talk with.

With a single exception, we are all single, we all don’t/won’t date each other, we all regret just how disconnected and isolated everyone has become, how hard it is to make friends, how hard it is to find significant others, and how easy it is in the age of endless consumer life+self customization to arrive at the point at which you basically can’t really get along with other people as anything other than utilities anyway, in the same way that we are utilities for the billionaires.

If we had any guts, we’d all do what the hippies did and carry out the equivalent of a “turn on, tune in, drop out” move, only we apparently don’t have the guts so we all call our friends and talk about how we don’t have any friends with them and bemoan the fact that there’s no one to date and the fact that you can’t really date anyway because it just makes you hate people and realize how much you’re destined to be alone.

This is not the natural state of things. I don’t know whether it’s unique to Generation X, but I think not. From everything I hear, other generations have their versions of the same thing, even if it’s mostly not identical.

People say that social media and technology and wealth inequality have broken the social contract, but in fact what they have broken is us; the social contract’s failure is collateral damage farther down the line, as what a bunch of broken us voted for.

To fix us we would have to kill off half of tech and most of modern convenience and now of course AI, and lose the activist ethos that has basically destroyed everyone and everything, and just quit and be normal and talk to each other like people.

But fucking what?

Be normal and talk to each other like people?

No fucking way, we’d rather die.

— § —

Such is life in 2026. So instead, we’ll still die, but we’ll just do it alone. We’ll only talk to our friends when it doesn’t matter. We’ll only date people so long as we don’t care about them. We’ll only socialize so long as it’s with strangers, around banal, meme-land topics. And we’ll only vote for candidates we don’t respect.

Because this is America, and because we’re all modern, well-educated people.

America has become a terrible place.

May 12, 2026
8:24 pm

I am coming to hate my country. This is not a Democrat thing or a Republican thing; they’re roughly the same. It’s just that we hate each other. We all know that we hate each other. Nobody hates Americans so much as Americans do.

It has become intolerable to live here. Everybody’s busy complaining about Trump right now, but the fact is that we long ago crossed the Rubicon; it’s got nothing to do with immigrants or race, really. You can be white and your neighbors can be white and you can be roughly the same class but you hate each other. You hate each other for existing, and you ultimately harass each other, whether directly or through voting or through rumors.

It’s not pleasant to live here. There’s no community. There’s no comity. There’s just hate. And it’s getting worse.

I’m as American as they come. And I’ve traveled internationally. I have no illusions that I will ever be anything other than an American.

But boy do I hate Americans. And they hate me.

We all hate each other.

Rest of the world, don’t be offended. However much you think we hate you, I can assure you that we hate each other more.