The User-Agent, a technical name for the web browser, was supposed to be our loyal butler on the internet – doing our bidding and keeping us safe. But over decades, browser makers diluted its role, sometimes as a result of actions done with the best of intentions. Both the internet and browsers have become increasingly complex. And with that complexity, the custodianship of the users’ data has become murkier.
Despite being the one peddling our data, user-agents can no longer tell us exactly who knows what about us. We’ve lost our agency over our browsing data. And now our dude on the internet is a “user-agent” in name only.
We will soon have to confront an objectively more chaotic creature – an AI-powered personalized user-agent. Personalized meaning it knows a lot about us and will continue to learn about us. User-agent meaning it will take actions on our behalf. And, AI-powered at this point means that nobody really knows how it works. Let’s talk about what is about to happen and how you can prepare.
The User-Agent is supposed to be your buddy.
Information is intangible – you can’t stick your finger in it. So whenever we want to interact with a system that processes information, especially of the digital kind, we need some stand-in for our finger so that we can poke things.
So, people came up with the concept of a user-agent – to refer to the device1 that people interact with. People would push buttons, make holes in paper, turn knobs, or do whatever they need to do to communicate with the user-agent. The user-agent grasps what the person wants, and does what it needs to do to make the person’s wish a reality. On the flip-side, the user-agent also communicates results back in a form that the person can understand.
Early user-agents were simple creatures with no intent of their own. It wasn’t that hard for people to look inside them to understand what it was doing. There was no mystery. The user-agent did what was asked, and earnestly reported back on what happened.
For example, in an email system, this user agent is called a mail user agent MUA. A user-agent that interacts with the web on our behalf should be called a web user agent (WUA)2. But because web user agents —which you might refer to as web browsers— are so dominant, people dropped the superfluous “web” part.
MUTT(1) User Manuals MUTT(1)
NAME
mutt - The Mutt Mail User Agent
SYNOPSIS
mutt [-hNpRxZ] [-s subject] [-c cc-addr] [-a file] ...
[-F rcfile] [-H draft] [-i include] [-f mailbox]
[address] ...
DESCRIPTION
Mutt is a small but very powerful text-based MIME mail client.
Mutt is highly configurable, and is well suited to the mail
power user with advanced features like key bindings, keyboard
macros, mail threading, regular expression searches and a
powerful pattern matching language for selecting groups of
messages.
"All mail clients suck. This one just sucks less."
- me, circa 1995
The Mutt Mail User Agent.
This was fine and dandy. People would fire up their user-agents, type commands, and the user-agent would do stuff and show the user results. But over time, these user agents evolved from simple command-line tools to much more fancy interactive, graphical browsers and services. Modern (web) user agents are not just programs, they are entire platforms3.
Modern User-Agents are technical marvels.
Since the same internet needs to be accessible from different user-agents for perpetuity, the way web pages are written and made available on the internet had to be standardized – i.e. everyone needed to agree on the how. But these technical specifications, or standards grew bigger and bigger and so demanding that only a few well-resourced tech companies can write and maintain a full-featured UA4.
Though voluminous, these standards are neither perfect nor complete. But they have been crafted through person-millennia of effort to ensure security and correctness. Organizations like the following work tirelessly to get the details right:
- World Wide Web Consortium (W3C) is primary group that governs how web content is written.
- Web Hypertext Application Technology Working Group (WHATWG) is the group that pushes the boundaries of what is possible to do inside the confies of a web browser.
- Internet Engineering Task Force (IETF) is the group governing how data moves about on the internet, and also how to keep your content secure while it is being moved about.
- The Unicode Consortium is responsible for which letters and symbols you can use on the internet. They are the people who bring you new emoji every year.
… and many others.
But they are bound by the same implicit social contract …
Beyond technical requirements, there is an increasingly important social contract: That the user-agent always act on behalf of the user.
That means that the UA should do what you ask, and generally not do what you don’t ask.
The latter –don’t do what the user didn’t ask– is a little complicated because of all the complexity involved in the modern web.
Say you ask to see bank.com which is a bank. So the UA
talks to the bank.com server and asks for the contents of
the page. The contents say that some of the pictures on the web page
need to be fetched from advertiser.com. But doing so
involves sending a bunch of information to
advertiser.com like the fact that you are viewing
bank.com and information like your preferred language and
whether you’ve been to these sites before.
Pretty much all UAs will opt to fetch the resources from
advertiser.com, because otherwise the
bank.com
site will not work correctly. Since the user requested to see
bank.com, clearly they intended for all the required
stuff to get downloaded too, right? Except, it’s likely that the user
never intended for advertiser.com to learn all that
information about them.
That the User-Agent must act ONLY on behalf of the user.
Herein lies the problem.
The modern web is a marvel in that it exists despite brutally conflicting interests, and is run by people who do not trust each other one bit5.
Do you really think the nice lady who made you read her life story before she told you how to make simple pancakes also made an ad for weight loss drugs? Of course not.
Behind the scenes, a “modern” web page is a warzone where multiple companies are playing tug of war with your data and limited attention span. Not only are these companies clamoring to show you video ads while you are trying to make pancakes, they are also trying to figure out where you’ve been and what you are into.
You, the user, don’t want any of that; not even the nice lady’s life story. You just want to make pancakes. A UA that acted only on your behalf would have just pulled out the recipe part and shown it to you.
In the short term, rewriting the web like this sounds great. But in the long term, those companies – who now have no one to show ads to – will stop paying the nice lady. And the nice lady will stop publishing her recipes. No ads, no recipes6.
So to keep the web open, the UA does a tricky balancing act between conflicting interests; one that is not weighed 100% on your behalf. The UAs do their level best to present you with the most authentic web, while trying to keep you safe from potential attackers, without divulging too much about you to strangers. But this situation does mean that the “user agent” isn’t truly just the user’s agent.
And it’s not just users vs. publishers and advertisers. There are many other influential groups on the web making the balance of interests difficult. Some are:
-
You. Including here for completeness. You want to visit websites and be presented with accurate reproductions of what the publishers intended … kind of. You don’t want to be bothered by stuff that you don’t want, like advertisements. Also you want to be able to consume content freely and without boundaries. You want a free and open web.
-
Publishers who make content on the web and give you a reason to visit it. There are many not-for-profit content that doesn’t have any commercial interests, but for those that do, it is important that they are able to make money publishing stuff. They also have lots of shared interests with users such as being able to publish what they want and to be able to reach their intended audiences. In addition sometimes they want to be able to publish things without identifying themselves.
-
Advertisers who compete for your attention and make money by influencing your behavior. One of the ways in which they compete for your attention is by “renting” space alongside publishers’ content.
-
Search Engines could be considered a kind of a publisher but they are special enough to get their own category. Search engines –at this point, mostly Google– act as the guides on the internet helping users find the content that they want. For this to work, search engines rely on the internet being open and free. Otherwise it would not be possible to take users to where they want to go. Since search engines are one of, if not, the primary tools users need on the internet they are always integrated deep into the UA.
-
Copyright owners and content rights holders who are worried that people on the internet might steal their content. So web browsers have to incorporate Digital Rights Management components into the UA to limit what users can do with protected content.
-
Businesses who want to make sure that you continue to trust the internet as a safe place to buy and sell stuff. So web browsers sometimes have to incorporate security measures (which actually help users a lot too) and easy to use payment methods.
-
Politicians. Yes those people. As much as we would like to treat the internet as some sort of ungovernable utopia, it isn’t. There are many laws and regulations that are different across political boundaries. So sometimes UAs need to report where you are physically so that your activity can be kept within legal bounds. Politicians and businesses often also want to control what you are and aren’t able to access on the internet. However, they take those issues up with publishers and internet service providers. Additional shady things politicians want include finding out the real identities of people who have said things they don’t like.
Shady stuff aside, politicians are also very useful. In fact as we will see below, legislation is pretty much the only way in which users can influence the big UA makers.
-
Internet Service Providers (or ISPs) make money by making it possible for you to access the internet. You give them some money, and they give you internet access. Seems simple, but there are lots of shady things that ISPs want, like to charge you based on what you consume. See network neutrality.
And that is hardly a complete list, but I believe we covered some of the biggest influences.
But how do we know how the UA balances conflicting interests?
But how do we know that the UA isn’t completely complicit and isn’t
selling our furniture without our knowledge? Why would we believe the
UA is doing what it says it is doing? What does
too much about me
even mean?
Well back in my day
one would crack open the UA and take a
look inside. Usually, things are simple enough that you can convince
yourself, or someone you trust can convince you, that it is in fact
doing what it says it is doing.
But modern UAs are so complex that such human inspection is impractical at best. Not only would you have to inspect millions of lines of code, you would have to inspect every new change as it comes in. Even then, there are parts of most commercial browsers that are closed source (i.e. they won’t let you look inside).
So most of us have to trust the judgment and moral compass of the thousands of specialists poring over the underlying logic.
It matters who owns the User-Agent.
We’ve established that modern UAs…:
- … are super complicated.
- … have to blance conflicting interests of the user, publishers, advertisers, search engines, other businesses, copyright owners, politicians, internet service providers, and others.
- … are not designed 100% in your favor.
Despite vigilant oversight users still have to trust, without verification, how a UA works.
If you don’t trust one, you can always pick another. But you can’t avoid taking someone’s word that the UA is safe. Because of how little control we have as users over browsers, the only real way in which users can influence UAs is via legislation that holds UA owners accountable for their design decisions.
-
I’m using the term “device” loosely here because the user agent often consists of an entire stack of hardware and software. In literature, you’ll find the terms “device,” “machine,” and “system” used interchangeably, depending on the context.arrow_upward
-
The term web-user-agent doesn’t appear much in technical literature with a rare exception. E.g. (Yasskin & Capadisli, 2026)arrow_upward
-
What’s the difference between a program and a platform? The latter is something upon which other programs can run. While the original world-wide-web was document-oriented (i.e. the “internet” consisted of documents that linked to other documents), as technology progressed, the web browser became more and more capable of presenting complex behaviors. At the start they were just aesthetic – like blinking text or changing how a link looks like when you hover the mouse cursor over it. But over time, they have gained capabilities that rival native applications. Look at all the APIs that are available inside a browser.arrow_upward
-
I worked on Google Chrome for over a decade on the networking stack, and briefly on the renderer. The amount of documentation that it takes to describe how a web browser works is in the order of tens of thousads of pages.
Even then it is an understatement. The web works despite countless quirks and misbehaviors. That’s because modern browsers handle those quirks “correctly” to give the impression that everything is fine. These quirks and how to handle them aren’t necessarily written down anywhere. Browsers can’t just refuse to talk to large portions of the internet or refuse to render millions of older web pages just because the code wouldn’t look good.arrow_upward
-
Pun intended.arrow_upward
-
This ad-supported model of the open web sounds terrible; and in some ways it is. There are some alternatives that have been proposed over the years, but none have been successful as ads. Many approaches involve internet users making payments of some kind to publishers. Unfortunately, all such models have the downside that those who are impoverished will get kicked out of our global information village. So for now, this is the best we can do.
AI will probably do in a few years what privacy advocates have tried to do for decades; which is to kill the ad-supported web. As yet it’s unclear what will replace it, and whom this replacement leaves behind.arrow_upward