Why are we letting domain owners dictate the internet experience?
A proposal for an internet more friendly to us, the people
Before we delve in here, I want to quickly note for readers of my last post that I am still working on Part II of the AI series, but it turns out to be harder to define AI well than I expected. Rest assured I have not forgotten about it! In the meantime - let’s talk about how the internet could be made better.
The internet is an abstract and artificial creation, which makes it very difficult to comprehend without the use of analogies: "buttons", "chats", "storefronts", etc. Like all analogies, these are used because they are helpful for human understanding, but they are also imperfect, and in certain ways misleading. They give us a place to start discussing these virtual creations of ours, a way rooted in things with which we are already familiar, which is very helpful. But occasionally, it's worthwhile, perhaps even critical, to step back and examine the inaccuracies of these analogies in order to identify ways in which they may have misled us.
Of all the analogies we use to describe the internet, one of the most ubiquitous is the foundational "webpage" or just "page". And it is this object (or should I say "object") at which I'd like to take a closer look.
So, a little review - what exactly is a webpage? Well, painting with broad strokes, a webpage is a collection of stuff, historically text and links, then later images and videos and a whole bunch of other stuff, put together to be viewed as one coherent whole and traditionally viewed within the context of a larger web “site” or domain, which is composed of pages in much the same manner as a book or a newspaper is composed of pages.
How is this accomplished? Well, people have a sort of tool (really it's just a program they run on their own computers) which they use to interact with the network of computers we call the internet; this user tool is called a browser. Using their browser they "request" a given webpage, and this request is routed through some internet wizardry to another computer which has the contents of the requested page stored on it and which is running a program called a "server" which sends said contents back to the user's browser to be displayed visually to the user.
To get a little more into the details, typically servers send back three things: “HTML”, “CSS”, and “Javascript”. (All my less techie readers, stay with me, I’ll get back to reality shortly!) The HTML is basically the content of the page, the CSS is some rules about how to display it (colors, sizings, etc), and the Javascript is sort of like the machinery of the page, allowing it to move and respond to user interaction. The browser actually does a tremendous amount of work to "interpret" all of this stuff, such that a bunch of text code can become the amazing, engaging, powerful website experiences we are used to having.
But here's where the page analogy really breaks down: a physical page in a book or newspaper has a definite kind of integrity, a wholeness or unity. In fact although there are ways to carefully divide pages which leave something of value, like newspaper clipping, more often the division of a page (i.e. ripping, like by my two year old son) actually destroys the page. The page of a book is one single thing, and generally speaking it is in the interest of everyone to keep it one thing. A webpage, though, is a very different matter!
Is a webpage really one thing at all? As we've already seen, it has at least three components. But even these are composed of many different components. Some of them may not even come from the same server! A typical modern webpage has a bunch of links at the top of its source code which (without any knowledge of the user) pull parts of the page from other websites, i.e. other servers. That’s right — your computer actually has to send off for information to a bunch of different computers scattered across the country (or world) in order to then actively weave their responses together into a single page. The ensuing page has parts which load right away and other parts which don't. It may have parts which are “embedded” from other pages. And of course it is composed of "elements" like paragraphs and images and videos, some of which may also be showing up on different pages in different contexts, different sizes, hidden altogether on small screens, etc. And all of this is (usually) smoothly handled by the browser, so that what is really on some level a massive collection of disparate things appears as one seamless whole to the user.
How different browsers do this also varies from browser to browser. There are some standards and patterns that tend to be followed so that users see something coherent, but every browser is a little different, and all of them are continually being changed and upgraded, and all of them are configurable to various extents. There are the different screen sizes. Some people are colorblind. Some want to hear sound, others don't. Some are on slower connections and don't want to see video because it slows down their page too much. In other words, there is a lot of interplay between the browser on the user's end, and the server(s) on the end of the person I'm calling the domain owner, to determine what actually gets presented to the user for a given webpage request.
Much of the time the user and the website owner have the same goals and are working in harmony. If I request YouTube.com, I want to see a bunch of videos. YouTube wants to show me a bunch of videos. I want them to fit cleanly and evenly on my screen. So does YouTube. I want videos to be displayed which I'm interested in, and so does YouTube, etc.
In some cases however, our goals are at odds. For example I don't really want to see ads, but YouTube wants me to see them in order to get paid by advertisers. I don't necessarily want to see politically motivated content labels, but perhaps YouTube wants me to see them. I want to spend really worthwhile time on the computer, but perhaps YouTube is optimized to just keep me engaged, etc. What typically happens, then, is that the user comes to a given webpage in order to see some element or set of elements on it which they are interested in, for example a video on YouTube or some items they searched for on amazon, but then the user is additionally subjected to other elements of the same page which they are not interested in. So one problem is that webpages collect elements which are undesirable to users together with elements which are desirable, treating them as if they were inextricably bound together when in reality they are absolutely not.
Of course, there are some elements which really are bound together, for example a response to a comment on Twitter often doesn’t make much sense without reference to the original comment. There are many cases like this, large and small, where communal dynamics develop. This, tangentially, is why the libertarian approach to the internet, where everyone gets their own experience of it perfectly tailored to them, finally won’t work. The internet, like any community, is and will always be necessarily to some extent governed. Regardless, the point stands that much of what is currently collected together on webpages has no actual unity.
But the second problem which is perhaps even larger is that a user often doesn't actually have any desire to see a webpage, as such, at all! Perhaps I want to watch a video. I don't care about YouTube.com, I just want to see videos! Or perhaps I want to buy an item. I don't care about Amazon, I just think of of the Amazon website as a whole as the best "place" to find the thing I'm looking for. In other words we generally pursue activities or types of information when we go on the internet, but we have habituated ourselves to conflating these various pursuits with the faux places where these things tend to be. The result is that when a website establishes itself as the best “place” for a given activity, it has a certain de facto monopoly, because nobody really wants to go to ten different places looking for videos or shopping for products. That’s a lot of work, going from place to place! So YouTube is the place for videos, and consequently has a ton of power over videos. They can determine which videos will be seen and which won’t, how many ads people will be forced to watch in order to see them, which subjects will be emphasized and which banned, which creators will be allowed, etc, because they control the place where everyone goes for videos.
But all of this is based on a mistake, because YouTube.com is not in fact a place at all, any more than YouTube’s home page is a true page! If a user’s browser can pull content and stitch it together and put everything on the page to YouTube’s liking, why not instead do so to the user’s liking? It is, after all, the user’s browser!
To put it another way, what if, instead of thinking of the internet as a series of places containing pages which must appear as the owners of the places desire, we instead thought of the internet as thousands and thousands of sources of media, all of which could be fetched and stitched together in a way tailored to the aims of the person viewing them? And this is, in fact, exactly what chatGPT and the like have largely succeeded in doing, albeit thus far in a way that is largely limited to text. The issue with chatGPT as of now at least is that it requires a lot of upfront “training” ie processing of the internet to be able to work. This enables it to operate at a level below the superficial content, mechanically extracting (albeit not perfectly) the underlying meaning from multiple sources and weaving it back together into “new” text. But this requires some significant resources to do. So we’re out of the frying pan and into the fire, because now everyone is just going to the website of a company that has these resources, namely chat.openai.com, to access the entire internet, in their cleverly pre-digested form. Which is even worse from a monopoly busting perspective than having a site for shopping, a site for email, a site for videos, etc.
What if we were to break away from this monopolistic paradigm altogether though, and users had their own configurable browsers which responded to their desires, going out and fetching the content they wanted without needing to be beholden to any huge “.com” company at all? Needless to say, such a browser would require a certain amount of technical work to create, so the average user would still have to rely on someone else’s work for such a tool. But these could very easily be developed as open source tools, with complete transparency to the user. They could also be purchased by users in many different varieties. The upsides for pretty much everyone except for the mega-corps would be tremendous.
I’m not particularly inclined, generally, to narratives of oppressor and oppressed. Nonetheless, there really are some David and Goliath struggles in the world, and when one sees a true such struggle it is hard to want anything more than for David to win. And of all such struggles which are absolutely staring us in the face in 2024, one would be hard pressed to name one more obvious than that of the little guy against big tech.
Big tech, almost by its very nature, oppresses everyone. That is why these companies are the most profitable companies in the history of the world. Once a given company has established itself as the “place” for a given activity, it can force almost all of not only the consumers but also the producers of this activity to operate on its terms. YouTube remains an obvious example. Do some people make money on YouTube? Of course. Does it present an amazing opportunity to creators? Absolutely. Does YouTube provide some real service in hosting and curating content? Certainly. Does it exercise despotic control and keep far far more of the money which it “earns” from the value of the content people post on it than it doles out, ultimately dampening rather than amplifying the incredible opportunity posed by the utterly brilliant developments in communication which have occurred in the last 60 years or so? Sadly, the answer is also a resounding yes. The same, of course, is true of Amazon, Facebook, LinkedIn, and the rest.
Imagine a world in which everyone was running a browser that had a button labeled “videos”. Behind the scenes, they could select sources for these videos, tweak the algorithm that displayed them, etc. So say I want to see videos from Vimeo, YouTube, X, Rumble, my own personal cloud drive, and an assortment of little no name websites owned by single individuals who share my interests, all on the same screen together. I want to see them in such and such proportion (say mostly from YouTube, with a few videos from the other sources sprinkled in). Or I just want my computer to do its very best to take all those sources and show me what I’m likely to find interesting. Or maybe I want my computer to be biased for a while towards me learning physics or philosophy! No problem, I control my own algorithm. And perhaps my browser is networked with other people who I trust, and all of us collaborate to rank content for each other behind the scenes, much the same way that “friend” or follower networks do now on social media, but without all the information we create in the process being owned by one big social media company for its own profit.
Of course, building these user-centric browsers would likely become big business as well, but because they exist on the consumer rather than supplier side of the information flow, they would be much harder to monopolize. If a better browser comes along, there is nothing to stop people from using it. A better video platform, on the other hand, is very difficult to start because of so called network effects: many platforms are only as good as the number of users networked through them, making it very hard to start one from scratch which is competitive with existing giants.
But what’s really amazing about the situation in which everyone is using a browser to actively fetch the content they are interested in rather than relying on a giant platform to feed it to them is that suddenly, all an up and coming platform like Rumble has to do to start getting seen is to get people to click a checkbox that makes their browser videos screen include a few videos from Rumble. Checkboxes are easy to click; habits are hard to change. Most of the reason that sites like Facebook and YouTube can get away with serving so many ads as to dilute the experience almost to the point of no longer being worthwhile is because of habit. From a context in which people only want to visit a few sites regularly, they have gotten themselves numbered among those privileged few and created the habit in us of returning to them over and over, and they are now cashing in, at our expense. But creators don’t really want to move to Rumble because it doesn’t have enough viewers, and viewers don’t want to go because there is not enough content. So we have a sort of societal habit, in addition to personal habits, which are strong enough already. We’re stuck. We need to be able to transition away, smoothly, and without superhuman levels of coordination and effort. And a user oriented browser which left websites behind would make this happen almost inevitably. I can’t see a better stone for David’s sling.
Objections.
Having established that webpages aren’t really pages (one more sure sign of this is that anything which encourages the dreaded infinite “doomscrolling” almost certainly has no integral unity), and that websites aren’t actually places, and that we are already used to having our own computers actively assemble the content we view on the internet, but have simply acquiesced without thinking about it to allowing our computers to do it in the way that maximally profits the big companies which own the big domains, at our own expense and leading to our own distraction, and finally that the shift away from doing so would be one of the great victories of the little guy in our times, perhaps now is the time to answer a few objections due to which some might believe that this can’t actually be done.
Objection 1. Is this even legal? Don’t the big websites have copyright on all the material people upload, and the subsequent right to control how and under what circumstances this material is consumed?
Well, basically, no, they don’t! And if they did that would be a messed up situation which we should actively work to change, because these websites aren’t legally treated as publishers, and don’t create or even edit any of the content which gets posted on them. Even if they did own copyright though, I don’t see any reason this couldn’t be worked around, for example by putting some kind of attribution to the original source of each piece of media on it somehow. And besides, if chatGPT is allowed to read the entire internet and represent it as its own (which it has been and will be, there's too much money in it for it not to) how could what we propose possibly be disallowed? We’re not even talking about a third party acting as if it owns the internet, we’re just talking about giving users a tool to get the content they want, and not the content they don’t.
Objection 2. Even if it’s legal isn’t this going to be against terms of service?
Possibly. But done right it would also be very hard to stop. So one is left with an ethical question - should the terms of service be honored if one has the ability to circumvent them in this manner? I say the answer here, in most cases at least, is no. I don’t need to see every obnoxious ad or every ridiculous “expert” content label YouTube wants to force me to see, just because they’ve managed to monopolise online video and then made me check a box. And I think it’s entirely righteous for me to want a viewer that combines different media sources relatively seamlessly so that I can conveniently fetch content from the source of my choosing.
Objection 3. Are you so sure it will actually be hard to stop — aren’t the tech companies going to stop this at any cost given we’re talking billions of dollars of potential lost ad revenue here?
I expect that if this were to ever really go anywhere, there would in fact be a massive power struggle. We’re talking about nothing less than the toppling of the most giant companies in history. But yes, I think it can be done. With the advent of AI it is going to be very very hard to distinguish a human reader of the internet from a machine reader. And the key here is that a user centric browser would act on behalf of the individual user so even the standard approach currently employed to block bots, namely putting things behind a (potentially paid) user login, would not work here because these aren't bots, they are tools in the hand of users, who could simply log in to each source of information they were interested in and then browse with the tool instead of manually.
Objection 4. What about the content creators — isn’t this going to upset the entire current internet economy, meaning no ad money for YouTubers and other creators?
Yes, and this is actually a serious concern in my opinion. And for that matter it isn’t just the content creators - the entire pipeline: content creators, content hosts, network infrastructure, end user devices and programs, content curation, etc, all need to somehow be paid for. This is one of the reasons that the big platforms like YouTube have succeeded while other things (like RSS readers for example) have failed; it’s hard to make and promote something really well if you aren’t getting paid for it. So that’s a problem which would have to be solved. But at the end of the day, nobody likes ads, so it’s a problem which really needs to be solved anyway. And solutions will be found.
This is going to have to involve a much broader awareness that things on the internet are not free. For example (h/t my partner at Human Centered Tech, Thomas Doylend, for the following quick analysis) according to Statista the average user spends 28 hours/month on the YouTube app. YouTube tries to show an ad roll about every 5 minutes or so, and those reels either consist of a 15-second unskippable ad or two that are skippable after 5 seconds; so, let's say ~10 seconds per roll. Therefore 3.33% of the user's time, or 0.94 hours/month, is spent watching ads. Since YT premium costs $14/month, the average person therefore values their time watching YouTube ads at about $15/hour. Perhaps this seems less significant because the little ad reels are so short. But I would argue that this actually makes them more costly rather than less, because distraction has a cost. So most of us are essentially working a relatively low wage and highly invasive job for about an hour a month in order to see YouTube. It’s not free. Maybe we ought to start thinking about better ways to pay for a better internet experience.
In Conclusion.
My particular goal in this article was to make the case that we ought to have more control over how we consume internet content. An internet of the people, by the people, and for the people, in contrast to the internet of semi-benevolent dictators which we currently inhabit. I argued that one significant and eminently possible step in that direction would be the creation of a browser which displays internet content according to the desires of the user rather than the domain owners. If there’s enough interest, I’d be happy to build such a thing myself! If we did nothing more than create something which displayed people’s YouTube homepage with a few of the best videos from Rumble mixed in, we’d be moving significantly in the right direction. Having done that successfully, one could go on much further.
But more broadly than any particular proposal, my goal is to stimulate thought about how to make technology more human. Much technology, especially the internet, is young, and much less of a set thing than we tend to imagine it being. The analogies we use to describe it are just that, analogies, and this means that there might be better ways to think about it, and subsequently better ways to use and build it.
How do you think that we could reimagine the internet for the promotion of what is really true, good, and beautiful?
Great article!