I often see this stuff and every result from every person I've seen is that fingerprints are unique. Even with DNT enabled, tools like uBlock, privacy badger (an EFF tool!), etc people are still unique. How does one actually become not unique? I'm on Apple hardware, so I can't imagine my hardware is unique (other than silicon lottery) and using Firefox. Testing in Safari shows "nearly unique" (1 bit difference than my FF results) so I gotta imagine that near is myself.
How does one actually make themselves non-unique?
And can we make things more transparent? How many bits is enough?
Here is another way to look at it. Not expecting anyone to agree; it's just a thought experiment.
Instead of asking "How do we become "non-unique""?
We could ask "How do we send less data to the parties who want to do tracking?"
For example,
User A sends 2-3 HTTP headers, the bare minimum, e.g., Host, Connection, maybe User-Agent or Cookie in some instances if required. User A has disabled Javascript unless she needs it.
User B sends a potentially unlimited number of HTTP headers (User B does not care to control the headers she sends, she just leaves that to the applications). User B leaves Javascript enabled regardless of whether it is needed.
Each user may be "unique", with sufficient effort from the party doing tracking, but there's an argument that User A is less interesting to the tracking parties than User B. Both users have footprints, both can be tracked, but one is sending much more data to the trackers than the other. She's leaving more detailed tracks, so to speak. User A is leaving a more generic footprint.
There is also an argument that finding "uniqueness" among a large group of "User B's" would be easier than amongst a large group of "User A's". If we were trying to achieve the impossible goal of "non-uniqueness" it would arguably be easier to try to have all users appear to be identical to User A than trying to get all users to match User B what with all the additional potential variables User B presents thanks to uncontrolled HTTP headers and Javascript's access to her computer's resources (and all the potential issues and "options" that raises).
If non-uniqueness means "there are others that look like you", it is clear that giving them less identifying clues makes things harder.
The question is how to do it tho, because some things might actually be useful: if your browser tells the site you prefer dark themes the site can react and display your prefered color scheme. If the browser tells the site how big your viewport is it can give you a site that fills that viewport neatly — and if the site can do these useful things it can also track you using that info.
If you’re trying to layout a newspaper style web page, sure.
I think it’s less true if you’re trying to play a game, look over a complex dataset, do high-end image previewing, maybe host a multi-party video conference, or other highly interactive application in a web browser (for user convenience and, to some extent, because users semi-reasonably trust browsers more than random app downloads).
User A sends 2-3 HTTP headers, the bare minimum, e.g., Host, Connection, maybe User-Agent or Cookie in some instances if required. User A has disabled Javascript unless she needs it.
There is actually no technical reason that an HTTP request needs to be any more than
GET / HTTP/1.0
It's nice for the server to know your user agent, what fonts you have installed, your OS, and whatever else, but it is not necessary and so the majority of the problem of fingerprinting browsers is one created by the browser developers themselves. The original concept of the web was that the client controlled the rendering, the server shouldn't care about what specific fonts you have or the size of your screen.
There is no reason that the Firefox or Safari developers couldn't decide in the next version to send only bare minimal request headers.
Is it actually possible to request most sites that way, without a host header? Maybe if it's tls and you already used sni, but I think most sites use a cdn that uses some form of virtual hosting to share ip addresses.
Consider the basic httpd feature of providing an automatically generated index to a directory. It is possible to locate content, including names of virtual hosts, with only an IP address.
That is perhaps the least convincing test case you could have chosen to support your argument. (You basically proved you could fetch a content-less page, in part because that was the easiest proof case you could write, but because it didn’t demonstrate anything that an HN reader would come here for, it doesn’t resonate convincingly.)
All that additional crud has probably contributed to the "relative ease" of conducting tracking as well as the "richness" of the data one can gather from tracking. Why should we ignore this simple fact.
The people behind these browers, especially Mozilla, keep assuring the public they are working to protect user privacy. This may be true to some extent but what they are not telling the public is how they are working to ensure the online advertsing industry continues to thrive, i.e., how they are working to ensure they do not upset the status quo. The words do not match the actions.
I would not rely on the browser developers to address this problem. To experiment with how well minimalism works on today's web, one can use alternative HTTP clients that allow control over headers and/or one can use a forward proxy one can remove/modify headers generated by the browser.
You can never be non-unique because of your actions.
To stay in the analogy of tracks: it really doesn't matter how your Tires look, the direction of the tracks make you unique.
That's the same reason why Tor is not secure.
But you can make it harder by cutting your tracks in smaller pieces.
There is another idea I did not mention and that is "minimalising" the behavioural fingerprint. The most simple examples I can think of are DNS and Wikipedia.
User A makes a single request to download bulk DNS data or Wikipedia content, then makes it available on her loopback or local network.
User B makes a series of DNS queries for specific RR's or HTTP requests for specific Wikipedia pages.
How do the behavioural fingerprints of User A and User B compare?
Trackers generally cannot see the individual DNS queries or HTTP requests for Wikipedia pages that User A makes over her loopback or to addresses on her local network. However all of User B's individual DNS queries and HTTP requests, including timings and context, are easily visible to a variety of trackers operating on the internet.
> Instead of asking "How do we become "non-unique""? We could ask "How do we send less data"
The basic principle of information theory is that these two things are exactly the same. There is no "correspondence" between them, they are one and the same thing. The amount of information that you send is, by definition, the measure of how unique it is.
I don’t think that’s the point being made by the parent comment. If the information being sent is some identifying information + the extra information information in the http requests you send, then both users may be identifiable however one user may still send more information than the other. The parent comment suggests that reducing this extra information should be a goal.
But the information has nothing to do with the size of the message. It is just a measure of how unique the message is. If everybody in the internet submits the same 1TB of data, but you transmit a single byte, then you are transmitting a lot of information, but everybody else is transmitting almost nothing.
I don't see how your reasoning about sending more or fewer headers fits into this; but anyhow the conclusion won't depend at all on the amount or the length of the headers, but on how many people are sending the same headers as you. It is impossible to know how much information are you sending by only looking at the bytes you send. You need to know the worldwide distribution of headers to measure that.
It seems that the term "information" is being used in two different ways in this thread. The usual meaning of a bit of information is with regard to a probability distribution over messages which the user wants to send to a server. I don't think most people are used to thinking about bits in other contexts, so that's where the miscommunication is happening.
Your interpretation, which I think is correct in this context, seems to be with regard to the entropy of a probability distribution over internet users, and the mutual information between that and the distribution over messages. The actual length of the message is irrelevant to the math once you fix the joint probability distribution.
The argument others seem to be making is that the joint probability distribution is in fact not fixed, and that you can smear out the conditional probability over users given a message by shrinking the space of possible messages. In theory that seems possible, but I don't know enough to have any idea how well that would work in practice. If you shrank the message space to be small enough to be useful for this purpose, wouldn't that get in the way of usability?
This is not "my" interpretation, it's the standard definition of information content in computer science, as given [1] by Shannon in 1948 and used by everybody since.
Obviously I'm familiar with the definition. If you didn't get that, you should probably read my comment again. It seems like you've somehow decided that people in this thread are arguing with you, but they're not. And anyways, it's a bit silly to get mad at people because they haven't studied information theory.
Define the random variables M for message content and U for the identity of the user. The interpretation of "bits of information" that most people will have is H(M). The correct interpretation in this context is H(U). You seem to be confused about why people are talking about H(M) instead of H(U). But I think people correctly intuit that those aren't independent, so the mutual information I(U;M) = H(U) - H(U|M) is positive. And obviously if you change P(M), you will also change the amount of mutual information. That's why talking about sending fewer headers makes sense.
User A is also "unique", indeed... until more people start doing the same. Then they can no longer be told apart.
We should agree on common set of absolutely minimal necessary data (User agent, supported fonts, screen resolution and DPI if Javascript can probe those, etc) and then ALL run with the same, except on the very few sites one truly needs and which don't work with this bare minimum for some (hopefully valid!) technical reason.
We could actually package all of this into an addon or a simple recipe of settings. I'm not knowledgeable enough to pull such an effort with absolute certainty that nothing gets forgotten, but if someone does, I'd suggest "V for Vendetta" as a name, as it reminds me of the 5th of November scene.
In my test (umatrix, but no other preventions), the main sources of uniqueness were the fonts installed, the HTTP-ACCEPT header, and WebGL fingerprinting.
Especially the fonts installed was very interesting and I hadn't thought of that before. The JS just iterates over thousands of known fonts in a div/span and sees if the browser can render it to build a list. It's enough that you've installed a single uncommon font and together with everything else you suddenly became unique just by that.
You're not going to block all of these "dynamic" prints just by changing browser or installing a single plugin. Even if you run in a VM, unless you actually flush and reinstall that VM every session, you're eventually going to amass customizations inside the VM that can be fingerprinted.. :/
I thought that most font rendering was still happening on CPU. Usually, characters are rather small, so the overhead of shipping the data to the GPU does not seem worth it.
Even with CPU rendering, there will be differences depending on the specific software libraries and system configuration, e.g. aliasing settings. Patches to one of the pieces of software doing rendering could create pixel level differences in rendered fonts that could be used to fingerprint.
It's actually an interesting case study how "identical installs" often have minor config variations which produce a sort of chaos in the end result. Also, staggered downstream distribution of software updates doesn't help.
I haven't tested it much myself, but I suspect there's a lot to unpack here.
How about a browser that has cookies etc. as different modules detachable/attachable from/to the actual _browsing session/s_, and an option to attach/detach them on the fly. I think I would get used to the overhead work involved when browsing, just like I got used to NoScript.
windows 10 pro has "Windows Sandbox" which is a VM that runs windows 10 that is only ephemeral storage. Closing the VM and restarting it is 100% fresh again.
There's ways to do it with virtualbox and qemu as well by setting the disks to the same ephemeral style.
Maybe we need to run browsers in a virtual machines that poison data gathering by simulating rolling profiles populated with fake history and bookmarks,
where all your real bookmarks, history, cookies and passwords are held outside of the browser by a separate program that interacts with the browser using an automation framework like puppeteer.
Now you are in sockpuppet land, and there are a few established companies that keep a low profile as is the modus operandi for the miccimac (curious about the term?, look up Ray McGovern)
Firejail ( https://firejail.wordpress.com ) offers an easy way to start a completely fresh and rather isolated instance of a browser, and probably after a little bit of work to use various different "complete profiles" (home directories...).
War is a Racket is a must read, but there is another Smedley Butler story a lot of people don't know about that is extremely interesting: the business plot. [1]
maybe proxying my browsing via a sockpuppet, but rather than actively polluting the internet with automated traffic, I want my browser to be a library of garbage that punishes any attempt to analyse me.
It's highly unlikely I'm going to do anything with this thought, but if I do I think I'll call it Lignin, after the complex structural compound in plants that no organism was able to digest for the first 60 million years after its appearance.
You’d need to do this carefully, to avoid it standing out just as clearly as a unique tracking id.
Its like if I were an analyst at a national security TLA, I’d treat TOR traffic as a big red “look at me!!!” flashing light. Sure, you _might_ be using TOR to anonymously report a pot hole to your local authorities, but you’re _way_ more likely than average to be doing something the government has “a war on”. (So as an analyst I’d assume you’re involved in drugs, terrorism, child abuse, or insisting on government accountability.)
You probably don’t want “a library of garbage”, you probably want something a little smarter that breaks all your traffic into totally plausible “normal looking” traffic but with each stream (browser tab/website pair I guess in this context) looking like a different but totally “normal” session. So your HN session looks like a totally stock Win10 Edge browser session, but when you click over to (or open a tab to) eff.org, it changes to maybe a SamsungS20 session, and when you flip somewhere like NYT - all the page loads and the hundreds of tracking pixels all see what looks like a macOS Safari session.
Do-able, I think, but more complex that simple “garbage” traffic. Needs stateful session inspection so it could do things like stripping referred headers when you change top level sites/urls, while sending them normally to image/xhr/tracking URL calls from within a site.
Correct, using Tor has a chilling effect, just like using cryptography used to have. Hidden in plain sight and blending in crowd is not foolproof either. If you don't do anything wrong on Tor, you're noise. However, some signal on Tor is information I find unethical (e.g. pedophile and terrorist content). If I want such distributors to be busted, lowering Tor noise seems to be in my interest.
It says my iPad running iOS14.2 with no particular attempt to increase or decrease its privacy level is unique. Most of that seems to be the small sample size they have so far couples with the fact Safari is telling them my screen size, en-au language setting, and my Sydney/Australia Timezone. I suppose it’s entirely plausible I’m the only iPad Air 2 user on 14.2 in Sydney to have checked using their site so far...
One of the easier ways to not look unique is use a popular iOS device because Apple pushes updates hard, so OS and browser versions tend to move in sync, they don't maintain that many SKUs (4 right now, and I'm not even sure if the 12 and 12 Pro are different enough for fingerprinting to detect), and their devices are pretty popular.
Various privacy trackers also block fingerprinting code.
If you need a new hobby, uBlock Matrix is the hard way to block most fingerprinting.
"All you need to do is pick up this abandoned github project, fork it, fix all the outstanding show stopper issues, bring it up to date with advances in browsers since it was last regularly maintained, and add in my own personal must face feature! Bob's your uncle! The you just need to avoid whatever inevitability it was that caused the previous maintainer to abandon it, deal with the usual crowd of self entitled and demanding-to-the-point-of-abuse users who refuse to contribute pull requests or money, and worry about tremendously overreaching to the point of fraudulent DMCA or patent lawsuits and having GitHub roll over to the RIAA or whoever doesn't like what other people use the code to do."
Chameleon does not seem to be enough. Or do you have some pointers how to configure it?
The EFF site showed two fingerprint ids that were completely unique for my browser: window size (because of my dock) and the http-accept headers (because of my language selection). That's with FFs fingerprint-resist option enabled. Chameleon can spoof those, which is great, and it gives access to the fingerprinting option which I think FF does not expose properly outside of about:config. So it should greatly reduce my identifiablity, but according to the site it does not help much, even if the specific categories are now almost unique. Like they explain, the combination seems to be the problem, or maybe they are not exposing the category that gets me.
Edit: Okay. The solution seems to be: Chameleon with most of the options enabled, so as much spoofing as possible, but without activating FF's fingerprint-resist option. Probably sending no data is worse than sending spoofed data!
We just need to make sure the browser suppliers are in total opposition to the surveillance capitalism and personalised advertising industry.
So while Chrome is such a dominant browser? No. It’s not gonna happen. Google will keep writing “inadvertent errors” that exempt their own tracking cookies from user instructions to delete them.
Google owning the dominant browser and mobile OS is a travesty for privacy. What we really need is a consortium of companies to rally behind FireFox for mutual benefit, like how Linux is done
Using the Tor Browser from torproject.org with the so called "safest" security settings (that I usually use) I am not unique (currently 1 in 15k browsers have the same fingerprint as mine - which is not exactly great but still better than unique). With "standard" security settings (I switch to those mainly for watching videos) I am unique among the tested 288,728 others (screen qualities and not using the recommended default language en stick out).
For anyone considering taking the dive: NoScript takes a bit of time to get used to, but stick with it. You'll build up some things you are happy to whitelist, plus always have the backup option of disabling temporarily within a tab. In a week or two, your new workflow will feel completely natural.
Another user here, and things weren't so smooth. Everytime you reach a new website you'll need to change rules and reload it. It's ok if you're always going to the same sites, but if you always discover new places (and I believe HN participates in that) it's a constant weight to lift. Always useful, but still not natural IME.
It gets a lot easier if you only allow self-hosted JS. Block any JS loaded from a domain other than the one you are actually visiting. If the site I am visiting cannot function properly from self-hosted JS, then I move right along. It has to be a webiste I'm really interested in for me to even consider allowing 3rd party JS.
I've been whitelisting JavaScript since No-Script was new. I've used uMatrix recently but with it's end-of-development[0] I am considering switching back. I grew tired of having some shitty virus dropped to %appdata% for the N-th time loading a web page, exempli gratia [1]
It's gotten a lot more annoying the last several years with 5mb webpages. Backing up the whitelist saves a lot of time.
For perspective, I use 4 separate profiles (--user-data-dir) listed in descending order of how annoying they are for me to use.
1). School browser: Chrome + uBlock origin
2). Shopping/low security (when the payment processor is an iFrame and I don't want to refresh...): Firefox + uBlock origin + CanvasBlocker
3). General browsing like browsing Google or YouTube: Chrome + uBlock Origin + uMatrix
I've gone from 2 browsers (Firefox + IE6) to 4 browsers (Firefox, Chrome1, Chrome2, Chrome3). By 2030 I'll be running 16 browsers in a virtual machine on a remote server that I connect to with my browser browser.
I wish these extensions could integrate with Firefox containers. Work and schools can easily ask people to use new sites which can not avoid and in that case it's best to defer to later day or just not fight with extensions.
I'd recommend uMatrix instead. It largely (completely?) supersedes NoScript, and I find the UI to be much easier to work with. It also has better granularity of exactly what you want blocked/allowed.
It hadn't had a stable release for a year at the point when the repo was archived, because it's more-or-less feature-complete and has no major bugs. 90% of the open github issues [1] were enhancement or documentation requests.
No updates is not necessarily a bad thing. Sometimes things work well enough to leave alone.
I use umatrix all the time and there are multiple sites where it doesn't work. I think part of the problem is that not all requests are actually shown in the drop-down for you to enable them. Maybe a popup or frame causes umatrix to miss it etc. For a blocker where the default mode is "block everything", it makes sense that this failure mode would be the dominant..
Have you used the built-in logger to confirm that it's actually uMatrix and not another extension? I typically have this issue and then find that the fault was with a uBlock filter.
It does! I had to manually disable Javascript in UO settings, which solved the problem for me. FWIW, it doesn't solve the problem at scale, per opt-in / opt-out dynamics. It's a feature worth building into the browser, and setting it to disable JS by default. Make site owners ask for permission to use JS, and they better have a good reason.
Thanks. What is the most common user agent out there?
Sadly, this is of limited use. Defense against fingerprinting is like herd immunity. If everybody else already has a unique fingerprint, there is not much an individual can do to avoid being uniquely identified as well. At most one can spoof one other unique individual. Plus the EFF recommendation is 'latest Chrome on Windows' which is a moving target.
Would be nice if the EFF site in OP would recommend an agent id to spoof to, at least that would help building a small, but non trivial herd of indistinguishable users. And then a popular extension like uBlock Origin would track this agent id and set it by default for all its users.
I use a Chrome extension called "Quick Javascript Switcher". If you click the icon in the bar, it switches JS off/on for the domain - using Chrome's built-in allow/block list for JS.
I honestly could not live without it. At this point I have pretty much every news and recipe site on the internet blocked. Visiting a new site, as soon as I hear my fan spinning up I reach for the "no-JS" button and the page suddenly becomes responsive again.
I'm not quite to the "no JS as default" level but I'm close.
Google themselves admited they are an ad company rather than a search engine company. Why use a browser by a company where their main revenues are ads.
> You make yourself non-unique by looking like others. ... limited by the precautions that others take
Yes. This is why it would ideally be done by the browser, not by individuals. If Safari reported only its top-level version number (and only exposed the installed-by-default fonts, and so on) then millions of others would suddenly look like me.
That doesn't really solve the fingerprinting issue. Hardly anyone uses whonix for everyday browsing, so you'll be that guy[1] with the obfuscated fingerprint
> How does one actually make themselves non-unique?
Though this is highly desirable, for a guy like me who is rather paranoid I'm not going to worry too much. I disable js and block lots of sites (based on MVPS but also my own personal list - anything that gets through gets put into my personal list). After that, I don't care too much if they track me cos they aren't getting bugger all useful - if everyone did that, what market would be left for advertisers?
I'll still read more of the article and try to close more holes, but perhaps a 95% solution is sufficient? What do you think?
Yep, I don’t trust this very much. Two of the same iPhones that are fully updated and in the same region should be essentially indistinguishable by the things they mentioned.
It isn't. I've worked with a company in past that has the most advanced tech available. It can uniquely identify iPhones of same model/make/year/os/safari version in the same region under the same ip address.
They promised that the tech would always give a stable and unique id from the browser. And it worked too, but it wasn't public and not for ad tech purposes.
Oh, fascinating. I've suspected this is possible for a long time, but have never seen any ideas on implementing it.
I have a POWER9 desktop, and a fair share of other users do so for privacy/security concerns, pretty hardcore Tor browser guys and the like. I've mentioned before that fingerprinting firefox on ppc64le would be very easy because of timing the non-JIT'd JS engine. I guess there's potential for much more specific fingerprinting.
crucial bit is to blend into the crowd instead of standing out (the paradox is the harder we try to mitigate on a technical level the more we'll stick out). instead use hardware compartmentalization, pseudonymous identities (not anonymous ones) and focus on operational techniques (modify your behavior). technology can be a means but often it is just part of a strategy. (e.g. instead of "let's encrypt all the comms" why not eliminating some comms altogether - no need to worry about data that doesn't get stored in the first place etc)
I had the same problem, but after setting tracking protection to strict in ff I am now only nearly-unique according to this test. Also this page [1] shows some more details around fingerprinting, although I don't know if it works the same as the page announced here.
We could successfully become non-unique (~10 bits) with 2 approaches:
1. TOR browser on Safest setting
2. Firefox using the strictest privacy settings except allowing 1st party cookies, and using Private Browsing; toggling privacy.resistFingeprinting to True in about:config; and using the uBlock and NoScript add-ons
These were successful on both desktop and mobile (except iOS)
Actually, let's look at a different solution up the same vein, how does someone become NOT themselves? I think that right now it's less about how to blend in, and more about how to swap identifying features.
What about a pool of users, like a vpn, except the pool exercises identifiable information swapping!
You might want to try https://en.wikipedia.org/wiki/Privoxy, a "privacy enhancing proxy" which works by reducing the stuff a browser sends before passing it on.
Having software that helps mitigate timing attacks and pre-fetches uninteresting content is a start. Disabling JS will be painful but helpful tool. Although disabling JS makes you a uniquely identifiable user. Not sure, but I think this should spark disfission.
Are you kidding? Apple has all but stripped away the ability to make a profit from Apple users as they all identify as the same advertising ID now if you use the safari browser. Facebook stock took a huge hit right after that update.
stylometry can identify you, or even just the times you are online, what content you visit. We need a browser plugin for this. Some button in the corner of your browser that you click to sanitize your post of any identifying features.
Not really. Even something as small as maximizing your tor browser window can indicate to whoever's watching that you're one of N people using that same screen resolution.
And sure, N is pretty big for popular websites. But the less popular the site, the less it holds up.
Tor is about the most effective thing we have, but I'm not sure "really effective" is fair. It gets the job done sometimes, but it also leaves a lot to be desired.
I'm pretty sure that because Tor Browser ships w/ `privacy.resistFingerprinting.letterboxing` set to `true` by default since 9.0 [1] (Oct. 22 2019 [2]), users can resize the windows now without affecting their fingerprint.
> DNT is essentially acknowledging that a 3rd-party is the one in control of your privacy in the 1st place, and you have to resort to asking this 3rd-party -- which has financial (or whatever) interests in tracking you -- to not track/data mine you, and trusting that they respect your wish, with no way for you to find out whether your request is respected. In short, it's BS, and supporting tracking/data mining is agreeing that tracking/data mining is the natural, expected behavior and thus an opt-out "feature".
> Nobody should ever agree to this.
> I see it differently: tracking is opt-in, and the ideal is that users are in full control of their privacy by default. Those who want to track/data mine you should ask you to opt-in, along with all the detailed information of how the data collated from you will be used and monetized (lists all entities to which your data is sold), in the spirit of informed consent.
> Currently the way for users to enforce their privacy choice is to use all the tools at their disposal to prevent their data from ending up in the hands of the trackers/data miners, and DNT is not one of these tools.
Surprised to still see that included on here. Apple removed support for it over a year ago since it never gained much traction and ironically can be used as an additional variable for fingerprinting.
That's basically what DNT was supposed to be—browsers said "do not track me" and "legitimate" trackers respected it. I agree that it's dumb. Ironically, DNT is now just one more bit that makes fingerprinting your browser easier.
DNT was supposed to be that until Microsft decided to make it on by default pretending it to be a big privacy move on their part. Because it was no longer a user choice advertisers decided it was no longer reasonable to honor it. So they just ignore it most of the time. And thus DNT became completely obsolete.
The EFF seems to essentially be an extension of the Silicon Valley business model - it's like the notices at casinos giving a number to call if you have a gambling problem. They're not going to address the underlying issues, but they'll offer some ineffective band aids for them.
I've been thinking for a while—what if the solution to fingerprinting wasn't making your browser less unique, but less constant?
Take Canvas fingerprints. If you added a bit of randomness to how the browser renders pixels, the fingerprint would change each time you measured it. It would still be unique, but it would be a useless tracking tool.
Presumably I'm not the first person to come up with this, is there a reason it isn't done?
This is very similar to how OS's and Browsers have mitigated the spectre vulnerabilities last year.
They basically fuzz (randomize) the timing precision that you can do, so you can't use timing to leak information across 2 different processes.
>"As of Firefox 57.0.4, Mozilla was reducing the resolution of JavaScript timers to help prevent timing attacks, with additional work on time-fuzzing techniques planned for future releases."
Sorry, I don't actually see in there why randomizing something like a canvas or webaudio fingerprint is a bad idea. I believe they know what they're talking about, but I'd still like to know why.
Notably, to me this isn't necessarily about beating the resistFingerprints setting, because I can't use that setting. For good or for ill, it breaks too much of the web.
But that's referring to privacy.resistFingerprinting (RFP), not adding randomness. RFP isn't really an option, at least for me, because it breaks too many sites.
>But that's referring to privacy.resistFingerprinting (RFP), not adding randomness
having a randomized fingerprint is a fingerprint value in and of itself. That may leak more entropy bits if you have a fairly common hardware configuration
Also, regarding webaudio fingerprinting, afaik it's not a real threat.
This reasoning is valid when randomization is not pre-configured and is done at the individual level. It doesn't hold when talking about changes that come with browsers etc and deployed at scale.
This deserves to be heard. I will also add to this, since it is known we can use addons to disable tracking and fingerprinting but addons can themselves be used for fingerprinting, why don't we get a feature on firefox (or even one more addon) to prevent javascript from getting a true result when querying for known addons?
>Take Canvas fingerprints. If you added a bit of randomness to how the browser renders pixels, the fingerprint would change each time you measured it. It would still be unique, but it would be a useless tracking tool.
that doesn't work because the sites can just take multiple samples and average the results.
It's great, I got told I got browser fingerprinting, showed me how it does it, but not really how to prevent it... Which is kind of the point I did the test for.
Browser fingerprints are turning up in court cases now. Probably not wise of EFF to frame it as "covering your tracks". It's more like "closing your blinds".
I'd be happy to make that argument. It's circumstantial evidence at best, and would be a waste of time in a criminal trial.
"Move to strike that so-called "evidence," your honor. It is irrelevant and thus inadmissible, because "Cover Your Tracks" is the name of a website that tells users whether their computers appear unique to advertising companies. Not to mention that the EFF regularly files friend-of-the-court briefs with the Supreme Court and is a trustworthy nonprofit dedicated to keeping the internet safe and fair."
Juries are made of regular people. The overwhelming majority of households in the USA have broadband internet. Regular people understand that advertisers track internet users in order to sell more ads.
Also, scare-mongering and preying on someone's lack of familiarity with a website is only useful when there isn't someone else present to quickly explain it to them.
Take a look at the linked case. The prosecutor made a big deal of the fact that the defendant used a subversion server for version control. It was doubly evil because it was a subversion server in gasp Germany.
"Although it is physically impossible for the single guard to observe all the inmates' cells at once, the fact that the inmates cannot know when they are being watched means that they are motivated to act as though they are being watched at all times."
First read about the Panopticon in Foucault's "Discipline and Punish". It was designed by Jeremy Bentham, the mind behind utilitarianism - in fact, I believe it was only put into practice in one prison, which Fidel Castro and Raul Castro were imprisoned in before the Cuban revolution. All this is somewhat tangential to the discussion at hand, but still quite interesting.
The parallels between the Internet/modern security state and the Panopticon design are obvious.
The question is how to protect against fingerprinting?
Protecting yourself against some scripts is easy and convenient with uBlock Origin. Tracking by URLs can be handled by ClearURLs. Cookies are easy to address with containers, ideally temporary ones with per domain isolation. IP tracking can be addressed partially using a VPN or Tor.
But fingerprinting is hard. Some Tor browser configurations go as far as fixing the window size. Things like timezone, user agent or available fonts leak a lot of unique information. Any simple setup that doesn't have too many caveats?
I love tor-browser project, what I don't like it's the hostile web you uncover with it. Browsing with tor-browser has become a CAPTCHA minefield.
Everything, everywhere there's a "human" captcha waiting for You. Sometimes I have to do the challenges 3 or more times. There's opportunities that even after a 4th attempt it just doesn't work.
This is funny and relatable, but from my anecdotal experience it doesn't really matter whether you click boxes that contain a little corner of the object you're supposed to identify. I think those are weighted less, or the scoring mechanism isn't that simple/binary, or something like that.
It often doesn't even matter to click wrong boxes to a certain degree. Green or red points are usually accepted as traffic lights (often no need to click the whole traffic light) and anything white painted on a street is usually accepted as a crosswalk and so on. Skipping/verifying the second task without even bothering to look at it (i.e. immediately after clicking "next" on the first one) usually works well, too. The AI based on the ML at work here seems not to be very sophisticated.
These captchas are used to crowdsource training data for semantic segmentation ML models. By shifting the image around, users statistically fill out the boundaries of objects by selecting which squares include the object. As a result, in many captcha instances, you see objects right at rectangle boundaries.
https://github.com/arkenfox/user.js
is militantly maintained.
It documents and make it easy to use many of the features upstreamed into Firefox by the Tor Project.
Currently trying out resistFingerprinting and I really don't like that it resets my zoom level on every new tab. I understand that there might be scenarios where not doing so would pose a risk (e.g. when a site opens another site in a new tab and they exchange their observed zoom level behind the scenes) but I'm okay with that. Most of the time I'd zoom back in anyways, so is there a way to disable that specific feature of RFP?
> “Add-ons and browsers that randomize the results of fingerprinting metrics have the potential to confuse trackers and mitigate the effects of fingerprinting as a method of tracking.”
I would like to dispute this. It is easy to detect when a user is spoofing certain settings. As such, randomizing fingerprint data increases entropy, making your browser more unique, not less. There is a reason why the Tor browser works so hard to make sure every user has the same fingerprint, rather than randomizing it.
If you browse website A and website B, with different randomised qualities at different times, and your non-randomised data isn't enough to sufficiently identify you, then although A and B might be able to identify that you're spoofing certain settings, they can't identify that you're the same entity browsing both websites.
If you just set things to a weird or unlikely value, then you're as identifiable as a man who walks down the street in a mascot uniform that he never takes off. That is to say, although they don't know the 'person' behind the mask, all you need to look for is the dingus wearing the capital city goofball costume. Indeed, it makes you stand out more...
If not a lot of people do it the bit of information "this browser is spoofing certain settings" is also a very telling thing and combined with the non-randomized data may tip the balance towards uniqueness.
But uniqueness is valuable only if you have identity. If you can't tell that dude A is also dude B then you just have a million dudes and you can't violate anyone's privacy.
”Random data” can be part of that “same fingerprint”, and it actually already is (Firefox with privacy.resistFingerprinting, and I think Tor too, randomize canvas and WebGL results).
Could fingerprinting be easily avoided by randomizing all of these parameters? Sure, you'd still have a unique fingerprint, but if every time you load a page you have a completely different unique fingerprint, wouldn't that solve the problem?
It 'solves' the problem in the theoretical sense, but in a practical sense websites may still require information from the browser for actual justified purposes. Think identifying the browser, OS, resolution and window size (non-exhaustive list and I'm a data-linking person but not a web person, so I'm just making them up).
Indeed as long as browsers need to send actual information for successful implementation of whatever it is the website is doing, the potential for fingerprinting still exists. And as long as there are sufficiently non-random properties of the computer for the theoretical attacker to poll (via whatever mechanism they can), then fingerprinting will be a viable mechanism.
Although as EFF points out, if MOST of the identifier obtainers require Javascript or connections to certain URLs to work, it might be that blocking javascript and URL blockers might be sufficient to stop most of them in practice?
Edit: of course, every time I say theory and practice here we run into problems of how much people are willing to do or put up with in practice...
Put a browser inside a read only virtual machine and have everyone agree to use it. All temp folders can be mapped to RAM. Have the VM use a random VPN each time and never login to anything.
Nearly everyone would have the same fingerprints even down to the browser resolution, root CA lists, installed fonts, etc. The only real difference being the sites you visit.
I see a few people asking how you blend in among the crowd. My strategy is to use a common device like an iPhone 5 using the Safari browser, and for the connection I use a 4G connection that I seem to be sharing with loads of other subscribers. Apple's Advertising Identifier[0] AFAIK is not passed into Safari, as that only applies to other apps and not browsers like Safari.
The only thing that could make me stand out is if I deviate from this basic setup and use different browsers on weird connections like a VPN (Yes - a VPN can make you stand out). Also: upgrading my phone to the latest model is probably an uncommon practice and used as another heuristic to track you, & I've heard many people like to stick to a certain phone model until it starts acting up.
Since virtually no one honors DNT, and its usually not set by default, choosing to set it is giving up a free bit. Instead try to do whatever the default behavior of your browser is.
> We have also worked with browser vendors such as Brave to provide more accurate results for browsers that are employing novel anti-fingerprinting techniques.
> remember: very, very, very few users use anti-fingerprinting measures
At this point, Brave has 20-million-something monthly users, all of whom have fingerprint randomization enabled by default. Randomization definitely makes an individual stand out if nobody else is doing it, but it makes it almost impossible to distinguish from others using the same technique.
My screen resolution isn't shown accurately on this tool. It returns 1920x1080x32, but my actual window resolution is something like 1880x1020x32 (I use swaywm and leave gaps around windows for a border). If I resize the window to 300x700, the tool returns 1440x900x32. If I disable all my extensions and Firefox's enhanced tracking protection, it always returns 1920x1080x32.
I'm using swaywm, so maybe this is an (unintentional?) feature of wayland.
Same. But on HR, many people conflate Brave’s built-in privacy with Brave’s ridiculous crypto thing. And so Brave is pooped on, laughed at, or at best ignored by many here.
To me they are completely separate and although Brave has had privacy missteps, all browser vendors have. I believe in the leadership (Brendan Eich, creator of JavaScript) more than the leadership of any other browser vendor. I believe Brendan eats his own dog food.
I tried out Brave a few years ago and soon noticed it did not completely clear history on close when configured to do so. While the history was not shown in the UI, you could see it was retained because URLs visited in previous sessions were shown in :visited style (links were purple instead of blue with default CSS. Note that this can also be queried via JS and thus allows sites to extract your browsing history).
I reported the issue at the time but they said the won't fix it since they were moving away from brave-laptop. They still had issues[1] clearing forms of local storage year later. They are not taking privacy seriously.
They also took crypto from users under the false pretense of taking payment on behalf of content creators[2].
did you notice the unique alphanumeric id on every piece of paper money?
the banks registers it and turns out the vast majority of money does only "one cycle" ie. from the bank to a (known) consumer/customer, to a shop/cashier and then back to the bank. so "they" already know were you spent it.
That is not entirely true. They can only track end-points with that system, meaning those places where the note ID is inspected. Moreover it's usually only done by one bank; the central bank. In any case that is certainly not a great argument for giving away more of your freedom.
Granted more and more shops now have machines that count all bills and change automatically and per transaction, under the guise of convenience and increased security towards theft or robbery. In the mean time, the notes may have travelled from hand-to-hand in a long chain that is completely shrouded in a dark—if not entirely black—economy that neither the bank nor the state can ever hope to fully track.
That all changes with the advent of a fully digital system, especially if all transactions are sanctioned to be within that system by force of law. Which is exactly what the banks are pushing for, with various scares such as claims of the danger of money laundering, corruption, and so on. What they systematically fail to add, however, is how they themselves gain the ability to track full markets, and make sure-fire investments based upon surplus data gleanded from aggregating private purchases and trades, thus ensuring their full control over said market.
Meanwhile they can use that same power to effectively shut down all opposition—and anyone who protests their actions—simply by finding a reason to freeze their accounts, just like they did to Julian Assange.[1] Thereby they get the power to effectively keep individuals they dislike from accessing the entire economy, unless perhaps they own some diamonds or a valuable goat... This has in turn given precedence to other financial institutions to shut down accounts or memberships based upon political affilations, such as is the case with Patreon, or because they somehow are construed to be in a competing position.
As an example, a Norwegian crypto trader has effectively been barred by all the banks in Norway to get a business account.[2] The reasons given were vague claims that his completely legal business somehow supported nefarious and evil individuals who all wish to undertake in money laundering, without offering any solid evidence for it. However the banks all failed to mention that his business can in many ways be seen as being in direct competition with their own business model, namely that of manual and middle-man-laden banking. In any case, for all intents and purposes, Bitcoin transactions are far easier to track than bank notes.
I read comments about solving this problem by randomising the fingerprint.
Is it possible to randomise the hardware by creating a sort of "ramdom VM-container" that runs the actual browser binary? It would randomises the VM so sometimes it's Linux and sometimes it's Windows.
I am of the belief, rather an insidious belief that not only do we need to block AdTech and other awful face-recog technologies, etc.; but to actively spread and feed it with fake data in order to completely confuse the algos, rendering them useless.
The practices should be banned. It's a federal felony to wiretap or open someone else's mail. In almost all states, it's a crime to record conversations without consent from at least one party. Earlier generations passed those laws because they viewed mass surveillance as a fundamentally Soviet activity with no place in a free society.
According to some footnotes in Shoshana Zuboff's book, the FTC was on the path to ban these practices in 2001 but then 9/11 happened.
Calling mass surveillance a "fundamentally Soviet activity" weakens your argument here. Is mass surveillance something that happened in the Soviet Union? Of course. But it also happens in the US and many other nations, liberal democratic or not, worldwide. So why shouldn't we call mass surveillance fundamentally American, or fundamentally capitalist?
Mass surveillance is bad. It is toxic to freedom and democracy. Let's focus on that rather than invoking Red Scare tendencies of old.
> Let's focus on that rather than invoking Red Scare tendencies of old.
He's not "invoking Red Scare tendencies of old" to describe modern mass surveillance. He's giving historical context about what earlier generations thought:
>> Earlier generations passed those laws because they viewed mass surveillance as a fundamentally Soviet activity with no place in a free society.
It's particularly confusing since the "earlier generations" were the people who wrote the unreasonable search and seizure clause in the Bill of Rights, almost 150 years before the Soviet Union even existed. The Supreme Court ruled in Katz v US that warrantless wiretaps were unconstitutional so Congress passed laws creating wiretap warrants [1] for law enforcement. Congress didn't ban wiretaps without a warrant, they passed a law to create such warrants in response to a Supreme Court decision that banned what police had been doing for decades, Soviet style.
You're not being "wiretapped" if the site you visit explicitly included tracking. As an analogy, the sites that you visit are already providing one-party consent (all that is needed with current federal wiretapping laws) by including said intentional tracking.
In the case where non-essential sites are required to disclose what they track to the user, the user is arguably also providing consent on their end even if it's mandatory. Nobody is forcing you to visit non-essential sites.
If you told an American in 1995 that AT&T was selling their call records, or that Visa was selling their transaction records, they would have gone ballistic.
The fact that all this tracking is still hush-hush means that most people would not react kindly if they were aware of the true scope of the tracking, and how widely distributed the data is.
> Isn't it safer just to use an adblocker? Why engage with ad-networks at all?
> While AdNauseam is far safer than using no blocker at all, it is indeed marginally safer for one to simply use a strong adblocker and protect themselves. But it is also safer to stay at home rather than to attend a protest. Using an adblocker does little to change the status quo (especially for those users without the resources to install/configure one, and so remain at risk). AdNauseam, and the obfuscation strategy in general, instead presents a possible avenue for collective resistance; a means of questioning and perhaps eventually, changing the system. But this is not for everyone. If your goal is primarily self-protection, this tool may not be for you...
Will not happen. The U.S. military industrial complex views these sorts of algorithms as a vital technology in fighting future wars, and also fears that it's at a disadvantage to China because 4x population means a strategic advantage purely from the size of data sets Chinese ML researchers have access to.
That's odd: the test claims that my user agent is Go-http-client/1.1, and that there's gzip in HTTP_ACCEPT, after I've followed the redirect trail manually (with s_client(1)), only providing a GET request line and a host.
I've tested my Firefox. They said I don't have an Ad Blocker. Since I use uBlock Origin it means either my ad blocker is quite smart or their technology is poor which casts a shadow over this tool entirely. My 2 cents
I wonder why they claim that I'm uniquely identifiable. I'm using default settings for my Macbook, no additional extensions, and the latest update of Chrome. I would think that is relatively common, no?
Well, I don't know what your result says of course but I suspect the default settings in browsers aren't strict enough to avoid "conventional" tracking and that's how you were identified.
I never understand why browsers expose APIs to read the state of a canvas and WebGL results, or color depth of the screen. How often are they used for anything other than fingerprinting?
Note: at least for my profile, those are the only things that truly seem to give significant data. Everything else is in the 1 in 2-300 range, just by using FF on Android with uBlock.
Ok, so that's one use case. How many webapps are doing image manipulation (not to mention plain sites)?
There should really be a prompt whenever a site tries to access this information, especially in privacy-conscious browsers (I wouldn't expect Chrome to want such an anti-feature for their customers, the ad companies).
I used more than I can recall. The browser is a creative tool and being able to use all the creative features is great actually.
Just editing your profile picture is a good idea to do on the client side, why waste CPU on your server for example? I do photo editing a lot in the browser.
This is a feature that benefits everybody and the fact that it can be abused is unfortunate, but not as tragic as you paint it.
Photo editing could be designed in such a way that the JS code does not actually get read access to the canvas - it just specifies transformations.
It's becoming very clear I think that having this level of control for web apps is more detrimental than it is a positive. Leave rich apps to the OS, and keep the web as untrackable as possible.
I agree. And it didn't even seem to be aware I was already running Privacy Badger. And inside Privacy Badger I saw no obvious options to decrease my browser footprint. Sort of lame.
How does one actually make themselves non-unique?
And can we make things more transparent? How many bits is enough?