Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Using HTTP meta-headers is actually something we seem to have forgotten how to do.

The one that annoys me most is the accept-language header which is almost entirely ignored in favour of GeoIP lookups to figure out regionality... which I find super odd; as if people are walking around using a browser in a language they don't speak. (or, an operating system configured for a language they don't speak).

ETAG's though, are a bit fraught- if you're a company, a security scan will fire if an etag is detected because you might be able to figure out the inode on the filesystem based on it... which, idk why that's a security problem eitherway[0], but it's common for there to be false-positives[1]... which makes people not respect the header.

Last-Modified should work though, I love the idea of checking headers and not content.

I think people don't care to imagine the computer doing as little as possible to get the job done, and instead use the near unlimited computing power to just avoid thinking about consequences.

[0]: https://www.pentestpartners.com/security-blog/vulnerabilitie...

[1]: https://github.com/sullo/nikto/issues/469



To add to the language problem, when I travelled to Europe, some websites (like YouTube) changed to whatever regional language based on where I was, despite me being logged in and Google knowing full well which languages I speak. Even the ads changed language, as if advertising in a language I don't speak will help anyone


almost all of my spam is in french, which is an assumption on the part of the spammers based on the email username. almost all my gmail is spam, because i have directed most real email elsewhere. therefore, almost all the mail i receive at gmail is in french. this has lead to google blocking things (like voter registration confirmation!) that are in english because they're "not in your normal language."


I think that Ads on Google are different in regards to licensing and targeting. A company might target "Users in Europe" for example.


I'm Canadian and I switched from an ISP headquartered in Ontario to one headquarterd in Quebec (Teksavvy to Bell because of a bulk agreement my building got) and now half my youtube ads are in french, despite me living in Ontario.

Don't care either way, but it does make you think...


That's a targeting limitation.

I don't know OTOH whether "target audience's spoken language" is one of the signals an advertiser can key into in targeting an ad (at a glance, it looks like it might be). But (a) advertisers don't always have that signal and (b) advertisers themselves aren't always savvy enough to set it (how many American advertisers targeting Iowa actually tag their ads as "in English?"), so you'll end up with region targeting as a proxy for language targeting.

In your case, it's probably that the ad engine doesn't have enough info on you so it's falling back to geotargeting and hoping for the best (are you running with JavaScript disabled? Clearing cookies frequently? Avoiding logins? If so, these are all things known to decrease ad signal quality).


That's the point. Youtube/Google knows who I am and I'm logged in - my preferences are all set to english and my searches, videos watched, etc are all 99% english, with the rest being with english subtitles.. I'm not talking about banner ads on random websites.

They literally have first class data.


Good point. Another hypothesis is that individual advertisers aren't using that data.

Back in the day, I had a front-row-seat to this process and I observed how often advertisers simply misconfigure a campaign and under-target it. If you don't set a targeting preference for a given indicator, the default can be to target everyone regardless of what that indicator says about them.

It might be the case that advertisers are saying they want to target you anyway (or failing to say one way or the other) even though they should have enough signal to know it's a wasted impression.


Yeah, that makes sense as well. The french ads come to me in waves and I can see lazy advertisers just "targeting quebec" or "all french regions" for the french runs.


IIUC, accept-language is mostly ignored because the tooling to configure it on the user agent is really poor for most user agents. So users log into a site, they get the site in the wrong language, and because only the site is visible they blame the site, not their UA.

It's the "Your site's broken if IE won't load it" problem.


Can someone attest that this is actually the issue?

FWIW Outlook does accept the "Accept-Language" header and I don't think anyone is saying that outlook is wrong for doing that or claiming it to be broken?

Are you totally sure that this isn't a backwards myth?

I think the most likely situation is that locale information for English speaking countries would be incorrect if the default (en_US) was used to install the operating system, which happens on occasion.


I couldn't speak to Outlook, but Outlook is both popular enough and has enough people locked-in to its ecosystem (as business users) that it doesn't have to worry about losing users if Accept-Language doesn't do the right thing. Companies have IT departments to fix that stuff.

I'm talking more like https://www.buerklin.com/. If that site comes up in the wrong language and the only way to fix it is to change the user agent's Accept-Language header, the user isn't going to just figure it out; they're going to navigate elsewhere. So the site has a bug in the top-left to toggle English or German.

Something you mentioned up-thread that I should have commented on but overlooked:

> which I find super odd; as if people are walking around using a browser in a language they don't speak

... yes, all the time. In libraries and Internet cafes, schools, and other shared spaces.


But then, that's not worse, because geoip would already be forcing those users into a language they potentially don't speak by not listening to their device.

Or your library has somehow misconfigured their PCs when setting them up?

A cookie based override already exists, forcing geoip is strictly worse as a default, except for localising currency? I guess.


As a default, geoip is remarkably effective. When you didn't know anything about the user, geography is a pretty decent predictor of language.


But, the users browser is telling you the language they speak!!!


That's the problem: too often, it is not.

https://www.reddit.com/r/webdev/comments/7a2cfe/comment/dp77... for details: there are a lot of reasons speakers more comfortable with another language will have their OS locale (and therefore the accept-language header) set to English.


Growing up in Belgium I feel your pain about GeoIPs and accept-language.

I lived in Flanders, with my accept-language set to en-US, en.

Ads would pop up in Dutch, Flemish, French and sometimes German. When you think about it, from a brick-and-mortar point of view, it makes sense. I'm more likely to buy <physical product advertised> at the <local chain grocery store> vs buying it anywhere in the USA, based on my IP.

Next to that, imagine you browsing Reuters.com in with a Berlin IP and accept-language set to en-US, en.

What SHOULD they show you? Local news in German, auto translated? Local news in German? Or redirect you to the US page?


Locality is different from language. In your example, it would have to show you the local German news, as that's local to you, and it would have to show it to you in the first supported language in your accept-language header.

Personally I would prefer, for example, Reuters.com to be a "hub", and all the regional variants on de.reuters.com. Then just let the user choose what they want.


Even when etag's have nothing to do with the filesystem they can still be a security vector. Some API's use etag's to identify what has changed since the last time you called a particular API. This means the ETAG values are probably stored in a database, which means the API server needs to protect against SQL injection in the request headers.


I mean that's something you need to do every time a DB is involved. Not really an argument against ETAGS.


>as if people are walking around using a browser in a language they don't speak. (or, an operating system configured for a language they don't speak

Well, yes, they are! Computers translated in my native language sound dumb. That's how a whole generation of my world learned better English than native speakers, ffs!

Half of the time it's just translated wrong. You think anyone has any incentive to translate any technology to a language with a couple million speakers, all of whom are obligate pirates?

And it seems like you might be surprised to hear that people speak more than one language. Then where's my global setting to tell the browser what languages I speak, so it'd know what header to send? Same place that lets me configure what ads I'm actually interested in. Nowhere.

>I think people don't care to imagine the computer doing as little as possible to get the job done, and instead use the near unlimited computing power to just avoid thinking about consequences.

This, friend, is what computers are for in the XXI century. "Bicycle for the mind", ha...


Accept-Language is an array, not a string.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Ac...


[flagged]


Just so you're aware, the way in which you've chosen to conduct this conversation is not in-keeping with the desired environment.

I'm not entirely sure why you are coming across as deeply emotional about this topic, but it's really not worth getting angry over.

However, you made a specific point:

> And it seems like you might be surprised to hear that people speak more than one language.

And I was mentioning that, actually, the header has support for multiple languages and the point is to fall back to the one you actually support; so if a site is translated in Spanish, but not Catalonian, then a person living in Barcelona might have ["ca_ES", "es_ES"], and actually only be served "es_ES" Spanish despite requesting the former as a preference.

Also, there are mechanisms for changing this locale away from the Operating Systems choice, but, I would wager a sane default is to use the localisation of the operating system, as that is largely going to be localised for the person already- moreso than browser fingerprinting(?) or GEOIP lookups, as computers move just like humans do throughout our world.


> I would wager a sane default is to use the localisation of the operating system, as that is largely going to be localised for the person already

I already explained why in many parts of the world this would not be the good wager you think it is.

>moreso than browser fingerprinting(?)

I'm not saying browser fingerprinting is a good way to determine what language to serve to the user.

I'm saying setting the headers to non-default values (and especially ones that represent actual facts about the user, such as what languages they can be expected to understand) can be used for fingerprinting, and that's probably the most sensible reason to avoid making use of such features that otherwise would have been, as you say, benign and quite convenient.

>I'm not entirely sure why you are coming across as deeply emotional about this topic, but it's really not worth getting angry over.

I'm not getting angry! I am having fun fun FUN! Your culture requires me to be having fun fun FUN in order to not be gradually destroyed from the inside! Or from the outside!


I live in Sweden, our culture does not require anyone to be having fun.

Lagom; https://en.wikipedia.org/wiki/Lagom


> can be used for fingerprinting

That's an interesting point. However, given that this is a language preference users are quite likely to manually select the correct language if the incorrect one loads up. At which point you have shared that information anyway.

I guess there's an argument to be made against blasting that information out to every last third party though. Perhaps it should only ever be sent to the target that appears in the address bar.


You really don't need to have Accept-Language overlap with language of browser. I'm sorry but reading-comprehension department seems out today. I suggest you try reading without assistance.

Also would you please tune it down and stop having such a confrontative and aggressive tone in your comments?


Would you please show me where that fact was pointed out prior to invoking the reading comprehension department?


> Then where's my global setting to tell the browser what languages I speak, so it'd know what header to send?

In Chrome: chrome://settings/languages

In Firefox: https://support.mozilla.org/en-US/kb/choose-display-language...


Cool! I didn't know that.

I tried it out then reverted to the default.

Because I keep forgetting it's not the 90s and "we" have also invented such brilliant things as browser fingerprinting.


> Then where's my global setting to tell the browser what languages I speak, so it'd know what header to send?

Firefox: https://support.mozilla.org/en-US/kb/choose-display-language...

Chrome insists that the first language be the UI language, and Safari insists that the first language be the _system_ language.


You can set a per-app language for Safari in macOS/iOS System Settings instead of using the system language


Oh yeah, you’re right. But that just gets you Chrome’s behaviour (the first preferred language has to equal the browser UI language).

So I suppose GGP is mostly right, in the sense that most browsers get this wrong (except Firefox).


Yeah it's still not ideal.

As someone who lives between 3 languages what I'd really like is a browser setting for language per-site. E.g. I want my Swedish bank's site in Swedish, not the English translation, I want Google Maps in Japanese so I can see the Kanji for the station names, but I want the AWS console in English. Each of these sites have their own toggles but they are very inconsistent and keep resetting, the browser would have done a much better job.

I wonder if this can be done with a browser extension.


Yes, and this is the actual issue: user agent configuration for preferred language is poor, users blame sites when they can't read the site, so it's in the site's best interest to ignore the broken thing and use a heuristic.


Yeah, I suppose you’re right about that.


> Then where's my global setting to tell the browser what languages I speak, so it'd know what header to send? ...Nowhere

Look again. Or switch browser. It is a basic feature and the issue is indeed websites ignoring it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: