Almost every internet user would like to imagine that one’s search history belongs to oneself. Searches can include private medical information, sexual peccadilloes and even family tragedies, all of which one would like to keep private. Sadly it’s not the case.
As with maintaining browser privacy, defending oneself against search engine surveillance is a difficult task. It’s very hard to find a search engine which doesn’t exploit one’s privacy. While there are major browsers which can be made more or less private, like Firefox or Brave, in case of search engines the whole situation is much worse.
Before examinging individual search engines, let’s start by reviewing worldwide market share.
Diversification of any kind? This tendency obviously doesn’t apply to search engines. Here’s what UK marketshare looks like:
It’s shocking to see the dominance Google enjoys in the market. Even more astonishing – none of the search engines which do protect user privacy even make the table. To understand how little privacy search engines allow us, one can start by simply reading their privacy policies.
Here’s a summary of my findings in table form:
|major search engines||personalised ads||open source||location||own index||3rd party sharing||IP storage|
privacy focused alternatives:
|alternative search engines||personalised ads||open source||location||own index||3rd party sharing||IP storage|
|Ecosia||optional (opt in)||no||Germany||no||yes||yes|
|Swisscows||no||no||Siwtzerland||only german results||yes||no|
I will go through the results search engine by search engine below.
A Tragic Path From Google to Yandex
When you use our services, you’re trusting us with your information. We understand this is a big responsibility and work hard to protect your information and put you in control.
That’s exactly how Google’s privacy & terms section is introduced. After a few lines explaining how Google cares about its users and products, they openly claim that they collect information to understand which language you speak, ads you’ll find most useful and people who matter most to you online. Well, only two paragraphs in and rather scary information surfaced already (note that the whole document consists of around 4000 words). Especially the line about people who matter to us gives me cold feet, because it suggests the profiling and individualization of the users.
Google collects all the possible information to share with affiliates and other trusted partners or persons.
On the other hand, at this point there are few privacy-conscious people who have any expectations of fair play from Alphabet/Google. What concerns me more is that most of the other search engines on the list are as invasive as Google, wrangling our personal data in very similar manner.
The second most used search engine, provided by long-time CIA and NSA partner Microsoft, is no less of a privacy nightmare. As Google, Microsoft places one or more cookies with unique identifiers, known as Search ID on your devices along with third party web beacons – which they claim are now allowed to collect information about you.
When you’re conducting a Bing search, Microsoft explicitly states that it collects following data:
- your IP address
- the unique identifiers contained in our cookies or similar technologies
- the time and date of your search
- your browser configuration
We also use “web beacons” to help deliver cookies and gather usage and performance data. Our websites may include web beacons, cookies, or similar technologies from third-party service providers.
Once again, nothing surprises me when it comes to Microsoft. When I skimmed through the privacy policies of Yahoo!, Baidu or Yandex, I wasn’t suprised either, for that matter. All three collect all possible information for themselves while sharing most of it with third parties, whom they designate as “trusted partners”. The list rounds out to dozens of ad and marketing agencies and services .
Is there a suitable alternative to these giants?
(Mostly false) claims about the independence of search results
Even if you use a search proxy when searching in Google/Bing for isn’t satisfactory either. According to the study described in the respective Techcrunch article, even heavily anonymised data are re-identifiable. Researchers were able to correctly identify 99.98% of individuals in anonymised data sets with just 15 demographic attributes. The conclusion even claims that it’s impossible to protect personal information against re-identification by current methods of data “anonymisation”. According to the study’s abstract:
Our results suggest that even heavily sampled anonymized datasets are unlikely to satisfy the modern standards for anonymization set forth by GDPR and seriously challenge the technical and legal adequacy of the de-identification release-and-forget model.
Having that on mind, the only safely stored data is no data.
DuckDuckGo – privacy smoke and mirrors?
DuckDuckGo might be the most famous “privacy first” search engine. If you search for the most private search engines, this is what’s Google most likely to spit out. For some reason, most of those “Top 10 Search Engines For 2021” articles tend to show a great deal of love for the US-based DuckDuckGo founded by a guy with a rather shady background.
Yes, you’re reading it correctly. Before he founded DuckDuckGo, Gabriel Weinberg used to run now defunct social network called Names Database, which he sold along with meticulously collected user data to another company called Classmate.com for about $10 million.. From the beginning, Names Database abused the privacy of its users. To gain access you could either pay money, or submit a lot of emails of other people, which Names Database was collecting.
Afterwards, Mr. Weinberg came up with the idea to compete with Google, as a response to privacy awareness triggered mainly by Snowden’s revelations.
What troubles me is the timing that suggests Weinberg’s intentions. He never really cared about privacy. For instance, why would he generate income from showing Bing Ads and collect affiliate revenue from privacy violators like Amazon and eBay?
His privacy-based marketing, which helped him successfully promote DuckDuckGo came to life after the public became more privacy aware and this helped him sell his product. Disclosure of NSA’s PRISM programme apparently improved privacy awareness, which is a good thing (or at least something). But honestly, is DuckDuckGo the right answer?
Outside of Weinberg’s dubious history, DuckDuckGo’s technology raises concerns in itself. For example, DuckDuckGo is advertised as open source software, but its core is proprietary. Thus, no one can see the whole source code and we have to take the word of a guy who has a history of privacy abuse. The open source nature of DuckDuckGo, however, is just like the well-crafted privacy illusion, yet another sales bullet of Mr. Weinberg. For example, in 2018 DuckDuckGo took part in FOSDEM conference (Free and Open source Software Developers’ European Meeting) and this is how misleadingly was DuckDuckGo introduced:
Privacy was THE hot topic of 2018, but at DuckDuckGo we’ve been raising the standard of trust online since 2008. Find out why privacy is important, how we’re helping raise the standard of trust online, and the part that Open Source plays in our mission.
…What search engines often do is store a unique identifier in your browser and then associate that identifier with your searches. At DuckDuckGo, no cookies are used by default.
DuckDuckGo is using the Canvas DOMRect API on their search engine. Canvas is used to make unique geometry measurements on target browsers, and DOMRect API uses rectangles. This can be verified with the CanvasBlocker Firefox add-on by Korbinian Kapsner.
DuckDuckGo, of course, denies such claims.
Qwant – Microsoft and Axel Springer? No, thanks.
Qwant.com represents yet another comparatively popular alternative among privacy concerned community. France-based search engine, which claims to be privacy oriented has its servers on the other end of Atlantic, far away from NSA’s prying eyes. Furthermore, as any other EU-based company, Qwant is subject to GDPR and ePrivacy Directive, altogether the world’s most strict privacy regulation.
- The entered keywords
- Information about the browser you use (the User Agent)
- Session preferences information (if you use Qwant set up for the results in France and with the user interface in the French language, for example)
- A salted hash of the user’s IP address with the salt that changes ever three months at the latest (i.e. the result of a mathematical formula based on this IP address, not the IP address itself)
- The approximate geographic area of origin of the search at the scale of a region or a city (as deduced from the IP address)
Keywords associated with a pseudonym identifier which is calculated from the User Agent of the browser and the salted hash of the IP address is retained for 7 days. Afterwards respective keywords are no longer associated with any identifier and are retained for an aggregate statistical purposes for 12 months. This is just the start though of Qwant tracking.
Qwant is open about sending data to partnered Microsoft for “more relevant results”, including pseudonymous data like:
- The keywords of the search
- Information about the browser you are using (the User Agent)
- The first three bytes of your IP address
- The approximate geographical area from which the search originated, at the level of a region or city
- The salt hash generated from your IP address, your User Agent and a salt that changes at the latest every 3 months
- A random token generated by Qwant (aimed at limiting data overlap)
Enough Microsoft? Not really. Qwant and Microsoft partnership appears to be rather deep and devoted.
So, is Qwant the right answer to privacy concerned user’s needs? No way, not with these red flags.
Quite outside of privacy issues, our creative director Alec Kinnear made a six-week intensive trial as his main search engine in 2020 and reports that Qwant’s search results disappointed him enough to make Qwant unusable as a primary search engine. This is before we did the privacy research on Qwant and the other search engines, so his impressions were not coloured by politics.
After succeeding in delivery of privacy oriented browser, Brave Software Inc. also developed their own search engine, which logically comes as next step after you have a thriving browser launched. Brave is well known for their vocal boast about privacy and transparency. Now, that they have their own search engine released, one would hope for a suitable alternative to major spywares like Google, Bing, Yahoo or Yandex… But can we trust them?
Brave Search is designed to be private by default. We don’t collect personal information about you, your device, or your searches. We also don’t transmit information to the web that could be used to profile you or track you or learn anything about you. Your searches are private to YOU.
Introduced like this, one would hope for more privacy than their competition grants. But is it really so idealistic?
Doing my research on Brave Search I failed to find anything nearly as compromising as in case of many other search engines. The only issue might be the opt out nature of the usage metrics. If you don’t opt-out, Brave collects:
- Number of daily/weekly/monthly visits
- Number of returning visits
- Number of search queries per day
- Average query length
- What percentage of queries led to a user clicking a search result
- How many users have chosen to leave feedback about Brave Search
- The operating systems people use when they visit (e.g. macOS, Windows, etc)
- The browser you’ve visited from (e.g. Brave, Chrome, Safari, etc)
Even more importantly, it is possible to opt-out of Brave tracking.
Lastly, unlike metasearch engines, no matter how privacy focused, Brave Search has its own index and is able to deliver solid results, which Mojeek, for instance, fails at.
Ecosia, aka Microsoft in green disguise?
Ecosia is truly an ambitious project. Based in Germany, their business model includes support for tree planting, which consists of 80% of their profit (47,1% of revenue). Privacy and ecology, that is their goal in a nutshell. But are there any downsides to it?
Ok, Ecosia doesn’t tie my searches with unique ID by default anymore, what a relief! On the other hand, as in case of any other search engine that relies on results of a partnered company (mostly Google, Microsoft or Yahoo!), Ecosia is no exception in sending some data to the main search engine it takes results (and ads) from. And that is a problem. No matter how privacy friendly they are, their dependence on Bing results undermines one of their pivotal ideas – privacy.
We don’t create personal profiles of you based on your search history. We actually anonymize all searches within one week.
Does it mean that before the end of the week searches are NOT anonymised?
We do work with third parties to answer your search requests. To some of our partners we don’t send your search query or any other information, like the IP address. For example, for our weather results we only send the location we are requesting weather information for. To other partners we have to forward more details in order to answer your search request.
For example, when you do a search on Ecosia we forward the following information to our partner, Bing: IP address (obfuscated), user agent string, search term, and some settings like your country and language setting. We never communicate IP addresses along with search queries. We only send IP addresses to Microsoft in obfuscated form, meaning we remove parts of the IP address when we sent it.
This liberates the search user from most of Bing tracking so it’s a huge step forward in privacy. There is however some privacy issues which arise from Ecosia’s usage of third-party tools.
Ecosia does use third party tools on non-search pages where absolutely necessary to help us understand and tune our marketing campaigns. For example, we may run an ad campaign on a social media site that requires us to share some data about user activity in order to help us understand how effective that campaign is. This helps us to prevent spending money on unsuccessful advertising campaigns that could instead be used to plant trees.
Right, tree planting is important, but it shouldn’t serve as an excuse for sharing data with ad companies. Again, there is a silver lining here. Ecosia states they don’t use external analytics services, like Google Analytics nor any other third party trackers. Still, Ecosia do use their own internal tools for analytics:
We collect data and do statistical analysis to understand user behavior and trends, how people use our services, and to monitor, troubleshoot, and improve Ecosia.
As mentioned earlier, many online sources claim that Ecosia defaults to creating your Bing Client_ID and sends it straight to the Microsoft servers, but this information is only partially true. Ecosia indeed creates your Bing specific Client_ID parameter, but only if personalised search is enabled and “Do Not Track” disabled in your browser settings. I contacted Ecosia’s support team for clarification. When Ecosia finally replied, they were consequent enough to send a precise answer:
Thanks for reaching out and for your patience! Ecosia does care about privacy and we are always willing to improve. About your questions:
Yes, the policy was recently updated, the date of the update can be seen at the end of the policy (8th November 2021).
The bing ID was enabled before, but is now disabled by default. So it was opt out before and is now opt in. We are serving results from Yahoo to a small percentage of users in some markets in South America and Asia. We are attributing, where the search results come from and are testing with Yahoo in these markets. We do have to send the whole IP Address to Yahoo.
We hope this answers your questions, feel free to get back to us if needed.
(name redacted) from Ecosia team
That’s great news that Ecosia has cut back on the personal data they send to Bing but it’s a disappointment to learn Ecosia send full IP address to Yahoo in “certain regions”.
Originally, I was keen to use Ecosia to support their tree planting mission (approximately every 45th search generates revenue for a tree to be planted). But there’s a twist to it. As described in Jasmine Owens’ article, money for tree planting come from Bing ads revenue, which are based on a pay-per-click advertising model. Thus, if you’re like me, unconditionally blocking all the ads, your searches won’t help Ecosia with tree planting at all. So the only things that remain are Bing results and some data sent over to Microsoft, Cloudflare, and potentially Yahoo. In this case, there’s no trees planted and not much privacy.
Metager – IP addresses logged but just for 96 hours
AGPL-3 (AFFERO GENERAL PUBLIC LICENSE) free software belonging to non-profit German organization SUMA-EV, MetaGer is a metasearch which takes results from as many as about 50 various search engines – to protect against censorship as they claim. It sounds certainly like a viable option in a world of privacy compromising search engines, so it calls for a closer look.
To protect our service from congestion, we need to limit the number of search queries per Internet connection. For this purpose alone, we store the full IP address and a timestamp for a maximum of 96 hours. If a noticeable number of searches are performed by an IP, this IP is temporarily stored in a revocation list (maximum 96 hours after the last search). Then the IP is deleted.
Furthermore, when using MetaGer web search engine via the web form or through the OpenSearch interface, the following data is generated (not shared):
- IP address
- entered search query
- user agent name
- location data
In the next paragraph they claim that truncated IP address (first two blocks only) in connection with a user agent is also sent to their advertising partners to finance their operations, as they say. By visiting their website, yet another load of data stored for a week is transmitted to SUMA-EV, like your IP-Address, name and URL of the retrieved file, date and time of access, the referrer you sent, and the user agent you sent.
Once again, I wasn’t happy to learn that they collect full IP address, so I approached their support team. Here’s MetaGer’s answer:
Hello Pavel S,
We did not store any IP-Adresses a while ago until heavy attacks targeted our services. We could not just ignore those attacks for various reasons and thus needed to be able to stop the attackers.
To block requests from attackers there is no other way than blocking their IP-Adresses and thus storing them temporarily. To distinguish attackers from users all Adresses have to be stored temporarily. The adresses from users however get deleted again within seconds or minutes automatically and by design cannot be connected to specific searches or other potentially identifying metadata.
If there was another way around that problem we would go it but we cannot think of one. If you have suggestions we are happy to hear them.
Sincerely Yours, (name redacted)
This is a reasonable position.
Startpage – bought out by an internet advertising technology company
…in order to enable the prevention of click fraud, some non-identifying system information is shared, but because we never share personal information or information that could uniquely identify you, the ads we display are not connected to any individual user.
On the other hand, a potential problem is represented by the ownership. When Startpage was in sole ownership of Dutch company Surfboard Holding B.V., their self-proclaimed privacy mission was credible. Startpage’s current owner is infamous advertising company System1. System1 is known for buying up privacy oriented browsers and search engines as well as for for data collection and targeted advertising. System1 are self-proclaimed “pioneers in behavioral marketing science”. According to the official statement, System1 has now majority ownership in Startpage:
Startpage BV is owned by Surfboard Holding B.V.— both are Dutch companies and continue to be managed by Startpage’s founders. Besides the stake that System1 acquired through the Privacy One Group (its wholly owned subsidiary), the original founders remain shareholders in Surfboard Holding B.V. System1 has majority ownership of Startpage, although as noted above, the Startpage founders have control over the privacy components of Startpage.
Of course, it’s no surprise Michael Blend who’s System1’s chairman and co-founder is also now on the Surfboard Holding Board of Directors. How will System1 and Blend personally affect Startpage privacy standards? All parties involved are rather tight-lipped, offering information of barely any value, just very formal, neutral statements. Is this a real issue? Make up your own mind, but it’s unlikely that a well-known advertising company would buy a search engine and not take advantage of the data harvesting potential of their new acquisition.
Startpage has made a public statement to computing.co.uk:
“System1 is interested in Startpage’s ad revenue, not its data,” the company said. “The reason a company like System1 openly owns other search engines and consumer tech products like Info.com and Mapquest is that they want to capture that ad revenue that is slowly shifting to private search engines. There has been a steady increase in people using private search engines and therefore a steady increase in their revenue. It is a growing market that they feel will continue to thrive and grow.”
Is this explanation credible? Given System1’s history of targeted advertising probably not.
The original independent Startpage was a very good alternative search engine for the privacy-focused individual. We’ll see what the future brings for their “privacy mission”. After all, Startpage claims first place among System1’s brands.
Mojeek, the one with its very own (trash) results
Mojeek is an interesting search engine, since it’s one of few privacy-focused search engines that relies entirely on its own crawlers and doesn’t take search results from the third party. The benefit of having their own index is obvious because no matter how anti-Google or anti-Microsoft other search engines claim to be, it’s most likely Google or Bing they take their results from (or ads), and there are exactly these two companies to which they send your data to obtain the results.
- the time of visit
- page requested
- referral data
- IP address is replaced with two letter code, indicating your country
- browser information in separate log
What might be concerning is that no time is specified, so it may suggest that Mojeek retains this information for indefinite period of time. But speculations aside, Mojeek really takes minimum of personal data and doesn’t share it with third parties. Furthermore, it’s one of few alternatives to “big players” which has its own results. Sadly, search results are not good enough to make Mojeek a viable alternative just now.
Hailing from Swiss Alps, Swisscows is relatively young, privacy first alternative. They claim not to collect any data, moreover, they are supposed to be against surveillance on principle – at least it’s one of their pivotal salespoints. Swisscows even employs their own attorneys who consider the legitimacy of potential government surveillance interceptions.
Nice to know, but how about the “limited information” Swisscows collect?
Swisscows looks like a great option, but only at first glance. The problem is with Swisscows advertising model.
Swisscows takes ads from Bing based on your search queries, which are sent to Bing – thus, Microsoft has access to whatever you enter in the search bar.
Another alarming issue with Swisscows, same as in case of Ecosia, is that they anonymise whatever data they collect after 7 days. What’s going on during the period of unanonymised week? No one knows. On the good note, by the time your data like obfuscated IP address, search terms, user agent with no unique ID and the setting for the search area is sent to their partners, Swisscows claims the data is carefully anonymised.
On the other hand, Swisscows’ servers (which contain also your non-anonymous data for a week) are based in Switzerland. While they pride themselves in being located out of The Five Eyes countries or even EU, Switzerland has surveillance issues of its own, despite nominal neutrality and no international intelligence alliance agreements.
After Snowden’s revelations of mass surveillance in 2013, many tech companies moved to Switzerland to reduce the risk of being exploited by the governments. However, in 2015 a new law backed by the referendum was passed, which strengthens the position of Swiss SRC spy agency:
The law expands the surveillance capabilities of the Swiss SRC spy agency to give them the power to lawfully hack into computers and install malware, tap phones and internet comms and install hidden cameras and bugs in private locations to gather data.
After the passing of this new legislation, and with the great pressure the US has been able to bring to bear on Swiss private banking laws, how safe is data in Switzerland? While the government claims to enforce this law only in extreme cases, even the enormously wealthy Swiss banks were unable to maintain their independence and privacy under US pressure. Are Swiss security organs any more independent and reliable?
Combined with the partnership with Bing and the attendant mediocre results, Swisscows neither offers much innovation nor a real guarantee of privacy.
As we can see, even most of the search engines which base their marketing on privacy are exploiting our privacy. Search engines could be divided into three general groups.
- Google, Bing and Yandex who offer first tier results
- other mostly metasearch engines which take their results from the above: Yahoo (Bing), Ecosia (Bing), Startpage (Google), Swisscows (Bing), Metager (multiple sources)
- niche engines like DuckDuckGo, Qwant, Brave Search, and Mojeek
Before I started the in-depth research for this article I favoured DuckDuckGo for their privacy stance, Qwant for their European-ness, and especially Ecosia for their commitment to ecology. After what I discovered during my research, I’m a lot more sceptical. On a positive note, Brave Search seems the most solid choice. Brave’s claims about search privacy are backed up by Brave’s browser privacy claims which I was able to test for an earlier article. Brave’s independent index, own crawlers and reasonable results have persuaded me to set my default search engine to Brave.
As a Brave browser user, I do worry that whatever little data Brave Software’s browser and search engine collect, go to the servers of the very same company, so the eventual re-identification process would be easier than if I split my data between two separate companies. Ideally Brave would not be headquartered in a Five Eyes country, let alone the United States. While Europe-based Ecosia and Swisscows would be good alternatives but both share too much data with Bing and Microsoft to be considered private. Startpage could pass as a nice alternative with clean Google results and no heavy tracking, but System1 has given the world little reason to trust them with privacy.