How uBlacklist Works
V2EX recently discussed the wave of content-farm results polluting Google (e.g., “Little X Knowledge”). Since we moved from Baidu to Google for better quality, seeing garbage in Google results is frustrating.
The community solution: install uBlacklist, subscribe to a shared blacklist repo, and block those domains client-side. It works surprisingly well.
Curious about the implementation, I read the source code. The core idea is simple: intercept the search results page, match entries against the blacklist, and hide them via CSS. In other words, the extension modifies the HTML after the page loads.
Limitation: paginated results may show different counts per page because some entries get removed—but that’s acceptable compared with seeing spam.
What About Baidu?
This approach requires result URLs to be present in the DOM. Baidu proxies all search results—links point to Baidu first—so uBlacklist can’t reliably match domains.
In theory, you could fetch the proxied URL and inspect the redirect. But out of the box, uBlacklist currently supports Google, Bing, DuckDuckGo, Ecosia (partially), and Startpage.com.
Final Thoughts
- Thanks to https://github.com/iorate/uBlacklist for cleaning up search results.
- It’d be nice if Google proactively filtered these sites.
- Baidu’s proxying of search results keeps traffic under its control. Yet another reason to avoid it.