Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This blog post is written that way because the guy works in the bot detection business so it's what he cares most about.

But there are still plenty of legitimate use cases for wanting a headless browser that perfectly replicates a normal browser environment. The obvious ones are automated frontend testing tools like https://playwright.dev/



Exactly. And as the blog post mentioned, people who have a strong need to block bots have tools other than browser fingerprinting at their disposal. Quoth the post:

> It’s important to leverage other signals such as:

>

> * Behavior (client-side and server-side)

> * Different kinds of reputations (IP, sessions, user)

> * Proxy detection, in particular, residential proxy detection

> * Contextual information: time of the day, country, etc

> * TLS fingerprinting.

Having a headless browser that behaves exactly like a normal one is tremendously useful for making things. And people who really *need* to block bots also need to contend with "mechanical turk" style attackers anyway. These techniques are also very useful against that approach, which still may be cheaper than making an undetectable bot even with a near-perfect Chrome fingerprint available headless.


> * Behavior (client-side and server-side)

Imagine count of false positives.

> * Different kinds of reputations (IP, sessions, user)

Almost all use mobile network right now. One IP can be sticked to thousands of users. Imagine count of false positives.

> * Proxy detection, in particular, residential proxy detection

Most residential proxy are just common ISP ips bought by a face or led by botnet. Imagine false positives of simple home users that are IP ranged like on 4chan.

> * Contextual information: time of the day, country, etc

script.execute(() => navigator.dateOffset = Math.random()...) script.execute(() => navigator.country = Math.random()...) script.execute(() => navigator.etc = Math.random()...)

> * TLS fingerprinting.

Imagine count of false positives, especially because there are 4 common tls fingerprints across browsers.

Just cope and seethe that your antispam filters will never work, antibot measures are fail. Cloudflare turnstile is fail. Bots won as usual.


We use a headless browser to load an internal webpage (with content that may be updated several times per day) and generate a pdf on-demand.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: