Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

CAPTCHAs are overused because of groupthink and fashions/fads. Before you use a CAPTCHA of any kind, consider very carefully if you really need one.

I've seen this a number of times in design meetings: someone will say "oh, an account registration form, we will of course need a CAPTCHA there", everyone will nod their heads and move on. In reality, in most of those cases, no one will ever conceivably even try to automate/script the thing being designed.



Thought the same, had a pleasant signup form for a small SaaS platform nobody really knows about, with no captcha. Then someone or some group found it and there's been a barrage of attacks varying in intensity, vectors etc. Cost us so much money in vendor costs the small company is now in danger of going bankrupt.

I appreciate the sentiment, as I had it, but rest assured any future publicly accessible form I build will get at least a CAPTCHA in front of it.


I have a bunch of publicly accessible forms and none of them have captchas.

I did once run into an issue where a signup form was abused by a spammer, but that was a simple fix (tip: in verification emails, do not include any information that the user typed in the form).

If you are careful with your forms, you don't need captchas. Captchas add a lot of friction for some users, so if they can be avoided, they should be.


Many captchas add friction for some users, but some types don't; there are relatively fast "proof of work" captchas that aren't surfaced to the user at all.


CAPTCHA: Completely Automated Public Turing test to tell Computers and Humans Apart

Proof of work isn't a CAPTCHA.


can you explain what you mean by that tip? was this spammer using your verification emails to send spam or something?

or was it more complicated, like not needing to store which fake account had which details?


The registration form had a name and an email, and I sent a message similar to the following:

Hi <name>, thank you for signing up...

The spammers put their spam message in the name field, so my server started sending messages like this:

Hi Get free cialis now http://example.com, thank you for signing up...


A long time ago, I was still in college (UK college, i.e., pre-university), and still learning.

I discovered a classmate was involved in some event, and found the event's website. They didn't have a captcha. By your logic, this was the right choice.

In reality, my dumb ass decided it would be fun to script something that would register millions of users (another classmate ran the script with me). After a few hundred thousand registration, the website was brought to its knees. I was a bit shook, but didn't think much of it.

Next morning I come into class, and was reprimanded by my teacher. Turns out, the owner of said event had threatened to sue the school and me, among other things. What had happened was their servers were down, their email server was brought to its knees, their web servers had died, and generally I had caused a lot of damage without even thinking about it. It caused them to potentially lose some money. None of this was my intention, of course, but I didn't know much better.

Point is, kids will kid, and spammers will spam. There are plenty of bots that just scrape the internet and fill out forms indiscriminately.

Captcha may or may not be the best option here (I'm always of the opinion it's not, especially not reCAPTCHA), but something has to be put in place, even if to stop the majority of bad actors.


you can also just limit the amount of sign ups from one IP each day. There's more simple heuristics to prevent unsophisticated abuse like that


You can, but then you discover that places like Bangladesh and Cambodia, that do a fair bit of freelance work on the 'net use a surprisingly tiny number of IPv4 addresses to do it.

For lots of these countries their total allocation of IPv4 addresses is < 20 per 1000 people and the nature of their access (through glorified internet cafes) mean that you will have some IP addresses that really are totally legit, yet have LOTS of users.

One size fits all is very dangerous on the Internet.


How is the IPv6 roll-out over there?

One the one hand, I assume bad due to cheap equipment. On the other, it's not like v6 addresses are expensive and you need some way of addressing every subscriber anyway. As more people sign up (as the country gets more people with internet access), you need more equipment which could support v6 out of the box, and the excuse for CGNAT I've always heard is old equipment that is harder to upgrade than to put a NAT router in front of. Could go either way from my POV.

If the roll-out is good, then all those people are already taken care of and the minority left on v4 CGNAT aren't bothered by the collective rate limit.

(To preempt the eventual remark that users can generate a billion addresses in v6: rate limiting on v6 works by limiting whatever prefix the ISP gives out to subscribers, like /56, not individual addresses the way it's often done with v4.)

As an aside, it should also be kept in mind that not every use case involves signing entire countries up for their service, even in an ideal case.


To give another example, in spain most mobile carriers will place everyone behind a cgnat with no ipv6.

In fiber some do the same, although thankfuly most place v4 behind a cgnat while offering ipv6.

The whole 1 ip 1 user even if dynamic quite false and is a mess.


That has been my experience on any mobile network, also in 2007 or so when v4 addresses were still available (because my 15-year-old self wanted to seed torrents with my unlimited data bundle ...on GPRS). It's a fair point that one has to consider this part of the market, though I was primarily thinking of wired connections.


It isn't good and purely being on IPv6 is still a terrible web experience in any event. Huge % of major websites don't properly support IPv6 yet. It's ridiculous.


I know, but this is about hosting a service, not about trying to use existing services that got v4 addresses before it was cool


IPv6 doesn’t address disambiguating people using public computers at places like Internet cafes.


Is your site even relevant to Bangladesh and Cambodia?

If you're collecting sign-up data for something local, then most likely not.


Yes.

FWIW, I learnt about this the hard way.


Yes if it is relevant then for all means make it work for them


No way. In the B2B world at least, I expect hundreds of users coming from behind the same corporate proxies.


> you can also just limit the amount of sign ups from one IP each day

This is a classic example of how "just do this" kind of thinking can lead to terrible results.

Do you now see how "just limiting the sign ups from one IP each day" can go very very wrong?


What you could do is, use both. One sign up from each IP per day before you get a CAPTCHA. Then you're not subjecting 99% of your users to training Google's AI for free but the people at a cafe in Bangladesh can still sign up.


This sounds like extra work to solve the problem you said didn't exist.


It's extra work because it's better. You're not subjecting 99% of your users to training Google's AI for free.


> limit the amount of sign ups from one IP each day

one per library per day...


one per coworking space, one per office location for each company


Life as a developer has taught me to take the other side of your argument. I'd disagree on this.

Once you release something to the wild you need to have robust controls in place to prevent one person or group of people from using all your resources.

I wouldn't release a product that doesn't have rate limiting of some kind, of which a captcha is one way to rate limit.

Always trust people to push the boundaries of your app as far as they possibly can. I have yet to build a system where someone doesn't. And that includes tools I've built for inhouse users:(

Whether intentionally or not, they always find a way to push the boundaries:)


> no one will ever conceivably even try to automate/script the thing being designed.

Spammers will spam everywhere they can. My minuscule personal site suffers from it very rarely, but I can imagine anyone getting a lot of page views making it worth it.


On my custom built site I have none of those. But, on my WordPress site, I had to install captcha the second days. Spammers are just using scripts, which cost next to nothing...


If a site gets even a little popular it attracts spam and security scanners and other nonsense no matter how it's built.


I don't know. I run a SaaS that allows free user signup and significantly more than 50% of my daily signups are just signup "spam", without any visible motivation for doing so. The user name or information doesn't show up anywhere publicly and there is no inherent value in having a free user account. I've implemented some basic countermeasures (dummy form fields which reject the submission) which wasn't enough. I've added reCaptcha, and I'm still getting 50% spam signups from working (!) gmail addresses, meaning someone is able to receive emails on these. The majority of these are from places like India, Bangladesh, Vietnam, etc.

I don't event want to know what my site would look like without my own countermeasures + reCaptcha + if it was a service where a user account has any kind of "value"...


Is there a particular problem if someone signs up for an account on your system and doesn't use it?

Is such an account using a lot of resources?


Blocking others from signing up or using the username they'd prefer. For example, the bad actor could spam the registration with valid email addresses and depending on how your system handles registration it could either send validation emails to those addresses or block the person that owns the address from signing up.


It doesn't use up too many resources, but I still don't like having more than 50% of the user database be essentially spam. If you ever want to sell your company you want to have somewhat accurate numbers of actual user registrations. Ever since I realised the extent of the issue I have become very doubtful about reported user counts from startups. In our case the only reason we've realised this is spam is because we annotate signups with GeoIP and also allow users to fill in a name and title which will then be something like "find escorts in Chennai" and the signup will be from India etc. If you only look at the email addresses all you'll see are many gmail addresses with western names, so you might be fooled into thinking that all of these are legitimate users.


Is it possible that the emails are real, non-spammers and the spammers are abusing your registration form to send a short message entered in the name field to those emails like this person described? https://news.ycombinator.com/item?id=36446532


Well, you can't use raw signup numbers as a tracker of how well you are doing?


CAPTCHA on registration page removed quite a bit of automated registrations. What are other options to prevent/reduce automated registrations? (one from top of my head email/phone verifications)


hidden fields will remove most of the non targeted attacks.

And if they really are targeted, I don't think CAPTCHA will help much.


Google reCAPTCHA v3 has worked pretty well for us, we saw many instances in our logs where v2 was solved by scammers on our site using some automated tools that had a 25% success rate (plenty for automated scammer scripts) but upgrading to v3 stopped all the automated attacks. So far we haven’t seen successful solves of v3 in the wild and we’re a payments company so we see a lot of attempts.


We use Auth0 which determines when to show a captcha, I think "Smarter Captcha" should be the industry standard. If you don't suspect the end-user being a bad actor, why show them a captcha every time. In fact, Google's Captcha is awful for literally almost always showing it, tells you they dont care about stopping bots, only the data they get from user inputs.

Edit: And come to think of it, A TON of websites do "smarter captcha" or whatever you want to call it, because in one of my computer I enabled the resist fingerprinting setting on Firefox, and I get a captcha every visit on some sites that NEVER show a captcha (I think it might be cloudflare driven, but unsure). Like Walmart comes to mind, it shows me a pill looking thing where I have to hold the mouse click until it fills.


It took me one incidence to turn from "no one will ever conceivably even try to..." to "everyone will nod their heads and move on"


Years ago I had a blood test taken at a local pathology place, the form they were submitting had a CAPTCHA and pictures they were given weren't easy by any means. I'm talking the kind of stuff you get trying to go to google.com on Tor browser.

As far as I could tell this was an internal form that wasn't publicly accessible


> In reality, in most of those cases, no one will ever conceivably even try to automate/script the thing being designed.

There are more than enough people running automated crawlers, probably fed from Google "inurl: contact-form" searches or whatever, and just blanket spam you.


We ignored them until we needed them. Then we needed them.


This is in line with my experience as well. For most sites, CAPTCHAs are overkill and an accessibility problem. Hidden honeypot, maybe a simple “How much is 5 + 2” keeps 99% of spam out. I had a few more difficult cases, which were solved by blocking some geographic IP regions and adding blacklists for certain words, like “crypto” for example.


I'm not an expert on honeypot inputs but wouldn't it be super easy to check for type=hidden or opacity=0 if you'd like to spam?


Yes, but most bots don't. There are also some more elaborate methods. Giving the input a tabindex="-1", aria-hidden="true" and then moving it left: -100vh works pretty well.


Remember that Google reCAPTCHA v3 is invisible to the user. No accessibility issues.


It's invisible until it isn't.


Hows that?


If you writing your own account registration form instead of using something off-the-shelf that provides captcha service for you, or even better are just using an oAuth or similar technology so users don't have to manage yet-another-password? I already hate you.


Spam is ever present, and Captchas protect from the massive torrent of trash.


i had to put a CAPTCHA system on a public register form for digital libraries, because they were getting spammed by bots.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: