How to Protect Against Spam and Not Bother Your Users

   —   

How do you protect a website from spam and bots? That’s likely a question most web developers have needed to solve at least once in their lives. The most common solution is to use CAPTCHA, but is this approach really the right one for your problem? Can you use something better to improve the user experience and security on your websites? In this article, you will see how we dealt with this challenge on our own Kentico site.

What is CAPTCHA?

CAPTCHA (or Completely Automated Public Turing Tests to Tell Computers and Humans Apart) is a tool to distinguish users from computers (e.g., spambots) by completing tasks that are easy to solve for users but difficult for computers.

These tasks can have various forms:

  • Rewriting text from distorted image—this is the original form of CAPTCHA, but it’s not very
    user-friendly, especially to visually-impaired users.
  • Question and answers—you have to answer easy questions (e.g., http://textcaptcha.com/).
  • Identification of objects in a picture—you need to find occurrences of certain objects in pictures (e.g., https://www.google.com/recaptcha).
  • Mathematical questions—you need to solve equations (e.g., http://www.freemathcaptcha.com/).
  • Games—this is based on finishing small tasks, such as rotating an image when it’s not straight, that shouldn’t be easy for computers (e.g., https://www.funcaptcha.com/).

The list goes on. CAPTCHA is basically a standard tool to deal with spam and bots. But does it have any disadvantages?

Disadvantages of CAPTCHA

Advantages of CAPTCHA are very clear: It’s easy to setup, almost no maintenance is required, and people have gotten used to it (although they don’t love it). But CAPTCHA also has many obvious (and some less obvious) disadvantages:

  • Some people have problems deciphering them. According to research from Stanford University, people have a really hard time deciphering CAPTCHA codes. It took an average of 9.8 seconds (with a 71% success rate) to view and solve text-based CAPTCHAs and an average of 28.4 seconds to solve audio CAPTCHAs (with only a 31.2% success rate).
     
  • CAPTCHAs have already been bypassed multiple times in a past (examples can be found here and here), and there are tools that make it possible for technology to beat CAPTCHA in the future. Mathematical CAPTCHAs can be cracked with the help of WolframAlpha, text-based CAPTCHAs can be broken with some advanced OCR, and so on.
     
  • There’s a chance CAPTCHA can create security issues on your website. Mitre’s Common Vulnerabilities and Exposures database was tracking 43 vulnerabilities (as of 3 July 2017) discovered between 2005 and 2017. If these flaws go unfixed, attackers can exploit the system with, for example, cross-site scripting and SQL injections.
     
  • CAPTCHA slows users and can discourage them from converting. According to this article, CAPTCHA could lower your conversion rate by 3.2%. A study by Distil Network found that it can affect your site even more—12% of people participating in the research were discouraged from converting and left the page. When users were on a mobile device, that number increased significantly to 27%.      

As you can see, CAPTCHA is becoming a bigger problem for users than for computers. You must be wondering—can we do it better?

Alternatives to CAPTCHA

There are numerous alternatives to CAPTCHA, just few of which I’ll talk about:

  • Logging in via already-created account—for instance, Facebook, Twitter, and LinkedIn offer recognizable identities as a service
    Advantages: easy to use and to implement
    Disadvantages: possible privacy concerns of users
     
  • Blocking registrations from suspicious IP addresses on the server side using services such as the BlackList DataBase—services have blacklists of IP addresses that are created automatically by spam activity or manually by reporting from users
    Advantages: a robust solution
    Disadvantages: slower than client-side validation and is vulnerable to errors, as blacklists may contain incomplete or incorrect information
     
  • Honeypot method—the principle of this method consists of a fake field visible only to bots, hidden from users by CSS, and if the field is empty upon form submission, it’s likely a user; otherwise, it’s a bot
    Advantages: an easy-to-implement solution
    Disadvantages: users with password managers might have problems with Autofill, or problems can arise if a user has disabled CSS
     
  • Tracking time from page load until form submission—real users need some time to scroll down a page and submit a form, while bots fill forms instantly
    Advantages: easy to implement and can be a really good complementary solution
    Disadvantages: users with password managers might have problems with Autofill

Solution on Kentico.com

CAPTCHA on Kentico.com was implemented for the aforementioned reason: to reduce and prevent spam attacks. It seems to work just fine, but we have started to observe that some users have problems with displaying CAPTCHAs. These problems occurred mostly on mobile devices or with low-speed internet connections, which made us think about whether or not we still want to bother users with CAPTCHA.

On the other hand, it’s really hard to secure your website only with these alternative approaches since they are mostly focused on client-side validation, and that can be bypassed (e.g., JavaScript can be disabled, Cookies can be changed, etc.).

For the implementation, we chose a combination of the Honeypot method, CAPTCHA (we are using the form control new reCAPTCHA), and server-side validation. CAPTCHAs are hidden and not required by default, and we have a fake address field as a honeypot. If the fake address is empty upon form submission, we can assume it’s a real user, so we allow them to submit the form without needing to solve a CAPTCHA. If the fake address field is not empty, a CAPTCHA is displayed and the user has to solve it.

How to set up a fake address field

fake-address-field.png

How to set up the CAPTCHA field

captcha-field.png

How to implement the server-side validation

A smart bot can bypass this client-side validation if it doesn’t fill the fake address field, which is why we also have a server-side check for repeating form submissions (we work with the assumption that a bot tries to submit forms repeatedly). If the previous submission was from the same user, we won’t count the submission and instead show an error message.

/// <summary> /// OnBeforeSave event handler /// </summary> void viewBiz_OnBeforeSave(object sender, EventArgs e) { BizFormInfo formInfo; DataClassInfo formClass; string formName = viewBiz.FormName; string registrationEmail = viewBiz.GetFieldValue("Email") as string; formInfo = BizFormInfoProvider.GetBizFormInfo(formName, KenticoHelper.KENTICOCOM_SITENAME); var lastItem = GetLastSubmittedFormItem(formInfo); // Check if last inserted item was submitted by same user (by email) if ((lastItem != null) && (lastItem.GetStringValue("Email", String.Empty).Equals(registrationEmail, StringComparison.OrdinalIgnoreCase))) { // If any new value was specified update original item and send notification email with data from form if (UpdateOriginalItemWithFormValues(lastItem, viewBiz) && viewBiz.EnableNotificationEmail) { viewBiz.SendNotificationEmail(registrationEmail, formInfo.FormSendToEmail, lastItem, formInfo); } // Don't save current item just redirect to 'thank you' page viewBiz.StopProcessing = true; string redirectToUrl = ContextResolver.ResolveMacros(formInfo.FormRedirectToUrl); Response.Redirect(URLHelper.ResolveUrl(redirectToUrl)); } } /// <summary> /// Returns last submitted form item /// </summary> private BizFormItem GetLastSubmittedFormItem(BizFormInfo formInfo) { var formClass = DataClassInfoProvider.GetDataClassInfo(formInfo.FormClassID); var lastItemQuery = BizFormItemProvider.GetItems(formClass.ClassName). TopN(1). OrderByDescending("FormInserted"); return lastItemQuery.FirstOrDefault(); }

Conclusion

CAPTCHA is a good tool for protecting yourself from spambots, but you have to use it wisely so it doesn’t discourage your users from converting. As you can see from our real example, you can combine a few easy-to-implement solutions, and they can serve you just as well. What is your opinion on CAPTCHA? Do you have any interesting alternatives you want to share with us? Don’t hesitate to leave us a comment.

 
Share this article on   LinkedIn