Interaction Design for Account Security

Good engineering organizations generally do a good job of managing the technical aspect of account security: the use of SSL, password hashing, encryption, etc. By contrast, the human interaction element of account security is often a relative afterthought, and one for which the technical training that security specialists have received may not have prepared them. For instance, I would argue that approaches to security that are driven only by technical staff are often too quick to put accounts into total “lockdown” mode following unsuccessful login attempts, which in addition to being irritating to a legitimate user who is having trouble with her password, also makes everyone’s account access vulnerable to extremely unsophisticated hacking attempts (i.e., a malicious party who knows someone’s username can readily deny that person access to the system).

With those thoughts in mind, I researched and developed best practice guidelines for the interaction side of these difficult problems, focused especially on password reset protocols. I have shown here two excerpts from the 15 page document, one about how to deal with unsuccessful login attempts, and the other proposing a novel approach to the problem of defining “good” security questions.

Locking accounts due to suspicious login attempts

15 successive unsuccessful login attempts with the same username (whether or not that username actually exists) should cause the system to begin requiring that each subsequent login attempt with that username be accompanied by some verification that it is dealing with a human being and not a script. “Successive” here (and below) means “not interrupted by a successful login”. Or to put it another way, the counter is reset whenever there is a successful login. The most conventional approach to verifying that a system is interacting with a human being is to use CAPTCHA, but alternatives such as that provided by areyouahuman.com may be both more usable and more reliable. If the “human” test fails, then the login credentials will not be submitted for validation. This will help to satisfy the requirement that “Application login functionality must limit brute force login attacks.” Any attempt to log in with a username in this situation from a standard login page (where the form contains only username and password fields) will be redirected (before the password is checked) to a login page that also includes a “human” test. Only if the “human” test is passed will the system evaluate the password.

Five successive “human” failures should block further login attempts from the current browser session (without regard to any particular username). If the “human” test passes, then the login credentials will be submitted; if they are invalid, then a new “human” count starts. 15 successive “human”-verified unsuccessful login attempts will cause the system to lock out the username altogether for 15 minutes, meaning that any login attempt will be met with a message stating that the account has been temporarily locked due to unsuccessful login attempts (whether or not an account for that username really exists). The 15 minute lockout period should only block login attempts, not username retrieval or password reset attempts, and, in fact, if a temporary password is sent during a lockout period, an attempt to log in with that temporary password should be permitted, and should have the effect of unlocking the account.

When the temporary lockout period for a username expires, then until the next successful login, the system should continue to require passage of a “human” test each time a password is submitted for validation against that username, beginning a new count towards another 15 minute account lockout. This should have the effect of pre-empting dictionary or other brute force attacks. As long as password length and complexity requirements are being enforced, dictionary and brute force attacks are very unlikely to be successful if login attempts are limited to no more than 15 every 15 minutes (per account), especially if “human?” tests are inserted into the loop. But the more important role of the “human” tests is to block automated denial-of-service attacks that might otherwise attempt to prevent authorized users from accessing their accounts by repeatedly submitting usernames with invalid passwords. The “human?” validation, combined with the short duration of the lockout, should combine to make denial of service attacks impractical for most purposes.

Security questions

As the website goodsecurityquestions.com puts it, “there really are NO GOOD security questions; only fair or bad questions. ‘Good’ gives the impression that these questions are acceptable and protect the user. The reality is, security questions present an opportunity for breach and even the best security questions are not good enough to screen out all attacks. There is a trade-off; self-service vs. security risks.” The authors of that site suggest four traits that a (relatively) “good” security question should satisfy: safe (cannot be easily guessed or researched), stable (doesn’t change over time), memorable, and definitive. Examples are given here: http://goodsecurityquestions.com/examples.htm

A “good” security question need not be applicable to everyone, since the system can provide many questions, of which the user need choose only a few to answer. It should be something that the user will readily know (memorable), and should minimize ambiguity (stable, definitive). For example, “what is the middle name of your oldest child?” would be a better question than “what is the middle name of your youngest child”, since a younger-yet child could always be born. Questions like, “what is your favorite fruit?” are best avoided, both because many people will answer this question differently on different days (not stable), and because guessing is likely to be quite effective (not safe).

It is probably best to avoid security questions involving the user’s date of birth, social security number, or mother’s maiden name, both because these are so widely used for identification in the financial services industry (such that asking for them is likely to make users suspicious about your motives), and also because (notwithstanding their widespread use) they are in fact relatively easy to crack.

It is generally considered inadvisable to allow users to write their own security questions, since that cedes all control over the quality and security of the question.

Proposal: Two-part user-assisted security questions

Between the rise of social media, the web posting and search engine indexing of old records, and the ubiquity of local websites, many widely-used security questions are becoming less safe all the time. Take, for instance, high school mascot, still a widely used security question. Once, it would have been difficult to either track down a random stranger’s high school, or to identify the mascot associated with a high school in a different part of the county. Not any more. And with the indexing of obituaries, wedding, and birth announcements (to say nothing of genealogical databases!), questions about the middle or maiden names of a person’s grandparents, parents, and siblings are often easily answerable.

In light of these concerns, I would like to propose a new paradigm for security questions, to be used initially on an experimental basis, and potentially deployed more widely if the results are successful. The concept is a two-part security question, where the system provides a basic question framework, but the user is asked to provide a clue that will be an integral part of the question.

The framework is as follows:

What is the first
middle
last
first and last
maiden name of a relative?
friend?
teacher?
colleague?
student?
mentor?
boss?
nemesis?

The user chooses one from each column, e.g. “What is the first and last name of a colleague?”

Then the system prompts the user to “Write your own clue that will tell you which [relative | friend | teacher | etc.] you are thinking of. Your clue should be meaningful to you, but not to others. It cannot include the [relative]’s name!” This rule should also be enforced by the system: if the answer is contained within the user-provided clue, then the pair should be rejected. E.g., “Uncle Peter” / “Peter”.

It is usually considered advisable to make answers to security questions non-case-sensitive, since we don’t want to depend upon users remembering the case they chose to use to answer the question. For the answer, the system should only allow the use of standard (A-Z) English letters, spaces, and two special characters. The hyphen/minus character (ASCII 45 / Unicode U+002D) is used in hyphenated first and last names (e.g. “Jean-Luc”, “Day-Lewis”), and is the character almost invariably produced by pressing the standard “hyphen” key on a keyboard. The apostrophe (ASCII 39 / Unicode U+0027 is used in names like O’Donnell, and is produced by using the “apostrophe” key on a keyboard. (Strict typography would dictate the hyphen character (U+2010) for the former and the typographic apostrophe (U+2019) for the latter, but these characters cannot be readily generated from a standard keyboard without software intervention. Even better might be to implement something akin to case insensitivity for these characters, converting any hyphen or dash to U+002D, and any apostrophe or single quotation mark to U+0027.) For the user-provided “clue”, any “safe” characters are permissible. Typically security question answers are displayed onscreen as they are entered, although masking the field after it loses focus is a good practice. Three characters is a good minimum length for security question answers.

I propose evaluating this two-part user-assisted security question approach using two metrics. Quantitatively, by the percentage of the time that questions are answered correctly during the password reset process, for traditional questions versus two-part user-assisted questions. And qualitatively, by rating the clues provided by users. All user-defined clues could be logged in a file that is accessible to specified Amplify employees, who could rate them for safety. One possibility is that users would ignore the directions, and instead opt for trivial “clues”, e.g. “The first name of a friend” / “write out 16 in words”, or “The first name of a friend” / “Don-ald”, or “Rob backwards”, or “Homer Simpson’s son” (i.e. “tricking” the system into allowing the user to embed the answer in the clue). These would be considered poor. Another possibility is that they would effectively duplicate widely used fair-to-good questions, of the sort that are becoming easier to crack as more and more information migrates online, e.g., “The middle name of a relative” / “my younger brother”. A third possibility is that they will come up with a personal clue that would be very difficult to crack, even if one has access to, e.g. genealogical records and a person’s Facebook account and high school yearbook. E.g. “The middle name of a relative” / “took me to see Cats”, or “The last name of a teacher” / “subjected me to Thomas Hardy”. These would be considered strong. I think this sort of rating could be done without any need for the raters to have access to the corresponding answers.

In order for the experiment to be deemed a success, all of the following would need to be true:

The correct response rate for two-part user-assisted security questions is comparable to or better than the correct response rate for traditional system-furnished security questions

A very small percentage of users (1% or less) select poor questions

At least 20% of users select strong questions