Rolling your own password management solution

neil1 · January 3, 2023, 12:14am

Can someone explain to me how a brute force attack works, if it’s not actually trying the password on the site in question? I’ve never understood that. I’m seeing lots of mentions of ‘hashes’ but how do they work as stand-ins for the actual site’s login system?

Not a direct attack against the site. The file containing the hashed passwords gets exfiltrated somehow first via some other security flaw…then gets attacked offline. As you know a hash is a one way algorithm that can’t be reversed to get the input, so t(e password guesses are hashed and the output compared to the exfiltrated file. A brute force attack at its simplest would start with the 95 possible characters and guess the password was the letter a…, then hash the letter a then compare the hashed value to the hashed password…if it matches the password is a and if it doesn’t match then it isn’t a. Next guess is b, then through the lower case, upper case, digits, and symbols until all have been tried…if all fail the password is more than 1 letter. Next guess is aa, then ab and so on until the entire 95 characters have cycled through the second character in the 2 character password. If all fail, guess ca and so on until each of the 95 characters has been tried in all possible combinations in of both characters. Then on to 3 character passwords.

The bad guy probably starts at some minimum number of characters rather than 1…but not being a bad buy I wouldn’t really know.

Just to be complete…a dictionary attack is usually a text file or table of pre-guessed guesses to try…it contains previously leaked passwords, famous phrases or song lyrics and maybe others…and the cracking software hashes each value to compare. A rainbow table is much the same except that the guesses are pre-hashed and the hashes stored to eliminate that step for every cracking attempt. As password length gets longer…eventually the storage space and access time for the dictionary or rainbow table gets so bit that they are forced to try brute force which needs cpu cycles but not drive space…which is why long is your friend.

The reason brute force is so hard is because of the really big number of possible guesses…and because there’s a fixed cost to the bad guy to try each guess (small per each but it can add up quickly) the vast, vast majority of us just rent important enough to spend that much time, effort, and money on.

neil1 · January 3, 2023, 12:47am

Somebody trying to brute force by guessing every combination - it’s just a factor of length. Somebody using a dictionary based attack before brute forcing - then it’s a matter of how many words that you use.

When they get to brute force…whether there are actual dictionary words or random gibberish is pretty much irrelevant…if your password is eaglepotatowhiskey for example it doesn’t matter that each of those 3 words is a valid dictionary word or not…because the individual guesses in the cracking dictionary (same word but different thing entirely) each either succeed or fail…if they guess eagle…it fails, if they guess eaglepotato it also fails…and they don’t get an it’s almost right answer…either works or fails. With more than 59 million words in the Oxford dictionary…trying all 3 or 4 or 5 word combinations is simply too hard…not to mention that upper case, symbols, and digits still make it fail…if your password was EaglePotatoWhiskey…then even the eaglepotatowhiskey guess still fails.

Technically…a completely random set of gibberish still has a slightly higher entropy than a same length lower case word only set…but padding it with upper, symbols, and digits narrows the gap from small to even smaller…and once you get to 20 pr 25 or whatever long whether the password is guessed in 50 million trillion centuries or a mere 45 of those…just doesn’t matter. Once you get to big numbers of guesses…better is the enemy of good enough.

Bad guys always try easier methods first…hacker dictionary and rainbow tables…but once length gets big enough…and that length increases as graphics cards get more powerful but 25 or so is enough for the foreseeable future…brute force is the only viable method and it’s a biz decision for the hacker…and most of us just aren’t worth it.

ace · January 3, 2023, 8:31pm

Can someone explain why the Diceware approach has you rolling dice and picking random words from their list? If the goal with a set of dictionary words is just length, why wouldn’t you pick five or six words—even proper names—that have lots of meaning to you (and are thus easy to remember and type) but wouldn’t be known to anyone else (first names of people you had secret crushes on in high school, for instance) and would add up to a sufficiently large number of characters?

ron · January 4, 2023, 4:12am

People are really, really bad at being random. If you’re a defender, you want to assume that attackers might know exactly how you’re generating passwords; you don’t want to rely on “length is everything” if your attacker might know a shortcut.

I don’t have a reference right now, but once upon a time Amazon rolled out a system that let you place orders using a three-word passphrase picked by the user. Even though they made it clear that you needed to pick random words to protect your account, the lack of randomness was shocking. In my memory (never to be trusted), something like 80% of the words people picked came from a vocabulary of 300 words. I’ll try to find the source data if I can, but the take home message is that words that come out of your brain are distinctly NOT random.

jiclark · January 4, 2023, 4:57am

Complain to the operator?? As mentioned more than once before (and I can’t believe you haven’t experienced it many times before yourself) these site are almost always government, bank or some other ‘operator’ that is simply not going to pay attention to anything we say. It’s totally mind-boggling that they are that clueless at this point in the evolution of all this, but there we are…

chirano · January 4, 2023, 5:30am

As @ron noted, you should assume that an attacker has thought of your password-generating method. In your example of using names, you’ve drastically decreased the size of the dictionary you’re drawing from, so even a fairly long password has relatively low entropy. A cracker, moreover, would likely start with the most common names, which would likely reduce the time needed to get a hit.

You might think you’re safe because a cracker won’t know that you used a bunch of names, which is a security-by-obscurity argument. But consider the fact that a lot of people use the names of their kids, pets, etc., as passwords, and some password-cracking software likely takes this fact into account.

People are fairly predictable. Diceware works because it takes people out of the equation when generating password.

Shamino · January 4, 2023, 2:59pm

Did you actually let them know? Or are you just assuming that they are aware of the problem and have deliberately chosen to ignore it for some reason?

Even banks and governments have contact pages for reporting problems with the site. If you can’t find one, try sending an e-mail to a webmaster@... address to see if that goes anywhere. One request may not get action (just like one bug report won’t make Apple change macOS design), but a lot of requests from many different people might.

And I can guarantee 100% that they will wait until the embarrassing public hack to change their policy if nobody complains about it.

Shamino · January 4, 2023, 3:07pm

Which is why you should use several names. Just like a dictionary attack can’t practically contain every possible 3-word permutation of the dictionary, it’s also not going to contain every possible 3- (or 4- or 5-) word permutation of a large list of names. Especially if they are unusual names or are unusual misspellings of them.

Unless the attacker already knows in advance that you are using names for your password. But he won’t know that unless you are specifically being targeted.

If you’re not a specially high-value target, an attacker is going to run a generic algorithm (one that he probably purchased and may not even understand) against a giant set of exfiltrated password hashes. He’s not going to start conducting background checks to determine who your family members are in order to try their names first. He’s also not going to know what books you read, what movies you watch or what your favorite pizza toppings are. He’s just going to use a dictionary populated with passwords that have worked in the past, attack every account for which they work, and not bother with the rest.

ace · January 4, 2023, 3:41pm

Thanks, @Shamino, that’s how I was thinking about this.

As I see it, there are two types of attacks to defend against:

Brute force attacks where high entropy is key
Highly targeted attacks where secrecy is key

In my example of the names of five people you had secret crushes on in high school, the entropy would be very high based purely on the length. The secrecy would also be high because no one but you would know who they were. That’s of course just one trivial example.

My larger point was that it’s really easy to come up with five words that are highly memorable for you, largely unknown to an attacker targeting you specifically, and long enough to provide protection against a brute force attack. Maybe it’s a high school crush, a word from a favorite movie title, the town where you ran your first 5K, your childhood gerbil’s name, and a funny word you like saying, such as guanabana. And one or more could have unusual spellings or character substitutions if you want.

So why go to the effort of picking random Diceware words that aren’t memorable?

ron · January 4, 2023, 6:15pm

I once found a serious vulnerability in a web site. I started by reporting it to webmaster@ and also using a link they had on the site. When I got no reply, I found and used some email addresses for their c-suites. It turns out, all the web development was done in a separate department (actually outsourced, IIRC). So complaints to the web site went nowhere, since it effectively meant they had to report to their corporate contact that they had made a defective product and request a budget to fix it. I suspect that this is commonly the case.

Limitations on password lengths are quite concerning. If passwords are properly stored only as hashes, longer passwords don’t use any additional system resources. Password limitations signal to me that they site is likely storing passwords in plain text (I’m shocked at the number of site that seem to do this).

fischej · January 4, 2023, 6:31pm

Good for you for going the extra mile. I’ve done the same, and found if I express the problem in terms of risk to the corporation (money, reputation, legal), and in the briefest possible terms, it’s amazing how well it works (sometimes). I’ve been on the other side of that too. Not that I was in a C-suite, but I was occasionally handed a task from there that came in though an out-of-band contact direct from a customer. The instruction was usually something like, “Look at this quickly, and let me know if it’s a problem. If so, fix it.”

Shamino · January 4, 2023, 7:35pm

It just occurred to me that I’ve been throwing about this phrase quite a bit in this discussion and it may need a bit clarification.

Clearly, if you are a person in the public eye (politician, celebrity, etc.), then you qualify. Enough people will want to target you specifically that you need to be extra careful with your data security.

But this also applies to any employee of a corporation. You may not be in any way famous, but if (for example) you work in the IT department for a business that has government contracts, your account is one that could yield big rewards on the black market, should it be compromised. So you may well be targeted specifically. Similarly if you have access to high-value non-government documents (e.g. schematics for Apple’s forthcoming products, bills-of-materials for Boeing’s newest cargo plane, etc.) that’s valuable enough to competitors that you may be targeted specifically.

And this may result in attempts to hack your personal systems as well as your corporate access, just in case (as news articles show to occasionally be true) you are storing sensitive documents on non-corporate equipment.

In other words, some of the advice I’ve been presenting as advice for “normal” individuals (to not worry about an attacker using information that would require real research to get) should definitely not be applicable to any corporate accounts, and it might also not be appropriate for personal accounts of employees of major corporations.

Which is also why all corporate IT departments need to rely on more than just strong passwords for security. 2FA, strong VPNs, certificates, mandatory encryption of storage and related technologies are more than just good ideas - they are necessary for businesses, in order to ensure that a leaked/phished/hacked password is insufficient for accessing documents and databases.

Technogeezer · January 4, 2023, 8:09pm

Bringing this back full circle, we’re all debating issues that are a result of using passwords, which we know are vulnerable in so many ways.

The “High value target” arguments are interesting. But I’d be willing to bet that more passwords are obtained a lot more easily by the bad guys from most users via phishing/social engineering attacks rather than through cracking attempts on password files.

The more quickly we can get away from passwords (even with 2FA) to things like passkeys the quicker this compromised passwords attack vector can be closed. Even then, other mechanisms have to be in place as @Shamino says (defense in depth should be the norm) to protect data.

San · January 4, 2023, 8:31pm

Just yesterday, as it turns out, I complained to a major financial company (one you’ve probably heard of) about the absurd limits their website places on user passwords. Ten character maximum, no special characters… my complaint to their customer service guy was like talking to a brick wall. Of course this was in a context – I didn’t call them about the passwords issue, I just threw that in – their technical systems had turned out to be utterly incompetent in general (not just the security stuff), but I didn’t know this for years until the buzzsaw of their incompetence hit me recently – big time. They’re clueless about everything to do with digital technology, not just security.

As for “contact pages” that you mentioned… how many labyrinthine bureaucracies have you had to deal with? Both government and private companies. Their so-called “contact pages” (if you can even find them) are often so heavily structured… limited to popdown-menu choices for almost everything, no way to customize your entries, and if your issue doesn’t fit neatly into one of the standard categories they’ve predefined, you’re stuck. And I’ve never seen “security concerns” or anything remotely like that as one of the choices.

San · January 4, 2023, 8:41pm

Yes. That’s the essence of the roll-your-own method I was referring to in the first place. (Much of this ensuing discussion is mathematically/technically way over my head, as it turns out!)

Also, concerning someone else who mentioned dictionaries of common names the crackers could use… I guess it’s lucky that the girls I remember having crushes on in junior high school were named Migzatona, Fulcrasq and Xrzzippi!

San · January 4, 2023, 8:56pm

As I mentioned before, I’ll research passkeys a little when they become an actual option, not just a proposal.

Like your username (which is great, BTW) I’m a techno-geezer too, and I remember so, so many interesting proposals and “standards” over the decades that were talked about a lot and then went nowhere. For example, all the other web developers on Usenet arguing that the latest hot item, XML or the interim XHTML, were sure to replace HTML soon. I denied this and was ridiculed mercilessly, but of course I was right.

If this here discussion were happening about 20 years ago, we might have been talking about biomorphic verification rather than passkeys. Remember those organic-security proposals? Well, I do still use my fingertip to unlock my old iPhone, except when I’m wearing gloves or there’s a bandage on my finger (like there is right now) or some criminal… um, I don’t even want to think about that. And then there’s another hot proposal, voice based verification, which (believe it or not) a major company that I have an account with keeps pestering me to sign up for! (Or speak up for, I guess.) Let me know when it’s in widespread, actual use, and I’ll research it further.

jiclark · January 4, 2023, 9:08pm

Come on, you’re being more than a tad unrealistic, don’t you think? As @San points out in a subsequent post, it rarely, if ever, matters when we try to notify someone at a giant entity about their incredibly lax and out-dated security features (and yes, of course I’ve tried myself). There’s no way enough regular users are going to complain to a giant corporation or gov’t entity to get them to change the way they do things. That’s simply a pipe dream.

jzw · January 4, 2023, 9:17pm

I think this is easy with one or two, or maybe a handful of passwords. But trying to come up with your own memorable passwords and remember which site they are for, for dozens to hundreds of passwords isn’t feasible. So this would be good for the few passwords that one needs to memorise (e.g. computer login), but not as a general practice.

I actually have 1Password generate random words and regenerate until it’s a set I’m happy with. I then customise slightly (e.g. separators, adding a digit or special character, etc) and type it over and over until I remember it. It is then extremely memorable

Shamino · January 4, 2023, 9:18pm

There’s nothing new under the sun. Mac OS 9 added voice-print authentication. It didn’t work all that well and was not migrated to OS X:

I suspect it would work better today, given the much faster processors and neural accelerators modern Macs have. It would be interesting to see if a new attempt would work any better, but I suspect most people would prefer to use fingerprints or facial recognition, as they do on iOS - who wants to have to speak up when using the computer when other people are around.

chirano · January 4, 2023, 9:37pm

This is where I’d say you could easily be overestimating the entropy. For example, in my high school, most of the girls had common names. I wouldn’t be surprised if a list of the hundred most common girls names would cover almost all of the girls. So if a cracker were to use that list with five names, he’d have only 10 billion possible combinations to check. That’s nowhere near the entropy you might think you have because the password is around 30 characters long.

It’s not impossible to manually come up with a good memorable password, but it’s also easy to fool yourself into thinking is a password is better than it actually is. Using Diceware helps you avoid this pitfall.