Rolling your own password management solution

james.cutler · December 31, 2022, 7:57pm

RE: Spaces in password/passphrase, a note
I have yet to find (in three decades) a wireless access point that prohibits spaces in the SSID password. Seeing older users grok using a long personal pass phrase is satisfying. Youngsters seldom ask for help for any reason – I’m too old to be ‘hip’ or whatever the current word is.

The master password for 1Password7 may contain spaces, making it easy to input a sentence that is memorable.

The 1P7 password generator does not emit spaces.

Personal Note:
I currently have memorized about a dozen PINs, passwords, and passphrases. I allow 1Password7 to store and recall hundreds of other PINs, passwords, and passphrases for me. I especially love how 24 character random passwords do NOT have to be entered by me using an iPhone keyboard – and face unlock of 1P7 is wonderfully helpful for fat fingered users.

Bottom Lines:
Local user solutions: I make certain clients that use a Word file on an encrypted virtual volume follow good backup processes and do not insist on learning new habits.

FINALLY BACK ON TOPIC!
Build or Buy: [hint: Buy] I do not have the time or patience to maintain a personal tool to securely store and access more than 1000 objects for myself and my clients. Using 1P7 to automatically share secure data seamlessly on numerous OS and application instances which change at intervals that I cannot control is the primary impetus for using a vendor’s password management solution.
This is the same reason most of my locally built scripts on macOS have been replaced by Howard Oakley’s tools. Updates are created for me.

ddmiller · December 31, 2022, 8:36pm

1P8 can do that. Hyphens, spaces, commas, periods, underscores, numbers, or numbers and symbols - you can choose any one of those in 1P8 when creating a memorable password.

Of course you can always have it generate a password and manually edit it to use the pattern you like.

etbtbfs · December 31, 2022, 9:03pm

Been worried about this myself. Rolled my own password keeper, based on openssl and shell scripts, that maintains an encrypted store on a USB drive kept offline except when in use. Not as nice as GUI solutions but I don’t worry about exploits.

trilo · December 31, 2022, 10:01pm

Ha, good to know. My real name is actually quite unusual but I still wouldn’t use it. If a big data leak occurs you’re quite likely to lose your full name, address, DOB etc. Creating an alias or pseudonym is an interesting idea though. I wonder is JamesBond007 is taken…

dfy · December 31, 2022, 11:34pm

Some items I’ve discovered in studying this:

A brute force cracker is going to be working against modern cryptographic hashes like bcrypt, SHA512crypt or PBKDF2. A graphics array with dozens of graphic cards that could do 150 trillion NTLM (simplest obsolete method) hashes would be slowed down to the tens or hundreds of thousands of hashes/sec.

The diceware folks recommend 7 words to future proof. That’s an entropy of 90 with their 7776 word dictionary, all lower case letters.

BTW, You can use a space in a password or passphrase.

For random words, the dictionary size is less of a factor than the length. Diceware uses 7776 words, 1Password said they use about 18,000 words of 3-8 letters each. An english dictionary filtered to use 3-6 letter words is about 30,000 words.

Entropy examples

12 random characters 95 character set: 79
15 random characters (lower + upper case only) 86
20 random characters all lower case: 94

Random words 20,000 word dictionary. Passwords all lower case:
3 words 43
4 words 57
5 words 71
6 words 85
7 words 100

These values assume that the cracker knows you are using random words, knows your dictionary and knows you’re using all lower case. If you add a random stray character anywhere (or make one upper case, or add a spec char) I’m informed that adds about 10 bits of entropy.

I think you can expect that crackers are going to go with the databases of discovered passwords first, dictionary attack variations next, and maybe a few other things. They won’t even likely get to brute force since it won’t be necessary. They’ll get most of the results they need without it. If they do brute force at all, it won’t be with massive computing power or for very long unless you’re a key person of some sort in business or government.

Calculations of brute force times are usually done with how long it would take to guess all combinations, or on average, how long to guess about half the combinations. One of the things you have to realize about brute force attacks is that they could get your password somewhere in the first few hundreds or thousands of guesses. Maybe even on the 1st guess. But probabilities are really low.

trilo · January 1, 2023, 2:45am

I’m thinking adding a word from a foreign dictionary would greatly add to the complexity of a brute force attack, especially when broken by a special character.

Something like: OceanT_avolo$vOlcano

jiclark · January 1, 2023, 6:07pm

Again, fascinating stuff! Can someone explain to me how a brute force attack works, if it’s not actually trying the password on the site in question? I’ve never understood that. I’m seeing lots of mentions of ‘hashes’ but how do they work as stand-ins for the actual site’s login system?

Also, like @trilo, I’ve been wondering how the use of words from languages other than English factor in? I assume there are dictionary attacks for every language, but mixing words from multiple languages seems like a valuable technique. How do asian language characters factor into all this?

Above all, the takeaway here is that length is far and away the most important factor, right? So as long as we’re creating 20+ character passphrases (or longer), the origin of the words themselves isn’t important?

Huge thanks to everyone contributing here. It’s invaluable information for these crazy times we live in!

ddmiller · January 1, 2023, 6:31pm

Yes and no. Actual words have fewer bits of entropy than a random string of alphabetic characters of the same length - for a given 7 letter string, there are 26 to the 7th power different combinations, but there are a lot fewer 7 letter dictionary words. That’s why a solution like diceware, which uses dice rolls to select words randomly from a dictionary, suggest using a long number of words, resulting in a longer passphrase, than just a random string of characters.

Somebody trying to brute force by guessing every combination - it’s just a factor of length. Somebody using a dictionary based attack before brute forcing - then it’s a matter of how many words that you use.

dfy · January 1, 2023, 6:39pm

Adding a foreign word to your random passphrase may seem intuitively correct, but you’re thinking like a human and not like a computer calculator. I actually did construct a 70,000 word English, French, Italian dictionary.

Here’s some more entropy numbers (assuming all lower case only):

20,000 word English dictionary 4 words 57
70,000 word Eng/Fr/Ital, 4 words 64
370,000 word english dictionary from Github, 4 words 74
20,000 word English 5 words 71

Notice how length beats dictionary size and complexity every time. That’s always true. (Again, we’re only talking against brute force attack here.) Adding other languages is really no different than just making a bigger dictionary.

The secret to strong passwords is making them long but also easy to remember .That’s why strings of random words are so powerful. If they attack it as a string of characters, you’ll have 20 or more characters or an entropy of 94. If they attack it as a string of random words, then word count matters. Using shorter words makes it easier to keep to a reasonable character length but still have a higher number of individual words. All lower case, even if long, is much easier to touch type or enter on an iPhone or iPad keyboard.

Or you can hit caps lock and do all upper case. But that only doubles their difficulty. You want exponential increases which comes from length.

Always expect that anything you can think of to trip up crackers, they’ve already thought of. Larger dictionaries, yep. Multi-language dictionaries, yep. Quotations and phrases, yep. They’re all out on the Internet and can be put in a dictionary of phrases, book titles, song titles, song lyrics etc etc. Only random strings or words force them into a brute force attack.

ron · January 1, 2023, 8:37pm

Warning: dangerous simplifications follow…

version 1

A simple system might have a file with a bunch of username/password pairs. When someone tries to log in, the system finds their username in the file and checks if their password matches. Great, but if a miscreant compromises the system and gets a copy of the password file (much easier in many cases than you might think), then they now have everyone’s username and password. The miscreant can then log in as any other user, including one that might have more privilege than the one used to steal the file. If any of those people are re-using passwords elsewhere, those sites are now compromised. Thus, it is now (since the 1980’s, actually) considered poor form to store passwords.

version 2

A cryptographic hash is a mathematical function that takes arbitrary data as input and returns a random-appearing fixed-size number as output. The output is consistent, in that the exact same input will always return the same hash value. It is collision resistant, in that it is mathematically intractable to generate an input that will hash to a pre-defined output.

Now, armed with a hash function, we can change our system to store the usernames and hashes of the password in the password file. If someone obtains a copy of the file, they cannot sign in using the hash and it is intractable to find the password (or a collision) from the data in the file, so they cannot use the data from the password file to log in to this or other sites.

Of course, they could brute-force the password file by running candidate passwords through the hash function, then comparing to see if any of the stored hashes match one of the candidate passwords. If so, they’re in.

A clever attacker could even run a few thousand commonly used passwords through the hash function in advance, then have a list of hashes (a “rainbow table”) to look for whenever a stolen password file becomes available.

version 3

If the hash stored in the password file includes the user name as part of the hash, that goes a long way (but not all the way, because username+password won’t be unique) toward defeating rainbow tables. Even better, each entry in the password file could contain the username, a large random number unique to the user (“salt”), and a hash made from the password+salt. The system can still validate a password by looking up the salt, appending it to the entered password, hashing the combo, and comparing it to the stored hash. Rainbow tables will be worthless, though, as you’d have to generate a rainbow table for every possible salt value (which can be made arbitrarily large).

So, to answer your question, most breaches (that don’t involve social engineering or exploits of vulnerabilities) occur by offline attempts to discover a password by brute-force analysis of a username+salt (we hope)+hash values from stolen password files. The host system is not involved, so attempts to limit password attempts in software won’t work.

I’ve left out a hopeless amount of nuance, but I hope this clarifies things.

trilo · January 1, 2023, 8:46pm

I hear you point but I’d counter with using something from an ‘uncommon’ language. For instance, here in Australia we have around 250 indigenous languages (800 dialects) which are largely undocumented and so small as to be ‘secure by obscurity’. I suspect adding words from one (or more) of these, and perhaps adding some from PNG, Nigeria, Indonesia, India etc would definitely add complexity. A location name, a fish species or a scientific plant name - all of which could potentially be easy to remember - would also increase that complexity.

Yes, at some point in time there may be the power to brute force every common use and scientific word, phrase, language, dialect, number, slang, colloquialism, abbreviation and symbol known to man, but to get them all in a specific order and combination will certainly outlive any valuable information held on my machine :)

Edit to say, none of the above considers obfuscation. Dictionaries by their nature are collections of known entities; start breaking up words with random replacements and they’re suddenly not part of any existing dictionary.

jzw · January 1, 2023, 9:12pm

This is an interesting point which I’d not considered, and not seen explicitly addressed in this thread so far. Even if we set aside non-latin characters (which can be cumbersome to type on English keyboards), there are a whole set of accented characters easily available (ü, ö, é, ñ, etc) on both computer and iPhone keyboards. Wouldn’t using one or more of these expand the character set needed in a brute force attack, and therefore lead to an exponential increase in cracking difficulty?

jiclark · January 1, 2023, 9:28pm

Thanks Ron! Very helpful (to me anyway)…

jiclark · January 1, 2023, 9:30pm

So does this mean it’s better to use 5 five-letter words than 3 eight-letter words? So interesting…

San · January 1, 2023, 10:44pm

Many people in this thread have emphasized that password length is by far the most important thing… such as Neil’s comment “Length is your friend…your only friend…”

I’m sure this is good advice technically, but there’s something about this I don’t understand: Many websites or services I open an account with place an arbitrarily short limit on password length, sometimes as short as 12 characters, in addition to other limits they have (only certain special characters allowed, for example). There seems no standard for this — every site or provider stipulates whatever requirements and limits they please.

So how does the emphasis here on length help in those cases?

Also, related question… although I don’t use a cloud-based password manager, I am a little curious… since those password managers are presumably purely algorithmic, not a vast room full of people entering random-string passwords for you… how do they deal with (or even know) what the limits and requirements are for the huge number of sites or services they have to generate passwords for?

terryk · January 1, 2023, 11:00pm

I know this won’t work for everyone, for various reasons, but when I encounter a website such as you describe, I immediately delete whatever information I’ve already entered and abandon the account creation process.

My opinion is, any website that has such obvious disregard for account safety isn’t one I want to join. Almost every website has dozens of other duplicates where I can achieve the same goal, whatever that goal is. I search until I can find a reputable one, meaning one that takes my security more seriously.

But that’s just me…

San · January 2, 2023, 2:39am

Almost every website has dozens of other duplicates where I can achieve the same goal…

Really? It’s not unusual for me to have a financial services, or government, or other critical/core account that I can’t “duplicate” or easily walk away from… these are traditional services that I typically had long before I ever accessed them online… and there’s only one way to create an online account with them: their way.

jiclark · January 2, 2023, 4:46am

I agree, San! And furthermore, you’re right, it’s almost always a bank or gov’t website that has the most draconian password restrictions. It’s truly maddening, and in most cases, can’t just be ignored or abandoned! Grrrrrr!!!

Shamino · January 2, 2023, 11:12pm

It assumes that someone has already broken into the site and has downloaded their authentication database containing hashes generated from each user’s password. It also assumes that the attacker knows which hash algorithm is being used by the site.

The brute-force attack takes every one of the (multi-trillions) of possible passwords and runs them through the hash algorithm, comparing the output against the downloaded authentication database. If any hashes match, then the password that generated the hash will be usable for logging in as that user.

Since it can take an extremely long time to run through every possible combination if the passwords are long and complex, attackers use this as a last resort.

They prefer to use “dictionary attacks”, where they have a database of words from the dictionary and common phrases (which may have been collected from prior password compromises), and generate hashes for that entire database. This can extract passwords much much faster than if they had to use brute force. But it can only work against a password that is found in that database.

Which is why there is so much repeated advice about not using single words or common phrases or common alternations for the above (e.g. swapping l for 1) - because those passwords will be in the dictionaries. You want to pick something that is very unlikely to be in any dictionary, because if it’s not in the attacker’s dictionary, brute force will be necessary, and very few attackers will find it worth taking the time unless you are known to be a high-value target.

Dictionary attacks, by definition, only work against passwords that appear in the dictionary. An English-language web site is going to primarily have English-speaking users. A French site is going to primarily have French-speaking users. A Japanese site is going to primarily have Japanese-speaking users.

In each case, it is most likely that the users will pick words from the languages they know. And attackers know this. So they will likely use a Japanese dictionary against a Japanese site and a French dictionary against a French site.

Mixed-language dictionaries are certainly possible, but keep in mind that an attacker wants to get the most bang for his buck. He’s not likely to be using a dictionary containing words from every language, because that will make his search (hashes of every dictionary) much much larger, partly undermining the point of using a dictionary.

Mixing words from languages is probably a valid approach, since dictionaries are unlikely to have mixed-language passphrases unless the phrase is somewhat common. This is for the same reason they won’t contain every possible combination of three (or four or five) English words - it makes the dictionary exponentially larger and for little actual gain.

Still, don’t forget that the goal isn’t to try and come up with words the attacker didn’t consider. It’s to come up with something not in the dictionary. You can do just as well by slipping a random or otherwise unexpected character in somewhere, or use a creative (hopefully uncommon) misspelling of one or more of the words.

Using 8-bit international characters can expand the search space, should brute-force become necessary for an attacker. Instead of needing to search the 95-character ASCII space, it may need to search a 190-character space (e.g. ISO-8859-1) or a 216-character space (e.g. WIndows-1252). These much larger search spaces will make brute-force attacks that much more difficult.

Interestingly, however, switching to 16-bit and 32-bit Unicode characters or their UTF-8 representations change anything. Although there are tens of thousands of Unicode characters, the inputs to the hash algorithms are simply blocks of bytes. The algorithms don’t care how these bytes may or may not translate into printable characters.

A password containing 8 16-bit characters is functionally no different from one containing 16 8-bit characters.

Pretty much. The important thing is to pick something that won’t be in a dictionary - so brute-force will be necessary to crack it. And then to choose complexity (characters from all four categories) and length in order to make a brute-force attack take as long as possible.

20 characters is often discussed because it will make brute force attacks take a very long time, even for simple character sets (like all-digits).

Absolutely correct. And if the site you’re connecting to imposes a small maximum-length for passwords (I’ve seen some that limit you to 8 or 10 or 12 characters), then you definitely don’t want to use real words - because they’ll all be in an attacker’s dictionary.

The idea about picking multi-word phrases is that you can get much much more length without it being harder to memorize. You can probably remember a 5-word phrase just as easily (maybe more easily) than 8 random characters. The greater length (e.g. 20+ characters if each word is at least 4 characters long) will more than make up for the fact that words have less entropy than random characters.

At first glance it would seem so, since there are more permutations of five words (20,000⁵ = 3.2 x 10²¹ permutations, for a 20,000-word dictionary) than three (20,000³ = 8 x 10¹³).

But brute-forcing every possible permutation of a dictionary is just as infeasible as brute-forcing every possible permutation of characters. No dictionary is going to have every possible permutation of even three English words, because an 8-trillion-entry hash-dictionary is going to be far too big to be usable.

So it comes back to picking a sequence of words that isn’t going to be a well-known phrase. And for this, the number of words doesn’t matter, as long as the phrase isn’t known. So it comes back down to length, in order to make sure that any subsequent brute-force attack (after discovering that their dictionary attack failed) takes a prohibitively long time to complete.

These are bad web sites. Complain to the operator. There is no legitimate reason for imposing limits like this. If their system has bugs where a too-long password or certain characters may cause problems, then it is poorly designed (and IMO, is likely to be hacked via these bugs if they’re not fixed).

But yes, if you’ve got an arbitrary length limit, then you need to work within those limits. Choose something that’s not a dictionary word/phrase that uses as many character catagories that the site supports, and use the maximum length, even if this means padding it with nonsense like “123!@#”.

If a site only allows 12 characters, and only allows five different symbols (e.g. “!@#%&”), then use what you’ve got (digits, upper-case, lower-case and symbols from these). This will give you a 67 character search space (not as good as 95, but still pretty good), which for 12 characters will have 67¹² = 8.2 x 10²¹ combinations. At 1 billion attempts per second, a brute-force attack will still require over 250,000 years to try every possible combination.

Not as good as 20 characters from a 95-character set, but probably good enough as long as you use all 12 characters.

neil1 · January 2, 2023, 11:57pm

that they could get your password somewhere in the first few hundreds or thousands of guesses. Maybe even on the 1st guess. But probabilities are really low.

In theory you’re technically correct…but as a practical matter once they try brute force then iterating through the possible passwords in the password universe in some logical incremental manner takes a lot less cpu power and drive space than randomizing the guesses, comparing the random guess to the already failed guesses, trying the guess, and adding it to the failed guess table…thus with a long password the chances of yours being the first or an early guess is as close to zero as you can get without actually getting to zero. That’s the beauty of forcing the hacker into brute force attack…it doesn’t matter how many dictionary attacks or previously leaked passwords he tries…because your long password that is either random gibberish or a sequence of Oxford dictionary words separated by some digits and symbols or with those at the front or bock…but mostly is really long…just isn’t in those tables.

The bad guys will get enough passwords with their more primitive but faster techniques that they won’t bother with yours…unless your name is Musk or Zuckerberg or Biden or Trump you’re simply not a financial risk they’re going to take…because it’s a business based profit vs loss situation for all but a very militia set of people.

Your job as the user is not to generate perfect security…as I said in several other posts perfect is the enemy of good enough and you (unless you’re in the aforementioned small set) is to just be good enough that you won’t be bothered with.