More

t3nary · on March 7, 2017

Yeah, it's pretty confusing, I thought Xoogler meant someone working at Google X.

t3nary · on Feb 2, 2017

Is it the same office as Amazon's Berlin office?

t3nary · on Jan 24, 2017

How difficult is it to get into a position at a top company where you can do ML research, without having a PhD?

And I'm also wondering about going back to university: Is it difficult to get into a good ML PhD program if you've been in the industry for some years?

achompas · on Jan 25, 2017

Very difficult, at east for research. Top companies with ML research labs (FB, Google, Uber) are basically externalized academic labs. The heads of these labs are simply bringing talent over from their former departments (NYU, Cambridge/UBC, CMU).

It is not impossible to work at these labs as an engineer without a PhD, but I don't have a deep understanding of these roles.

t3nary · on Nov 25, 2016

I feel very similar. Do you have any recommendations for resources to learn about why matrix multiplication is defined that way?

bandrami · on Nov 25, 2016

"All the mathematics you missed but need to know for graduate school"[1] helped me a lot (and, in fact, I had and did).

Once I had finished that, Cullen's "Matrices and linear transformations"[2] was really helpful too. But I wouldn't do Cullen if you're still, as I was, floundering with the concepts of why you're doing this in the first place. It's great once you have those concepts down.

[1]: https://www.amazon.com/All-Mathematics-You-Missed-Graduate/d...

[2]: http://store.doverpublications.com/0486663280.html

fsloth · on Nov 25, 2016

Garrity's book looks exactly what I've been looking for for brushing up on basic math skills after a decade of fairly low-brow work after my MSc. Thanks for the reference!

bandrami · on Nov 25, 2016

Hope it helps! It definitely got me through the first year of grad school...

cousin_it · on Nov 25, 2016

Let's say you have a system with N possible states, evolving in discrete steps. At each step, the system has some probability of switching from any state to any other state. That gives you an NxN matrix of switching probabilities. For example, if the system always stays in the same state as it started, the switching probabilities are an identity matrix (1 on the diagonal and 0 everywhere else).

Now let's see what happens after two steps. If the system started out in state i, what's the probability that after two steps it will end up in state k? Well, it's the sum over all possible paths. In other words, the sum of probabilities of i->j->k for all possible j. In other words, the sum of p_{ij} times p_{jk} for j from 1 to N. But that's exactly the definition of multiplying a matrix by itself.

Now it should be easy to understand that whenever you have matrices that represent transformations of some object, composing transformations will correspond to multiplying matrices.

darrickw · on Nov 25, 2016

Check out this series of videos "The Essence of Linear Algebra"[1] for a really powerful visual and intuitive explanation. It starts with vectors and builds to matrix multiplication and further to several other topics.

[1]: https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2x...

ww2 · on Nov 25, 2016

If you compose two linear transformations into one step, and apply distributive property, you get the matrix multiplication rule.

http://math.stackexchange.com/questions/31725/intuition-behi...

jperras · on Nov 25, 2016

Axler's book, "Linear Algebra Done Right", is widely considered to be one of the best texts on linear algebra that focuses less on the mechanics and more on the structure of linear operators on vector spaces.

http://linear.axler.net/

CalChris · on Nov 25, 2016

Yes, and the abridged version (no proofs, examples, and exercises) is downloadable for free. He avoids determinants altogether (well, until the 10th chapter).

http://linear.axler.net/LinearAbridged.html

t3nary · on July 13, 2016

Who is "we"?

t3nary · on June 30, 2016

This belongs to homebrew cask (https://caskroom.github.io/) and you can indeed install it using cask.

yeasayer · on June 30, 2016

Isn't cask supposed to be for heavy apps with GUI?

Ambroos · on June 30, 2016

I think you probably need some pretty big libraries installed to compile it yourself. Using cask you don't have to.

t3nary · on June 6, 2016

Hashing with an insecure algorithm and without good salting isn't much better that not hashing at all though.

dchest · on June 6, 2016

Nope, it is much better for those who use passwords with good entropy.

dzhiurgis · on June 6, 2016

How do you store salt?

CiPHPerCoder · on June 6, 2016

That question should never even need to be asked. The library you're using should take care of that for you.

In PHP:

    $storeMe = password_hash($plaintext, PASSWORD_DEFAULT);
    if (password_verify($plaintext, $storeMe)) {
        // Logged in
    }

The detail is totally abstracted away. All sane password hashing libraries offer this API.

See: https://paragonie.com/blog/2016/02/how-safely-store-password...

EDIT - CANNOT REPLY:

  > One major failure of this article:
  >
  > You should generate a random salt for each user and store it alongside the user
  > record in the DB.

No, you shouldn't. Your library should do that for you, and store it as a single string that's opaque to the developer.

  > I completely disagree. This implies that my DB ORM handles password stuff,
  > which doesn't make sense.

See the passlib section: https://paragonie.com/blog/2016/02/how-safely-store-password...

jjawssd · on June 6, 2016

One major failure of this article:

You should generate a random salt for each user and store it alongside the user record in the DB.

dchest · on June 6, 2016

How is this a failure? That's correct. (If your password hashing library doesn't handle this automatically, of course. But it does the same internally.)

jjawssd · on June 6, 2016

Is it normal for password hashing libraries to save salts to a database?

dchest · on June 6, 2016

Yes, they usually produce a string that looks something like "salt||hash". (Salt is a non-secret value.)

This result of bcrypt:

   $2a$10$N9qo8uLOickgx2ZMRZoMyeIjZAgcfl7p92ldGxad68LJZdL17lhWy
    |/ \| \____________________/\_____________________________/
    |   |        salt                      hash
    |  cost 
    |
 algorithm,
  version

You store this string in the database.

zo1 · on June 6, 2016

The big thing about this, is that it is perfectly "OK" to store both the algorithm, cost, and salt alongside the hash.

Most people seem to think, and myself included when I was new-to-it, that storing all those things together would compromise the security. The point of the hash is that it is impossible (almost) to get to the hash without the user's password, and there is no way to get to the password with the entire string you posted.

DigitalJack · on June 6, 2016

I'm naive about these things, but I was under the impression that salt just thwarted pre-computed hash tables? I guess should be "just" in quotes.

So somebody with resources and motive could still brute-force that string. It seems that storing the salt somewhere else would add a comparable amount of security as the salt itself. It seems prudent along the lines of "don't put all your eggs in one basket."

raverbashing · on June 7, 2016

> but I was under the impression that salt just thwarted pre-computed hash tables?

Yes. Because if you had two users with the password 'dadada' they would hash to the same value

Now 1234:dadada hashes differently then 1326:dadada hence preventing the use of a prehashed table (you could go through all salts for common passwords, but it's usually a bit long as well)

CiPHPerCoder · on June 6, 2016

What you're thinking of is called a "pepper" and is discouraged.

CWuestefeld · on June 6, 2016

Rather than expecting the password hash library to store something into your application DB, you should be managing the access to that DB yourself.

In our case, we use an immutable attribute of each user as their hash. This might be an internal identifier, or the timestamp on which their account was created, or something like that.

dchest · on June 6, 2016

Rather than expecting the password hash library to store something into your application DB, you should be managing the access to that DB yourself.

You do manage it yourself. Password hashing library doesn't access your database, it produces a string that you store, which includes salt and password hash.

In our case, we use an immutable attribute of each user as their hash

What? You really need to talk to security-competent people.

peeters · on June 6, 2016

> In our case, we use an immutable attribute of each user as their hash.

I assume you mean "as their salt". And even then, why the half-measure? Just laziness? Sure, a guessable/computable salt is better than no salt, but it's not nearly as good as a random salt.

CWuestefeld · on June 6, 2016

I assume you mean "as their salt"

Yes, thanks for clarifying what I meant to type.

why the half-measure? Just laziness? Sure, a guessable/computable salt is better than no salt, but it's not nearly as good as a random salt.

But isn't the salt essentially safe to make public anyway? That being the case, how does it matter what value you use, so long as it differs between users?

dchest · on June 6, 2016

Ideal salt is a large (e.g. 16 bytes or more) random byte string generated for each password.

If there's a reason for it (in most cases, there is none), some trade offs are possible, e.g.:

Salt is a large random string unique per user, not per password.

Given two hashes of passwords for the same user it reveals whether passwords are the same.

Salt is a small random string or some predictable value.

Attackers can precompute guesses and then look them up.

If you use some immutable identifier per user as salt, both of these attacks are possible. Is there a reason for this? Since you already store password hash in your database, I'm 100% certain that it's not, you can generate large random salt per each password hash and store it.

As for "safe to make public": there are many things in crypto called "public" where "public" doesn't mean that the whole world is free to get it, but instead means an opposite of "private", or, as I like to call them, "non-secret". Yes, salt can be made public, but shouldn't (unless there's a reason for it — like in a kind of client-side crypto where server stores salt and sends it to clients) to avoid precomputation.

CWuestefeld · on June 6, 2016

Salt is a large random string unique per user, not per password.

Of course it's per user.

But "large" makes some sense. My current implementation has maybe 20-22 bits of uniqueness in the salt, certainly less than 16 bytes.

I don't think 16 bytes is necessary even as insurance against the future. Rainbow tables are still expensive to build.

On the other hand, maybe to build just a small table addressing the stupidest passwords ("password","12345678",etc.) it's worth making it more difficult.

dchest · on June 6, 2016

Of course it's per user.

What I meant is that it shouldn't be per user, it should be per password. If a user changes his password, he should get a new salt.

CiPHPerCoder · on June 6, 2016

> I don't think 16 bytes is necessary even as insurance against the future.

The birthday problem comes into play here.

If you have 22 bits of entropy in your salt, after 2048 users (2^11) you will find two with the same salt, with 50% probability. If they also use the same password, this makes attacking your users much easier.

Don't make it easy for attackers. Use 16 bytes from a CSPRNG. Better yet: Use a password hashing library that takes care of this for you.

If you use a 128-bit (16-byte) salt, you have a 50% chance of a collision after 2^64 passwords.

peeters · on June 6, 2016

It being unique goes most of the way, you're right (though hopefully it actually is unique!). I was being dramatic when I said "not nearly as good". But making the salt easily guessable does allow an attacker to precompute rainbow tables, etc. So if there was a breach and an attacker got a dump of your password hashes, it might mean the difference between you having time to invalidate those passwords or not.

Good look at the issue here: http://security.stackexchange.com/questions/41617/do-salts-h...

tamana · on June 6, 2016

But then you are using php so you already lost.

CiPHPerCoder · on June 6, 2016

If there was any truth at all in that claim, you should be able to compromise paragonie.com right now simply for it running PHP.

Otherwise, that claim is false.

jjawssd · on June 6, 2016

> No, you shouldn't. Your library should do that for you, and store it as a single string that's opaque to the developer.

I completely disagree. This implies that my DB ORM handles password stuff, which doesn't make sense.

Python 3 example:

Syntax: hashlib.pbkdf2_hmac(hash_name, password, salt, iterations, dklen=None)

Example:

>>> dk = hashlib.pbkdf2_hmac('sha256', b'password', b'salt', 100000)

>>> binascii.hexlify(dk)

b'0394a2ede332c9a13eb82e9b24631604c31df978b4e2f0fbd2c549944f9d79a5'

Extremely simple and safe to use. Trivial to store and save/retrieve the salt from DB.

peeters · on June 6, 2016

I'm not too familiar with this library, but on inspection this approach seems to have a couple of drawbacks that libraries like bcrypt solve for you:

1) You need to store the salt alongside the password.

2) If you want to futureproof the stretching factor (e.g. change from 100000 to 1000000), you need to store that alongside the password hash as well.

3) If you want to futureproof the hashing algorithm, you need to store that alongside the password hash.

The value of the *crypt solutions is that they store the input parameters as part of the stored secret. So you can make adjustments later on without invalidating existing stored passwords, or having to resort to annoying "double-hashes" to migrate to a new approach.

I don't understand your comment about the ORM needing to handle passwords. It's a simple fetch of a field from the DB, which you then pass as an input to your password validator. How is that any harder than fetching a salt and a hash and passing those to your validator?

jjawssd · on June 6, 2016

I did not realize that the hash and salt are concatenated

t3nary · on June 6, 2016

Yeah, I'm currently taking a computer security class for master students and we spend quite some time on passwords. Made me think that this should be in an obligatory course for bachelor students (or some stripped down version that teaches the basics).

It's mind-boggling how many well-known companies stored passwords badly.

joshstrange · on June 6, 2016

As another comment mentioned I think the issue is far less that computer scientists don't know good PW practice and far greater that management doesn't care to give them time to work on it.

t3nary · on May 21, 2016

I think OP is rather talking about window.{alert,prompt,confirm}, at least "ie, you can't interact with the page without answering the dialog" hints towards that

pcwalton · on May 21, 2016

That's tricky too. You can't just remove window.prompt, or users won't be able to use pages that rely on it for critical input. So what do you do? window.prompt is a synchronous API; you have to return something to the code that called it. You might say "well, just suspend that function and let the user interact with the rest of the page". But by doing that you've introduced coroutines to JavaScript (since "interacting with the rest of the page" means "running JS"), which is a huge change that comes with a mountain of tricky interactions. Even if it could be made to work (which most browser vendors think is impossible), it would still break pages that didn't expect random state to change across the window.prompt call.

t3nary · on May 18, 2016

Whoa, I'm using this all the time, like every second day. Thanks a lot for creating it, it's really nice!