Thank you for sharing. One thing I found confusing was the introduction of l_{i}'(x). I kept thinking this was short for \frac{d}{dx} l_{i}(x), but it's not the derivative!
Have a look at the daily volume of trading for Bitcoin, these financial tools are a very small part of this... we are talking $50-100 billions daily. It does contribute, but in terms of scale way less than what you seem to suggest... arguably I would guess that these days the announcements of big purchases move the market more than the actual purchase itself.
Ah, fascinating, I’ve never used Order Statistics. It doesn’t look exactly like what I was thinking, but there is also a continuum from min to median to max, similar to min to mean/sum to max. I’m not sure but I might guess that for the special case of a set of independent uniform variables, the median and the mean distributions are the same? Does this mean there’s a strong or conceptual connection between the Bates distribution and the Beta distribution? (Neither Wikipedia page mentions the other.) Maybe Order Statistics are more applicable & useful than what I imagined…
Median and mean are not the same distribution. Consider three uniform values. For the median to be small two need to be small, but the mean needs three.
I think order statistics are more useful than what you described, because “min” and “max” are themselves quantiles and more conceptually similar to “median” than to “mean”.
Trying to imagine how to bridge from min/max to mean, I guess you could take weighted averages with weights determined by order, but I can’t think of a canonical way to do that.
The reason that they do not look the same is that the order statistics are there presented for an exponential function, which has an unbounded upper range. When you do it on a uniform distribution with n variables, you get an n'th power and n'th root at the extremes, with varying lopsided normal-looking distributions in between.
I've always thought the use of "Tensor" in the "TensorFlow" library is a misnomer. I'm not too familiar with ML/theory, is there a deeper geometric meaning to the multi-dimensional array of numbers we are multiplying or is "MatrixFlow" a more appropriate name?
Since the beginning of computer technology, "array" is the term that has been used for any multi-dimensional array, with "vectors" and "matrices" being special kinds of arrays. An exception was COBOL, which had a completely different terminology in comparison with the other programming languages of that time. Among the long list of differences between COBOL and the rest were e.g. "class" instead of "type" and "table" instead of "array". Some of the COBOL terminology has been inherited by languages like SQL or Simula 67 (hence the use of "class" in OOP languages).
A "tensor", as used in mathematics in physics is not any array, but it is a special kind of array, which is associated with a certain coordinate system and which is transformed by special rules whenever the coordinate system is changed.
The "tensor" in TensorFlow is a fancy name for what should be called just "array". When an array is bidimensional, "matrix" is an appropriate name for it.
I agree. Just like NumPy's Einsum. "Multi-Array Flow" doesn't sound sexy and associating your project with a renowned physicist's name gives your project that "we solve big science problems" vibe by association. Very pretentious, very predictable, and very cringe.
The joke I learned in a Physics course is "a vector is something that transforms like a vector," and "a tensor is something that transforms like a tensor." It's true, though.
The physicist's tensor is a matrix of functions of coordinates that transform in a prescribed way when the coordinates are transformed. It's a particular application of the chain rule from calculus.
I don't know why the word "tensor" is used in other contexts. Google says that the etymology of the word is:
> early 18th century: modern Latin, from Latin tendere ‘to stretch’.
So maybe the different senses of the word share the analogy of scaling matrices.
The mathematical definition is 99% equivalent to the physical one. I find that the physical one helps to motivate the mathematical one by illustrating the numerical difference between the basis-change transformation for (1,0)- and (0,1)-tensors. The mathematical one is then simpler and more conceptual once you've understood that motivation. The concept of a tensor really belongs to linear algebra, but occurs mostly in differential geometry.
There is still a "1% difference" in meaning though. This difference allows a physicist to say "the Christoffel symbols are not a tensor", while a mathematician would say this is a conflation of terms.
TensorFlow's terminology is based on the rule of thumb that a "vector" is really a 1D array (think column vector), a "matrix" is really a 2D array, and a "tensor" is then an nD array. That's it. This is offensive to physicists especially, but ¯\_(ツ)_/¯
The problem with the physicist's definition is that the larger the N the less the geometrical interpretation makes sense. For 1, 2, and even 3-dimensional tensors there is some connection to geometry, but eventually it loses all meaning. Physicist has to give up and "admit" that an N-dimensional tensor really just is a collection of N-1-dimensional tensors.
The tensors in tensorflow are often higher dimensional. Is a 3d block of numbers (say 1920x1080x3) still a matrix? I would argue it's not. Are there transformation rules for matrices?
You're totally correct that the tensors in tensorflow do drop the geometric meaning, but there's precedence there from how CS vs math folk use vectors.
Matrices are strictly two-dimensional arrays (together with some other properties, but for a computer scientist that's it). Tensors are the generalization to higher dimensional arrays.
I could stop right here since it's a counterexample to x being a matrix (with a matrix product defined on it; P.S. try tf.matmul(x, x)--it will fail; there's no .transpose either). But that's only technically correct :)
So let's look at tensorflow some more:
The tensorflow tensors should transform like vectors would under change of coordinate system.
In order to see that, let's do a change of coordinate system. To summarize the stuff below: If L1 and W12 are indeed tensors, it should be true that A L1 W12 A^-1 = L1 W12.
Try it (in tensorflow) and see whether the new tensor obeys the tensor laws after the transformation. Interpret the changes to the nodes as covariant and the changes to the weights as contravariant:
import tensorflow as tf
# Initial outputs of one layer of nodes in your neural network
L1 = tf.constant([2.5, 4, 1.2], dtype=tf.float32)
# Our evil transformation matrix (coordinate system change)
A = tf.constant([[2, 0, 0], [0, 1, 0], [0, 0, 0.2]], dtype=tf.float32)
# Weights (no particular values; "random")
W12 = tf.constant(
[[-1, 0.4, 1.5],
[0.8, 0.5, 0.75],
[0.2, -0.3, 1]], dtype=tf.float32
)
# Covariant tensor nature; varying with the nodes
L1_covariant = tf.matmul(A, tf.reshape(L1, [3, 1]))
A_inverse = tf.linalg.inv(A)
# Contravariant tensor nature; varying against the nodes
W12_contravariant = tf.matmul(W12, A_inverse)
# Now derive the inputs for the next layer using the transformed node outputs and weights
L2 = tf.matmul(W12_contravariant, L1_covariant)
# Compare to the direct way
L2s = tf.matmul(W12, tf.reshape(L1, [3, 1]))
#assert L2 == L2s
A tensor (like a vector) is actually a very low-level object from the standpoint of linear algebra. It's not hard at all to make something a tensor. Think of it like geometric "assembly language".
In comparison, a matrix is rank 2 (and not all matrices represent tensors). That's it. No rank 3, rank 4, rank 1 (!!). So what does a matrix help you, really?
If you mean that the operations in tensorflow (and numpy before it) aren't beautiful or natural, I agree. It still works, though. If you want to stick to ascii and have no indices on names, you can't do much better (otherwise, use Cadabra[1]--which is great). For example, it was really difficult to write the stuff above without using indices and it's really not beautiful this way :(
See also http://singhal.info/ieee2001.pdf for a primer on information science, including its references, for vector spaces with an inner product that are usually used in ML. The latter are definitely geometry.
[1] https://cadabra.science/ (also in mogan or texmacs) - Einstein field equations also work there and are beautiful
In TensorFlow the tf.matmul function or the @ operator perform matrix multiplication. Element-wise multiplication ends up being useful for a lot of paralellizable computation but should not be confused with matrix multiplication.
I was thinking it would be cool to add another slider for day of the year. That way you can see how the differences change as local timezones roll through their own daylight savings time or other adjustments. For example, right now Asuncion Paraguay is +3 hours ahead of Chicago U.S. but in April it will be only one hour ahead.
I believe they want to signal to the rest of the world that they are still a stable alternative for western nuclear fuel buyers. Uranium is the spice, and the spice must flow. Of course they also export coal, gas and oil too.
Thank you so much for creating this! Under the Curve61 point addition example, I was trying to follow the formula for adding two points: P:(x1, y1) + Q(x2, y2) = R(x3=l^2-x1-x2, y3=l(x1-x3)-y1) where l=(y2-y1)/(x2-x1). I tried to use the example P:(5, 7) + 23P:(2, 24) = (226/9, 2888/7) != 24P:(59, 55) and was wondering where I've gone wrong? Appreciate your response!
After a few missteps where I transcribed the vars wrong (laugh) I wrote out the calcs and was able to reach the correct result. Here's my step-by-step process, hope this helps!
Much respect to the ScummVM team as they allowed me to play the original "Secret of Monkey Island" after it left my local library! Ironically, I no longer needed the "Dial a Pirate" code-wheel.