I have a function that calculates a value of a scalar field on a 3D space, so I feed it 3D tensors for x, y and z coordinates (obtained by numpy.meshgrid) and use elementwise operations everywhere. This works as expected.
Now I need to calculate a gradient of the scalar field. I've been playing around with theano.tensor.grad
and theano.tensor.jacobian
and I don't understand how a derivative of elementwise operation is supposed to work.
This is a MWE that I don't understand:
import theano.tensor as T
x, y = T.matrices("xy")
expr = x**2 + y
grad = T.grad(expr[0, 0], x)
print(grad.eval({x: [[1, 2], [1, 2]], y: [[1, 1], [2, 2]]}))
It prints
[[ 2. 0.] [ 0. 0.]]
while I would expect
[[ 2. 4.] [ 2. 4.]]
I also tried with jacobian:
import theano.tensor as T
x, y = T.matrices("xy")
expr = x**2 + y
grad = T.jacobian(expr.flatten(), x)
print(grad.eval({x: [[1, 2], [1, 2]], y: [[1, 1], [2, 2]]}))
which retus
[[[ 2. 0.] [ 0. 0.]] [[ 0. 4.] [ 0. 0.]] [[ 0. 0.] [ 2. 0.]] [[ 0. 0.] [ 0. 4.]]]
(the nonzero elements together would give me my expected matrix from the previous example)
Is there some way to get the elmentwise gradients I need?
Can I for example somehow define the function as scalar (three scalars into a scalar) apply it elementwise over the coordinate tensors? This way the derivative would also be just a simple scalar and everything would work smoothly.