NumPy arrays of vectors#
First, install and import Vector and NumPy. (Vector requires NumPy, so if you can use Vector, you’ve already installed NumPy.)
[1]:
import vector
import numpy as np
Making an array of vectors#
If you want to do calculations with large numbers of vectors or performance is important, it’s better to make arrays of vectors than lists of vector objects.
The vector.array function is a general-purpose constructor. It works like np.array and expects the dtype to correspond to a structured array in which the field names are recognized coordinate names:
[2]:
vector.array(
[(1.1, 2.2), (3.3, 4.4), (5.5, 6.6)], dtype=[("x", np.float32), ("y", np.float64)]
)
[2]:
VectorNumpy2D([(1.1, 2.2), (3.3, 4.4), (5.5, 6.6)],
dtype=[('x', '<f4'), ('y', '<f8')])
Although this interface provides a lot of flexibility, it is inconvenient to type, compared to a Pandas-style constructor:
[3]:
vector.array(
{"x": np.array([1.1, 3.3, 5.5], np.float32), "y": np.array([2.2, 4.4, 6.6])}
)
[3]:
VectorNumpy2D([(1.1, 2.2), (3.3, 4.4), (5.5, 6.6)],
dtype=[('x', '<f4'), ('y', '<f8')])
The VectorNumpy2D, MomentumNumpy2D, VectorNumpy3D, MomentumNumpy3D, VectorNumpy4D, and MomentumNumpy4D classes that this constructor creates can also be used to np.ndarray.view an existing array as an array of vectors, which avoids copying data.
[4]:
arr = np.array(
[(1.1, 2.2), (3.3, 4.4), (5.5, 6.6)], dtype=[("x", np.float32), ("y", np.float64)]
)
[5]:
arr.view(vector.VectorNumpy2D)
[5]:
VectorNumpy2D([(1.1, 2.2), (3.3, 4.4), (5.5, 6.6)],
dtype=[('x', '<f4'), ('y', '<f8')])
If you’re starting from arrays in an unstructured form, you can first np.ndarray.view them as structured arrays and then view those as Vector’s array subclasses.
[6]:
arr = np.random.normal(0, 1, (10000, 3))
arr
[6]:
array([[-0.11148816, -0.83807123, 1.89938172],
[ 0.07184076, 1.36148282, -1.57618995],
[ 0.46223533, 0.22770039, 0.62830773],
...,
[-0.84363959, -0.89915358, -0.18437813],
[ 0.7430596 , -0.3939518 , 0.15209156],
[ 0.40687548, -0.18103823, 0.78935375]])
[7]:
arr.view([(name, arr.dtype) for name in ("x", "y", "z")])
[7]:
array([[(-0.11148816, -0.83807123, 1.89938172)],
[( 0.07184076, 1.36148282, -1.57618995)],
[( 0.46223533, 0.22770039, 0.62830773)],
...,
[(-0.84363959, -0.89915358, -0.18437813)],
[( 0.7430596 , -0.3939518 , 0.15209156)],
[( 0.40687548, -0.18103823, 0.78935375)]],
dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8')])
[8]:
arr.view([(name, arr.dtype) for name in ("x", "y", "z")]).view(vector.VectorNumpy3D)
[8]:
VectorNumpy3D([[(-0.11148816, -0.83807123, 1.89938172)],
[( 0.07184076, 1.36148282, -1.57618995)],
[( 0.46223533, 0.22770039, 0.62830773)],
...,
[(-0.84363959, -0.89915358, -0.18437813)],
[( 0.7430596 , -0.3939518 , 0.15209156)],
[( 0.40687548, -0.18103823, 0.78935375)]],
dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8')])
You can also have multidimensional arrays of vectors (just be sure not to confuse the two different concepts of “dimension”):
[9]:
arr = np.random.normal(0, 1, (100, 100, 3))
arr.view([(_, float) for _ in "xyz"]).view(vector.VectorNumpy3D)
[9]:
VectorNumpy3D([[[( 1.04123373, -0.99133844, 1.00077791e+00)],
[( 0.52169376, 0.11683251, -1.14363945e-03)],
[( 1.85318012, -0.31640048, -2.34568709e-01)],
...,
[( 1.67155023, -0.33544467, 8.77562271e-01)],
[(-1.1018786 , -0.35744257, -1.05612336e+00)],
[( 0.79966766, 2.38293098, 5.96236617e-01)]],
[[( 1.94366521, -0.9855164 , 4.60895006e-01)],
[( 0.3889074 , -0.85616641, -7.82467759e-01)],
[(-0.32565 , 0.9830741 , -1.44864836e+00)],
...,
[( 0.61327191, 1.43713057, -1.13442288e+00)],
[(-0.07791965, 0.65163304, 1.09988728e+00)],
[( 1.65952573, -2.78818342, 6.72730414e-01)]],
[[(-0.8903325 , -0.6297625 , 6.67996204e-03)],
[(-0.6223236 , -1.37730236, 2.51833933e-01)],
[(-0.1018968 , -0.41731018, 9.47917924e-01)],
...,
[( 0.34484408, -0.5465291 , -4.27092664e-01)],
[( 1.12237423, -0.25509153, 9.04461212e-01)],
[( 0.17489999, 0.92012174, -7.09007969e-01)]],
...,
[[( 0.26270918, -1.05908501, 1.18350858e+00)],
[( 2.71511972, -1.34473636, 1.05541923e+00)],
[( 0.56873211, 0.08490973, -9.58420819e-01)],
...,
[( 0.75977987, 1.01487493, -3.09104648e-01)],
[( 1.49991558, 0.31828259, 7.32527127e-01)],
[( 0.37631738, -0.13702323, -4.49895321e-01)]],
[[(-0.29811247, 0.22712161, -2.30229599e+00)],
[(-0.08367526, -0.6770184 , 2.14906991e+00)],
[( 1.30232311, -1.19833055, 6.25619458e-01)],
...,
[(-0.66017245, 0.71545281, -1.08721809e-01)],
[(-0.05371075, 0.14571677, -2.32778280e-01)],
[( 1.84950325, 0.20260285, 7.63400450e-01)]],
[[( 1.11391053, -0.92655026, 6.71088190e-01)],
[( 0.80958443, -0.87365369, 1.90746702e-01)],
[( 0.9196947 , -0.65898106, 5.13530528e-01)],
...,
[(-0.06864652, 1.35126332, -3.85608104e-01)],
[(-0.59319991, 0.29826444, -9.87661503e-01)],
[( 0.85737793, 0.65006251, 1.14143652e-01)]]],
dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8')])
As with vector objects, the type of vectors (2D/3D/4D, coordinate system, geometric or momentum) depend on the field names.
Below is 10000 polar 2D vectors with a random distribution.
[10]:
vector.array(
{
"rho": np.random.exponential(5, 10000),
"phi": np.random.uniform(-np.pi, np.pi, 10000),
}
)
[10]:
VectorNumpy2D([(5.76986734, -0.36991388), (6.24908325, -1.59263296),
(7.30828905, 2.44556635), ..., (4.22339573, 1.2802615 ),
(1.26336621, 1.3455241 ), (7.57120305, 0.49939363)],
dtype=[('rho', '<f8'), ('phi', '<f8')])
Below is 10000 simulated electron four-momenta, radiating in a spherically symmetric pattern.
[11]:
vector.array(
{
"pt": np.random.exponential(5, 10000),
"phi": np.random.uniform(-np.pi, np.pi, 10000),
"theta": np.arccos(np.random.uniform(-1, 1, 10000)),
"mass": np.full(10000, 0.000511),
}
)
[11]:
MomentumNumpy4D([(3.88889022, 2.09785462, 2.66867642, 0.000511),
(0.26253295, 2.0104313 , 1.00648054, 0.000511),
(4.50268027, 2.61219852, 1.26494007, 0.000511), ...,
(1.69167777, 2.92313438, 0.64490048, 0.000511),
(0.02264467, 2.72715258, 1.64896841, 0.000511),
(0.47545426, -2.3042779 , 1.09635369, 0.000511)],
dtype=[('rho', '<f8'), ('phi', '<f8'), ('theta', '<f8'), ('tau', '<f8')])
Using arrays of vectors#
Calculations on arrays of vectors are more efficient than vector objects because loops over the data are performed in compiled code, not Python for loops. Therefore, while it’s possible to iterate over the arrays like this:
[12]:
a = vector.array({"x": np.arange(10), "y": np.zeros(10), "z": np.arange(-5, 5)})
b = vector.array({"x": np.ones(10), "y": np.arange(10, 0, -1), "z": np.arange(-10, 0)})
[13]:
for i in range(len(a)):
print(a[i].cross(b[i]))
VectorObject3D(x=50.0, y=-5.0, z=0.0)
VectorObject3D(x=36.0, y=5.0, z=9.0)
VectorObject3D(x=24.0, y=13.0, z=16.0)
VectorObject3D(x=14.0, y=19.0, z=21.0)
VectorObject3D(x=6.0, y=23.0, z=24.0)
VectorObject3D(x=-0.0, y=25.0, z=25.0)
VectorObject3D(x=-4.0, y=25.0, z=24.0)
VectorObject3D(x=-6.0, y=23.0, z=21.0)
VectorObject3D(x=-6.0, y=19.0, z=16.0)
VectorObject3D(x=-4.0, y=13.0, z=9.0)
It’s much more efficient to express the operation in one (Python) function call:
[14]:
a.cross(b)
[14]:
VectorNumpy3D([(50., -5., 0.), (36., 5., 9.), (24., 13., 16.), (14., 19., 21.),
( 6., 23., 24.), (-0., 25., 25.), (-4., 25., 24.), (-6., 23., 21.),
(-6., 19., 16.), (-4., 13., 9.)],
dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8')])
Here’s a demonstration of that fact:
[15]:
a = (
np.random.normal(0, 1, (100000, 3))
.view([(_, float) for _ in "xyz"])
.view(vector.VectorNumpy3D)
)
b = (
np.random.normal(0, 1, (100000, 3))
.view([(_, float) for _ in "xyz"])
.view(vector.VectorNumpy3D)
)
[16]:
%%timeit -r1 -n1
out = np.empty(100000)
for i in range(len(out)):
(out[i],) = a[i].dot(b[i])
2.34 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)
[17]:
%%timeit
out = a.dot(b)
224 μs ± 23.9 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
(Note the units.)
All of the properties and methods for Vector’s NumPy backend operate an array at a time. This is often called “vectorized” processing, but using that terminology with Vector is confusing. This also applies to the properties that convert coordinates:
[18]:
a.phi
[18]:
array([[ 2.23666277],
[-2.81852833],
[ 1.21493154],
...,
[ 2.15235242],
[-0.95523549],
[ 2.76774914]])
[19]:
a.theta
[19]:
array([[0.49675579],
[3.01694143],
[1.3997076 ],
...,
[2.4227274 ],
[1.56865499],
[1.63848396]])
and the binary operators:
[20]:
a + b
[20]:
VectorNumpy3D([[(-0.54670582, 2.89324033, 0.94011735)],
[(-0.72623161, -0.24806477, -0.82315251)],
[( 0.65875323, -0.03273066, -1.00717335)],
...,
[(-0.4388786 , 1.67843842, -2.02837515)],
[( 2.03465828, -0.28830504, -1.30802195)],
[(-1.14806043, 0.2955152 , 0.31599728)]],
dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8')])
All of the above operations apply element by element, returning a new array of the same shape, just like NumPy’s ufuncs. However, other functions, such as np.where and np.sum, work equally well:
[21]:
np.where(abs(a) > abs(b), a, b)
[21]:
array([[(-0.37840277, 2.67899112, 0.4375337 )],
[(-0.09376478, -0.03139186, -0.78914169)],
[( 0.48146806, -0.50970237, -1.09509198)],
...,
[(-0.64940632, 0.98784834, -1.35097802)],
[( 1.37797373, 0.64022737, -1.31045726)],
[(-1.23186483, 0.48325011, -0.0897055 )]],
dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8')])
[22]:
np.sum(a)
[22]:
VectorNumpy3D((496.97860925, -22.8454872, -148.7924227),
dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8')])
For arrays with multiple “dimensions” (length of the shape tuple), you can reduce along different axes, adding all vectors appropriately for their coordinate systems.
[23]:
arr = np.random.normal(0, 1, (2000, 2000, 3))
arr = arr.view([(_, float) for _ in "xyz"]).view(vector.VectorNumpy3D)
arr = arr.to_rhophitheta()
[24]:
np.sum(arr, axis=0)
[24]:
VectorNumpy3D([[( 75.98446415, 40.15921933, 60.12075909)],
[( 43.96376125, -41.61744832, 2.74547848)],
[( -8.85532964, 36.95219866, 6.00630486)],
...,
[( 14.02706268, -60.15901098, -26.43507193)],
[(-51.59700053, 42.23856749, 70.63818226)],
[(-35.38474417, 103.36337473, -101.28248057)]],
dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8')])
[25]:
np.sum(arr, axis=1)
[25]:
VectorNumpy3D([[( 3.62436183, 1.83432384, 64.49208518)],
[(-36.24506116, 47.24392643, -16.92574687)],
[(-53.86240677, -20.09648243, 6.7455743 )],
...,
[( 27.92974969, 33.86961883, -23.77178002)],
[(-45.26981061, -3.55433054, 50.74996803)],
[(-40.54746674, 38.83594044, -18.71646616)]],
dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8')])