Missing of a type conversion in model.py

There seems to miss a type conversion in the forward process of `VisionTransformer`, in `clip/model.py`. The direct forward pass without pre-conversion (Line 342) would cause error of type mismatch.

As a reference, in Line 146 there is an explicit type conversion in `ModifiedResNet`.

Is it possible to add conversion `x = x.type(self.conv1.weight.dtype)` between Line 223-224?