Currently the input tokenizer is in Python, taken from the original OpenAI's implementation: https://github.com/certik/fastGPT/blob/01eb84b015d89a567245da0445c0abb7d53a8500/encode_input.py. We should implement it in Fortran. That will eliminate the need to call the Python script before running fastGPT.
We have to write tests that exercise each code path in the Python implementation to ensure our Fortran implementation is correct.
Currently the input tokenizer is in Python, taken from the original OpenAI's implementation: https://github.com/certik/fastGPT/blob/01eb84b015d89a567245da0445c0abb7d53a8500/encode_input.py. We should implement it in Fortran. That will eliminate the need to call the Python script before running
fastGPT.We have to write tests that exercise each code path in the Python implementation to ensure our Fortran implementation is correct.