- N/A
- N/A
- Depends on
apache-beam[gcp]>=2.50,<2.51for Python 3.9 and 3.10.
- N/A
- N/A
- N/A
- Depends on
protobuf>=4.25.2,<5for Python 3.11 and onprotobuf>=4.21.6,<6.0.0for 3.9 and 3.10. - Bumped the minimum bazel version required to build
tfx_bslto 6.5.0. - Updating ZetaSQL to v2024.11.1 and related dependencies.
- Update GCC version to gcc-10 and the build configuration (.bazelrc, .bazelversion)
- macOS wheel publishing is temporarily paused due to missing ARM64 support.
- N/A
- N/A
- N/A
- Relax dependency on Protobuf to include version 5.x
- N/A
- N/A
- N/A
- Depends on
tensorflow 2.16
- N/A
- N/A
- N/A
- Bumped the mininum bazel version required to build
tfx_bslto 6.1.0. - Bump the macOS version on which TFX-BSL is tested to Ventura (previously was Monterey).
- Bumps the pybind11 version to 2.11.1
- Depends on
tensorflow 2.15 - Depends on
apache-beam[gcp]>=2.53.0,<3for Python 3.11 and onapache-beam[gcp]>=2.47.0,<3for 3.9 and 3.10. - Depends on
protobuf>=4.25.2,<5for Python 3.11 and onprotobuf>3.20.3,<5for 3.9 and 3.10. - Deprecated Windows support.
- N/A
- N/A
- Bumped the mininum bazel version required to build
tfx_bslto 6.1.0. - Bump the macOS version on which TFX-BSL is tested to Ventura (previously was Monterey).
- Bumps the pybind11 version to 2.11.1
- Depends on
tensorflow~=2.15 - Depends on
apache-beam[gcp]>=2.53.0,<3for Python 3.11 and onapache-beam[gcp]>=2.47.0,<3for 3.9 and 3.10. - Depends on
protobuf>=4.25.2,<5for Python 3.11 and onprotobuf>3.20.3,<5for 3.9 and 3.10. - Deprecated Windows support.
- N/A
- Deprecated python 3.8 support.
- N/A
- Bumped the Ubuntu version on which TFX-BSL is tested to 20.04 (previously 16.04).
- Adds
order_on_tieparameter toMisraGriesSketchto specify the order of items in case their counts are tied. - Use @platforms instead of @bazel_tools//platforms to specify constraints in OSS build.
- Depends on
pyarrow>=10,<11. - Depends on
apache-beam>=2.47,<3. - Depends on
numpy>=1.22.0. - Depends on
tensorflow>=2.13,<3
- N/A
- N/A
RaggedTensors can now be automatically inferred for variable length features by settingrepresent_variable_length_as_ragged=truein TFMD schema.
- Bumped the mininum bazel version required to build
tfx_bslto 5.3.0. RecordBatchToExamplesEncodernow encodes arrays representingRaggedTensors in a way that is consistent withtf.io.parse_example. Note that this change is backwards compatible withExamplesToRecordBatchDecoderand the decoding workflow as well.- Added utility functions for interacting with Arrow arrays and record batches.
- Depends on
numpy~=1.22.0.
- N/A
- Deprecated python 3.7 support.
InferTensorRepresentationsFromSchema,TensorAdapterandTensorsToRecordBatchConverternow supportSparseTensors with unknowndense_shape.
- Depends on
tensorflow>=2.11,<3
- N/A
- N/A
-
TensorAdapternow processestf.RaggedTensors in TF 2 ~10x faster. -
InferTensorRepresentationsFromSchemanow infersRaggedTensors forSTRUCTfeatures. -
TFSequenceExampleRecordnow supports schemas with features not covered or partially covered byTensorRepresentations. -
This is the last version that supports TensorFlow 1.15.x. TF 1.15.x support will be removed in the next version. Please check the TF2 migration guide to migrate to TF2.
- Depends on
tensorflow>=1.15.5,<2ortensorflow>=2.10,<3 - Depends on
protobuf>=3.13,<4 - Various
TFXIOimplementations now inferTensorRepresentationsfor provided schemaFeatureseven if someTensorRepresentationsare provided as well.
- N/A
- N/A
ExamplesToRecordBatchDecoderis now picklable.ParquetTFXIOcan now be used asRecordBasedTFXIO.- Introduces
CreateTfSequenceExampleParserConfigthat takes TFMD schema as input and produces configs fortf.SequenceExampleparsing. TFSequenceExampleRecordcan now produce an equivalent tf.data.Dataset.- Introduces an api:
CreateModelHandlerthat produces a model handler suitable for apache_beam.ml.inference. - Quantiles sketch supports GetQuantilesAndCumulativeWeights, which returns the sum of weights in each quantiles bin along with boundaries.
- Depends on
apache-beam[gcp]>=2.40,<3. - Depends on
pyarrow>=6,<7. - Depends on
tensorflow-metadata>=1.10,<1.11. - Depends on
tensorflow>=1.15.5,<2ortensorflow>=2.9,<3.
- GenerateQuantiles removed from weighted_quantiles_summary.h and replaced with GenerateQuantilesAndCumulativeWeights.
- N/A
- N/A
- Depends on
tensorflow-metadata>=1.9,<1.10. - Depends on
tensorflow>=1.15.5,<2ortensorflow>=2.9,<3. - Depends on
protobuf>=3.13,<3.21.
- N/A
- N/A
- Introduced
RunInferencePerModelPTransform, which is a vectorized variant ofRunInference(useful for ensembles). - Introduced
ParquetTFXIOthat allows reading data from Parquet files inpyarrow.RecordBatchformat. - From this version we will be releasing python 3.9 wheels.
- Depends on
apache-beam[gcp]>=2.38,<3.
- Depends on
tensorflow-metadata>=1.8,<1.9.
- N/A
- N/A
- N/A
- Depends on
apache-beam[gcp]>=2.36,<3. - Depends on
tensorflow-metadata>=1.7,<1.8. - Depends on
tensorflow>=1.15.5,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,<3. - Depends on
tensorflow-serving-api>=1.15,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,<3. - Added a TFXIO where the user defines the beam source.
- N/A
- N/A
- N/A
- Fixes a bug when
TensorsToRecordBatchConvertercould not handletf.RaggedTensors with uniform inner dimensions in TF 1.15. - Depends on
apache-beam[gcp]>=2.35,<3. - Depends on
tensorflow-metadata>=1.6,<1.7. - Depends on
numpy>=1.16,<2. - Depends on
absl-py>=0.9,<2.0.0. - Depends on
tensorflow>=1.15.5,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,<3.
- N/A
- N/A
TensorsToRecordBatchConvertercan now handletf.RaggedTensors with uniform inner dimensions.
- Depends on
apache-beam[gcp]>=2.34,<3. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,<3. - Depends on
tensorflow-serving-api>=1.15,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,<3. - Depends on
tensorflow-metadata>=1.5,<1.6.
- N/A
- N/A
- Introduces
RecordBatchToExamplesEncoderthat supports encoding nestedpyarrow.large_list()s representingtf.RaggedTensors.
- Register s2t ops before loading decoder in record_to_tensor_tfxio if struct2tensor is installed.
- Depends on
pyarrow>=1,<6. - Depends on
tensorflow-metadata>=1.4,<1.5.
- N/A
- Deprecated python 3.6 support.
- N/A
QuantilesSketchnow ignores NaNs in input values and weights. Previously, NaNs would lead to incorrect quantiles calculation.- Fixes a bug when
MisraGriesSketchwould discard excessive number of elements duringAddValuesandCompressand output fewer elements than requested. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,<3. - Depends on
tensorflow-serving-api>=1.15,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,<3.
- N/A
- N/A
- Added support for converting
tf.compat.v1.ragged.RaggedTensorValues toTensorsToRecordBatchConverter. - Depends on
apache-beam[gcp]>=2.31,<3. - Depends on
tensorflow-metadata>=1.2,<1.3.
- N/A
- N/A
- N/A
- N/A
- Depends on
google-cloud-bigquery>>=1.28.0,<2.21.
- N/A
- N/A
- Provided the SQL query ability for Apache Arrow RecordBatch. It's not available under Windows.
- Depends on
protobuf>=3.13,<4. - Upgraded the protobuf (com_google_protobuf) to
3.13.0. - Upgraded the bazel_skylib to
1.0.2due to the upgrading of protobuf. - Depends on
tensorflow-metadata>=1.1,<1.2. - More documentation is added for the SequenceExample decoder. It's available
at
tfx_bsl/coders/README.md.
- The minimum required OS version for the macOS is 10.14 now.
- N/A
- N/A
- Depends on
apache-beam[gcp]>=2.29,<3. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<3. - Depends on
tensorflow-serving-api>=1.15,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<3. - Depends on
tensorflow-metadata>=1.0,<1.1.
- N/A
- N/A
- Misra-Gries sketch: added support for replacing large string blobs with a configurable placeholder, and replacing invalid utf-8 sequences with a configurable placeholder.
- Depends on
tensorflow-metadata>=0.30,<0.31.
- Removed
tfx_bsl.beam.shared. It is now available in Apache Beam. Useapache_beam.utils.sharedinstead.
- N/A
- Add RawRecordTensorFlowDataset interface to record based tfxios.
- TensorToArrowConverter now can handle generic SparseTensors (>=3-d).
- Added
RecordToTensorTFXIO.DecodeFunction()to get the decoder as a TF function.
- Depends on
absl-py>=0.9,<0.13. - Depends on
tensorflow-metadata>=0.29,<0.30. - Bumped the mininum bazel version required to build
tfx_bslto 3.7.2.
- N/A
- N/A
- N/A
- Depends on
apache-beam[gcp]>=2.28,<3.
- N/A
- N/A
- RunInference can now be applied on serialized tf.train.{Example, SequenceExample} for all methods as well as any other kind of serialized structure for the Predict method.
- RunInference can now operate on PCollection[K, V] in a key-forwarding mode (whereby the key is left unchanged while inference is performed on the value).
- RunInference is now more performant.
- Depends on
numpy>=1.16,<1.20. - Depends on
tensorflow-metadata>=0.28,<0.29.
- N/A
- N/A
- This is a bug fix only version, which modified the dependencies.
- N/A
- Fix in the
tensorflow-serving-apiversion constraint.
- N/A
- N/A
tfx_bsl.public.tfxio.TFGraphRecordDecoderis now a public API.
- Depends on
apache-beam[gcp]>=2.27,<3. - Depends on
pyarrow>=1,<3. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,<3. - Depends on
tensorflow-metadata>=0.27,<0.28. - Depends on
tensorflow-serving>=1.15,!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,<3.
- N/A
- N/A
- This is a bug fix only version, which modified the dependencies.
- N/A
- Depends on
apache-beam[gcp]>=2.25,!=2.26.*,<3. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,!=2.4.*,<3. - Depends on
tensorflow-serving>=1.15,!=2.0.*,!=2.1.*,!=2.2.*,!=2.4.*,<3.
- N/A
- N/A
.TensorFlowDatasetinterface is available in RawTfRecord TFXIO.
- Fix TFExampleRecord TFXIO's TensorFlowDataset output key's to match the tensor representation's tensor name (Previously this assumed the user provided a tensor name that is the same as the feature name).
- Add utility in tensor_representation_util.py to get source columns from a tensor representation.
- Depends on
tensorflow-metadata>=0.26,<0.27.
- N/A
- N/A
-
Add
RecordBatchesinterface to TFXIO. This interface returns an iterable of record batches, which can be used outside of Apache Beam or TensorFlow to access data. -
From this release TFX-BSL will also be hosting nightly packages on https://pypi-nightly.tensorflow.org. To install the nightly package use the following command:
pip install --extra-index-url https://pypi-nightly.tensorflow.org/simple tfx-bslNote: These nightly packages are unstable and breakages are likely to happen. The fix could often take a week or more depending on the complexity involved for the wheels to be available on the PyPI cloud service. You can always use the stable version of TFX-BSL available on PyPI by running the command
pip install tfx-bsl.
- TensorToArrow returns LargeListArray/LargeBinaryArray in place of ListArray/BinaryArray.
- array_util.IndexIn now supports LargeBinaryArray inputs.
- Depends on
apache-beam[gcp]>=2.25,<3. - Depends on
tensorflow-metadata>=0.25,<0.26.
- Coders (Example, CSV) do not support outputting ListArray/BinaryArray any more. They can only output LargeListArray/LargeBinaryArray.
- N/A
- N/A
- Depends on
apache-beam[gcp]>=2.24,<3.
- N/A
- N/A
- You can now build
tfx_bslwheel withpython setup.py bdist_wheel. Note:- If you want to build a manylinux2010 wheel you'll still need to use Docker.
- Bazel is still required.
- You can now build manylinux2010
tfx_bslwheel for Python 3.8. - From this version we will be releasing python 3.8 wheels.
- Stopped depending on
six. - Depends on
absl-py>=0.9,<0.11. - Depends on
pandas>=1.0,<2. - Depends on
protobuf>=3.9.2,<4. - Depends on
tensorflow-metadata>=0.24,<0.25.
- N/A
- Deprecated py3.5 support.
- Several TFXIO symbols are made public, which means:
- TFX users (both pipeline and component authors), and TFX libraries (TFDV, TFMA, TFT) users may start using these symbols.
- We will be subject to semantic versioning once tfx_bsl goes beyond 1.0.
- TFRecord based TFXIO implementations now support reading from multiple file patterns.
- Implemented the TensorFlowDataset() interface for TFExampleRecord TFXIO.
- Starting from this version,
tfx_bslhas no binary dependency onpyarrow(libarrow.so). As a result:- Package
tfx_bslwill be able to work with a wider range of pyarrow versions. We will relax the version requirements in setup.py in the next release. - Custom built
tfx_bsldoes not have to maintain ABI compatiblity with a specificpyarrowinstallation. Custom builds don't need to be manylinux-conformant.
- Package
- Starting from this version, the windows wheel will be built with VS 2015.
run_all_testswill fail with exit code -2 if no tests are discovered.- Stopped requiring
avro-python3. - Example coders will ignore duplicate feature names in the TFMD schema (only the first one counts). It is a temporary measure until TFDV can check and prevent duplications. DO NOT rely on this behavior.
- CsvTFXIO now allows skipping CSV headers (
set skip_header_lines). - CsvTFXIO now requires
telemetry_descriptorsto construct. - Depends on
apache-beam[gcp]>=2.23,<3. - Depends on
pyarrow>=0.17,<0.18. - Depends on
tensorflow>=1.15.2,!=2.0.*,!=2.1.*,!=2.2.*,<3. - Depends on
tensorflow-metadata>=0.23,<0.24. - Depends on
tensorflow-serving-api>=1.15,!=2.0.*,!=2.1.*,!=2.2.*,<3.
- N/A
- Dropped Python 2.x support.
- Note: We plan to remove Python 3.5 support after this release.
- Added SequenceExamplesToRecordBatchDecoder.
- Added a TFXIO implementation for SequenceExmaples on TFRecord.
- Added support for TensorAdapter to output tf.RaggedTensors.
- Improved performance of tf.Example and tf.SequenceExample coders.
- Depends on
pandas>=0.24,<2. - Depends on
tensorflow>=1.15,!=2.0.*,<3. - Depends on
tensorflow-metadata>=0.22.2,<0.23. - Removed tensor_to_arrow_test for TF 1.x as it does not support TF 1.x.
- Removed
arrow.table_util.SliceTableByRowIndices(in favor ofRecordBatchTake) - Removed
arrow.table_util.MergeTables(in favor ofMergeRecordBatches)
- Moved RunInference API and related protos to tfx_bsl/public directory.
- CSV coder support for multivalent columns.
- tf.Exmaple coder support for producing large types (LargeList, LargeBinary).
- Added TFXIO for CSV
- Depends on
apache-beam[gcp]>=2.20,<3. - Depends on
pyarrow>=0.16,<0.17 - Depends on
tensorflow-metadata>=0.22,<0.23
- Renamed ModelEndpointSpec to AIPlatformPredictionModelSpec to specify remote model endpoint on Google Cloud Platform.
- Renamed InferenceEndpoint to InferenceSpecType.
- Added a tfxio.telemetry.ProfileRecordBatches, a PTransform to collect telemetry from Arrow RecordBatches.
- Added remote model inference on Google Cloud Platform.
- Added
arrow.table_util.MergeRecordBatches: similar toMergeTablesbut operates againstpa.RecordBatches. - Added
arrow.table_util.RecordBatchTake: similar toSliceTableByRowIndicesbut operates against apa.RecordBatch. - Requires
apache-beam>=2.17,<3 - Only requires
avro-python3>=1.8.1,!=1.9.2.*,<2.0.0on Python 3.5 + MacOS - Requires
google-api-python-client>=1.7.11,<2
- Requires
apache-beam>=2.17,<2.18
- Fixed a bug in tfx_bsl.arrow.array_util.GetFlattenedArrayParentIndices that could cause memory corruption.
- Defined an abstract subclass of
TFXIO,RecordBasedTFXIOto model record based file formats.
-
Utilities in
tfx_bsl.arrow.array_utilthat:- previously takes
ListArraynow can also acceptLargeListArray. - previously takes StringArray/BinaryArray now can also accept LargeStringArray and LargeBinaryArray.
As a result:
GetElementLengthsnow returns anInt64Array.GetFlattenedArrayParentIndicesmay return anInt64Arrayor anInt32Arraydepending on the input type. - previously takes
-
Introduced TFXIO, the interface for Standardized TFX Inputs
-
Added the first implementation of TFXIO, for tf.Example on TFRecords.
- Added a test_util sub-package that contains a tool to discover and run all the absltests in a dir (like python's unittest discovery).
- Requires
apache-beam>=2.17,<3 - Requires
pyarrow>=0.15,<0.16 - Requires
tensorflow>=1.15,<3 - Requires
tensorflow-metadata>=0.21,<0.22.
- Requires
apache-beam>=2.16,<2.17as 2.17 requires a pyarrow version that we don't support yet.
-
Behavior of csv_decoder.ColumnTypeInferrer was changed. A new column type,
ColumnType.UNKNOWNwas added to denote that the inferrer could not determine the type of that column (instead of making a guess of FLOAT). Summary of behavior change (values in the examples are from the same column):<int>, <empty>: before:FLOAT; after:INT<empty>, ... , <empty>: before:FLOAT; after:UNKNOWN
- Added a (beam) utility to infer column types from a
PCollection[CSVLine]. - Added a utility to parse a CSVLine into cells (conforming to RFC4180).
- Added dependency on
tensorflow>=1.15,<2.2. Starting from 1.15, packagetensorflowcomes with GPU support. Users won't need to choose betweentensorflowandtensorflow-gpu.- Caveat:
tensorflow2.0.0 is an exception and does not have GPU support. Iftensorflow-gpu2.0.0 is installed before installingtfx-bsl, it will be replaced withtensorflow2.0.0. Re-installtensorflow-gpu2.0.0 if needed.
- Caveat:
- Added dependency on
tensorflow-serving-api>=1.15,<3. - Added a python PTransform,
tfx_bsl.beam.RunInferencethat enables batch inference.
- Added a tf.Example <-> Arrow coder.
- Added a tf.Example ->
Dict[str, np.ndarray]coder (this is a legacy format used by some TFX components). - Added some common Arrow utilities (
tfx_bsl.arrow.array_util). - Added a python class,
tfx_bsl.beam.Sharedthat helps sharing a single instance of object across multiple threads. - Added dependency on
apache-beam[gcp]>=2.16,<3. - Added dependency on
tensorflow-metadata>=0.15,<0.16.