Skip to content

Conversation

@hangfei
Copy link
Collaborator

@hangfei hangfei commented May 21, 2022

Add API to materialize features to offline storage.
Syntax:

client = FeathrClient()
offlineSink = OfflineSink(output_path="abfss://[email protected]/materialize_offline_test_data/")
# Materialize two features into a Offline store.
settings = MaterializationSettings("nycTaxiMaterializationJob",
                                   sinks=[offlineSink],
                                   feature_names=["f_location_avg_fare", "f_location_max_fare"])
client.materialize_features(settings)

This will generate features on latest date(assuming it's 2022/05/21) and output data to the following path: abfss://[email protected]/materialize_offline_test_data/df0/daily/2022/05/21

You can also specify a BackfillTime so the features will be generated for those dates. For example:

backfill_time = BackfillTime(start=datetime(
    2020, 5, 20), end=datetime(2020, 5, 20), step=timedelta(days=1))
offline_sink = OfflineSink(output_path="abfss://[email protected]/materialize_offline_test_data/")
settings = MaterializationSettings("nycTaxiTable",
                                   sinks=[offline_sink],
                                   feature_names=[
                                       "f_location_avg_fare", "f_location_max_fare"],
                                   backfill_time=backfill_time)

Output sample:

image

@hangfei hangfei requested review from jaymo001 and xiaoyongzhu May 21, 2022 18:10
@hangfei hangfei merged commit 0a44f5e into main May 25, 2022
@hangfei hangfei deleted the mat_hangfei branch May 25, 2022 21:46
bozhonghu pushed a commit that referenced this pull request Jun 1, 2022
* main: (30 commits)
  Yihui/moderate registration conflict (#304)
  Update homepage (#310)
  Add extensible extractor APIs (#302)
  Remove Java and JS from Code Scanning
  Create codeql-analysis.yml
  [feathr] Add API to materialize features to offline store (#294)
  Improve error message when path is not supported (#257)
  Add tech talk slides for Feathr (#296)
  Update README.md
  Add milestone link (#286)
  Fix millisecond timestamp handling (#288)
  Consolidating CI pipelines (#280)
  Fixed dependecy problem of pretty print utils (#273)
  Fixing a broken link in README.md (#277)
  Fix test failure (#276)
  Added feature validation (#258)
  Feathr UI: Display feature key and transform expression in feature detail pages (#262)
  Feathr UI: enable multiple tenant auth (#266)
  Reduce feathr web api docker image build time (#261)
  Pretty-print the features produced by buildFeatures  (#214)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants