-
Notifications
You must be signed in to change notification settings - Fork 238
Open
Labels
bugSomething isn't workingSomething isn't workinggood first issueGood for newcomersGood for newcomers
Description
Willingness to contribute
No. I cannot contribute a bug fix at this time.
Feathr version
0.9.0
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 20.0): Ubuntu 20.0
- Python version: 3.10
- Spark version, if reporting runtime issue: 3.2.x and 3.3.1
Describe the problem
Materialize job fails on some combinations of features, throwing following errors:
Caused by: java.lang.NullPointerException
at com.linkedin.feathr.common.types.protobuf.FeatureValueOuterClass$FeatureValue$Builder.setStringValue(FeatureValueOuterClass.java:1728)
at com.linkedin.feathr.offline.generation.outputProcessor.RedisOutputUtils$.$anonfun$getConversionFunction$4(RedisOutputUtils.scala:110)
at com.linkedin.feathr.offline.generation.outputProcessor.RedisOutputUtils$.$anonfun$encodeDataFrame$2(RedisOutputUtils.scala:51)
at com.linkedin.feathr.offline.generation.outputProcessor.RedisOutputUtils$.$anonfun$encodeDataFrame$2$adapted(RedisOutputUtils.scala:48)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
at scala.collection.immutable.Range.foreach(Range.scala:158)
at scala.collection.TraversableLike.map(TraversableLike.scala:286)
at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
at scala.collection.AbstractTraversable.map(Traversable.scala:108)
at com.linkedin.feathr.offline.generation.outputProcessor.RedisOutputUtils$.$anonfun$encodeDataFrame$1(RedisOutputUtils.scala:48)
Tracking information
No response
Code to reproduce bug
# anchored feature
Feature(
name="account_country",
key=account_id,
feature_type=STRING,
transform="accountCountry",
),
...
# average amount of transaction in that week
avg_transaction_amount = Feature(
name="avg_transaction_amount",
key=account_id,
feature_type=FLOAT,
transform=WindowAggTransformation(
agg_expr="cast_float(transactionAmount)", agg_func="AVG", window="7d"
),
)
...
client.materialize_features(
MaterializationSettings(
ACCOUNT_FEATURE_TABLE_NAME,
backfill_time=backfill_time,
sinks=[RedisSink(table_name=ACCOUNT_FEATURE_TABLE_NAME)],
feature_names=["account_country", "avg_transaction_amount"],
),
allow_materialize_non_agg_feature=True,
)
feature_names=["account_country"], feature_names=["avg_transaction_amount"], and other combinations like ['account_country', 'num_transaction_count_in_day'] work without errors.
Only ["account_country", "avg_transaction_amount"] this combination fails.
What component(s) does this bug affect?
-
Python Client: This is the client users use to interact with most of our API. Mostly written in Python. -
Computation Engine: The computation engine that execute the actual feature join and generation work. Mostly in Scala and Spark. -
Feature Registry API: The frontend API layer supports SQL, Purview(Atlas) as storage. The API layer is in Python(FAST API) -
Feature Registry Web UI: The Web UI for feature registry. Written in React
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workinggood first issueGood for newcomersGood for newcomers