Is it possible to have staggered datapoints when training on streaming data? #868
-
Hello, I'm pretty new to all of this ML and streaming ML techniques. I was wondering if it was possible with this library to support staggered datapoints. For example, my model takes in 3 different data points; price, quality, and product. However this data does not always come in together/ comes in at different times live streaming. Would it be possible for one iteration of data of the learning only has values for 2 of the 3 usual datapoints? and the third point is in the next iteration alone? Would I need to set the missing datapoints of each iteration as nulls? Would that still result in an accurate and functional model? Or should every datapoint be filled with a value for every iteration. Thanks for the advice in advance. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Hello there. That's an interesting set of questions!
That's not very clear to me. If I understand what you're saying, you're asking if a model can
You can leave them missing, thus creating sparse vectors. Or you can use a method from the I would say that if you have staggered data, it makes sense to want to merge these partial observations together before updating the model. Does that make sense? I would say we can add some utilities to help you merge staggered data streams. But to do that I would need some more details on what your streams look like, ideally with an example. Indeed, it's not clear to me if this is something you should handle on your side, or if River should own this merging mechanism. |
Beta Was this translation helpful? Give feedback.
Hello there.
That's an interesting set of questions!
That's not very clear to me. If I understand what you're saying, you're asking if a model can
learn_one
with a partialx
. Models definitely can do that, but they will always assume each partialx
is a separate observation. In other words, as an example, you can't do "partial gradient descent".You can leave them missing, thus creating sparse vectors. Or you can use a method from the
impute
module.I wo…