The curr tag is defined as - Marker tag which indicates the point has capability for subscription to its real-time, current value.
The read Op documentation http://project-haystack.org/doc/Ops#read the 3rd example shows currVal with data in the response. I see this as a handy feature as clients can get access to real-time data, even though they have not used watchSub.
There is a scenario we've come across where you may not want to make this behaviour available to clients. e.g. huge systems, or slow/constrained communication networks. It would be impractical to constantly stream data every point with a curr tag to the haystack server for the off-chance a client would do a read Op to get real-time data. In these scenarios you make clients instead subscribe to the points using watchSub. Client performing a read Op, will then get a down or unknown in the currStatus result.
In this case, how are we to inform clients that real-time data is only available via watchSub?
I can only think of 3 options to address this:
1) We update the documentation to inform users that when performing a read Op, currVal will only ever be the last known value when it was last subscribed from client via watchSub.
2) We remove the currStatus, currVal, currError in the read Op response. Real-time data is to be only ever to be accessed by clients via watchSub/watchPoll.
3) We be more flexible and precise with real-time behaviour and add a new marker tag i.e.. currStream - which denotes that real-time values are available for query via read Op without being initially solicited via a watchSub. In most server implementations the data would be already be in server memory. Real-time points without currStream should then not have currVal in the read op response.
Is my initial interpretation of the read Op behaviour incorrect? We have been though the reference server implementation, and we could not arrive at a consensus.
Some clarity would be appreciated. :)
Brian FrankMon 12 Jun 2017
The current design intention is option 1 - that you are are not forcing a lower level poll against source data, just reading the last value.
For SkySpark we only read source data when the point is watched, although we have other mechanisms to force a poll.
In Niagara, there was the concept of a "lease" which was essentially a 1min subscription/watch which forced reads of the source data.
But there are so many complexities associated with source data and its protocol (does it poll, does it support a subscription to COV mechanism, etc)
So I'd say we update the documentation to be clearer and say its outside the scope of the Haystack how you force a read (I think it would be very messy to standardize a clean efficient mechanism, especially since we already have watchSub which does that)
Samuel TohMon 12 Jun 2017
Voting for option #2 because I think the read API should be limited to its intended scope. To me the read operation is mainly for;
1) looking up entities 2) get a detailed description of these entities. E.g. It is a site at geo coordinates of X,Y.
For option #1 if curVal is denoted as the last known value of a point then the read op has sort-of become a mini hist_read because it is returning what a hist_read.range=Today would provide.
Option #2 is a good way forward because this means that each operation has their unique purpose been; 1) Read op - will just be on providing detailed information about entities 2) WatchSub/poll - for streaming real time data 3) histRead - for reading historical values.
Patrick CoffeyWed 14 Jun 2017
Hi Brian,
Thanks for the clarification. I understand Option 1 is the easiest path forward. Cleaning up the documentation would be a good step towards more uniform implementations. For what it's worth, WideSky reads source data when after a watchSub. But we're also considering the case when data is constantly being streamed via a message queue e.g. MQTT/AMQP.
Brian FrankThu 24 Aug 2017
Here is the changeset I've pushed to clarify the documentation around this issue
Patrick Coffey Mon 12 Jun 2017
Hi Everyone,
I have a specific question about real-time:
The curr tag is defined as - Marker tag which indicates the point has capability for subscription to its real-time, current value.
The read Op documentation
http://project-haystack.org/doc/Ops#read
the 3rd example shows currVal with data in the response. I see this as a handy feature as clients can get access to real-time data, even though they have not used watchSub.There is a scenario we've come across where you may not want to make this behaviour available to clients. e.g. huge systems, or slow/constrained communication networks. It would be impractical to constantly stream data every point with a curr tag to the haystack server for the off-chance a client would do a read Op to get real-time data. In these scenarios you make clients instead subscribe to the points using watchSub. Client performing a read Op, will then get a down or unknown in the currStatus result.
In this case, how are we to inform clients that real-time data is only available via watchSub?
I can only think of 3 options to address this:
1) We update the documentation to inform users that when performing a read Op, currVal will only ever be the last known value when it was last subscribed from client via watchSub.
2) We remove the currStatus, currVal, currError in the read Op response. Real-time data is to be only ever to be accessed by clients via watchSub/watchPoll.
3) We be more flexible and precise with real-time behaviour and add a new marker tag i.e.. currStream - which denotes that real-time values are available for query via read Op without being initially solicited via a watchSub. In most server implementations the data would be already be in server memory. Real-time points without currStream should then not have currVal in the read op response.
Is my initial interpretation of the read Op behaviour incorrect? We have been though the reference server implementation, and we could not arrive at a consensus.
Some clarity would be appreciated. :)
Brian Frank Mon 12 Jun 2017
The current design intention is option 1 - that you are are not forcing a lower level poll against source data, just reading the last value.
For SkySpark we only read source data when the point is watched, although we have other mechanisms to force a poll.
In Niagara, there was the concept of a "lease" which was essentially a 1min subscription/watch which forced reads of the source data.
But there are so many complexities associated with source data and its protocol (does it poll, does it support a subscription to COV mechanism, etc)
So I'd say we update the documentation to be clearer and say its outside the scope of the Haystack how you force a read (I think it would be very messy to standardize a clean efficient mechanism, especially since we already have watchSub which does that)
Samuel Toh Mon 12 Jun 2017
Voting for option #2 because I think the read API should be limited to its intended scope. To me the read operation is mainly for;
1) looking up entities 2) get a detailed description of these entities. E.g. It is a site at geo coordinates of X,Y.
For option #1 if
curVal
is denoted as the last known value of a point then the read op has sort-of become a mini hist_read because it is returning what a hist_read.range=Today would provide.Option #2 is a good way forward because this means that each operation has their unique purpose been; 1) Read op - will just be on providing detailed information about entities 2) WatchSub/poll - for streaming real time data 3) histRead - for reading historical values.
Patrick Coffey Wed 14 Jun 2017
Hi Brian,
Thanks for the clarification. I understand Option 1 is the easiest path forward. Cleaning up the documentation would be a good step towards more uniform implementations. For what it's worth, WideSky reads source data when after a watchSub. But we're also considering the case when data is constantly being streamed via a message queue e.g. MQTT/AMQP.
Brian Frank Thu 24 Aug 2017
Here is the changeset I've pushed to clarify the documentation around this issue
Patrick Coffey Sun 27 Aug 2017
Looks good to me. Thanks!