All Topics

#428 HisRead {date} range clarification

Alex Afflick Thu 28 Jul 2016

Hi all,

We've come across an interesting implementation detail for the hisRead operation when defining "{date}" as the range.

e.g. If you have the following points:

2016-01-02T00:00:00 UTC,2.0 W
2016-01-02T12:00:00 UTC,3.0 W
2016-01-02T59:59:59 UTC,4.0 W
2016-01-03T00:00:00 UTC,5.0 W

As per the documentation:

"Ranges are exclusive of start timestamp and inclusive of end timestamp."

and

"Date based ranges are always inferred to be from midnight of starting date to midnight of the day after ending date using the timezone of the his entity being queried."

// request
ver:"2.0"
id,range
@somePt,"2016-01-02"

// reponse
ver:"2.0" id:@somePt hisStart:2016-01-02T00:00:00-00:00 UTC hisEnd:2016-01-03T00:00:00-00:00 UTC
ts,val
2016-01-02T12:00:00-00:00 UTC,3.0 W
2016-01-02T59:59:59-00:00 UTC,4.0 W
2016-01-03T00:00:00-00:00 UTC,5.0 W

As you can see, requesting the date 2016-01-02 has included the record at midnight of the next day 2016-01-03 and excluded the record on midnight of the requested day.

The debate we've been having is that this approach isn't very intuitive. Requesting records for a date should include the midnight of that date and not the one of the day after (inclusive start and exclusive end).

Can we please get some clarification on the intended functionality of the operation?

Brian Frank Fri 29 Jul 2016

That design was copied from oBIX which in turn was copied from how consumption data typically works. We spent a lot of time on this issue for SkySpark and decided that is a terrible design which might be ok for consumption data, but doesn't fit well with sampled interval data or change-of-value history data.

What we have switched over to for SkySpark is that ranges are always inclusive of starting timestamp and exclusive of ending time. For charting and interpolation we typically also always return data value immediately before and after the range.

We have also gone even farther and specified that consumption history data should be represented using the starting timestamp of the interval to match how COV data would work. Not sure how relevant that is to what we standardize for Project Haystack because we haven't really gotten into the low-level details or rollups yet.

This is what I would propose:

We change the semantics for hisRead to be inclusive start and exclusive end
Consider how we might want to include timestamp before/after range if you need it for interpolation (not sure if this should be default, option, etc)

Stuart Longland Sat 30 Jul 2016

So, silly question… suppose I was trying to make a library that talked to Project Haystack servers.

How does the library detect what the convention is for a particular server so that it can translate what was asked for in the program to an instruction that will retrieve what was asked for?

Brian Frank Sun 31 Jul 2016

How does the library detect what the convention is for a particular server

What do you mean by convention? You mean whether it is exclusive start or inclusive end? That would best by done via the about op version (we would say new convention is "3.0" and old convention is "2.0")

Stuart Longland Sun 31 Jul 2016

Yes, the range endpoint conventions; whether the start or end is inclusive or not.

Is it up to the client to see that a particular server is a "3.0" server and to adjust its parameters accordingly or will a server respond according to the old conventions when it receives a request in the "2.0" style?

More to the point; how does one signal to the server, "I only understand Haystack 2.0"? Which end is responsible for backward compatibility?

At the moment, pyhaystack is a Haystack 2.0 client, support for 3.0 has not yet been started. I'm trying to avoid applications built on pyhaystack getting broken by changes in the Haystack protocols.

We also have a couple of JavaScript Haystack clients, one which I wrote as part of a Tableau Web Data Connector that pulls data out of Project Haystack into Tableau, another is part of a web UI for displaying energy consumption.

The HTTP interface is for machines, not humans, and machines are dumb. Tagging model changes are relatively easy to accommodate. Fundamental things like subtleties in how arguments are interpreted or how data is formatted over-the-wire not so much.

Edit to clarify: I'll admit the newer convention of "start inclusive, end exclusive" is good in that it matches the conventions seen in languages like Python, JavaScript and PHP. But there is existing code already in production that assumes the old convention.

We want to avoid breaking that code where possible. Perhaps a couple of arguments could be added to indicate to the server whether the start/end are inclusive/exclusive might address this. (i.e. start="inclusive")

Alex Afflick Sun 31 Jul 2016

Thanks for the input Brian and Stuart!

Brian - It's great to hear that you guys have switched over. Your suggested approach is the one we prefer as well. I'm picturing Stuart's suggestion looking like this:

// ZINC request
ver:"3.0" start="inclusive" end="exclusive"
id,range
@someTemp,"2012-10-01"

These would be the default (optional) flags.

// REST request URI
/haystack/hisRead?id=@hisId&range=yesterday&start=inclusive&end=exclusive

Does anyone have any other tag suggestions for start, end, inclusive and exclusive?

Zach Mierzejewski Mon 2 Mar 2020

I know this post is 3 years old, but I don't see any of these options in the current Haystack 3.0 documentation. In fact, the Ops page specifically says start is exclusive and end is inclusive.

I agree with Brian; it is more logical for trend data to be: start-inclusive and end-exclusive, but ...

Is that now the standard? (Or were you guys just brainstorming?)
Is it still the oBIX start-exclusive, end-inclusive?

I understand that for backwards compatibility, we need to check the version and do some fiddling for 2.0. I'm asking what the current standard is in 3.0.

Will it be different from implementation to implementation, even when both say ver:"3.0"?
Will we need to figure it out for each vendor, e.g. SkySpark?

Brian Frank Tue 3 Mar 2020

At this point, I'd say it should just always be inclusive start and exclusive end. Its been that way in SkySpark since 2016 and I don't think its come up again.

But I see now that the docs on this website never got updated. The docs also say that client range timezone must match the point's configured timezone, but I think most implementations now will do a conversion which makes things a lot more robust.

I fixed the docs to the following for next time we update website:

Ranges are inclusive of start timestamp and exclusive of end timestamp. The {date} and {dateTime} options must be correctly Zinc encoded. Date based ranges are always inferred to be from midnight of starting date to midnight of the day after ending date using the timezone of the his point being queried.

Clients should query the range using the configured timezone of the point. Although if a different timezone is specified in the range, then servers must convert to the point's configured timezone before executing the query.

Zach Mierzejewski Tue 3 Mar 2020

Thank you for the clarification and updated docs Brian!