#938 Sub-proposal: Filter Inference Operator

Brian Frank Mon 19 Jul 2021

This is a sub-proposal of the broader 940 Haystack 4 proposal.

The current docHaystack::Relationships chapter includes a strawman proposal for how to query relationships. In this sub-proposal we tweak the syntax and tie up a few loose ends.

First lets examine the common use cases:

The overwhelming use case of relationship queries is looking up specific points for an equipment. For example a visualization or analytic on an AHU might lookup the fan enable cmd like this:

discharge and fan and enable and cmd and [email protected]

I would venture to guess that this pattern is used for 90% of all Haystack queries. The problem is that it doesn't handle varying levels of detail in the model. Haystack 4 allows an instance model to be any of the following:

ahu / enable-point
ahu / discharge-duct / enable-point
ahu / discharge-fan-equip / enable-point
ahu / discharge-duct / discharge-fan-equip / enable-point

In the first case we have a simple instance model where the AHU models the fan enable command point as a direct child. In the last case we have a very detailed model where the duct and fan are modeled as first class entities. And we have valid models between the two. We require a filter syntax so that applications can correctly resolve the point on any four of these instance models.

I propose a new operator to the filter syntax with the following patterns:

name? ^symbol
name? @ref
name? ^symbol @ref

For most the part the behavior is as described in the strawman proposal. However the syntax is slightly modified so that the name and symbol aren't combined like a conjunct. With that syntax we can query for child points as follows:

// query from above using inference operator
discharge and fan and enable and cmd and containedBy? @ahu

// same thing but little more explicit that we expect ahu
// be an equip and not some other entity type like a space
discharge and fan and enable and cmd and containedBy? ^equip @ahu

// query all the points contained recursively by the equip
point and containedBy? ^equip @ahu

// query all entities which contain the given point
contains? @point

Note that per the existing proposal a filter engine would take into account if a relationship has transitive and/or reciprocalOf defined.

Containment is by far the most important and most queried relationship. Next most important is flows including the flow of electricity, air, and fluids with thermal energy. We are capturing these relationships using a suite of tags such as elecRef, airRef, hotWaterRef, chilledWaterRef, etc.

Per the separate sub-proposal 937, we propose to rename the flow relationships as inputsFrom and outputsTo. That allows queries such as:

// query all equip which inputs hot water from the given plant 
// (all downstream equip)
equip and inputsFrom? ^hot-water @hwp

// query the AHU which outputs air to the given VAV
ahu and outputsTo? ^air @vav

Like containment these relationships are transitive so can traverse any number of levels of elecRef, airRef, etc.

The same behavior of the inference operator will also allow use of is? to query supertypes:

// query anything that subtypes from airTerminalUnit
?is ^airTerminalUnit

Gareth David Johnson Thu 12 Aug 2021

The strawman conjunct style syntax was confusing. I like the changes proposed here.

This enhancement to filters is badly needed so people can make effective use of Haystack's onotology.

annie dehghani Wed 18 Aug 2021

I really like the proposed improvements to the containment and querying syntax. This will be super helpful for us and our analytics.

Also, the ability to query subtypes using the ?is filter is great.

A question though about the inputsFrom/outputsTo...

How does that work if you've modelled references for each component? If you've modelled each component in the system with an upstream reference, would all equipment be returned by the query?

For a really simple example, say you have:

@boiler, hotWaterRef: @ahuHWvalve
@pump, hotWaterRef: @boiler
@ahuHWvalve, hotWaterRef: @pump

What would equip and inputsFrom? ^hot-water @boiler return in this case? Just @pump or @pump and @ahuHWvalve?

Brian Frank Wed 18 Aug 2021

What would equip and inputsFrom? ^hot-water @boiler return in this case? Just @pump or @pump and @ahuHWvalve?

It would return all upstream matches, so it would @pump and @ahuHWvalve.

Although the model is that hotWaterRef points to your source. So an actual model would be flipped (the AHU valve would point to the pump, which in turns points to the boiler)

Steve Eynon Sat 18 Sep 2021

Hi Brian, I can certainly see all the hard work and thought process gone into these proposals, and in general they do help to simplify the common case.

However, looking at this sub-proposal from a training perspective, I fear that introducing a new style of filter syntax will have an adverse impact on the ease of adoption of Project Haystack.

It is far easier for us humans to adopt technology if it re-uses existing syntax, and I have concerns that this sub-proposal could be a little too obtuse for general human consumption.

My concerns are two fold:

1. The ? symbol is misleading

In programming languages the ? is often used to denote something optional or something to do with nulls, be it a regular expression a? b, a null safe call a?.b(), or an elvis a ?: b.

So when I first look at name? ^symbol I'm very confused as to what it means.

2. Sequentially listing terms shuns existing operator syntax

name? ^symbol @ref such as containedBy? ^equip @ahu is not easy to understand. I would like to see the use of operators to give context to the terms in the expression.

As such I would like to propose a slightly different syntax that I feel is more intuitive to us humans.

A revised sub-proposal

Use ~ to denote inferred tags.

~ in English means approx or about, so it works here by meaning not directly in the Dict.

inputs   // does the dict have a tag named "inputs"
~inputs  // does the dict implement a def with a tag name "inputs"

Use == for equals.

The equals operator is ubiquitous, so lets use it.

// all points contained by '@ahu'
point and ~containedBy == @ahu

Use is operator to denote hierarchy.

Definitions already use the is tag, so lets re-use the same terminology.

// anything that subtypes from airTerminalUnit (instance use)
is ^airTerminalUnit

// anything contained by an airTerminalUnit (definition use)
~containedBy is ^airTerminalUnit

// all ahu equips
equip and is ^ahu

In general I propose the syntax be enhanced to the following.

old                  enhanced
---                  --------
name?                ~name
name? ^symbol        ~name is ^symbol
name? @ref           ~name == @ref
name? ^symbol @ref   ~name is ^symbol and ~name == @ref

The last case is a little more verbose, but also very understandable.

// query the AHU which outputs air to the given VAV
ahu and ~outputsTo is ^air and ~outputsTo == @vav

I look forward to hearing your thoughts.

Brian Frank Wed 10 Nov 2021

In programming languages the ? is often used to denote something optional or something

I think it depends on the programming language :-) In languages like Ruby or Clojure that aren't constrained in their identifiers, its pretty universal to use a trailing question mark to indicate a predicate function that returns a boolean. Which I personally really like and is the inspiration for that syntax.

Although I am not sure I have a strong opinion on the syntax:

// my proposal
ahu and outputsTo? ^air @vav

// your proposal
ahu and ~outputsTo is ^air and ~outputsTo == @vav

So it would be good to get other people's opinions...

Steve Eynon Fri 26 Nov 2021

Yes, hearing other ideas and opinions would be really useful.

For my own part, I would of course be able to adopt the syntax as you propose (albeit begrudgingly!).

My main concern here is adding too much bespoke complexity for the sake of brevity to a platform that many people already find difficult to comprehend.

This is why the ideas of:

  • re-using the and operator to compound expressions,
  • re-using == to compare references, and
  • introducing an is operator to specify what a tag value is

are all natural progressions of an already established syntax, making it far easier for a general audience to use and understand.

I'm only trying to keep filter expressions intuitive and accessible. :)

In languages like Ruby [they] use a trailing question mark to indicate a boolean.

Ah yes, for many years I've been trying to forget my time in Ruby! Although in many ways this just shows how overloaded the ? character is, potentially making the situation even worse?

Anyway, to sum up with an example, here are the two syntaxes side by side.

old                     proposed
---                     --------
outputsTo?             ~outputsTo
outputsTo? ^air        ~outputsTo is ^air
outputsTo? @ref        ~outputsTo == @ref
outputsTo? ^air @ref   ~outputsTo is ^air and ~outputsTo == @ref

Jason Briggs Mon 6 Dec 2021

I prefer Brian's way better.

Gareth David Johnson Mon 13 Dec 2021

I prefer Brian's shorter syntax. I also like the fact the ? operator is after the tag name rather than beforehand. Typically with Zinc, a special first character (i.e. @ normally denotes a type of granular haystack value.

I also like the fact this is one statement that can be easily parsed...

outputsTo? ^air @ref

I don't have a prefrence between ?or ~.

Richard McElhinney Mon 13 Dec 2021

I'm not sure how this ends up but seeing as we're picking sides :)!!

I'm neither here nor there on any of the syntax proposals, to me they all have degrees of un-readability if that's a real word!

What I will say is this....PH has embarked on some endeavours to help grow adoption and encourage more users. To do this everything we do must be easy to understand and easy to adopt whilst delivering powerful results.

In my view all of the proposals deliver powerful results from the query syntax.

But it is also my view that all of the syntax proposals fail on the simplicity and easy to understand principle.

I realise that the area we are getting into is complex and requires some learning, but I wouldn't want the efforts on the technical development front to be counter-productive to efforts in other areas where we are trying to encourage adoption, new members, new manufacturers, new building owners etc.

Unfortunately (or fortunately for everyone else!) I'm not a person who has the skills to propose new syntax. However I would say that in the quest for brevity, which the proposals seem to be looking for, lies complexity not simplicity.

I'll put my head in the lions mouth here and suggest that perhaps a little more verbosity, less symbols, and re-lighting the search for simplicity should be considered a lot more than perhaps it already has.

There you go, I didn't pick a side I went right off in a different direction! :)

Gareth David Johnson Tue 4 Jan 2022

Compared to other query languages, I think the proposed syntax changes are quite simple.

As with any changes, developers also have to think about things like Lexers and grammar that add some bias.

To make it really simple though, I think there should be search instead a filter query language. Query and search sound similar but they're fundamentally different in their approach. For querying you're finding an exact match. With search you're finding a likely match.

Do I think search should be part of the Haystack standard? No. I think the standard is complex enough. Certainly I would use haystack filter queries as part of a search implementation. For PH, a reference implementation for how search could be implemented would be a nice addition (i.e. using OpenSearch).

Stephen Frank Wed 5 Jan 2022

Weighing on the functionality: I think the proposed query types are very solid and badly needed. No objections from me.

Weighing in here on syntax: I prefer Steve Eynon's syntax, even though it is more verbose. As a self-taught programmer it just makes more sense to me. I don't have much of a preference on the exact symbols used (? ~ ^), but using "is" as a keyword in particular makes a ton of sense to me because it ties back well to the underlying def syntax.

Other thoughts: In general, I think there needs to be some way for a programmer who isn't familiar with defs to construct a non-def-aware query that more-or-less accomplishes what they want as long as they know the structure of their own data. We don't want to force people to rely on def-aware queries (especially when they are just learning). However, the current structure does support this. For example, the following query from Brian's example is very nicely expressed using def-aware syntax:

discharge and fan and enable and cmd and containedBy? ^equip @ahu // Brian's version
discharge and fan and enable and cmd and ~containedBy is ^equip and ~containedBy == @ahu // Steve's version

...but you could also hack it with:

discharge and fan and enable and cmd and (equipRef == @ahu or equipRef->equipRef == @ahu or equipRef->equipRef->equipRef == @ahu)

The non-def-aware query is ugly but covers the bases of possible equipment structures that Brian gave. Hence someone could use it, but the ugliness of it becomes a teaching/learning opportunity for converting to def-aware syntax.

Login or Signup to reply.