All Topics

#732 Haystack 4 - Normalization

Madhushan Tennakoon Wed 11 Sep 2019

Hi all,

Looking for some clarification on normalization with respect to Haystack 4.0 as the link: Normalization seems to be a work in progress (an ETA on that would be greatly appreciated).

As I understand it, a normalized dict is the effective representation of a Haystack entity that contains the declared tags and the inherited tags. Pointing to the El Camino example:

// declarations
def: ^car
numDoors: 4
color: "red"
engine: "V8"
----
def: ^pickupTruck
numDoors: 2
color: "blue"
bedLength: 80in
----
def: ^elCamino
is: [^pickup, ^car]
color: "purple"'

...the normalized representation is given as:

def: ^elCamino        // declared
is: [^pickup, ^car]   // declared
color: "purple"       // declared
numDoors: 2           // inherited from pickup first
engine: "V8"          // inherited from car
bedLength: 80in       // inherited from pickup

Given the above def, is there a requirement for what an actual, saved instance of an El Camino should look like or is that left up to the implementation? Would the entity that is created in the data store need to have all normalized tags explicitly defined (option 1) or would it only contain the declared tags and all future read queries rely on subtype inference on the fly (option 2)?

Option 1

id: @my-new-el-camino
elCamino
pickup
truck
color: "purple"
numDoors: 2
engine: "V8"
bedLength: 80in

Option 2

id: @my-new-el-camino
elCamino

Let me know your thoughts.

Thanks, Madhu

Jason Briggs Wed 11 Sep 2019

Option-2, the database would not require the addition tags. Hopefully I understood your question.

Madhushan Tennakoon Thu 12 Sep 2019

Hi Jason, thanks for clarifying. Option-2 makes a lot of sense as the user could then add a new elCamino instance with a single tag (which would make HS4 awesome for the client and also for the server when a def needs to be updated with new markers, etc.). But just so we're clear, wouldn't this mean that each time a read op is executed, the backend must traverse the subtype tree to evaluate the tags that it implements? E.g. the filter query: "elCamino" would cause the server to create the elCamino def's normalized dict and fetch matching entities as opposed to what would happen in HS3.0, where all the relevant tags are explicitly declared in the instance itself. I might be thinking in terms of a schemaless database like MongoDB which is why this kind of traversal seems more daunting than it should be :)

Jason Briggs Thu 12 Sep 2019

In Haystack 4.0 we added features in the haystack filter.

For Example:

In option-2 if your filter was:

'readAll(pickup or truck)' Then nothing would return, 
because the tags don't exist on that record.

So if you wanted to find all pickups you would use:

'readAll(is ? pickup)' this would then return -> id: @my-new-el-camino

So the server when it saw this ? it would then look at the defs, and then find what tags it would need to query.

The good news here is that nothing broke in filters going from 3.0 to 4.0, so all old filters still work.

There are more changes to the filter that let's you do more things, but the above is just an example.

Hope this helps

Madhushan Tennakoon Mon 16 Sep 2019

Hi Jason, thanks for clearing it up. This was very helpful.