#706 Filter queries with lists/dicts/grids

Stuart Longland Mon 27 May 2019

Hi all,

Just had a silly thought… as I understand it, there's no "standard" syntax for querying lists, dicts or grids in a filter string.

One thing that makes this more complex is the fact that these are "collection" types, there's more than one value, and you might want to retrieve a specific value within that collection.

The light-bulb moment for me was thinking about Perl. Perl is a dynamically-typed scripting language which is probably best known for its regular expression syntax. It has a handful of types: scalars, arrays, hashes, functions, references and undef.

Scalars are cast to whatever form is needed for the computation, so can be strings, booleans, integers or floats. Arrays function like Haystack "lists", and hashes like haystack "dicts".

%myDict = {
    key1 => "a value",
    key2 => 123.45
};
@myList = (1, 2, 'three');

I could have some syntax wrong, my Perl is rusty. To access the elements, you can do something like this:

$key1Value = $myDict{key1};
$listFirstElement = $myList[0];

These values are passed around by reference, so you might have a couple of references to the above array and hash, going via the reference looks like this:

$key1Value = $refToMyDict->{key1};
$listFirstElement = $refToMyList->[0];

This got me thinking about the way we reference tags in Haystack filters. If myTag was a List; we could do something Perl-like, and use myTag[0]. A Dict de-reference could look like myDict{myKey}. We also have that arrow syntax; and if we think of the "tag" as conceptually a reference to the value, we could use a syntax like myTag->[0] and myDict->{myKey}.

The syntax falls down when we consider the Perl way to get array length: ${@myTag} I think would get confusing as we already use @ to indicate a Ref literal. Maybe if we want to filter on an array length, we use something like myTag[#] or myTag->[#]?

Not sure which would work better in terms of the HFilter parser in the reference implementation or for readability.

For Lists, referencing the last element is also a handy thing. Perl and Python do the same thing here: a negative index counts from the end.

The elephant in the room of course is Grids. This is a more complex type, but I think it easier to break it down into a simpler data type like we do with its JSON representation:

{
    "meta": {
        "ver": '3.0',
        "metaKey1": "s:a value"
    },
    "cols": [
        {"name": "col1"},
        {"name": "col2"}
    },
    "rows": {
        {"col1": "n:123.4", "col2": "s:first row"},
        {"col1": "n:2345", "col2": "s:second row"}
    ]
}

To my thinking, I look at that and I see:

  • a Dict with meta, cols and rows keys
  • the meta is itself a Dict
  • rows and cols are both Lists
  • Each element of rows and cols is a Dict.

Based on this, a filter that tests the value of metaKey1 might be gridTag{meta}{metaKey1} (or maybe gridTag->{meta}->{metaKey1}), the name of the second column might be gridTag{cols}[1]{name}, and the value of col1 in the second row might be gridTag{rows}[1]{col1} (or gridTag{rows}[-1]{col1}, since it is also the last row).

Anyway, it was a silly idea I had, maybe worth something, maybe not. :-)

Brian Frank Tue 28 May 2019

Hi Stuart,

First off I want to clarify that "pathing thru" both Refs and Dicts is already part of the filter specification using the -> operator. Pathing thru a ref implicitly looks up the Dict for what the ref references.

Additionally as part of Haystack 4.0, the current proposal is to enhance the filter language to treat lists as if they are multiple key/value pairs similar to how RDF triples might work. For example:

{foo:["a", "b", "c"]}

// any of these filters would match the dict above
foo=="a"
foo=="b"
foo=="c"

There hasn't been any proposals to fully path thru lists by integer index. Semantically I am not sure that makes sense for most cases because we want to treat lists more like RDF where they are just a mechanism to define multiple tag/value pairs for the same tag.

Login or Signup to reply.