#666 ZINC to CSV converter

Brian Simmons Wed 23 Jan 2019

I see how HSZinc can read a ZINC file into memory in python, but has anyone developed a parser that can read in ZINC objects and write a CSV? Or vice versa, read in a CSV grid and write a ZINC? Thanks!

Stuart Longland Thu 24 Jan 2019

The biggest problem is, "define CSV"… CSV is really a non-standard. All it really says is, values shall be separated by a comma.

How lines are separated (CR LF, CR or LF), how commas within values are handled, what other characters are "special", and how those are handled, is entirely up to the implementation of CSV you're using. ZINC itself is actually a CSV-based format, but has enough quirks that I think it would trip up a typical CSV parser. (e.g. C(123.45, 678.9) would likely be seen as two separate cells, not one)

This is one reason why hszinc doesn't try to interpret or parse plain CSV files. The encoding is application-specific.

This is one possible way you could do it:

import hszinc
import csv

# Load in the ZINC file
grid = hszinc.parse(open('/path/to/file.zinc','r').read(), mode=hszinc.MODE_ZINC)
# Grab a list of columns
cols = list(grid.column.keys())

# Create the CSV writer
csvwriter = csv.writer(open('/path/to/file.csv','w'))

# Write out the header row
csvwriter.writerow(cols)

# What to emit at a given cell for the given column:
def _emit_cell(row, col):
    try:
        return hszinc.dump_scalar(row[col], mode=hszinc.MODE_JSON)
    except KeyError:
        return ''

# How to emit a row:
def _emit_row(row):
    return [_emit_cell(row, col) for col in cols]

# Emit all rows
csvwriter.writerows(map(_emit_row, grid))

No guarantees on the above of course, but in theory that should emit each cell of the input grid to CSV, using Haystack JSON scalars to encode the values. To go the other way, you'd use the parse_scalar to parse the scalar from the CSV file.

Brian Frank Thu 24 Jan 2019

The Java implementation has a CsvWriter you can use as a template for Python.

The Zinc, Trio, and JSON formats are full fidelity and can round trip all the correct value types. CSV is not, so mostly just dumps data values as their string value to make it most useful to import into Excel.

Login or Signup to reply.