Showing posts with label JSON. Show all posts
Showing posts with label JSON. Show all posts

Tuesday, May 3, 2022

observations on the conversion of the backend from JSON to property graph (Neo4j)

The JSON backend for the Physics Derivation Graph 

  • is concise -- only the fields necessary are present 
  • is easily readable -- plain text and not much nesting
  • requires significant investment to construct queries
  • is static in terms of dependencies; unlikely to degrade or require maintenance
The property graph (in Neo4j) backend
  • supports user-provided queries
  • adds maintenance risk of keeping up with changes to Cypher and Neo4j

Wednesday, April 8, 2020

a terrible hack to get JSON into a database

I've been using JSON to store Physics Derivation Graph content. The motive is that JSON is capable of storing data in a way that most closely reflects how I think of the data structure in Python (nested dictionaries and lists).

To support multiple concurrent users, JSON doesn't work. The multiple users with concurrent writes would require locks to ensure changes are not lost.
Migrating from JSON to a table-based data structure (e.g., MySQL, PostGRESQL, SQLite) incurs a significant rewrite. Another option would be to use Redis, specifically the ReJSON plugin that alters the flat hashes in Redis to a nested structure closer to JSON.

I'm wary of using a plugin for data storge, and I'm reluctant to rewrite the PDG as tables.
There is a terrible hack that allows me to stick with JSON while also resolving the concurrency issue that doesn't require a significant rewrite: I could serialize the JSON and store it in Redis as a very long string.

Redis has a maximum string length of 512 MB (!) according to
https://redis.io/topics/data-types

What I'm currently doing:
>>> import json
>>> path_to_db = 'data.json'
>>> with open(path_to_db) as json_file:
     dat = json.load(json_file)

Terrible hack:

Read the content as text, then save to redis
>>> with open(path_to_db) as jfil:
    jcontent = jfil.read()
>>> rd.set(name='data.json', value=jcontent)
True

which can be simplified to

>>> with open(path_to_db) as jfil:
    rd.set(name='data.json', value=jfil.read())

Then, to read the file back in, use

>>> file_content = rd.get('data.json')
>>> dat = json.loads(file_content)