Home » Blog » Marginal Structure – Quaternio

Marginal Structure – Quaternio

About Marginal Structure

Marginal Structure works at NASA’s ames research center developing and integrating collaborative web technologies. Currently, I’m a developer on the NASA Nebula team. i’m also a computer science grad student in automated digital forensics, natural language processing, and data mining, with an interest in applying these techniques to the study of patterns in human behaviour, organization, and governance.

besides programming, i love cooking, a good crossfit workout, and building community around peoples’ pursuit of their passions. when i’m not hacking on some kind of web development or digital forensics code, or futzing with linux, i am usually working on community infrastructure, in particular the kind where we think about how organizations and collaborations are structured, how that intersects with technology, and how we can cultivate cultures of creativity in society. add wine and giant family feasts; repeat.

i leave archaeologically uninteresting digital remnants at a few places online:

  • jessykate on Twitter.
  • RocketQueen on flickr
  • google reader, which i use in spurts.
  • delicious

i live in the “secret nasa think tank in the hills” AKA rainbow mansion, where the aforementioned family feasts are practiced as often as possible, and various forms of world domination are plotted.

here’s some stuff i am or have been involved in:

  • La Choza del Mundo: our propery in costa rica for changing the world!
  • Yuri’s Night Bay Area, the largest yuri’s night party on the planet and the first ever at NASA, which i started and organized in 2007 and supported in 2008.
  • i do simple web hacks under the auspices of the tinyapps team.
  • Nick Skytland, Robbie Schingler, Karen Lau and I run a website called opennasa.com, an unofficial public dialog about transparency, participation, and the organizational evolution of NASA.
  • I was a cofounder of NASA CoLab. which was never known to cause any trouble……….ever.

JSON Encoding and Decoding with Custom Objects in Python

he JSON module supports encoding (aka serializing) for all the basic built-in python types– strings, lists, dictionaries, tuples, etc. but if you have your own user-defined class that you want to store, I found the documentation to be pretty ambiguous. And since I also didnt see any complete examples out there of custom object encoding and decoding, i thought i would post mine here.

Encoding

For encoding, the documentation’s not all that bad. It will tell you to implement the default() method in a subclass (of json.JSONEncoder) which takes your obect as an argument, and returns a serializable object. By serializable, they just mean something in the form of one of the basic serializable types. So, say you have a class with a few attributes as follows:

class MyClass:
    def __init__ (my_int, my_list, my_dict):
        this.my_int = my_int
        this.my_list = my_list
        this.my_dict = my_dict

You could write a custom encode function by mapping all the class attributes you want to save as members of a dictionary. If there are helpful additional things you want to store as well, that’s fine too. in this example, i use a string representation of a previously defined datetime object to make note of when the object was saved. Of course the only thing to remember is that when you later decode the object, you’re going to be recreating a MyClass object from this data, and it will have to match (so, specifically, you’ll either be discarding the date information or storing it elsewhere (or annotating your object with it on the fly)).

class MyEncoder(json.JSONEncoder):
     ''' a custom JSON encoder for MyClass objects '''
     def default(self, my_class):
         if not isinstance (my_class, MyClass):
             print 'You cannot use the JSON custom MyClassEncoder for a non-MyClass object.'
             return
         return {'my_int': my_class.my_int, 'my_list': my_class.my_list,
                    'my_dict': my_class.my_dict, 
                    'save date': the_date.ctime()}

Decoding

Decoding is less clear than encoding. there are two ways you can customize the results returned by the json load() or loads() functions. One is by writing an object hook, and one is by subclassing JSONDecoder and overriding the decode() function.

When called, load/loads calls the decode() function on the json string or file pointer you pass to it. if object_hook is also specified, then the function passed to object_hook is called after the decode function is called.

the default behaviour of decode() is to return a python object FOR EVERY SIMPLE OBJECT in that string. this means that if you have a hierarchy of such objects, for example a dictionary which contains several lists, then although you only call load() once, the decode() function gets called recursively for each python-like object in that string. here’s an example to convince yourself of this, using the previously encoded object:

fp = open ('myclass.json')
def custom_decode(json_thread):
    print json_thread
json.loads(fp, object_hook=custom_decode)

if what you want is to recover a custom object (such as the original MyClass object), this isnt terriby useful. at this point, it becomes clear we probably have to override the default loads() behaviour. as mentioned above, we do this by subclassing the JSONDecoder and overriding the decode() function. It’s not clear why the lack of symmetry here with JSONEncode– we override default() in one, and decode() in the other. but, ok.

now, your custom decode function took a python object as argument, but the decode() function of course will receive the raw serialized string being decoded. the basic approach is to use the generic decode capability of the JSON module to parse the string that was stored on disk into a python dictionary object. but the decoder still doesnt know about your custom MyClass object, so what you do is actually create a new object, initializing it with the values in my_class_dict.

class ThreadDecoder(json.JSONDecoder):
    def decode (self, json_string):
        # use json's generic decode capability to parse the serialized string
        # into a python dictionary.
        my_class_dict = json.loads(json_string)
        return MyClass(my_class_dict['my_int'], my_class_dict['my_list'], 
                       my_class_dict['my_dict'])

And there you have it. This is a simple example, but objects and types can be nested arbitrarily; you just have to be willing to unravel them as appropriate such that you are encoding and decoding basic python types.

Happy serializing!

Leave a Comment