The ODIn Lab - Filesystems, Application Semantics, and Walled Gardens (Part 5)

In the past weeks, I've been talking about how a persistence layer for web applications could be developed, informed by both the successes and failures of the iOS walled garden document model. This week, I'll (hopefully) wrap up with a discussion of how three benefits of iOS can be incorporated into a web application filesystem.

Interface Formats

A significant virtue of the iOS document model is that applications are forced to get their data into standardized formats before exporting them out; Stripping off the application-specific metadata and passing only the stuff that another application is guaranteed to understand. A similar phenomenon can be found, rooted in the oldschool idea of web mashups. Each application builds on a standardized data representation (e.g., Google Maps, or Facebook Comments). The data representation has a standardized interface, and allows users to put their own data on top of it.

There was once a part of the MacOS (back in the OS 7 days) called OpenDoc. Although it didn't survive for political reasons, it actually featured some extremely nifty ideas. The core concept was that of nested document types. An application would register itself, not as a standalone component of the operating system capable of editing entire files, bur rather as having the ability to edit data of a certain type. For example, you would have a word processing application component. These application components could be nested -- The word processing document could have graphical data embedded within it, and would allow the user to edit that graphical data through a fully-featured graphical editor.

Similar ideas have been brought up in the web world. XML (and to a lesser extent HTML) are perfect examples of this. XML's nested structure echoes this idea perfectly, and many a web application has been embedded into another through iframes.

The web is ideal for this sort of application design. The persistence layer should encourage developers to structure their applications to work with this general layout.

Presentation

Filesystems are tricky. People have different ways of identifying document data. Although the filesystem has always forced us to use filenames, this archaic concept is rapidly being replaced in many contexts with internal identifiers that the user never needs to see or interact with. Different types of data can be presented in different ways. Short summaries of text (something commercial operating systems have automated the extraction of since the mid-90s) can be useful for paging through textual data. Thumbnails are excellent for graphics. Short previews work for video/audio data. Header comments are useful overviews of code. Even things like date/time last modified can be useful in the identification of what you're looking for. In short, the filesystem needs to be able to work with the application in order to better visualize the data contained within. The OSX quicklook feature is a great example of this, as each application can provide a plugin that quickly renders a preview snapshots of individual files.

Non-Document Data

Non-document data is, quite frankly, the hardest to work with. Consider an address book, or a BibTeX bibliography manager. You might have multiple individual address books, or multiple bibliography documents, but ultimately you want the data in these address books or bibliographies to be linked. If your friend and coworker changes their phone, you want the phone number updated in both your personal and work contact lists. If you find a typo in a bibliography entry, you want to fix the typo in all of your manuscripts. Decentralization is critical.

This in turn makes it hard to share data without running into the security concerns I discussed last week. Logical groupings of entries are one way to manage access control, and specialized interface widgets are another. What it also means is that to be efficient, the filesystem has to operate on a sub-document granularity. For example, the HFS filesystem featured a quaint little idea called a resource fork. In addition to the normal notion of a file as a sequence of bits (the data fork), each file also had a structured component. The structured component contained lots of bits of data (resources), each individually addressable. Moreover, there were standardized ways of accessing these bits of data. For example, you could have a collection of icons in one file. The operating system provided primitives that allowed any application to get into that file and access each of those icons as needed. More recently, OSX's notion of Packages or Bundles achieves a similar end. Applications are conceptually a single file, but have structured contents in standardized formats such as graphical data or XML/plist.

In short, it is extremely helpful if the filesystem supports (securely) being able to drill down into data from other applications, no matter how it's structured.

Well, that's it for this thread. On to new and wonderful things next week. Happy Holidays, and qoSraj QI'lop jaj ghubDaQ.