Skip to content

Rhaptos Software Development

Personal tools
You are here: Home » Development » The Mysterious Future » Distributed Repository

Distributed Repository

Document Actions
Outline of possibilities for dealing with a repository made from multiple Rhaptos installations. Transcribed from notes.
A federated/syndicated model...

There are many individual sites: pysical installations of Rhaptos at various locations. Perhaps they are called nodes. Each is identified by a domain (like cnx.rice.edu, cnx.iub.edu, cnx.novell.com).

Individual sites contain modules, just like we do now.
    Module names are required to be unique within the domain, just like we do now. The naming scheme is up to the installation.

Sites are collected in pools/federations. This may happen with a central master registry or simply by registering interest with a site already in that federation, in which case the information about the list of participants will spread throughout the pool as the individual machine communicate with each other.

Any site can search, browse, or link to any content in the repository of another site in its federation. Probably this is done with a common REST API, and with local indexing/caching of search results.
    Viewing "foreign" content (content from other nodes in the pool) will be in context of the local site. This may look like
       http://cnx.org/content/other.cnx.org/m10012
    Alternately, we may discover that a unique suffix is easier.
    Must determine the distinction in UI between local and foreign content. How transparent is it?

However, sites may come and go for various reasons. We don't want important resources to disappear (The Connexions contract is that nothing disappears, at least for local content, which is all we have now).

So...

Any site may, based on some motivation, host (copy/download/grab/etc) foreign content, making it locally stored. So if the local site is available, that foreign content will be available, regardless of the status of the remote site. Also, the remote site will probably not be hit for viewing content hosted locally.
    All versions are stored, probably.
    It will not go away, but still is accessed with the same URL, as above.
    Doing this will also make the local site either register for change notices from the original site, or poll occasionally (or both) to keep up with new versions.

Hosting may happen for several reasons
  - used in local course, or referred to by local module
  - chosen by admin
  - chosen by user
  - based on traffic to that content
  - entire site mirror

Problems
  - keeping content fresh; requires polling and/or notification
  - usage statistics get spread out; local reports to canonical source may fix that, though this cannot be enforced to be accurate, so reports will have to distinguish.
  - editing of content hosted elsewhere is problematic; making foreign content ineligible for workspace and directing users to the original is most reasonable, but distributed publish (with a web services API) may be possible
  - user accounts may be different per system, unless we enforce a SSO or federated identity system as well

Other possible methods...

Massive distributed caching: content is almost entirely read, so load distribution by geographically-distributed cache machines is feasible (and cheap!). We will need to make sure we have good cache invalidation. Also, interactive use (searches, browse, editing) will still have a single point.

Single application, multiple locations: with a VPN, place ZEO clients in other locations. Backend is still the same, though there exist solutions for multi-source PostgreSQL and ZEO servers.

Fully replicated repository: each installation is independent, but modules are fully replicated among a set of related installations. Each might reserve a block of ids, for instance. Editing becomes a problem due to the tight coupling.

Shared discovery: each site can find content on other sites, and probably even include them in courses and as CNXNs, but always refers to the external resource.

Note that many of these are not exclusive, and some might be considered as precursors to others.
Created by jccooper
Last modified 2006-12-15 17:56