Dashboard > Project: Resources > Home > Versioning of resources
  Project: Resources Log In | Signup View a printable version of the current page.  
  Versioning of resources
Added by Jim Eng, last edited by Jim Eng on Nov 12, 2006  (view change)
Labels: 
(None)

JSR-170 supports versioning of resources, and a number of people have requested that Sakai's ContentHostingService and Resources tool offer support for versioning. This page invites comments and suggestions about what form of versioning would be appropriate in Sakai. Please add comments to the "Use Cases for Version Control" page or to this page outlining the features that should be available. Among the questions we will need to answer are:

  • Should we offer access to all versions of the "content" of a resource (the file body)?
  • If metadata about a resource (or folder) changes, should we save the old "version"?
  • Who should be able to access earlier versions of resources? Are there times when access to earlier versions should be offered to all users and other times when only certain users should have access? Should this be a system-wide setting, a site-by-site setting, a resource-by-resource setting?
  • Is there some standard or some implementation of versioning that would provide a "model" for versioning that would be appropriate for Sakai? (Please summarize features and provide links to useful documentation).

The need for version control is discussed in a few JIRA tickets, especially "Resources tool should have the ability to store multiple versions of files in Sakai", which incorporates two others: "Document Versioning" and "Add Versioning Capabilities to ContentHosting, Resources Tool, and WebDav".

Discussion of version control in general and references to specific examples can be found in wikipedia.

Eventually, we will begin working on an implementation of version control. That will require decisions about a user-interface changes to support Versioning and API changes to support Version Control.

API changes to support Version Control (Project: Resources)
Use Cases for Version Control (Project: Resources)
Versioning User Interface (Project: Resources)

I think that since the API is designed to be adaptable to different underlying storage (Digital Repository -DR) techniques (file system, fedora, etc...see Content Hosting Handler) , it would be nearly duplicative to provide versioning implemention code that was independent of that. Some versioning would likely be supplied by the DR technology, in some cases, in others, none. So, I think it best to assume only that versions may be available... and to align this presentation, both in the UI and in designing the API presented, with the similiar-in-effect 'permissioned access to versions' idea.

Jon makes a very good point. Versioning should be delegated to the content hosting implementation as much as possible and the functionallity may then vary depending on the underlying implementation. E.g. someone could possabily want to use subversion for storing resources and the versioning capabilities of subversion should be used as much as possible. (Using subversion is more plausible if you think about trying to make resources that already exist in subversion available directly within Sakai.)

This seems backwards to me. User needs should drive UI and API design, rather than having the API implementation determining what can appear in the UI.

If we need the full functionality of subversion in the Resources tool, we should design a user interface with that functionality, and we should rewrite the CHS API to support all the operations needed for that. If we need a different set of capabilities, the UI and API should be defined in terms of the user needs we are trying to satisfy.

I may not have done a very good job of posing the question. I'm really asking this: What version control capabilities are needed in the Resources tool (as well as WebDAV, AccessServlet and the filepicker).

"rather than having the API implementation determining what can appear in the UI."

I don't think I've seen anyone suggest that. I'm speaking out of my depth, but is there really a conflict here? Is there not some way to have the Resources tool ride on top of some underlying storage and still have the operation and presentation of the tool driven by user tasks and needs? Cannot the API assume some generic versioning features, and perhaps call for local integrators to provide any glue code that's needed, like a kind of resources "provider"?

I apologize if I'm misunderstanding. The current APIs for CHS do not support versioning at all, and the proposed ContentHandler APIs do not provide inherent support for versioning. Even with ContentHandlers and JSR-170, we will need some implementation of the CHS interface for the normal case of a user uploading a file and other users accessing it. If we are going to support any version-control features in that process, we need to specify the features.

So what features are we talking about?

I was only trying to avoid retrofitting an awkward 'no versions available' technique later. It would be easy to assume that versions (and meaning responses for versioning questions to the API) will always be available. I think it is wise and more helful to assume they will not. Whether an API user just gets returned nulls or specific boolean results for specific method calls, I have no opinion.

I agree with Mark that metadata changes should be considered an edit and that attachment changes should version the attach-or. However, I think changes to constituents of Collection should only produce a version for the constituent and not the whole Collection. I know, I know....but this has come up in the past WRT upacked IMS CP and SCORM (IMS CP) PIFs

Ditto marks commets.

Plus I think ...
there should be a history feature, of course.... and svn style comment option for saves... autonumbering and inc of version numbers in a human grokkable format. RetrieveByVersion.... I also think it worth considering the unimagineable: Merge.... and because also we have to be real about that: LockByVersion (already have id locks).

You might want to consider a wider look at what versioning should support. When we wrote wiki, we needed versioning, and would have used CHS if it had had versioning.... but it didnt.... so we implimented our own storage..... which wasnt the best thing to do.

So there are lots of tools out there that as a bear minimum want/need linear versioning where the default action is on the current version, but historical versions are avaialble and operations can be performed on those historical versions.

I would be carefull about non linear versioning since some of the big commercial 170 implementations dont support branching etc...... and there may be institutions that have or are in the progress of making investments in that type of technology. For Sakai to fit into the CIO's vision it might want to integrate with those stores.

Some of my thoughts ...

By default, the identifier of a source refers to the "latest active version".
Each time a resource is updated, it is tagged with a version number (auto incrementing integer).
Older versions of a resource may be accessed with method variants that specify a version number.

The separation of metadata from content is an artifact of how it is currently implemented. In my view, resource metadata is an integral part of the resource itself. Therefore, if metadata is modified, a new version should be created. This allows close tracking of all information changes to a resource.

Incidentally, attachments should be considered part of the resources, as well. Changes to an attachment, including add / remove, should cause the version number to change.

In terms of access, while it complicates the model, I do think that access to older versions may be different than access to the latest may be different. I don't think we need to be able to define permissions on a specific version, just the set of old versions and the current one.

While we are talking about versioning, it would be very useful to have a method that determines the difference between two specified versions (or the latest). We should be able to archive AND RESTORE all versions prior to a specified point. I think a linear versioning system is sufficient. We don't need branching.

  • Mark Norton

I especially like those first three as a concise feature summary. Having some vague impression that healthy collaboration calls for people to own (up to) their actions, I'm left with slightly altered versions:

  • By default, the identifier of a source refers to the "latest active version".
  • Each time a resource is updated, it is tagged with a version number (auto incrementing integer) and the time and identity of the updater.
  • Older versions of a resource may be accessed with method variants that specify a version number and the time and identity of the updater.

As a matter of convention, I would expect a "Version history" view of some sort as well. It would also be highly desirable to present textual diffs of some sort or other, e.g., the color-coded diff view which has become common in wikis and source control tools.

"In my view, resource metadata is an integral part of the resource itself."

I can appreciate the principle, and so think that this should be allowed for, but when I pause to think about a use case that people here might care about, I struggle to come up with one. I think our users would prefer that we configure things only to version resources themselves, and not their metadata. So I suppose I'm lobbying for such (limiting) configuration to be possible.

"In terms of access, while it complicates the model, I do think that access to older versions may be different than access to the latest may be different. I don't think we need to be able to define permissions on a specific version, just the set of old versions and the current one."

I think any extra flexibility that might come from access control applied to specific versions would be much weighed down by the attendant complexity and confusion. This is just not something people want to have to think about. Again, if it were allowed for, I'd lobby for a way to obscure it.

I agree with the policy that changes in the metadata of a resource do not lead to a new version of the resource.

To bridge from theory to pactical use, i will briefly describe the use of versioning in our content management systems (e.g. Stellent*), where we store anything from learning materials and news articles to pictures and macromedia flash code fragments.

  • When the content changes we want to create a new version of the resource. This allows people to refer to a stable version of the content, and use an older version when they need to.
  • When we want to change the usage of the resource, we change the appropriate metadata on the same version. Each version of the resource holds its own set of metadata values. We know how a specific version of the resource was/is used and feel no need for a history of de meta data for one version of the content of the resource.
  • To describe the changes that were made (in the content or the use), we use a comments field.
  • Branching is not avaialbe nor missed, lineair versioning works just fine for us.

* Stellent has a long history as a document and content management system, and it has recently been aquired by Oracle. They have implemented a model that works quite good (for us).

I would advocate not including indentifying information in the identifier itself. There will most likely be a need for versioning in an anonymous context. It would be better to have that information in the meta-data, where it could be used to control workflow as needed, but where permissions could be used to prevent it from being surfaced when not appropriate.

IMHO - versioning should cover both file content as well as metadata. If we support the most granular level (i.e. like svn) then we can build some really cool applications on top of this versioned data. For what it is worth, the Navigo assessment engine was built entirely on top of a home grown versioing content system. If we limit the number of revisions or what is versioned, it also limits what is possible to be built on top of such a system. Thanks, Lance

Site running on a free Atlassian Confluence Open Source Project License granted to Sakai Foundation. Evaluate Confluence today.
Powered by Atlassian Confluence, the Enterprise Wiki. (Version: 2.5.5 Build:#811 Jul 25, 2007) - Bug/feature request - Contact Administrators