Homepage - News - Tools - Data and Tranche! - FAQ - Archive - Sitemap

About Developer Information
This is the main page for developer information regarding ProteomeCommons.org. Here you'll find information about how you can work with ProteomeCommons.org and what projects we currently have bounties on.

Development @ ProteomeCommons.org

Several projects are actively developed here at ProteomeCommons.org. Each of these projects is hosted on this server and users may freely access that absolute latest version of the code. We also offer some formal guidelines for development of Proteomics software and datasets. These guidelines need not be strictly followed, but they are strongly encouraged in order to simplify collaboration and simplify user adoption of new projects.

This document is further broken in to several subsections, including software development, peak list and spectra data set development, and access information. Click on the following links for further information.

Software Projects

ProteomeCommons.org software projects are intended for general development by anyone willing to contribute code. All projects are placed in a subversion repository and maintenance branches are created for each release of the code. The development branch may always be found in the "dev" directory of the project's repository and releases may always be found in a directory named after the release (e.g. "Foo-1.0"). This practice is fairly standard with versioned software projects, and it makes it relatively easy to have stable releases of the code without preventing active development. In addition to this standard branching, each branch of the project should organize files in the following manner.

./dist
Distribution files for the project. Files that are present in a distribution of your project should be organized in the ./dist directory. Use the dist directory as an expanded version of your project's ProteomeCommon.org archive, i.e. after building the project branch a developer should be able to simply ZIP compress the ./dist directory to create a ProteomeCommons.org archive.
./dist/proteomecommons.org
The meta-file for the project branch.
./dist/index.html
The main introductory page for the project should be in HTML format and included as index.html in the dist of the project. ProteomeCommon.org archives are publicly displayed on the WWW, and this is the file that users are directed to in order to find project-specific documentation. Weither you intend your project for viewing on the WWW or locally on a computer, use ./dist/index.html as the standard entry point for documentation.
./dist/docs
All project documentation should be placed within this directory, with the exception of an ./dist/index.html, which should reference any documentation included in the ./dist/docs folder.
./src
The source code for the project. Source-code files may be ordered in any manner within the src directory. For example, a Java project using the package org.proteomecommons would have its source-code in the ./src/org/proteomecommons directory. This practice ensures that developers know exactly where to look for any source-code included in the project.
./lib
Supporting libraries should be included in the ./lib directory of your project.
./README
Any special information required for the project should be included in a plain text file named README. This should include documentation about anything a developer should know about before trying to build or edit the project.
Build Files, Licenses, etc.
Build files, licensing information, and anything of the like should be included in the root directory of the project. In general, the goal is to keep the root directory clean and with only a few directories and the README file.

Some good examples of all of the above may be found in the ArchiveChecker and Falk Model projects. Note, the above layout is only intended for the subversioned development of branches of a project, i.e code that developers need. The actual structure of the ProteomeCommons.org archive is largely up to you. We only required that you include the proteomecommons.org meta-file, and we suggest you use index.html and docs for documentation.

Before concluding, let us look at a quick example. Imagine a ficticious project named Foo that is started as a project here at ProteomeCommons.org. Initially the repository would have the following structure.

./dev/README
./dev/src/SomeCode.java
./dev/build.sh
./dev/dist/index.html
./dev/dist/proteomecommons.org

The repository has nothing more than a README file, some code, a build script, and some documentation. Note that everything is under the "dev" directory. Remember "dev" means the active development branch of the code. If someone wanted to check out the latest snapshot of the code they would always use svn+ssh://www.proteomecommons.org/svn/Foo/dev. In subversion the concept of branches is nothing more than a sub-directory of the project.

Now imagine that the Foo project matures to the point of an initial release, named "Foo-1.0". When the release is made the developers of the Foo project will want the following two things: the ability to actively develop the Foo code and the ability to easily fix minor bugs with the Foo-1.0 while not slowing down new development. This is achieved by formally branching the project to have both a "dev" and a "Foo-1.0" branch. The new repository structure would be the following.

./dev/README
./dev/src/SomeCode.java
./dev/build.sh
./dev/dist/index.html
./dev/dist/proteomecommons.org

./Foo-1.0/README
./Foo-1.0/src/SomeCode.java
./Foo-1.0/build.sh
./Foo-1.0/dist/index.html
./Foo-1.0/dist/proteomecommons.org

Any minor fixes to the official Foo-1.0 release, assume Foo-1.0 was released to users via ProteomeCommons.org, would be commited under the Foo-1.0 directory and optionally rolled in to the dev branch. Any new development of the Foo code would occur in the dev branch, and it would be included in a future release or if desired, back-ported to a Foo-1.x release. In general, versioning of projects continues as described above, with each version getting another maintenance branch in the repository. When branching remember to use the svn cp command for a cheap copy, else you will physically copy the entire project and waste repository space.

Data Sets

Data sets are structured in a similar fashion to software projects, and versioning, if required, should be carried out in an identical manner. Data sets shouldn't use the ./dist directory at all, put all files in the base directory of the branch, keeping the same sub-folders as the dist directory normally has. Peak lists and spectra should be placed in the ./peaklists and ./spectra folders respectively. Use the Michrom Tryptic Digest Standards Peak Lists and Michrom Tryptic Digest Standards Spectra projects for model examples of data sets.

Spectra are normally much much larger than peak lists; however, peak lists are much easier to work with using most MS and MSMS tools. Spectra is required for any data set posted on ProteomeCommons.org, but it is suggested you split the spectra and peak lists in to two seperate projects so that users may download what they please.

Code Releases (i.e. putting your code on-line)

This section is devoted to putting your project on-line at www.proteomecommons.org, and special attention is placed on adding HTML pages to your project. Having a web page on www.proteomecommons.org where others can download your code is different than placing code in our subversion repositories. Anyone may archive a public release of their code or data set on this website, and if you happen to have HTML web pages in your project, those pages are displayed will be made availabe on www.proteomecommons.org.

Putting your project on-line

Putting your project on-line is as simple as sending in a release of your code or data set. Normally we expect to get a single compressed file (ZIP, GZIP, or TAR+GZIP) that contains your entire project, exactly as you would like it on-line. We then uncompress that file and put all the contents on-line under a single project. The site then manages the project, just like any other project in the archive. Send an e-mail to the administrators, and we'll gladly help get everything arranged.

Putting a website in your project

Visitors to ProteomeCommons.org can choose to download a project as one ZIP archive or as individual parts. This system gives each file in your project its own unique URL, which anyone may use to download the file. It also means that you can put HTML files in your project and we'll host them as if it was your own website. We actually encourage that you include at least one file named index.html that acts as user documentation and a entry point to the on-line precense of your project.

This index.html file is not the only thing you can do, but it is definitely the best thing to get started with. You can place any valid HTML in this file as you please. Pick out a good tutorial on HTML and design your custom site. If you'd like to see some examples, look at any of the projects that are already in the archive. HTML is sent to your web browser as is, and you can see the full source-code by using the "view source" option of your web browser. Pick a project that you like and design your HTML page to be similar.

Above and beyond the index.html file you can also include other things to compliment your website such as more HTML files, images, javascript, cascading style sheets, and Java Web Start applications.

Subversion Access

Anyone may anonymously access and download the latest code from the subversion repositories. This provides the best method of getting the absolute latest code. Many free subversion clients exist for most all of the popular platforms, visit subversion.tigris.org to get a free version of subversion.

Getting the latest code

Stable, release builds of a project should be downloaded directly from the ProteomeCommons.org archive. If you want to get the absolutely latest version of the code, i.e. what the developers are working on, you'll need to check out the appropriate branch of the subversion repository. In general, the "dev" branch represents the development branch of software projects and data sets only have one branch, and you can check the code out using subversion. For example, if you wanted to get the latest version (possibly unstable) of the ProteomeCommons.org-IO project, use the following command.

svn export svn://www.proteomecommons.org/svn/IO/dev

After executing the above command, a directory named "dev" will have appeared in your working directory. In the dev directory you will find all the project files and the instructions/build scripts required to make the project.

In general, you can check out any branch of the code using the svn export command. Simply change the path to be the branch you required, e.g. for the 1.0 release of the Falk Model project use svn://www.proteomecommons.org/svn/IO/IO-1.0. If you are not a registered as a developer with write access to the repository, you will always be checking out code as an anonymous user with only read access. Understand that while development code is freely provided it is not formally supported. Do not expect to get help by a project's developers when you are working with the development code.

Submitting Patches

If you have made modifications to a project's code and you would like to formally submit those changes to the project, send a diff of your modifications. The diff should be sent to the developers of the project along with an explaination of your changes. For example, assume you'd found some typos in a build script, say build.sh. Check out a development copy of the project (svn checkout), modify build.sh, and create a diff using the following command.

svn diff build.sh >  my-changes.patch 

The file my-changes.patch will contain annotations of what you've changed and the developers of the project can use the patch tools to incorporate those changes to the project. Send the patch file to the developers of the project and include a brief message of what you changed and why it was changed. After review by a developer with write access to the repository, the change will be rolled in.

Do not change a file and then send the entire file, nor should you change some files and send an entire project archive, especially if you are submitting changes via e-mail. This only wastes space and bogs down the developers with information they don't need to see. The diff tool provides a succinct description of what changes have been made to a file, and it is the preferred method of submitting patches.



Comments or Questions? Please contact the site's administrators.