Google Summer of Code

From BioRuby
Jump to: navigation, search

Contents

Introduction

From 2009-2012 the Open Bioinformatics foundation BioRuby project successfully participated in the Google Summer of Code, the first year under the wings of Nescent, and from 2010-2012 as part of the Open Bioinformatics Foundation. Sadly, in 2013, the OBF was not accepted as a mentoring organisation. The SciRuby organisation, however, will merge in our projects ideas! Please go to that page to become a GSoC student. BioRuby mentors will be there.

We still look for students who want to work on one of below projects over summer. Feel free to inquire on the ML, or directly with below names!

In earlier years a number of BioRuby projects were accepted under the OBF umbrella (see below). Two others under the NESCENT mentoring organisation.

Please read the GSoC page at the Open Bioinformatics Foundation and the main Google Summer of Code page for more details about the program.

Mentor List

BioRuby panel members who currently volunteer to act as a mentor to a GSoC student.


Project Ideas

2014

An ultra-fast scalable RESTful API to query large numbers of genomic variations

Rationale
VCF files are the typical output of genome resequencing projects (http://www.1000genomes.org/node/101). They store the information on all the mutations and variations (SNPs and InDels) that are found by comparing the outputs of a NGS platform with a reference genome. These files are not incredibly large (a typical uncompressed VCF file is few gigabytes) but they are full with information on millions of positions in the genome where mutations are found. Large resequencing projects can produce hundreds or thousands of these files, one for each sample sequenced.
Existing tools (such as VCFTools or BCFTools) offer a convenient way to access these files and extract or convert the information present, but are limited in functionalities and speed when more complex queries need to be performed on these data. With existing tools it is very complicated, if not impossibile, to retrive information when working on many VCF files and samples together to compare, for instance, the variations found in 100 samples and extract all the mutations that are present in 50 samples but are not present in the other 50 and so on.
Approach
The project should develop a RESTful API to address the issues described in the rationale and to allow users to manipulate and compare genomics variation information for hundreds of samples. A database engine will be required to store the information and to support the data mining. Unstructured database engines such as noSQL databases or key-values stores can all be valid alternatives to combine high-speed with data flexibility. The decision on the best database engine to be used will be discussed between the student and the mentors and within the OpenBio community. Given the high amount of information that will need to be processed by such an application, scalable and fast languages such as JVM-based languages like Scala or JRuby will be a good choice. The project should also take care of the deploy of such an API, by creating a Ruby gem or a JAR that users can install and use right away with their datasets.
Difficulty and needed skills
The project has an average difficulty and it is aimed at talented students who wants to develop a fast API to address these problems.
The project requires
Knowledge of advanced programming languages. Some experience and knowledge of databases and data mining will help managing the information of VCF files.
Mentors
Francesco Strozzi, Raoul J.P. Bonnal

Project ideas

2013

The BioRuby proposals for 2013 are listed here and linked from the Open Bioinformatics Foundation.


D3 based graphics package for Bioinformatics

Rationale
D3 is an incredible interactive data visualisation library written in Javascript that runs in a browser. We want to port special visualisations for Bioinformatics related to genome displays, phylogeny, QTL mapping, etc. as well as figures for statistics for the SciRuby library. Based on existing work in the Ruby bio-graphics and RubyD3 gems, R/qtl, and work done for genometools and the JBrowser, we would like to create a graphics generator that allows for embedding interactive Javascript hooks. The immediate task is to create zoomable interactive figures for a genetic map, pairwise recombination fractions, image of genotype data, LOD curves, 2d scans and QTL effects.
Approach
Create a project that takes all the good ideas from other projects (such as Matplotlib, JBrowse, Sequenceservere and Bio::Graphics), and build up a sustainable source base for current and future work. The generator should be written in Ruby and generate suitable D3 code in either Javascript or Coffeescript. Even though the generator is written in Ruby, the functionality should be easily accessible from other programming languages, perhaps by settling on an intermediate representation of data and code
Difficulty and needed skills
Affinity for design and graphics, accessibility of information, web programming in Ruby. Javascript/Coffeescript
The project requires
Some Ruby experience, interest in web design and scientific graphics. Creating a useful package will be a real challenge
Mentors
Raoul J.P. Bonnal, Rob Syme, Karl Broman, Rob Buels (confirm)

Semantic web/RDF support for Bioinformatics

Rationale
The bioinformatics community is doing a lot of work integrating different data repositories through RDF. For example Bio2RDF and SADI. A list of activities can be found here. BioRuby and biogems contain a wide range of parsers and formatters which could be extended to support reading and writing RDF. Having such functionality would make it easy for bioinformaticians to incorporate and expose RDF for flexible data queries.
Approach
We will visit all existing parsers and formatters and decide which ones are most useful for RDF import/export. The student will tackle one transformer at a time, writing tests and adding a SPARQL end point for others to use. The student will also add SADI service discovery.
Difficulty and needed skills
Average difficulty
The project requires
The student will need to have affinity with the semantic web and get to a decent level op Ruby programming. Probably includes meta-programming.
Mentors
Toshiaki Katayama, Mark Wilkinson (confirm), Jerven Bolleman

Machine Learning & Data Mining Algorithms for Ruby

Rationale
Machine learning and data mining algorithms are widely employed for the statistical analyses performed on biological datasets. Many Java libraries currently exist that implement the most commonly used algorithms in bioinformatics (such as clustering methods and simple classifiers), but the usability of these tools is restricted by the limited supply of APIs and user-friendly implementations for languages other than Java.
Approach
The goal of this project would be to implement a system to easily access these set of tools using Jruby and to develop a basic framework that integrates the different sources. The Java libraries that could be primarily used would be taken from Weka (http://www.cs.waikato.ac.nz/ml/weka/) and RapidMiner (http://rapid-i.com/content/view/181/190/). This approach could be subsequently extended to develop a visualization scheme based on D3.
Difficulty and needed skills
Medium/Hard depending on the topic selected and the scope of the project
Basic statistical knowledge is required as well as programming in Ruby, JRuby and Java
The project requires
Basic statistical knowledge,Ruby,JRuby,Java
Mentors
Raoul J.P. Bonnal, Francesco Strozzi

Create a dynamic and social web portal for Bioinformatics packages

Rationale
The http://biogems.info/ website is an aggregator of Ruby gems and Debian/Biolinux package information. We are looking at building up similar aggregators for bioinformatics packages from other resources, including BioPerl, Biopython, R and BioJava, which may get their own base domain names. Also we wish to create dynamic news feeds based on github commit updates, software releases, testing information etc., so as to create a one-stop resource for bioinformatics software. Also we want to push information to Twitter and Facebook.
Approach
We want to build up on the current biogems.info functionality with Ruby on as a site generator and HAML/SASS template handler. In the browser we want to use Coffescript to for interactive features to the site, as well as fetching live commit information from github, for example.
Difficulty and needed skills
Affinity for web design, accessibility of information, web programming in Ruby. Javascript/Coffeescript.
The project requires
Some Ruby experience, interest in web design and social networking
Mentors
Members of the BioRuby panel


A parallelized framework for processing large numbers of VCF files using Scala Actors and JRuby

Rationale
VCF files are the typical output of whole genome resequencing projects (http://www.1000genomes.org/node/101). They hold the information on all the mutations and variations (SNPs and InDels) that are found by comparing the outputs of a NGS platform with a reference genome. These files are not incredibly large (a typical uncompressed VCF file is few gigabytes) but they are full with information on millions of positions in the genome where mutations are found. Large resequencing projects can produce hundreds or thousands of these files, one for each sample sequenced.
Existing tools (such as VCFTools or BCFTools) let you manipulate, convert and access the information stored into VCF files but are limited in functionalities and speed when there is the need to work with many files together and compare the variations found for example in 100 samples to identify common mutations sites among sub-groups of samples, or to extract for instance all the mutations that are present in 50 samples but are not present in the other 50 and so forth.
Approach
The project will develop a framework (a single utility or a set of utilities) to address the issues described in the rationale and to allow users to manipulate and compare hundreds of VCF files. Given the high number of information that will need to be processed, the JVM and the Scala language will be the preferred choice, using the Akka actors library to develop a high performance, highly parallelized framework to process VCF files. A database would be required to support the information processing and mining, traditional RDBMS, noSQL or semantic databases are all valid choices. The decision on the best database engine to be used will be discussed between the student and the mentors and within the Bio projects community.
The JRuby language could then be used to create a nice interface around the framework to run the different tasks and to easily distribute it as a BioRuby gem (http://www.biogems.info).
Difficulty and needed skills
The project is mid / high difficulty, aimed at talented students. Previous knowledge of Scala or Ruby is not necessary but a background in advanced programming languages (like C++, Java) is essential to develop the project.
The project requires
Knowledge of advanced programming languages. Some experience and knowledge of databases and data mining will help managing the information of VCF files.
Mentors
Francesco Strozzi (author of bioruby-grid, bioruby-ngs etc.), Raoul J.P. Bonnal (bioruby-samtools, bioruby-ngs, biogems etc.)


Add more project ideas here

Use the template of the other project ideas. Make sure this is finalised before student submissions start.



Accepted Projects

2013

See above: pending approval by Google Summer of Code (in April 2013)

2012

GSoC:HPC-GFF3 Write the world's fastest parallelized GFF3/GTF parser in D, for Ruby FFI

Rationale
GFF3/GTF parsers are used by genome browsers and next-gen sequencing tools. Current parsers are slow and use a lot of memory. A fast low-memory parser would be beneficial to many bio-medical projects
Approach
Based on existing implementation we can design a fast parser using the D programming language. D provides capabilities for hand-crafting high-performance parsers. If required, parallelization of records can be introduced by using Actors. D can compile libraries which can be bound to Ruby using a C-style interface. This means the GFF3/GTF parser can be used from Ruby. The design will focus on iterating records and feeding them back to the Ruby environment. The library will also be useful for Python, Perl and the JVM.
Difficulty and needed skills
This is a challenging project. Advanced programming concepts, concurrency, foreign language bindings.
The project requires
An interest in high performance computing. Some affinity with coding in C and one or more interpreted languages
Mentors
Pjotr Prins (author of bio-gff3), Raoul Bonnal
Other interested parties
Naohisa Goto (author BioRuby's GFF3 parser), Brad Chapman (author Biopython's GFF3 parser) and Peter Cock (Biopython), Chris Fields (BioPerl).

GSoC:MAF Extend bio-alignment plug-in with Multiple Alignment Format -MAF- parser

Rationale
The multiple alignment format stores a series of multiple alignments in a format that is easy to parse and relatively easy to read. This format stores multiple alignments at the DNA level between entire genomes. Previously used formats are suitable for multiple alignments of single proteins or regions of DNA without rearrangements, but would require considerable extension to cope with genomic issues such as forward and reverse strand directions, multiple pieces to the alignment, and so forth.
Approach
Create a native ruby parser similar to BioPython API http://biopython.org/wiki/Multiple_Alignment_Format, because they have an interesting indexing system. Another approach consists in using FFI to bind native C libraries like http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/ and http://compgen.bscb.cornell.edu/phast/index.php. A lazy parser is preferred oven an eager one.
Difficulty and needed skills
Medium, foreign language bindings
The project requires
Knowledge of Ruby and, in case, of C bindings
Mentors
Pjotr Prins, Raoul Bonnal

GSoC:Bam Robust and fast parallel BAM parser in D for binding against dynamic languages

Difficulty and needed skills
This is a challenging project. Advanced programming concepts, concurrency, foreign language bindings.
The project requires
An interest in high performance computing. Some affinity with coding in C and one or more interpreted languages
Mentors
Pjotr Prins, Francesco Strozzi, Raoul Bonnal

2011

GSoC:Bio-images Represent bio-objects and related information with images

2010

Ruby 1.9.2 support of BioRuby (see below)

Implementation of algorithm to infer gene duplications in BioRuby (see below)

2009

Implementing phyloXML support in BioRuby (see below)


Past Project Proposals

2012

Update bio-images a plugin to represent bio-object with cool images

Rationale
After GSoC 2011 the main library used to plot the chats was discontinued and a new one was introduced. We want to update this Biogem -BioRuby plugin-. Most of the time, after a bioinformatics analysis, the resulting data needs to be re-processed into a graphical way since we, as human-beings, are more comfortable accessing results and data visually than browsing a huge table with interconnected information. Very often it is also difficult to extrapolate the real biological meaning from a raw datasets. The main idea of this proposal is to define and attach graphical functions to BioRuby objects and consequently to the results computed from a generic process or pipeline. With this solution, it would be possible to explore them more naturally but also to export and integrate the information into a web environment, for sharing the knowledge and the results. For example, different objects storing alignments results could share the same interface and display their data in a common way. The same is true also for other kind of objects or computational procedures.
Approach
Study the new library http://mbostock.github.com/d3/ and update the code already developed, then improve the number of charts and objects supported.
Difficulty and needed skills
Medium/Hard. The student will need to define a graphical API and integrate the new code with the existing BioRuby modules. High level coding skills will be required to create a clean API with a clear documentation.
The project requires
  • Very good knowledge of Ruby (1.9)
  • Basic concepts of graphics/visualization
  • Ruby on Rails basic knowledge
Mentors
Raoul J.P. Bonnal, ...

Proposal 2012

Testing framework for biogem plugins

Rationale
Biogems are Ruby gems that are created by independent authors, see the publication. Designing an automated testing frame work for different versions of Ruby is critical for the successful use of Ruby in bioinformatics. Gems should be tested on release, but also when tagged on github
Approach
We integrate facilities of http://biogems.info/, http://rubygems.org/ and http://github.com/ to automatically test gems that get released in the public domain. We can make use of open-bio's testing framework to test individual gems, or crowd sourcing setups, such as http://test.rubygems.org/ to test gems in different setups. Next we program the http://biogems.info/ website to show the test results in easy way.
Difficulty and needed skills
Ruby scripting, and affinity for web integration, some web programming including Ruby, HTML (HAML), CSS and Javascript (Coffee)
Mentors
Pjotr Prins, Raoul Bonnal, Peter Cock (confirm)

Adding social networking functionality to BioRuby.org

Rationale
BioRuby.org is a portal for getting appropriate information on Ruby-related software development to bioinformatics software developers. The current portal discourages both experienced and inexperienced software developers from digging deeper, and finding solutions to typical bioinformatics problems. We are looking at ways to motivate new developers, students and teachers to dive into (Bio)Ruby. This implies building out the community with functionality from twitter feeds all the way to biogem github updates.
Approach
We want to restyle the portal so it becomes an interactive environment, encourages people to participate and put information in, and information gets easier to find. The restyling is about web design, and programming an interactive website in Ruby, using Ruby on Rails and other tools, such as markdown, haml, sass, staticmatic, etc. Also the idea is to use existing webservices, such has github gists, rubydoc.info, e.g. example, and embed them into the site - rather than recreating all these services from scratch. We would like to create a collection of code snippets and documentation that is easy to navigate and add to. It should be even less effort than maintaining a Wiki. Also code snippets should be able to run online - and prove they are correct. The total design should also be useful for other Bio* projectcs, such as BioPerl. We are currently defining the features we want from such a web presence, see [features]. It is even possible to get a scientific paper out of this work.
Difficulty and needed skills
Affinity for web design, accessibility of information, web programming in Ruby. Javascript/Coffeescript.
The project requires
Some Ruby experience, interest in web design and social networking
Mentors
The BioRuby panel: Raoul Bonnal, Pjotr Prins, Francesco Strozzi, Naohisa Goto, Toshiaki Katayama (confirm)


Update to the Ruby Ensembl API

Rationale
The Ruby Ensembl API has been published on 2011 (http://bioinformatics.oxfordjournals.org/content/early/2011/01/28/bioinformatics.btr050) and allows users to programmatically access the Ensembl Database with Ruby. The API was modeled on the Ensembl Perl API to give users the same methods they already know and are familiar with, but it provides also a slightly different approach with easier and powerful methods to access the different Ensembl databases and retrieve biological data. From the developers side, the project is based mainly on ActiveRecord classes that map tables in the MySQL databases, plus other libraries that define high order methods to combine different data from different tables and provide connection to the Ensembl databases. So far the Ruby API covers the Ensembl Core, Ensembl Variation and Ensembl Genomes databases, which are updated every 2-3 months by the EBI teams. The API uses Ruby metaprogramming to adapt to a database schema and the code needs to be updated only if significant changes occur to the databases. We want to push the project further and define a library that can adapt itself at every new Ensembl release, although minor or major changes can occur to the schemes.
Approach
As part of the Ruby Ensembl API, an utility could be created that can be run periodically at every change in the Ensembl databases schemes, to generate the necessary ActiveRecord classes and define relations among classes using the fields and foreign keys present into the MySQL tables. A testing suite could also be generated according to the new classes and methods defined.
Difficulty and needed skills
Medium/Hard. The student need to know Ruby quite well and he/she will make a strong use of advanced Ruby metaprogramming. He/she will also need to understand how the Ensembl databases work and how the biological data are organized.
The project requires
  • Very good knowledge of Ruby (1.9)
  • Good knowledge of database schemes, MySQL in particular.
  • Good knowledge of Ruby metaprogramming.
  • Basic knowledge of Ensembl website and/or Ensembl databases, even if not mandatory, will help.
Mentors
Francesco Strozzi
Current BioGem available at
https://github.com/fstrozzi/bioruby-ensembl

2011

Represent bio-objects and related information with images (ACCEPTED)

Rationale
Most of the time, after a bioinformatics analysis, the resulting data needs to be re-processed into a graphical way since we, as human-beings, are more comfortable accessing results and data visually than browsing a huge table with interconnected information. Very often it is also difficult to extrapolate the real biological meaning from a raw datasets. The main idea of this proposal is to define and attach graphical functions to BioRuby objects and consequently to the results computed from a generic process or pipeline. With this solution, it would be possible to explore them more naturally but also to export and integrate the information into a web environment, for sharing the knowledge and the results. For example, different objects storing alignments results could share the same interface and display their data in a common way. The same is true also for other kind of objects or computational procedures.
Approach
The student and the mentor will define together a minimum set of features that need to be shared by the BioRuby objects and that could be visualized. Then the student will create a library/module to implement these graphical features within the BioRuby project. He/she will gain experience with Rubyvis as the graphical API and with Ruby on Rails for web visualization.
Difficulty and needed skills
Medium/Hard. The student will need to define a graphical API and integrate the new code with the existing BioRuby modules. High level coding skills will be required to create a clean API with a clear documentation.


The project requires
  • Very good knowledge of Ruby (1.9) and pattern design
  • Basic concepts of graphics/visualization
  • Ruby on Rails basic knowledge
Mentors
Raoul J.P. Bonnal, Christian Zmasek, Claudio Bustos (confirm)

Support Next Generation Sequencing (NGS) in BioRuby (proposed)

Rationale
The processing and analyzing of NGS data is challenging for a variety of reasons, in particular due to the fact that the data-sets are usually very large and contain a vast amount of information and a high number of unknown data. Furthermore there are many different approaches to perform NGS analyses and several software tools need to be integrated to produce reliable results. Since this topic is so important for the BioRuby community we started a sub-project bioruby-ngs for analyzing NGS data. The project is in an early stage of development but notable results have been quickly gained. Many topics need to be still addressed, in particular: * data and results reporting
  • workflow management
  • DSL for describing experimental designs
  • YALIMS (Yet Another LIMS), a simple web based Lims for raw datasets processing, with reporting and monitoring
Approach
Due to the open nature of the project the student will choose which feature he/she wants to develop and to focus on. The student will learn basic concept of NGS data analysis and will work tightly with a mentor to produce a working library that will be integrated into the BioRuby NGS project.
Difficulty and needed skills
Medium to Hard depending on the topic selected.


The project requires
  • Ruby
  • Bash programming and knowledge of the Linux environment
  • Ruby on Rails 3.x
Mentors
Raoul J.P. Bonnal, Francesco Strozzi
Project overview and updates
[1]
Source code
https://github.com/helios/bioruby-ngs

BioRuby Wrapper for Command line application (proposed)

Rationale
The main reason for this project is the need to support different stand-alone applications critical for Next Generation Sequences analyses. Direct binding to existing C/C++ source code or rewriting all the applications is impractical and a waste of resources. A quick solution is to use stand-alone applications directly, integrating them into the BioRuby API. Some work has been already done in the BioRuby NGS project [with this wrapper] but a better support for demanding I/O processes is required. Following this design pattern will be possible to improve also the support for other bioinformatics suites, like EMBOSS, outdated in BioRuby at the time of this proposal.
Approach
The student will familiarize with advanced meta-programming concepts in Ruby and will contribute to the definition of a DSL for this wrapping library. He/she will build also a parser to automatically define additional wrappers for the EMBOSS suites starting from the ACD configuration files.
Difficulty and needed skills
Medium. Good Ruby knowledge and experience with meta-programming are required to achieve the goals.
The project requires
  • Ruby 1.9
  • Ruby Metaprogramming
Mentors
Raoul J.P. Bonnal, Francesco Strozzi
Source code
https://github.com/helios/bioruby-ngs, wrapper branch

Modular annotation knowledge base for BioRuby (proposed)

Rationale
Handling data sets coming from platforms for gene expression analysis or real time PCR requires to access the corresponding gene annotations several times during the measurements. This kind of information is normally stored into remote databases that provide the required knowledge and data. Problems arise when the available databases do not support a specific version of the data of interest or when huge queries need to be submitted. A BioRuby knowledge base, designed to be modular and expandable through time, could solve these problems. A good compromise between performances and portability could be achieved using embedded databases and accessing the data through a clean API.
Approach
The student and the mentor will explore which platforms should be supported by their popularity. Then the student will recover the essential annotation and will design a simple database schema to support all the relevant non-redundant information. The schema will be flexible enough to allow interconnecting the dataset with external databases or resources for subsequent analyses. After this phase of discovery and design, the student will build the database using SQLite and will write a Ruby library to access the data using ORM ActiveRecord
Difficulty and needed skills
Medium. The student will need to define the core data to be included into the database and how this information will be organized and accessed by the end-user. The Ruby library will be created using the powerful ActiveRecord paradigms, but good coding skills will be required to design an efficient API with a clear documentation.
The project requires
  • Minimal SQL dialect
  • Good knowledge of Ruby
  • Experience in querying biological databases
  • Experience with annotation data
Mentors
Raoul J.P. Bonnal, Francesco Strozzi

2010

Ruby 1.9.2 support of BioRuby (accepted)

Rationale
New stable Ruby version 1.9.2 is now under development, and will soon be released. It have many improvements and some incompatible changes. The goal of the project is to run almost all functions of BioRuby correctly in both Ruby 1.8.x and Ruby 1.9.2.
Approach
First, implement unit tests to guarantee no behavior changes. Next, modify existing codes.
Difficulty and needed skills
Medium.
The project requires
  • Ruby programming skill
  • bioinformatics skill, or motivation to learn bioinformatics

In addition, the student should also have interest in the differences between Ruby 1.8 and 1.9.

Mentors
Naohisa Goto
Student
Kazuhiro Hayashi
Project overview and updates
project blog for the Ruby 1.9.2 support of BioRuby
Source code
http://github.com/GSoC2010KH/bioruby

Implementation of algorithm to infer gene duplications in BioRuby (accepted)

Rationale
Gene duplications are an important concept in biomedical research. They are of particular importance in the study of molecular evolution, since they are believed to be a major driver in the evolution of new protein functions. Furthermore, the inference of gene duplications is almost always necessary for accurate sequence function prediction, as sequences related by gene duplications (paralogs) are more likely to exhibit differences on a functional level than sequences related by speciations (orthologs). Gene duplications can be inferred by calculating an evolutionary tree of the molecular sequences being analyzed, and then comparing this gene tree with a species tree (the 'tree of life'). For this purpose, we developed a simple and fast algorithm, named SDI, for speciation duplication inference (reference: Zmasek and Eddy, 2001, "A simple algorithm to infer gene duplication and speciation events on a gene tree", Bioinformatics, 17, 821-828). Implementing this algorithm in the increasingly popular Ruby programing language, as part of the BioRuby open source bioinformatics project would give a large number of biologist and software developers immediate access to a useful tool, particularly in light of the ever increasing number of sequenced genomes and the associated increase in comparative functional genomics studies.
Approach
Development of unit tests followed by the implementation of the algorithm and necessary data structures. Since BioRuby supports phyloXML, the basic infrastructures needed to implement the SDI algorithm are already present (such as data structures to store species and gene information, input and output of phylogenetic trees), making the implementation relatively straightforward. Dependent on student interest and aptitude for computer science and algorithm development, this project might also entail extending the algorithm itself. Currently, it is only defined for binary trees. A very useful extension would be to allow non-binary species trees, and, possibly, non-binary gene trees. Relevant references on the theories behind this proposal can be found here.
Difficulty and needed skills
Medium. The project requires Ruby programming skills and some experience with algorithms. Knowledge about evolutionary biology is advantageous but not mandatory. The (optional) extension of the algorithm for non-binary trees requires a solid background in computer science.
Mentors
Christian Zmasek, Diana Jaunzeikare
Student
Sara Rayburn
Project overview, timeline, and updates
Implementing SDI Project Updates
Source code
http://github.com/srayburn/bioruby/

Related Projects

As part of NESCent's Phyloinformatics GSoC

This is an application for the Evolution and the semantic web in Ruby: NeXML I/O for BioRuby project idea.

Abstract: Add NeXML parsing and serializing support, and an RDF triples API to BioRuby.

Student: Anurag Priyam

Mentor(s): Rutger Vos (primary), Jan Aerts

Project Homepage: Develop an API for NeXML and RDF triples for BioRuby

Project blog: My Weblog( phylosoc label )

Source Code: Github

Past Mentor List

BioRuby developers who volunteered to act as a mentor to a GSoC student.

2009

Implementing phyloXML support in BioRuby (accepted)

As part of NESCent's Phyloinformatics GSoC