The goal of this project was to develop software for predicting unobserved genotype data using correlations found in the HapMap data. The software takes as input the reference data (or if none is supplied, then the HapMap project data) and the incomplete set of SNPs. It produces as output a complete predicted genotype based on SNP correlations from the reference data. The program is coded in C++ and released under the GNU General Public License.

Read this paper to learn about the state of the project as of June 2008: Project Overview and Results (as of June 2008).

February 22, 2012

The code for this project is now hosted by Bitbucket here:

June 5, 2008

Finished overview paper.

June 4, 2008

Final draft of presentation on this project here.

Current version: 0.62. Get it here.

  • Improved error rate to 6%.

May 24 – June 1, 2008

The first useful version of the program has been released!

Current version: 0.601. Get it here.

April 20 – May 23, 2008

See the old wikidot project page for earlier updates: Imputation Project Page