Cpg-logo.jpg (4054 bytes)

Preparing Data

As CPG Map uses a three-tier componentary architecture, it can get it's data from a variety of sources as long as the data is assembled in to the correct format. A class has been written that acts as a default data-packer in the event that a programmer does not want to use their own data access classes and just wants to parse some CPG Maps out of a few flat files. This class is in the CPGMap package and is called FileParser. A file to be parsed by this class should have the following format:

//DOUBLE    TRUE

//UNITS    units1    units2

 

//***MAP    map_title

//LOCUS    locus_name    position_1     error_1    position_2    error_2

//HOMOGROUP    homology_group_name

//OWNS    who_it_owns

//LOCUS    ...

//HOMOGROUP    ...

//OWNS    ...

//LOCUS    ...

//HOMOGROUP    ...

//OWNS    ...

//***END

 

//***MAP    ...

//LOCUS    ...

//HOMOGROUP    ...

//OWNS    ...

//***END

 

Explanation:

If you are creating a set of double maps you should have //DOUBLE tab TRUE on the first line of the file. If you have anything else but TRUE, the program will think it's making single maps. So if you want to make single maps, have something like FALSE here!
The next line tells the program what units to use on the left and right hand side of the CPG Maps. The units are separated by a tab so the line might look like //UNITS tab MB tab cM. The units can be any string. Note, as the units are defined here for all the maps, you have to have all the physical (or linkage) maps on the same side of all the double maps. If you want to make single maps, don't tab after the first units.
Next comes the data for the individual maps. Both a single or double map starts with //***MAP and ends with //***END. In between those tags should be all the loci that are to be placed in that map. There are three components to each locus:
//LOCUS

You should have a tab after this label and have the name of the locus, tab, the position on the left-hand side of the double map, tab,it's error of position either side of position1, tab, the position on the right-hand side of the double map, tab, it's error of position either side of position2. If you want to create single maps leave the last two spaces blank.

A double map locus might look like:

//LOCUS    xp109    1354.350     9.893    89.7456    1.207

and a single map locus might look like:

//LOCUS    xp109    1354.350     9.893

//HOMOGROUP

You should have a tab after this label and then the string corresponding to the homology group to which it belongs. For example

//HOMOGROUP    HomologyGroup1

If the locus doesn't belong to a homology group, just leave a blank after the tab.

//OWNS

You should have a tab after this label and then a tab delimited list of integers corresponding to the loci in the same map that the locus owns. For example

//OWNS    0    4    21    2     122

The numbers do not have to be in numerical order and remember that the first locus in a map is number 0, not 1.

 

Some example files can be found in this winzip file.

 

The next section tells you how to use the data now that it's prepared and gives some programming tips.

©1998 Jeremy Dickson