Let's create a new directory where we put the files we will need.
~$ mkdir Structure_job ~$ cp /home/usuaris/miguel.omullony/Structure_files/* Structure_job/
In Structurejob we will see three files: structure.R, genjoblist.py, structure.sh
With Filezilla we will copy the data table that we want to analyze in the same directory.
Structure.R
The code works calling Structure from the path where is installed and we choose the parameters in the function instead of using the mainparams file. It needs a 'joblist.txt' file (where K, burnins and repetitions are indicated) and a directory where the outputs results are saved. We will generate the joblist/s easily with gen_joblist.py.
Here is the R script, the same as the one you copied before. Just modify the path lines and the loop 'for' with the number of joblist to work with.
### Using Structure from R with parallel package library(ParallelStructure) setwd("/home/usuaris/user/Structure_job/") ### All files has to be here (data.txt, joblists...) system('mkdir structure_results') ### Directory to save the results # path of Structure. Don't modify this. my_path = "/home/soft/Structure/console/" # Function to call structure, in this case ten times because we have ten joblists. # Modify the number to equal the number of joblist files. # Here you specify the parameters like in the mainparams file and how many processors # you'll use to parallelize for (i in 1:10) { parallel_structure(structure_path=my_path, joblist = paste('joblist', i, '.txt', sep = ''), n_cpu=20, infile='tabla.txt', outpath='structure_results/', numinds=680, numloci=17, printqhat=1, plot_output=1, noadmix=0, linkage=0, label=1, markernames=1, popdata=1, locdata=1, missing=-9, ploidy=2, inferalpha=1, onerowperind = 1) }
To generate a number x of joblists faster I have created a python script. Just write the numbers of the columns as it's asked when you execute the program.
Here an example:
~$ python gen_joblists.py ###WARNING! Insert only numbers WARNING!### how many joblists? > 5 number of populations? > 20 number of K's > 11 how many burnins? > 1000 how many reps? > 2000
You'll see five new files. Type 'ls' and 'nano joblistx.txt'
miguel.omullony@cluster-ceab:~/Structure_job$ ls gen_joblist.py joblist1.txt joblist2.txt joblist3.txt joblist4.txt joblist5.txt structure.R nano joblist1.txt T1-1 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 1 1000 2000 T2-1 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 2 1000 2000 T3-1 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 3 1000 2000 T4-1 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 4 1000 2000 T5-1 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 5 1000 2000 T6-1 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 6 1000 2000 T7-1 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 7 1000 2000 T8-1 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 8 1000 2000 T9-1 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 9 1000 2000 T10-1 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 10 1000 2000 T11-1 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 11 1000 2000
Now we have all we need to run Structure sending the job with qsub.
As always to send a job we do it with a 'script.sh' file. We copied one before in the Structure_job directory (structure.sh). Just modify the path. If it's your first job you can check this TODO (escribir enlace a la página)
nano structure.sh #!/bin/bash cd /home/usuaris/<USER>/Structure_job # write your user /home/soft/R-3.3.1/bin/R --vanilla < structure.R > output.txt 2> error.txt
Press ctrl+X to exit, don't forget to save changes.
Now send it to a queue with qsub. Take the same processors as indicated in the R script. It does not need to much RAM, just the assigned as default.
qsub -pe make 20 structure.sh
Structure generate a lot of output files (ending with _f, _q and .pdf's). Once it finished is useful to compress in a zip file all of them or just the ones we need. If we want just the _f's:
cd Structure_results/ zip files_f.zip *_f
With this will obtain a 'files_f.zip' with all files ending in _f. (If we write *f it will compress the pdf's as well).