Examples
This example illustrates the usages of different options by analyzing a small data set obtained from GAW18. This demo data set contains 894 SNPs and 849 individuals. The required/optional input files are
- PLINK file:
chr3-geno-MAP4-849.bed
,chr3-geno-MAP4-849.bim
andchr3-geno-MAP4-849.fam
- Covariates file:
covariates-julia.txt
which contains 5 covariates and intercept - Trait file:
y-julia.txt
- Annotation file:
annotation.txt
- Kinship file:
gaw18_849_kinship.txt
These data files come with our package, and they are available at here.
Basic usage
Three types of exact tests can be performed. Open up a Julia session and type
- exact likelihood ratio test (eLRT)
julia> using VarianceComponentTest
julia> gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", kinship = "none", test = "eLRT")
Then the output will be written to chr3-geno-MAP4-849-julia.out
at the same directory as PLINK files.
- exact restricted likelihood ratio test (eRLRT)
julia> using VarianceComponentTest
julia> gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", kinship = "none", test = "eRLRT")
- exact score test (eScore)
julia> using VarianceComponentTest
julia> gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", kinship = "none", test = "eScore")
Note: if the input files are not at the current directory, you should specify the paths correctly.
You can also call gwasvctest from command line. For example, to perform eRLRT
$ julia -E 'using VarianceComponentTest; gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", kinship = "none", test = "eRLRT")'
Option pvalueComputing
Chi squared approximation is recommended (though you don't have to write it out specifically)
julia> using VarianceComponentTest
julia> gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", kinship = "none", test = "eRLRT", pvalueComputing = "chi2")
If you want to use Monte Carlo method
julia> using VarianceComponentTest
julia> gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", kinship = "none", test = "eRLRT", pvalueComputing = "MonteCarlo")
Note: Option pvalueComputing
is only for eLRT and eRLRT. For eScore, it employs the method of inverting characteristics function to compute the p-value.
Option nNullSimPts
If Monte Carlo method is chosen, generate 10,000 replicates should be sufficient (you can also generate even more replicates to increase the precision of p-value)
julia> using VarianceComponentTest
julia> gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", kinship = "none", test = "eRLRT", pvalueComputing = "MonteCarlo", nNullSimPts = 10000)
If Chi squared approximation method is chosen, you can generate more replicates (default value is 1,000,000) to have a better estimator of point mass at 0
julia> using VarianceComponentTest
julia> gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", kinship = "none", test = "eRLRT", pvalueComputing = "chi2", nNullSimPts = 10000000)
Option annotationFile
If the annotation information of the markers is available, you can share it by inputting an annotation file with option annotationFile
julia> using VarianceComponentTest
julia> gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", kinship = "none", test = "eRLRT", annotationFile = "annotation.txt")
Option kinship
For the analysis of unrelated data, the kinship matrix should not be included in the model. So you should specify option kinship
as none
julia> using VarianceComponentTest
julia> gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", test = "eRLRT", kinship = "none")
If you have family data and have a file (gaw18_849_kinship.txt in this example) which contains the kinship matrix
julia> using VarianceComponentTest
julia> gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", test = "eRLRT", kinship = "gaw18_849_kinship.txt")
It's okay if you don't have the kinship file
julia> using VarianceComponentTest
julia> gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", test = "eRLRT", kinship = "GRM")
Option infLambda
If you are concerning about too high approximating rank for the kinship matrix, use option infLambda
to set a lower rank
julia> using VarianceComponentTest
julia> gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", kinship = "gaw18_849_kinship.txt", test = "eRLRT", infLambda = 1.0)
Note: If you are analyzing unrelated data, you can ignore option infLambda
since the kinship matrix will not be included in the model in unrelated case.
Parallel computing
If your machine has multiple cores, you can active the parallel computing mode to speed up the calculations. For example, you want to distribute the computings to 4 processes
julia> addprocs(4)
julia> @everywhere using VarianceComponentTest
julia> gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", kinship = "none", test = "eRLRT", annotationFile = "annotation.txt")
You can also call the function from command line
$ julia -p 4 -E '@everywhere using VarianceComponentTest; gwasvctest(plinkFile = "chr3-geno-MAP4-849", covFile = "covariates-julia.txt", traitFile = "y-julia.txt", kinship = "none", test = "eRLRT", annotationFile = "annotation.txt")'