| filter_GWAS {QCGWAS} | R Documentation | 
Automated filtering and reformatting of GWAS results files
Description
This function was created as a convenient way to automate the
removal of low-quality and non-autosomal SNPs. It
includes the same formatting options as QC_GWAS.
Usage
filter_GWAS(ini_file,
            GWAS_files, output_names,
            gzip_output = TRUE,
            dir_GWAS = getwd(), dir_output = dir_GWAS,
            FRQ_HQ = NULL, HWE_HQ = NULL,
            cal_HQ = NULL, imp_HQ = NULL,
            FRQ_NA = TRUE, HWE_NA = TRUE,
            cal_NA = TRUE, imp_NA = TRUE,
            ignore_impstatus = FALSE,
            remove_X = FALSE, remove_Y = FALSE,
            remove_XY = FALSE, remove_M = FALSE,
            header_translations,
            check_impstatus = FALSE,
            imputed_T = c("1", "TRUE", "yes", "YES", "y", "Y"),
            imputed_F = c("0", "FALSE", "no", "NO", "n", "N"),
            imputed_NA = NULL,
            column_separators = c("\t", " ", "", ",", ";"),
            header = TRUE, nrows = -1, nrows_test = 1000,
            comment.char = "", na.strings = c("NA", "."),
            out_header = "original", out_quote = FALSE,
            out_sep = "\t", out_eol = "\n", out_na = "NA",
            out_dec = ".", out_qmethod = "escape",
            out_rownames = FALSE, out_colnames = TRUE, ...)
Arguments
| ini_file | (the filename of) a table listing the files to be processed and the filters to be applied. See 'Details'. | 
| GWAS_files | character vector: when no  | 
| output_names | character vector: the filenames for the
output files. The default option is to use the input
filenames. Note that, unlike with other  | 
| gzip_output | logical; should the output files be compressed? | 
| dir_GWAS,dir_output | character-strings specifying the directory address of the folders for the input files and the output, respectively. Note that R uses forward slash (/) where Windows uses backslash (\). | 
| FRQ_HQ,HWE_HQ,cal_HQ,imp_HQ | Numeric vectors. When no  | 
| FRQ_NA,HWE_NA,cal_NA,imp_NA | Logical vectors. When no  | 
| ignore_impstatus | Logical vector. When no  | 
| remove_X,remove_Y,remove_XY,remove_M | logical; respectively whether X-chromosome, Y-chromosome,
pseudo-autosomal and mitochondrial SNPs are removed. Note:
these arguments accept only a single  | 
| header_translations | translation table for column names.
See  | 
| check_impstatus | logical; should
 | 
| imputed_T,imputed_F,imputed_NA | arguments passed to
 | 
| column_separators | character string or vector; specifies
the values used as column delimitator in the GWAS file(s). The
argument is passed to  | 
| nrows_test | integer; the number of rows used for
"trial-loading". Before loading the entire dataset, the
function  | 
| header,nrows,comment.char,na.strings,... | arguments passed to  | 
| out_header | Translation table for the column names of
the output file. This argument is the opposite of
 
 | 
| out_quote,out_sep,out_eol,out_na,out_dec,out_qmethod,out_rownames,out_colnames | arguments passed to
 | 
Details
The easiest way to use filter_GWAS is by passing an ini
file to the ini_file argument.
The ini file can be generated by running QC_series
with the save_filtersettings argument set to TRUE.
The output will include a file 'Check_filtersettings.txt',
describing the (high-quality) filter settings used for each
file (taking into account whether there was enough data, i.e.
whether the use_threshold was met, to apply the filters).
The ini_file argument accepts both a table
or the name of a file in dir_GWAS or the
current R working directory.
If no ini_file is specified, the function will use the
GWAS_files, x_HQ, x_NA and ignore_impstatus
arguments to construct such a table.
GWAS_files can either be a character vector or a single
value. If a single string, all filenames containing the string
will be processed. The other arguments can also be a vector or
a single value; if the latter, they will be recycled to create
a vector of the correct length.  
If neither ini_file nor GWAS_files are specified,
the function will look for a file
Check_filtersettings.txt
in dir_GWAS and the current R working directory.
Note that ini_file overrules the other filter settings,
i.e. one cannot adjust ini_file through the other
arguments.
Value
An invisible logical vector, indicating which files were successfully filtered.
Note
R is not the optimal platform for filtering GWAS files. This function was added at the request of a user, but an UNIX script is likely to be faster.