study.character {JATSdecoder}R Documentation

study.character

Description

Extracts study characteristics out of a NISO-JATS coded XML file. Use CERMINE to convert PDF to CERMXML files.

Usage

study.character(
  x,
  stats.mode = "all",
  recalculate.p = TRUE,
  alternative = "auto",
  estimateZ = FALSE,
  T2t = FALSE,
  R2r = FALSE,
  selectStandardStats = NULL,
  checkP = TRUE,
  criticalDif = 0.02,
  alpha = 0.05,
  p2alpha = TRUE,
  alpha_output = "list",
  captions = TRUE,
  text.mode = 1,
  update.package.list = FALSE,
  add.software = NULL,
  quantileDF = 0.9,
  N.max.only = FALSE,
  output = "all",
  rm.na.col = TRUE
)

Arguments

x

NISO-JATS coded XML file.

stats.mode

Character. Select subset of standard stats. One of: c("all", "checkable", "computable").

recalculate.p

Logical. If TRUE recalculates p values (for 2 sided test) if possible.

alternative

Character. Select sidedness of recomputed p-values for t-, r- and Z-values. One of c("auto", "undirected", "directed"). If set to "auto" 'alternative' will be be set to 'directed' if get.test.direction() detects one-directional hypotheses/tests in text. If no directional hypotheses/tests are dtected only "undirected" recomputed p-values will be returned.

estimateZ

Logical. If TRUE detected beta-/d-value is divided by reported standard error "SE" to estimate Z-value ("Zest") for observed beta/d and recompute p-value. Note: This is only valid, if Gauss-Marcov assumptions are met and a sufficiently large sample size is used. If a Z- or t-value is detected in a report of a beta-/d-coefficient with SE, no estimation will be performed, although set to TRUE.

T2t

Logical. If TRUE capital letter T is treated as t-statistic when extracting statistics with get.stats().

R2r

Logical. If TRUE capital letter R is treated as correlation when extracting statistics with get.stats().

selectStandardStats

Select specific standard statistics only (e.g.: c("t", "F", "Chi2")).

checkP

Logical. If TRUE observed and recalculated p-values are checked for consistency.

criticalDif

Numeric. Sets the absolute maximum difference in reported and recalculated p-values for error detection.

alpha

Numeric. Defines the alpha level to be used for error assignment of detected incosistencies.

p2alpha

Logical. If TRUE detects and extracts alpha errors denoted with critical p-value (what may lead to some false positive detections).

alpha_output

One of c("list", "vector"). If alpha_output = "list" a list with elements: alpha_error, corrected_alpha, alpha_from_CI, alpha_max, alpha_min is returned. If alpha_output = "vector" unique alpha errors without a distinction of types is returned.

captions

Logical. If TRUE captions text will be scanned for statistical results.

text.mode

Numeric. Defines text parts to extract statistical results from (text.mode=1: abstract and full text, text.mode=2: method and result section, text.mode=3: result section only).

update.package.list

Logical. If TRUE updates available R packages with utils::available.packages() function.

add.software

additional software names to detect as vector.

quantileDF

quantile of (df1+1)+(df2+1) to extract for estimating sample size.

N.max.only

return only maximum of estimated sample sizes.

output

output selection of specific results c("doi", "title", "year", "Nstudies",
"methods", "alpha_error", "power", "multi_comparison_correction",
"assumptions", "OutlierRemovalInSD", "InteractionModeratorMediatorEffect",
"test_direction", "sig_adjectives", "software", "Rpackage", "stats",
"standardStats", "estimated_sample_size").

rm.na.col

Logical. If TRUE removes all columns with only NA in extracted standard statistics.

Value

List with extracted study characteristics.

Note

A short tutorial on how to work with JATSdecoder and the generated outputs can be found at: https://github.com/ingmarboeschen/JATSdecoder

Source

An interactive web application for selecting and analyzing extracted article metadata and study characteristics for articles linked to PubMed Central is hosted at: https://www.scianalyzer.com/

The XML version of PubMed Central database articles can be downloaded in bulk from:
https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_bulk/

References

Böschen (2023). "Evaluation of the extraction of methodological study characteristics with JATSdecoder.” Scientific Reports. doi: 10.1038/s41598-022-27085-y.

Böschen (2021). "Evaluation of JATSdecoder as an automated text extraction tool for statistical results in scientific reports.” Scientific Reports. doi: 10.1038/s41598-021-98782-3.

See Also

JATSdecoder for simultaneous extraction of meta-tags, abstract, sectioned text and reference list.

get.stats for extracting statistical results from textual input and different file formats.

Examples

# download example XML file via URL
x<-"https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0114876&type=manuscript"
# file name
file<-paste0(tempdir(),"/file.xml")
# download URL as "file.xml" in tempdir() if a connection is possible
tryCatch({
readLines(x,n=1)
download.file(x,file)
},
warning = function(w) message(
  "Something went wrong. Check your internet connection and the link address."),
error = function(e) message(
  "Something went wrong. Check your internet connection and the link address."))
# convert full article to list with study characteristics
if(file.exists(file)) study.character(file)

[Package JATSdecoder version 1.2.0 Index]