- The research questions and/or hypotheses should organize the report with concurrence among the methods, results, and discussion. These questions/hypotheses should be formulated before the research is undertaken and situated in relevant literature. Researchers should overtly state when they are attempting to explain results in a post hoc/exploratory fashion.
- The design and testing procedures of the study should be considered and reported in manner to display methodological rigor. Experimental designs for instance require random assignment and a clear explanation of the treatment and comparison conditions (Rogers & Révész, 2020). Basic descriptive and correlational tables are expected. There is also the expectation of covariate control. Bi-/univariate tests, to cite another example, are problematic as much if not all of SLA/AL theory features multidimensional phenomena. Authors who are unsure of what criteria might be used to review various specific approaches are encouraged to consult Hancock et al., (2019) and Norris et al. (2015) as resources.
- The sampling process is at the heart of quantitative research and will be checked for suitability. The population from which a representative sample has been drawn should be clearly stated in the research questions/hypotheses. In general, single site samples are flawed in relation to the generalizations that quantitative research endeavors to make (see Moranski & Ziegler, 2021; Vitta et al., 2021). Sample sizes also need to be defended relative to anticipated effect sizes (Brysbaert, 2019). Ideally a priori power analyses or at least sensitivity analyses are used. Alternative procedures such as precision are also encouraged (Norouzian, 2020). In lieu of mathematical processes (e.g., power), heuristics and rules of thumb can be used. That is not to say that single site underpowered samples are unwelcome, but authors should overtly state the limitations of such practices.
- Authors will generalize using their results whether they acknowledge this or not. Connecting findings to past studies is a generalization in many ways. Thus, inferential testing is required for most quantitative work unless the authors explicitly state that the results pertain to their context only (e.g., a mixed-methods study) and thus descriptive statistics are suitable. Whether the authors wish to present Bayesian or frequentist inferential approaches is up to them.
- Effect sizes must always be reported, even for non-significant results and even for post-hoc comparisons. Authors are also encouraged to consider the magnitudes for the effect sizes that they report (Plonsky & Oswald, 2014). The assumptions of the inferential tests producing the effect sizes must be reported and checked (Isbell et al., 2022). While there are competing frameworks for these assumptions checks, the author should cite a framework and then execute the check accordingly. Field’s (2018) handbook is at the practical level of most CALL researchers. Authors are encouraged to consider the robustness of parametric testing before using non-parametric alternatives which themselves have limitations, e.g., the destruction of data via ranking.
- While CALL is not a psychometric-intensive subcamp of AL/SLA, reliability and validity evidence should be presented for the inferences enabled by the studies’ instruments.
- Authors are strongly encouraged to share their data and materials using databases such as IRIS.
References
Brysbaert, M. (2019). How many participants do we have to include in properly powered experiments? A tutorial of power analysis with reference tables. Journal of Cognition, 2, 1–38. https://doi.org/10.5334/joc.72
Field, A. (2018). Discovering statistics using IBM SPSS statistics. Sage Publications.
Hancock, G. R., Mueller, R. O., & Stapleton, L. M. (2019). The reviewer's guide to quantitative methods in the social sciences (2nd ed.). Routledge.
Isbell, D., Brown, D., Chan, M., Derrick, D., Ghanem, R., Gutiérrez Arvizu, M. N., Schnur, E., Zhang, M., & Plonsky, L. (2022). Misconduct and questionable research practices: The ethics of quantitative data handling and reporting in applied linguistics. The Modern Language Journal, 106(1), 172–195. https://doi.org/10.1111/modl.12760
Moranski, K., & Ziegler, N. (2021). A case for multisite second language acquisition research: Challenges, risks, and rewards. Language Learning, 71, 204–242. https://doi.org/10.1111/lang.12434
Norouzian, R. (2020). Sample size planning in quantitative L2 research. Studies in Second Language Acquisition, 42(4), 849–870. https://doi.org/10.1017/s0272263120000017
Norris, J. M., Plonsky, L., Ross, S. J., & Schoonen, R. (2015). Guidelines for reporting quantitative methods and results in primary research. Language Learning, 65(2), 470–476.
Plonsky, L., & Oswald, F. L. (2014). How big is “big”? Interpreting effect sizes in L2 research. Language Learning, 64, 878–912. https://doi.org/10.1111/lang.12079
Rogers, J., & Révész, A. (2020). Experimental and quasi-experimental designs. In J. McKinley & H. Rose (Eds.), The Routledge handbook of research methods in applied linguistics (pp. 133–143). Routledge.
Vitta, J. P., Nicklin, C., & McLean, S. (2021). Effect size-driven sample-size planning, randomization, and multisite use in L2 instructed vocabulary acquisition experimental samples. Studies in Second Language Acquisition. Advance online publication. https://doi.org/10.1017/s0272263121000541