Science topic

Sample Size - Science topic

The number of units (persons, animals, patients, specified circumstances, etc.) in a population to be studied. The sample size should be big enough to have a high likelihood of detecting a true difference between two groups. (From Wassertheil-Smoller, Biostatistics and Epidemiology, 1990, p95)
Questions related to Sample Size
  • asked a question related to Sample Size
Question
3 answers
The study involves age-specific range (15-16).
Relevant answer
Answer
The minimum sample size for pilot testing of a 12-item test depends on various factors, including the level of confidence desired, the margin of error acceptable, and the variability within your target population.
However, usually for piloting, it should be 10–20% of your actual study sample size.
  • asked a question related to Sample Size
Question
1 answer
I'm using quasi-experimental design with a sample size of 18 student with locomotor disability. There's experimental group and control group.. Pre-test.. Post-test design and also perform treatment with experimental group with a teaching module to study it's effectiveness. Should I formulate alternative hypothesis or null hypothesis for the same? alternative hypothesis ?
Relevant answer
Answer
Both null and alternate hypothesis can be formulated In a quasi-experimental design, But the type of hypothesis used should align with the research question and the structure of the design. Typically, a non-equivalent groups design is employed, where the researcher hypothesizes that the intervention or treatment will lead to a specific outcome compared to the control or comparison group. This hypothesis is directional and predicts the expected difference or change due to the intervention. The reason for using this type of hypothesis is that, unlike true experiments, quasi-experiments do not use random assignment, making it challenging to control for all confounding variables. Therefore, the hypothesis must be carefully constructed to account for these potential confounds and to assert a cause-and-effect relationship as confidently as possible within the constraints of the design.
  • asked a question related to Sample Size
Question
9 answers
I want to conduct a study where the population size is infinitely lagre. In that case How much data I should take. May anyone can suggest any book on that?
Relevant answer
Answer
  • asked a question related to Sample Size
Question
3 answers
Hi everyone,
I want to Apply Tests for Special Causes in p-chart, but the sample size for each group is different. So I don’t have a fixed UCL and LCL. Just a question can I apply Tests for Special Causes? Can I calculate index when I don’t have a fixed LCL and UCL.
Thanks,
Relevant answer
Answer
This is the raw data. Please understand it according to the P diagram method
Product P chart for non-conforming products within one month, one sample will be taken every day within one month
Samples were taken and the number of non-conforming products was checked, with a total of 26 days of data collected.
The data is as follows:
Inspection quantity Unqualified quantity (nP)
158 11
140 11
140 8
155 6
160 4
144 7
139 10
151 11
163 9
148 5
150 2
153 7
149 7
145 8
160 6
165 15
136 18
153 10
150 9
148 5
135 0
165 12
143 10
138 8
144 14
161 20
  • asked a question related to Sample Size
Question
1 answer
Our study involves conducting multiple WBC countings of venous blood from healthy participants using a control and test diluting fluid. We plan to compare and evaluate the results by checking how close/far the WBC count of a sample with the test is from the control. We want to know of studies or articles that state what is the minimum sample size that will still yield statistically significant results.
Relevant answer
Answer
MARBIL
Folks may find it difficult to answer your query as are many possible differences possible between two assessment methods (of blood dilution methods in this case). And each requires a different statistical – power-analysis (SPA) to determine sample sizes — so, suggest you to be more specific as to which is of interest ( or your purpose).
In this regard, two obvious basic concerns (IMO) —and arguably both — would be 1) respective reliabilities and their difference, and 2) respective means (and potential bias Of new dilution method vs. old- control).
“Sample-size“ also has several possible meanings depending on possibility of dividing a single persons sample into more than two subsamples , e.g. say 10 where alternating between dilution methods ( for relative independence) 5 are each would be ultimately produced. “Sample size’ also could mean the number of persons who provide a sample ( which might be divided as we just described).
Nonetheless: Given how you posed the question (sans specifics): One might guess you are interested in a quick eval where real ifocus is in use of new — maybe much faster or cheaper dilution method -- where WBC variation is “real interest.” In such a case, one might suggest selecting (~15) stratified sampled individuals ( animal or human) with WBCs over range of interest and then dividing into multiple even numbered sub samples ( GE. 6 , but 10 better).
Justification for the above could be provided, if it is on target. Most likely it is off in some significant way(s) , and to better advise would require a better statement of your underlying interest.
ALVAH
Alvah C Bittner, PhD, CPE
@
  • asked a question related to Sample Size
Question
3 answers
Provide the formula for determining sample size, given a study population. sampling table by different scholars can be of value to me.
Relevant answer
Answer
The Kish Leslie formula for survey sampling is:
\[ f = \frac{N}{n} \]
Where \( f \) is the sampling interval, \( N \) is the population size, and \( n \) is the desired sample size.
  • asked a question related to Sample Size
Question
1 answer
Sample size determination seems to be a difficult job for me, specially due to it's statistical complexity. For researchers who doesn't have prior statistical knowledge often becomes harder. I want to learn the concept and in details with comprehensive examples. Please help me finding such source. Best Regards
Relevant answer
Answer
You can take a sample size of 30 and above because such a data is assumed to be normally distributed, that is, it has the bell shape meaning free from bias. Nonetheless the larger the samle size the better the sampling result.
  • asked a question related to Sample Size
Question
2 answers
Hi Research Gate people!
I am trying to determine the sample size, power and alpha boundary needed for my interim analysis for a registered report in a psych journal. I have read the paper by Lakens (2014) on sequential analysis, but am still pretty confused and would appreciate if anyone could link me up with any published social psych papers that has used sequential analyses in their design.
Cheers!
Relevant answer
You could just simply use n = at least 30 to fulfil the CLT :)
  • asked a question related to Sample Size
Question
2 answers
Hello. I am examining the psychometric properties of the UGDS-GS, a test for measuring gender anxiety in the trans community. The problem I am facing is that my efa results are good but my cfa is not good. Is this due to the small sample size? My sample size is 140.
Relevant answer
Answer
EFA is less restrictive than CFA, as EFA allows all of the crossloadings. In contrast, most if not all crossloadings are typically fixed to zero in CFA. It is therefore rather typical for CFA to fit worse than EFA.
In terms of sample size, N = 140 is rather on the low side for both EFA and CFA. If anything, such a low sample size would typically result in a lack of statistical power to detect model misfit. Therefore, if you see misfit, I would definitely pay attention to it, especially in a small sample.
  • asked a question related to Sample Size
Question
6 answers
Greetings,
I am Mamun. I just want to know that if I use 7% margin of error than will it cause any problem in future analysis? I am using stratified sampling technique, for this if i take 7% margin of error then i get small sample size. It would be better for me to take small sample size.
Thank You.
Relevant answer
Answer
Just to be clear, Mamun Ur Rashid, I think you are not talking about a p-value, but the margin of error you are considering. Is that correct?
  • asked a question related to Sample Size
Question
6 answers
I am quite confused about what formula to use to compute my sample size. I will be conducting a Sequential Explanatory design wherein my QUANT phase will make use of mediation analysis and my qual phase will be interpretative phenomenology. How can I determine the sample size? What is the best formula to use?
Relevant answer
Answer
Bruce Weaver A useful app but it appears that its focus (currently) is exclusively on power. I believe that a simulation that is run "from scratch" provides greater flexibility and yields a lot more information than just on power. For example, you can simulate the effects of non-normal and missing data and evaluate the performance of fit statistics, as well as parameter and standard error bias in addition to estimating power. Power estimations can be misleading when your SEM parameters or standard errors are biased due to, for example, an insufficient sample size.
  • asked a question related to Sample Size
Question
2 answers
The pilot study is to detect the prevalence of pathogen in rodents.
Relevant answer
Answer
If your study compares groups, 10-20 animals per group is acceptable. This number can balance logistical convenience and data collection to guide the full-scale study design.
  • asked a question related to Sample Size
Question
4 answers
Hello everyone,
I have research including two objectives,
one of them is to assess the relationship using logistic regression.
another one is comparing two groups using Mann-Whitney U Test.
if I want to apply sample size formulation need to calculate separately for each objective?
also what is the Minium sample size for logistic?
Thanks.
Relevant answer
Answer
Thank you for clarifying, Bahar Ysr. PASS has routines for both logistic regression and the Wilcoxon-Mann-Whitney test. So I would estimate the needed sample sizes for both and then choose the larger of the two.
  • asked a question related to Sample Size
Question
5 answers
Any articles on this topic would be appreciated. Thank you.
Relevant answer
Answer
Esta en función de la superficie del campo a muestrear y de la homogeneidad en cuanto a características topográficas y composicion físico - química del suelo.
  • asked a question related to Sample Size
Question
3 answers
If a sample size is 300 and 5 Questionnaires are discarded due to error, are the 5 discarded questionairee not meant to be replaced?
Just curious
Relevant answer
Answer
It is highly unlikely that the "statistical validity" of your study will be affected by discarding 5 participants from you sample. Yes, you will see a small drop in power, but it is also highly unlikely it will be enough to create a meaningful difference.
Alternatively the is a high likelihood that this reply was generated by ChatGPT, rather than by someone who actually knows this field.
  • asked a question related to Sample Size
Question
1 answer
Please, I need assistance in calculating the sample size for the interventional study for three arms using G power.
Arm 1: Control group
Arm 2: Standard treatment
Arm 3: Standard plus advanced treatment.
I would be very glad if I could get a standard plan format for estimating that.
Thank for your advance assistance
Relevant answer
Answer
This article link above might help you. Otherwise, the links below may help you perform the calculation. I wish you all the best. Vicki
  • asked a question related to Sample Size
Question
2 answers
For my research, I will retrieve data for each firm (100 firms) over 5 years, leading to 500 data points.
Should this dataset size be sufficient for using a fixed effects model?
Relevant answer
Answer
Chuck A Arize Thank you for the quick response. I want to examine a positive linear relation between two continuous variables. Additionally, I use 6 control variables (mostly continuous).
  • asked a question related to Sample Size
Question
3 answers
In Survey sampling we need to calculate the desired sample size for different kind of scenarios. Hence it will be great help if there is a good resource for the same.
Relevant answer
Answer
One of the commonly used formulas for calculating the ideal sample size from a given universe is the Krejcie and Morgan formula (1970). In fact, one does not even have to calculate it as there is a pre-calculated table that allows the researcher to identify the ideal sample size simply by looking at the size of the univerise.
Reference: Krejcie, R.V., & Morgan, D.W., (1970). Determining Sample Size for Research Activities. Educational and Psychological Measurement.
  • asked a question related to Sample Size
Question
7 answers
Hello I am trying to run a moderation analysis but will need to use G*Power to determine my sample size. Just wondering if anyone could assist me with the following:
1) the parameters to set
2) Also what effect size/power should I use?
3) If my outcome measure is pre vs post test change on a questionnaire, would Hayes process macro be a good program to use or should I use SPSS instead?
IV: Intervention
M: Language (3 lvls)
DV: Pre test vs post test of an outcome measure
Thank you.
Relevant answer
Answer
thank you for sharing your perspectives. I think I have only been introduced to G*Power. But it was interesting to hear this perspective. Thank you Bruce for the reading resource as well which I will definitely read.
  • asked a question related to Sample Size
Question
3 answers
What is your opinion on this?
CASE: If you need to include 100 people according to the power calculation and you expect 20% dropout, you need to include a total of 125 people.
QUESTION: Would you stop before 125 (100+20%) if you reach 100 participants that can be included in the analysis due to lower dropout than expected? Or would you continue to 125, i.e. include more than the power calculation?
Relevant answer
Answer
Considering a sample size of 125 and greater than that is good provided you have enough time, sources,logistics and good frequency of participants.
1) It helps if you are having subgroups such as control group, treatment group, and placebo group where equal proportion of participants are recommended else there is a risk of bias.
2) If research is based on repetitions /intervention with the participants (clinical trial), you have to be 100% sure that the same participant will come and meet you after the prescribed time which is always a risk.
3) It also helps if research outcome is based on factors such as age , medical conditions, and other demographics.
4) You may also want to consider the effect of the study on the sample size i.e., outcome may vary / greater /lesser than expected.
Having a higher sample size always keeps you on a safer side.
  • asked a question related to Sample Size
Question
6 answers
Hi all
What could be the process of estimating sample size for a cognitive test being developed? The test is not an adaptation of any existing test and has to be used for patients with schizophrenia. In the first stage, I understand that a healthy participant group will be required to have the normative data followed by administration on the clinical group. Hence, what could be the process of estimating the sample size of the healthy control and the clinical groups?
Relevant answer
Answer
Thank you Dr. Morgan.
  • asked a question related to Sample Size
Question
3 answers
Searching for articles or insight on determining the number of vignettes needed for a study using the Factorial Survey Method.
Relevant answer
Answer
You probably know this publication, but just in case you do not (Auspurg & Hinz 2015. Factorial Survey Experiments).
Do you plan on randomly drawing two vignettes from the vignette population for each participant, so that you end up with ratings for all vignettes after all? If that is the case, is it feasible (in terms of the time to complete your survey) that each respondent rates more than two vignettes?
  • asked a question related to Sample Size
Question
9 answers
Dear colleagues, I am looking for advice on the validation of a standard questionnaire that I intend to translate. The original version contains 40 items. Could you please tell me what sample size is required for validation? Thank you in advance for your help.
Relevant answer
Answer
Soper’s sample size calculator may help you decide on the right sample size for your research. This calculation method, which was developed for studies applying structural equation modeling, has recently been accepted by top-tier journals (e.g., https://doi.org/10.1080/13683500.2023.2301458).
Please follow the reference and the link below to benefit from the calculator: Soper, D. (2018). A-priori sample size calculator for structural equation models. https://www.danielsoper.com/statcalc/calculator.aspx?id=89
  • asked a question related to Sample Size
Question
3 answers
I am planning to pass surface water samples through HLB cartridge. The sample size is more and hence want to know if i can use the HLB? and if yes what will be the procedure to clean it?? Even another query do we need to concentrate the sample in rotary after passing it through the cartridge?
Relevant answer
Answer
Would be interested to know
  • asked a question related to Sample Size
Question
3 answers
I have two groups (A and B) each going through 5 repeated measures. I want to know how I can determine eta squared so that I can use it to calculate an a-priori sample size.
Thanks in advance
Relevant answer
Answer
Hi,
Before your study, estimate the effect size for your mixed ANOVA by reviewing similar research or using Cohen's conventions if no prior studies exist. Then, determine your sample size using G*Power, considering your desired power and significance level.
Hope this helps.
  • asked a question related to Sample Size
Question
5 answers
Hi everyone,
I have prospective study; the aim is to apply multiple logistic regression.
The sample size for the current study is determined based on the previous study. A study designed to recruited 105 patients and analyzed 207 samples. But I'm thinking to say this justification for the sample size is not correct.
The Resean is: This new study includes secondary outcomes not covered by the previous research, Am I correct to say the justification based on previous research is not appropriate? If so, could you guide me to outline the justification for the sample size?
Thanks,
Relevant answer
Answer
If you know what analysis you are going to perform on the data, then you should perform a power analysis to tell you what sample size is needed.
There are multiple tools available to conduct a power analysis, but you need to find one that fits your planned regression. You can start by looking at the G*Power program and its capabilities (https://www.psychologie.hhu.de/arbeitsgruppen/allgemeine-psychologie-und-arbeitspsychologie/gpower ), but there are others available.
  • asked a question related to Sample Size
Question
5 answers
The primary survey is meant to analyse the social vulnerability status of the population, where sample size is defined at taluk level of the district (study area). Considering factors like physiography and population density of the study area, kindly provide suggestions on how to select the geographical location of samples (households) preferably using GIS tools (other than fishnet and random sampling tools in ArcGIS) or through other scientific or systematic methods. TIA.
Relevant answer
Answer
Hello Arya M. A.
Here are some strategies for selecting geographical locations of samples (households) using GIS tools and other systematic methods:
  1. Stratified Sampling: Divide the study area (taluk level) into smaller strata based on relevant criteria (e.g., physiography, population density, socioeconomic status). Randomly select samples from each stratum to ensure representation across different characteristics.
  2. Systematic Sampling: Create a grid or network of points covering the entire study area. Select a starting point randomly and then choose subsequent points at regular intervals (e.g., every nth point). Adjust the interval based on population density or other factors.
  3. Purposive or Judgmental Sampling: Use expert knowledge or specific criteria to select sample locations. For instance, identify areas with high social vulnerability (based on physiography, poverty levels, etc.) and deliberately sample from those regions.
  4. Spatial Clustering: Identify clusters of households (e.g., villages, neighborhoods) based on proximity or shared characteristics. Randomly select a subset of clusters and then sample households within each cluster.
  5. Geospatial Data Layers: Utilize existing geospatial data layers (e.g., land use, land cover, elevation, road networks) to inform sample selection. Overlay population density maps, physiographic features, and other relevant layers to identify suitable locations.
  6. Buffer Analysis: Create buffers around key features (e.g., rivers, roads, urban centers). Sample households within these buffers to capture variations in vulnerability.
  7. Accessibility and Proximity: Consider accessibility to services (healthcare, education, etc.) when selecting sample locations. Prioritize areas that are easily accessible to survey teams.
  8. Community Engagement: Involve local communities in the sample selection process. Seek input from community leaders, stakeholders, and residents. Their knowledge can guide the identification of relevant sample locations.
In fact, a combination of methods may be most effective. GIS tools, such as ArcGIS or QGIS, can assist in visualizing data, creating spatial layers, and analyzing patterns. Additionally, consider collaborating with experts in geography, sociology, and demography to refine your sampling strategy!
Best Regards,
Ali YOUNES
  • asked a question related to Sample Size
Question
8 answers
For example, If I have to understand the relation between the age and dependent variable (Likert scale data) and my sample size is uneven across the age categories (like in the age category 20-30 years old, there are 10 individuals while in the age category 30-40 years old, there are 20 individuals), what can be done in this case?
Relevant answer
Answer
okay, yes I'll do it, thank youu!!
  • asked a question related to Sample Size
Question
3 answers
Hello
is there any advanced estimation technique to conduct multiple regression in the context of small sample size
Relevant answer
Answer
Hello,
Yes, techniques like bootstrapping or Bayesian regression can be used for multiple regression with small sample sizes, providing robust estimates even with limited data.
Hope this helps.
  • asked a question related to Sample Size
Question
1 answer
My research plan is as follows:
5 organisations are taking part in the project. Their employees will get a questionnaire in the beginning, middle and end (t1, t2, t3) of the project.
However, we will not be recording participant data, and so it is not fully longitudinal and more of a cohort study I believe, because we cannot tell whether the same people take part at each time point.
My plan was to do some type of multilevel model with participants nested within organisations, and to measure the effect of time on 3 outcome variables measured using the questionnaire.
Now a reviewer is asking for a sample size calculation to see how many people I would need to recruit for adequate power.
There are so many different programs (free or paid) as well as R packages that can do these types of analyses, and I am not quite sure what to pick. Any advice would be helpful!
Relevant answer
Answer
Hello,
Your study is suitable for a simulation-based power analysis approach, leveraging packages like simr or powerlmm in R. These tools allow for the flexibility needed to model your specific study design and outcomes accurately.
Hope this helps 
  • asked a question related to Sample Size
Question
3 answers
Because of the small number of the population the researcher will use the entire population as the sample size. According to…. … state that if the population size is small the researcher can use the entire population. Researchers choose to study the entire population because the size of the population that has the particular set of characteristics that are interest in it is typically very small.
Usually, it is only straightforward to collect data from a whole population when it is small, accessible and cooperative.
Relevant answer
Answer
In terms of providing reliable statistical results, conducting a census of the entire population is ideal. By contrast, using a statistical sample is subject to sampling error. We sample for practical reasons: e.g., to limit the cost, to limit the burden on respondents, and because a complete list of the entire population often is not available. Also, only a relatively small sample is needed to produce reliable results, generally providing little reason to go to the extra time and expense of conducting a census.
  • asked a question related to Sample Size
Question
2 answers
In my PhD I need
Relevant answer
Answer
One way would be to:
(1) Define your question
(2) Develop the most appropriate method to answer your research question
(3) Specify the minimum size of the effect/ difference between groups you wish to be able to detect
(4) Determine the level of statistical significance you wish for your test (p < 0.05 is common)
(5) Determine the level of statistical power you wish your test to have (80% or 90% power is normal, less than 60% would be of very little use)
(6) Use all of the above to carry out a power calculation to determine the required minimum sample size to meet all of the above. (There are online statistical power calculators and resources, for example: https://stats.oarc.ucla.edu/other/gpower/)
If you are not too sure what these steps involve I would take each one in turn, and speak to your PhD supervisor and/ or read up on each step.
Or you could plug lots of assumptions into a sample size calculator (many such sample size calculators are online), but you still need to think about all of the above.
  • asked a question related to Sample Size
Question
3 answers
In my experimental design I have 4 treatments, 3 replicates per treatment and 3 blocks. In each plot I measured whether a plant is infested or not ("Infestate" variable). This measure has been performed to 30 to 40 plants placed at the centre of the plot. Sampling has been performed weekly (variable "Data_rilievo) on the same plants, even though the sample size might vary if some plants die. Treatment does not influence plant death. Thus, I removed from the dataset the observations resulted in plant death.
I obtained the following dataset:
'data.frame': 2937 obs. of 15 variables: $ ID_pianta : chr "_Pianta_1" "_Pianta_2" "_Pianta_3" "_Pianta_4" ... $ Data_rilievo : POSIXct, format: "2023-11-14" "2023-11-14" "2023-11-14" ... $ Blocco : num 2 2 2 2 2 2 2 2 2 2 ... $ Trattamento : chr "Controllo" "Controllo" "Controllo" "Controllo" ... $ Infestate : num 1 0 0 1 0 1 0 0 1 0 ...
I opted for a mixed-effect model with treatment as fixed effect, plant ID ("ID_pianta") as random effect to account for repeated measures, and block ("Blocco") as random effect.
And this is the result
> summary(model) Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod'] Family: binomial ( logit ) Formula: Infestate ~ Trattamento + (1 | ID_pianta) + (1 | Blocco) Data: data AIC BIC logLik deviance df.resid 3835.8 3871.7 -1911.9 3823.8 2931 Scaled residuals: Min 1Q Median 3Q Max -2.1969 -1.0611 0.6139 0.8091 1.5079 Random effects: Groups Name Variance Std.Dev. ID_pianta (Intercept) 0.16880 0.4108 Blocco (Intercept) 0.09686 0.3112 Number of obs: 2937, groups: ID_pianta, 40; Blocco, 3 Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.59808 0.20650 2.896 0.003776 ** TrattamentoLavanda -0.16521 0.11116 -1.486 0.137218 TrattamentoRosmarino -0.02389 0.11000 -0.217 0.828075 TrattamentoTimo -0.37733 0.11017 -3.425 0.000615 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Correlation of Fixed Effects: (Intr) TrttmL TrttmR TrttmntLvnd -0.266 TrttmntRsmr -0.269 0.502 TrattamntTm -0.269 0.499 0.504
I wanted also to check the predictive abilities. I used this code
library(caret) data$Infestate <- factor(data$Infestate, levels = c(0, 1)) # Convert predicted probabilities to binary predictions using a threshold binary_predictions <- ifelse(predicted_probabilities > 0.5, 1, 0) # Convert binary_predictions to a factor with levels 0 and 1 binary_predictions <- factor(binary_predictions, levels = c(0, 1)) # Create a confusion matrix conf_matrix <- confusionMatrix(data$Infestate, binary_predictions) print(conf_matrix)
And these are the results:
Confusion Matrix and Statistics Reference Prediction 0 1 0 1811 28 1 751 55 Accuracy : 0.7055 95% CI : (0.6877, 0.7228) No Information Rate : 0.9686 P-Value [Acc > NIR] : 1 Kappa : 0.0709 Mcnemar's Test P-Value : <2e-16 Sensitivity : 0.70687 Specificity : 0.66265 Pos Pred Value : 0.98477 Neg Pred Value : 0.06824 Prevalence : 0.96862 Detection Rate : 0.68469 Detection Prevalence : 0.69527 Balanced Accuracy : 0.68476 'Positive' Class : 0
It seems te model is good in predicting negative but it predicts 751 false positive. How to deal this aspect? Can the model be considered a good predictor? How can I increase predictive abilities?
Relevant answer
Answer
Dursa Hussein Thank you ChatGPT
  • asked a question related to Sample Size
Question
4 answers
In an experimental study, should the experiment and control group be divided equally. For example, 25 control group and 25 in experimental group. Can there be a + or - 1 to it.
Relevant answer
Answer
The experimental and control groups should be roughly equal in size, but it's not essential that they are exactly equal.
The more important question is whether the residuals are roughly normally distributed in each group - as most traditional NHST tests make this assumption.
  • asked a question related to Sample Size
Question
1 answer
Dear Friends!
The gold-standard for identifying an allele for human leukocyte antigen is SBT or sequence-based typing. If I devise a new PCR test or LAMP test, how many known positive and negative samples (known by SBT) do I take. What formula do I use?
Relevant answer
Answer
A ROC curve is typically a good start. For sample size determination, you need some preliminary data regarding assay specificity/sensitivity, and use that data in eg GPower to determine sample size
  • asked a question related to Sample Size
Question
5 answers
I am using a multi-stage sampling technique for my study in Kathmandu district, Nepal, which consists of 1 metropolitan city and 10 Municipality. I have randomly selected Kathmandu metropolitan city (KMC) (due to budget and logistic constraints). KMC further consists of 32 wards of which I have selected one Ward (No.16) randomly (resource constraints).
My sample size is around 437, calculated using Taro Yamane formula. However, I do not have the list of households in Ward #16. In such a situation, which sampling technique will be appropriate.
If cluster sampling is to be used, how should the clusters be made as the clusters would not be homogeneous?
Relevant answer
Answer
Sure, you can conduct using Area Sampling or Cluster Sampling:
Steps
1. Define the study Area - this is geographical boundaries of your study area.
2. Divide the Area into Clusters e.g city blocks, streets, neighborhoods, based on the size and nature of your study area.
  • asked a question related to Sample Size
Question
2 answers
My students and I recently completed a pilot intervention study for anxiety in Division 1 volleyball players. We studied the players from one volleyball team over the course of a season, measuring self-reported anxiety before the season, 8 weeks into the season, and one-month after the end of the season. We had three groups, two intervention groups and one control group. The team has 21 players, so we started with 7 in each condition. Due to drop out across the intervention, the 8-week time point had 6, 5, and 4 players in each group, and the post-season time point had 6, 4, and 3 players in each group. My question is whether we could still use Hedge's g as an initial measure of "effect" difference in the various measures or whether our very small groups preclude us from using this statistic. Any references either way would be appreciated!
Relevant answer
Answer
As far as I know, nothing about the small sample size makes the effect size estimate invalid, but you might calculate confidence intervals for it. The interval might be quite wide.
  • asked a question related to Sample Size
Question
2 answers
Hi im looking to calculate sample size for my study n=655.
This is an awareness study using HAIS-Q questionnaire, I wanna know what setting in G*power i can use for this calculation. Thanks in advance.
Relevant answer
Answer
If you already have a sample collected and know the sample size, what would you want to compute in G*Power? Also, what type of statistical analysis are you looking at?
  • asked a question related to Sample Size
Question
6 answers
Can someone pls assist me with sample size calculation for RCT in scientific research, 2 grps control and intervention. Is there a method utilizing ANCOVA? Which software is the best. Assuming I have all assumption to run the ANCOVA.
Thank you kindly
Hashim
Relevant answer
Answer
I have found ANCOVA a bit of a pain to do power calculations for. In the past I have used simulations, which is flexible but time consuming. More recently I've used Superpower in R:
  • asked a question related to Sample Size
Question
1 answer
Good afternoon, everyone!
I plan to investigate the effect of Trendelenburg position on the quantitative measure "X". We have the following results:
1) Study 1: 30 patients in supine position have the value of the parameter "X" = 69+-10, same patients in Trendelenburg position: "X" = 75+-12. The time after which values of the parameter "X" were recorded after the start of the Trendelenburg position was 1 minute.
2) Study 2: 40 patients in the supine position: "X" = 86+-11, same patients in the Trendelenburg position: "X" = 105+-16. The time after which values of the parameter "X" were recorded after the start of the Trendelenburg position was 1 minute.
AND same 40 patients in the Trendelenburg position have the value of the parameter "X"= 95+-14, BUT the time after which values of the parameter "X" were recorded after the start of the Trendelenburg position was 10 minutes, instead of 1 min.
In fact, we have 3 observations for one outcome, but 2 of them are from the same cohort of patients in the Study 2.
How can we take this into account to adjust the "weight" of each of the two results in the second study?
Would it be correct to reduce the sample size for the second and third observations to 20 (40/2) when entering primary data for meta-analysis into programs such as Stata or RevMan?
Relevant answer
Answer
Good afternoon, I have only used Revman once a long time ago to add a study done in Guinea Bassau to the cochrane review on antibiotics for the prevention of the complications associated with measles. I can't recall a lot of this now - it is a long time ago & now that review has been redone so my name was knocked off as co-author [I didn't assist with last review update]. However, my thoughts on your question from my limited knowledge of statistics is I wouldn't reduce sample to 20. Even though the same study group, but different amount of time after, it is more a repeat measure but changing one part of how your measure. I don't really know the answer to the question - sorry.
  • asked a question related to Sample Size
Question
7 answers
Can anyone here me with one biostatistics question. It is about finding the sample size from power analysis. I have the variables. Just need an assistance with the calculations.
  • asked a question related to Sample Size
Question
4 answers
How might I calculate the sample size for an experimental design in medical research, peripheral sodium channel block, and stabilization exercises
Relevant answer
Answer
I suggest you look into the concept of statistical power (I.e., the ability to detect an effect), and more specifically into the g*power program which translates your desired level of power into a sample size.
  • asked a question related to Sample Size
Question
1 answer
Various formulas have been proposed in recent years for sample size determination. The Krejcie and Morgan sample size determination table has also been in use for some years now. I will like to learn whether it will still be ideal and meet current trends and standards to use the table as a reference for sample size determination. Thank You.
Relevant answer
Answer
Yes, the Krejcie and Morgan Formula is still widely used, although there are a few software packages /sites that offer one the sample size based on the size of the universe.
Studies that have been published as recently as January, 2024 have used this formula-
  • asked a question related to Sample Size
Question
4 answers
If the reviewer want linear regression and I have one dependent non parametric parameter (10 sample size) and ten parametric parameters (each parameter 5 sample size).
Can I perform Pearson correlation and linear regression?
Relevant answer
Answer
@ Vaisakh Venu
Thanks a lot for your help. If normalization is done for the non parametric dependent parameter but two parametric parameter (out of 10) are not homogenous; can multiple linear regression, result may be reliable?
Thanks once again Vaisakh Venu.
  • asked a question related to Sample Size
Question
3 answers
i want to do cross sectional study in osteogenesis imperfecta patients. the prevalance is about 6.5 per 100000 live births. kindly help me how to calculate the sample size
Relevant answer
Answer
I am not sure a cross-sectional study makes much sense with a specific rare disease. The quantification will always require an inordinate number of people. If you go after general diseases in aggregate, it would make more sense as we know that about 5% of the population is stricken.
Regarding the specific disease you mention, here is a pointer to an association who deals with it in Pakistan:
  • asked a question related to Sample Size
Question
3 answers
Other than cross sectional studies, I wish to gain knowledge in calculating the sample size for interventional studies as well as calculate sample size with the help of Standard Deviation etc..
Relevant answer
Answer
I have used the sample size charts (and desired power) from charts in this book (Chapter 13 and appendices):
Applied Statistics for the Behavioral Sciences (5th Edition)
by Dennis E. Hinkle, William Wiersma and Stephen G. Jurs.
Hope this is helpful.
  • asked a question related to Sample Size
Question
3 answers
I want to calculate the sample size for my study on the education of Nomadic children in Lahore Pakistan. But no data regarding population size is available. How i can calculate and also give me reference. Thanks
Relevant answer
Answer
Here is a general approach:
  • Identify the confidence level.
  • Determine the margin of error.
  • Estimate the population standard Deviation.
  • asked a question related to Sample Size
Question
3 answers
The sample size for the BE is directly poportional with power. However, the proposed/accepted power ingeneral is 80-90%. Consideration of power more than 90 % may end up the issue of ' 'Forced Bioequivalence' by regulatory.
Can any one please explain clearly the term 'Forced Bioequivalence' associated with larger larger sample size due to high power consideration?
Relevant answer
Answer
  • asked a question related to Sample Size
Question
5 answers
Dear researchers ,
Greetings. I am trying to collect data from an organization. Suppose , the number of employees in an organization is 1000. What could be the optimum number of sample size from these 1000 people ( that means , I wont get respond from everyone, what is the optimum number of data size to be considered ). I am trying to gather information on employees . 1.There are different formulization for optimum sample size , can you please list them all provide me the best one (with recent publication references ?)
2. Also, can you please let me know how can I eliminate the possibility of bias in the dataset ?
Relevant answer
Answer
Controlling bias is also possible and is directly related to the researcher.
Conduct a random lottery to ensure that there is no bias in principle.
When conducting data collection, adhere to impartiality and scientific honesty.
  • asked a question related to Sample Size
Question
3 answers
sample size required to collect responses on a questionnaire.
Relevant answer
Answer
There will be many references on the Internet.
That said, depends on what you want the sample for---and as posted above, other factors concerned with your research project.
  • asked a question related to Sample Size
Question
8 answers
One of the many assumptions in ANOVA is that our data follows a normal distribution. Usually in biology experiments are carried out in triplicates and the avalilable data is very small, not > 30 which I think is standard sample size for assessing normality. Any one who is familiar with use of statistics in phytochemistry, microbiology, pathology or any relavant field kindly answer the following questions.
1. Is it necessary to carry out tests of normality such as Shapiro-Wilk normality test to confirm if our data follows normal distribution?
2. With such a small sample size is it possible to carryout these tests?
3. Is it true that test for normality is unnecessary in biological experiments?
4. Can I safely assume that my data will follow normal distribution wiithout any of these tests?
5. Which statistical software is best for a beginner?
Relevant answer
Answer
Checking the normality of your data is important when using parametric statistical tests like t-tests and ANOVA. These tests make certain assumptions about the distribution of the data, and if those assumptions are violated, it can affect the validity of your results.
For t-tests and ANOVA, normality is particularly important when dealing with smaller sample sizes. However, these tests are somewhat robust against violations of normality, especially with larger sample sizes (typically, n > 30 is considered reasonably robust).
If normality is a concern, you might consider non-parametric tests like the Mann-Whitney U test (for two groups) or the Kruskal-Wallis test (for multiple groups) as alternatives.
In summary, while it's a good practice to check for normality, the impact of deviations from normality depends on your sample size and the specific assumptions of the statistical test you're using. If in doubt, consulting with a statistician or using non-parametric tests may be appropriate.
  • asked a question related to Sample Size
Question
3 answers
Is it necessary to integrate an estimated non-response rate when finalizing the sample size after conducting the power calculation? How to interpret?
For instance,
Two-sample t test power calculation
n = 213.1237
d = 0.35
sig.level = 0.05
power = 0.95
alternative = two.sided
NOTE: n is number in *each* group
So, we need 214 for each group
In case we integreate an estimated non-response rate (e.g., 80%)
So, we need 214/0.2 = 1070 for each group
*By the end of the study, let's say we get 600 for each group. Is it reliable and meaningful results in statistical analyses?
Relevant answer
Answer
Yes, if you have to expect so much missing data, you would have to oversample massively to retain sufficient power given the 80% non-response rate.
That being said, I would be very concerned if I had to expect an 80% non-response rate. Why would so many responses be missing? What would the mechanism of (so much) missingness be? It is difficult to imagine that your remaining sample would be representative of the target population or that the missing cases would be "missing at random" or "missing completely at random."
  • asked a question related to Sample Size
Question
1 answer
I am conducting a systematic review and meta-analysis of many studies with overlapping populations. The Review Manager 5.4 software does not require a sample size to perform a meta-analysis of hazard ratios, so I thought it may be possible to pool data from overlapping populations.
Relevant answer
Answer
I am not sure of the meaning of "overlapping" term you mention. From my perspective, it seems that the same patients could be used twice as separate study findings? This type of problem is depicted in Cochrane training book 6.2.4 section (Chapter 6: Choosing effect measures and computing estimates of effect | Cochrane Training).
This situation could be accounted for in the meta-analysis and would have an effect on the standard error of the estimate (with hierarchical model accounting for repeated measurement it would increase the standard error of your pooled estimate). Practical info should be retrieved in (Chapter 10 “Multilevel” Meta-Analysis | Doing Meta-Analysis in R (bookdown.org)) using metafor package in R environment.
  • asked a question related to Sample Size
Question
3 answers
When undertaking clinical trials with limited sample size and repeated measures, the question arises: which is more advantageous—employing "Generalized Estimating Equations (GEE)" or opting for "non-parametric analysis"? We seek insights from experts to elucidate the respective strengths and limitations of each approach.
Relevant answer
Answer
Hi,
These papers note the strengths and limitations of GEE and nonparametric methods for analysing repeated measures in clinical trials. Paper 2 emphasises GEE's data optimisation and power but notes its reliance on covariance assumptions, which is problematic in small samples. Paper 1 presents nonparametric analysis as less assumption-dependent and effective for detecting specific effects, though generally less efficient than GEE. While GEE excels in efficiency and power, nonparametric methods are preferable for small samples where assumptions may fail. Using simulations to mirror actual data can guide the choice between efficiency and power and assumption dependency.
Hope this helps.
  • asked a question related to Sample Size
Question
2 answers
This is because the sample size is small and some specific characters are required.
The inclusion and exclusion criteria were indicated. I acknowledged that this might led to the limitation for finding generalisation.
Relevant answer
Answer
If you want to have separate experimental and control groups, you can use "random assignment" from the original sample. But be sure your overall sample size is large enough to ensure that you have the power to detect significance. If you are not familiar with assessing the power of a test, the most widely used tool is g*Power
  • asked a question related to Sample Size
Question
6 answers
Hello,
I plan to conduct a cross-sectional survey among mothers (caregivers) in one of the rural regions of my country. The survey will assess mothers'/caregivers' knowledge and practices regarding breastfeeding, child nutrition, and health. The study does not have a control group. Is this information enough to understand the formula required for the sample size calculation?
I really appreciate any help you can provide.
Relevant answer
Answer
Calculating the sample size for a cross-sectional survey study involves several factors, including the desired level of precision, the confidence level, the estimated variability in the population, and the margin of error you find acceptable. The formula commonly used for estimating the sample size in a cross-sectional survey is based on the formula for calculating the sample size in a population proportion, and it's given by:
What is the formula for the sample size of a population proportion?
n = N*X / (X + N – 1), where, X = Zα/22 *p*(1-p) / MOE2, and Zα/2 is the critical value of the Normal distribution at α/2 (e.g. for a confidence level of 95%, α is 0.05 and the critical value is 1.96), MOE is the margin of error, p is the sample proportion, and N is the population size.
Here's a step-by-step guide on how to use this formula:
  1. Determine the Confidence Level (CL): Common confidence levels are 95% or 99%. The corresponding Z-scores are approximately 1.96 for 95% CL and 2.576 for 99% CL.
  2. Estimate the Proportion (p): If you have no prior estimate, you can use 0.5, which maximizes the sample size and minimizes the required margin of error, assuming the most conservative scenario.
  3. Choose the Margin of Error (E): The margin of error is the maximum acceptable difference between the sample estimate and the true population parameter. Common choices are 0.05 or 0.01.
  4. Plug the values into the formula: Substitute the values into the formula mentioned above and calculate the required sample size.
Keep in mind that this formula assumes a simple random sample and that the population size is large enough relative to the sample size (typically when the population is at least 10 times larger than the sample size). If your population is small, you may want to consider using the finite population correction.
Also, if your study involves multiple subgroups or strata, you may need to calculate sample sizes for each stratum and then combine them.
Remember, calculating the sample size is just one step in the planning process. Other considerations, such as budget constraints, feasibility, and the potential for non-response, should also be taken into account in the design of your survey.
  • asked a question related to Sample Size
Question
3 answers
Leslie Kish formular or Leslie Fischer's formular? What is the actual nomenclature? Are these names used interchangeably? The Literature isn't helping to clarify at all.
Relevant answer
Answer
Andrew Paul McKenzie Pegman great advice, why bothering with sample size calculations, power considerations, taking into account model complexity or sampling strategies. One size fits all. Shame on all reasearchers who dared to have larger sample sizes, what a waste of ressources and especially money. *irony off*
  • asked a question related to Sample Size
Question
6 answers
The type of mixed method is sequential explanatory. I'm planning to conduct an experiment, and then explore how they experienced the phenomenon through IPA. How many participants would be enough for the quantitative part of the research? And do I have to include every participant during the qualitative part? Could I just select a few? 3-6, maybe?
Relevant answer
Answer
I have no idea where people are getting the now common recommendation of 3-6 participants for the the qualitative strand of an explanatory sequential design. Yes, 3-6 is standard for IPA, but explanatory sequential designs are explicitly based on using the quantitative findings as a pre-existing foundation for the qualitative study, so I think that approach would seldom make IPA a good choice.
The classic standard for the size of a qualitative study is to achieve saturation, i.e., the point at which new cases no longer add any information. This is difficult to predict in advance, too planning for too few cases can be a serious problem.
  • asked a question related to Sample Size
Question
5 answers
I'm doing a meta-analysis for continous outcomes and most of the data from the studies is reported as median, IQR (as difference Q3-Q1), I already know Wan (2014) method and McGrath (2021) method, but those only are appliable when the exact values of the quartiles are available, I was thinking about using IQR/1.35 to get the SD and asume a normal distribution so the median=mean, and then try a sensitivity analysis to evaluate the effect of using those studies as presented.
Relevant answer
Answer
Hi Raid Amin. As I understood it, Mariano Gallo Ruelas has no raw data for any of the studies. For many of them, he has only the median and Q3-Q1, but not the values of Q1 and Q3. And as I understood it, the two possible approaches he mentioned (Wan 2014; McGrath 2021) require the exact values of Q1 and Q3, not just the difference between them.
Mariano Gallo Ruelas, I have no firsthand experience with the problem you described, but a Google search took me to the documentation for this R-package:
Notice that some more articles are mentioned in the Description section. Some of them may be helpful. See also the Reference list.
HTH.
  • asked a question related to Sample Size
Question
3 answers
I am conducting a cross-sectional study from NHANES database. I used the "Full.sample.2.year.interview.weight" to weight the data. However, I do not know where should I put the decimal to finish my data analysis.
The total sample size before weighting is 5639 participants.
If I put the decimal after 5 numbers, the sample size will be decreased by 50% and will be around 2700 participants.
If I put the decimal after 4 numbers, the sample size will be increased by 200% and will be more than 10,000 participants.
I am attaching a screen shot after weighting the data.
I used SPSS for analysis, tab (data), weight by "Full.sample.2.year.interview.weight".
Relevant answer
Answer
DISCLAIMER: I do not consider myself an expert on analysis of complex survey data, and I had no direct experience with NHANES. However, my Google search turned up this page, which you might find helpful.
What I read there makes me wonder if your Full.sample.etc. variable in the image you uploaded is the one that is called WTINT2YR on the CDC web page. I hope this helps move things along. And I hope that someone with relevant direct knowledge of NHANES jumps in!
  • asked a question related to Sample Size
Question
3 answers
The population of this research study are students from a specific school, and 6 strata was identified. The other 3 strata met the calculated sample size required, however the other 3 unfortunately did not despite sending out the surveys again and again. Additionally, a majority of the students were minors, and there were some whose parents did not give consent for the survey.
With this, the 3 strata were not able to meet the calculated sample size. Should we proceed with statistical analyses (with 3 reaching the calculated size, and the other 3 not) and write this as a limitation? or what other actions should we take instead?
Relevant answer
Answer
By "calculated sample size required," I assume you refer to a power analysis done before data collection to calculate the minimum sample size required for obtaining statistical significance with a given probability of Type 1 error and a given minimum effect size. If so, then yes, you can proceed with the analysis, while acknowledging that the shortfall in desired sample size will necessarily increase the probability of Type 2 error.
You should also up-weight the respondents in low-response strata to compensate for under-representing the subpopulations in those strata. Unless your sampling involved strictly proportionate stratification, you would have to do some weighting even if all strata had the same response rate.
Of greater concern is whether the low response rate in some strata reflects a non-random bias in non-response within those strata. If you can obtain subpopulation figures for each stratum from other sources on key demographics (like age, gender, etc.), you can adjust (to some degree) for such bias by doing post-stratification weighting within each stratum, to make your sample demographic marginal percentages match the known subpopulation values. If you only have demographics for the whole population, you could do post-stratification weighting for the sample as a whole rather than within each stratum, although the within-stratum approach would be better.
By the way, with the population being students from a single school, unless that school has a huge enrollment, both your power analysis and your statistical inference should include the Finite Population Correction (as you may already know).
  • asked a question related to Sample Size
Question
1 answer
I need to make my sample size calculation without a pilot study, my n=10, but how can I prove that I have this number?
Relevant answer
Answer
You say your N = 10. If that is actually the population size (i.e., the total number of possible observations), that is a problem because it is virtually impossible to obtain statistical significance with such a small number.
  • asked a question related to Sample Size
Question
3 answers
I am conducting a study to better understand the effects of flooding on the mental health of farming communities in my country. However, despite my efforts, I have been unable to find any suitable questionnaire to use in my survey. I'm hoping someone out there may have a relevant questionnaire I can adapt for my study. Alternatively, if anyone can provide me with relevant questions, that would also be very helpful. Additionally, if I were to conduct a group discussion, what would be the ideal sample size for each group?
Relevant answer
Answer
This article could help you select a tool. Please take a look. The sample size has to be estimated based on the prevalence rate. As a rough guidance, a sample of size around 300 will be adequate. Please calculate to determine it.
  • asked a question related to Sample Size
Question
2 answers
Is there any formula to find the sample size needed to create machine learning or deep learning models in the detection ,localization segmentation and classification of colon polyps
Relevant answer
Answer
Thank you
  • asked a question related to Sample Size
Question
1 answer
Dear Friends,
I want you to read section 4.2 of the following paper and comment.
Happy New Year
Relevant answer
Answer
For ready reference, here it is:
4.2. Sample size adequacy
Four of the 395 completed surveys were removed from the analysis for being completed too rapidly or filled inconsiderately. Moreover, six of the 59 items were deleted (see 5.2 for details). Therefore, the sample-to-item ratio was 391/53 = 7.4, i.e., there were 7.4 subjects per item. Regarding sample size requirements, “no simple rule of thumb about sample size works across all studies” (Kline, 2023, p. 16). In the absence of a sampling frame (non-probability sampling), the sample size issue remains “ambiguous,” and “there are no rules” (Saunders et al., 2019, p. 315).
In structural equation modeling (SEM), a factor analysis-based technique, there are at least two perspectives, “entrenched camps” arguing to look at total sample size (minimum sample sizes) or the ratios (number of cases required per item, N:p ratio) (Kline, 2023, Osborne and Costello, 2004). There is widespread consensus in the first camp that a sample of 100 or less is “untenable” or “poor,” and for a sample of less than 200, journals “routinely reject for publication” (Comrey and Lee, 1992, Kline, 2023). Traditionally, “more is always better” (Osborne & Costello, 2004, p. 8). In contrast, researchers believe that “more is not always better.” (Wolf et al., 2013, p. 14). Sekaran and Bougie (2016, p. 264) said that “too large a sample size (say, over 500) could become a problem” due to the possibility of Type II errors.” They went on to say that “neither too large nor too small sample sizes help research projects.” (Sekaran & Bougie, 2016, p. 264). The minimum sample size of 250 is acceptable (Hoyle, 1995, p. 186). For many, a sample size of 300 or above is acceptable/appropriate/good (Comrey and Lee, 1992; Floyd and Widaman, 1995; Tabachnick and Fidell, 1996).
In the second camp, N:p of 10:1 has been advocated for ages (Everitt, 1975, Nunnally, 1978). Osborne and Costello (2004, p. 2) said this “recommendation was not supported by published research.” Streiner (1994, p. 140) suggests the ratio should be at least 5:1, provided “there are at least 100 subjects. If there are fewer than 100, the ratio should be closer to 10:1.” Several authors consider a 5:1 ratio acceptable (e.g., Bentler and Chou, 1987; Comrey and Lee, 1992; Gorsuch, 1983; Hatcher, 1994; Tabachnick and Fidell, 1996). Rather than a threshold ratio, Cattell (1978) suggested a 3 to 6. Not one ratio is likely to work in all situations. According to Bentler and Chou (1987, p. 91), “when there are many indicators of latent variables and the associated factor loadings are large,” the ratio may go as low as 5:1. In other words, more indicators and loading are critical to deciding optimal sample size. MacCallum et al. (1999,p. 96) concluded N:p ratio depends upon some aspects of variables and design, “most importantly, level of communality plays a critical role.” Where communality (squared factor loadings) represents the “squared multiple correlations among variables” (Tabachnick & Fidell, 2019, p. 481). MacCallum et al., (2001,p. 636) summarized that “samples somewhat smaller than traditionally recommended are likely sufficient when communalities are high.” In a nutshell, for this study, a sample size of 391 is not only sufficiently large but all communalities (ranges from 0.5 to 0.9) are also high. Therefore, N:p ratio of 7.4:1 is adequate for this study.
  • asked a question related to Sample Size
Question
7 answers
Is ex ante power analysis the same as a priori power analysis or is it something different in the domain of SEM and multiple regression analysis? If it is different, then what are the recommended methods or procedures? Any citations for it?
Thank you for precious time and help!
Relevant answer
Answer
Zubaida Abdul Sattar Thanks a lot for sharing detailed information.
  • asked a question related to Sample Size
Question
3 answers
To understand the relationship between change in marginal error and research sample size.
Relevant answer
  • asked a question related to Sample Size
Question
3 answers
Is it possible to identify statistical differences between two groups when they have different sample sizes, as in the case of comparing CD (with 12 samples) to an experimental group (with 2 samples)?
Relevant answer
Answer
Christopher Muhambuwa and Laith Faisal Lazem Thank you very much
  • asked a question related to Sample Size
Question
3 answers
I have a cross sectional study including two groups and the sample size was calculated to be 73 in each group but the participants were 80 in each group as the statistician told me to be more significant but the reviewer now during publication his comment this is unethical and need justification why add 14 more how can response to him?
Relevant answer
Answer
I have a hard time imagining how increasing your sample size could be "unethical." The simple answer for increasing the sample size is to "increase the power of your test" (i.e., the ability of the analysis to detect an effect).
  • asked a question related to Sample Size
Question
2 answers
Dear researchers I have one question for today
It is about the analysis of skipping questions. Most of us are familiar with questions that require skipping. For example
If the question is, do you like fruit? the answer will be yes or no
Then the next question is, if yes, which one? A .Banana B. Apple, C orange
Then, when we analyze this kind of question, it will be difficult because only the yes part will be entered into the model, but the sample size will be less than what we calculated first since we only consider the "yes.". so how can we analyze this kind of variables
Relevant answer
Answer
What do you consider to be "difficult" about this type of survey? You seem to be concerned about sample sizes.
If I may make an assumption here: Surely you do not, or even can not, expect every subject who enters the questionnaire to be able to answer all questions, thus maintaining the same number of responses to all questions. Even if it was a 50:50 split between "yes" and "no" answers to Q1, you are forcing a subdivision in Q2 amongst different sub-groups. It's built into your questionnaire design.
Calculate some percentages: 50% of respondents "like fruit" (or whatever it is). Of these, 25% equally indicated a preference for the 3 fruits suggested and 25% suggested "other fruit". Respondents who did not like fruit preferred "bananas" 90% of the time.
Perhaps there are some qualitative statistical methods you can employ. Go talk to your local statistician, I'm sure you can get some useful advice from them.
  • asked a question related to Sample Size
Question
1 answer
I am exerting my efforts to identify the impacts of last year's three-month prolonged flood on the mental health of the farming communities of Pakistan, particularly in rural Sindh province. Therefore, I have two options in hand to choose either: first, I should go ahead with a focus-group discussion (how long should be the sample size?); second, I can conduct a survey questionnaire (how many questionnaires should be filled sufficiently).
Also, how to develop this kind of questionnaire, as I do not find any relevant study that could be sufficiently helpful to follow?
Relevant answer
Answer
If you cannot find a directly relevant study, you need to widen your conceptual framework. Many communities have been affected by flooding, and many farming communities have been affected by weather and climate events that have caused hardship, loss and death. It might help to think of your research as a study in this wider context.
One difficulty is the absence of a baseline measurement, which is why I think your plan to do qualitative research is a good one. Key informant interviews and focus groups can produce rich data, not just about the mental health impacts but also on the way the community mobilised its resources in the face of adversity. It avoids painting the community as purely a victim, and paves the way for understanding factors that promote community resilience – the ability to learn from adversity so as to be better able to deal with future adversity.
  • asked a question related to Sample Size
Question
1 answer
Hello everyone,
I am trying to make a Network Analysis with EEG data, behavioural data and other variables for predicting the learning of a task, but the sample size is really small (around n=40).
I want to know if it is feasible to do it in that way with a non-parametric bootsraping, or if there is other way to do it.
Cheers,
Laura Maldonado
Relevant answer
Answer
Hi Laura,
I think the sample size 40 is big enough for doing the analyses you want. It would not be called small sample size once it's over 30.
So many studies did the analyses you mentioned but only has sample size less than 30.
  • asked a question related to Sample Size
Question
3 answers
In the original protocol, there should a sample of 75 that is divided into the groups each of 25 subjects.
In the actual research I did, I included 78 subjects divided into 25, 26, and 27 subjects in each group.
Does this violate the protocol?
Relevant answer
Answer
Thank you sir
  • asked a question related to Sample Size
Question
3 answers
Hi
I am trying to work out what my effect size was for my study
I have the sample size, power etc...
its there a program that can work out what it is?
Relevant answer
Answer
And bear in mind that standardized effect sizes are not always the best choice.
  • asked a question related to Sample Size
Question
5 answers
The most appropriate equation to determine the sample size
Relevant answer
Answer
As Mark said, it depends on various specifics. Basically there are methods based on probability of selection, the simplest being simple random sampling, and there are methods based on models, generally regression models, also called "prediction." There are also model-assisted randomized methods (or model-assisted design-based approaches). There are many variants. But generally they assume that bias is under control, and you just want to calculate what sample size, under your conditions, will hold variance below some desirable level. This also depends on inherent variability. (Please research the difference between standard deviation and standard error.)
I found it interesting that although a simple model-based ("prediction-based" - but not forecasting) method that is helpful in Official Statistics where sample surveys and census surveys are repeated to monitor markets, and simple random sampling are very different theoretically, I obtained a formula for the former that is in the same format as what W.G. Cochran developed for the latter in his book, Sampling Techniques, 3rd ed, 1977, Wiley, in Chapter 4. See Knaub, J.R., Jr. (2013), “Projected Variance for the Model-Based Classical Ratio Estimator: Estimating Sample Size Requirements,” JSM Proceedings, Survey Research Methods Section, American Statistical Association, pp. 2885-2889, https://www.researchgate.net/publication/261947825_Projected_Variance_for_the_Model-based_Classical_Ratio_Estimator_Estimating_Sample_Size_Requirements
Note that stratification is often a good way to increase efficiency. If you are only interested in overall results (not those from each stratum), it can be used to reduce sample size.
If you search the internet on specific terms of interest, and include "Pennsylvania State University" in your search, you may find that they have helpful material available online for you on statistics.
  • asked a question related to Sample Size
Question
9 answers
I am conducting a study testing the effectiveness of a kind of group psychotherapy. There are 10 participants in my experimental group and 14 participants in my control group. At first, I planned random assignment to the groups, but because of the time of the group therapy, 14 of the participants wanted to be in the waitlist control group. After I created the control group, I run a t test to compare two groups in terms of some study variables. When I did a t test, I saw that there was no significant difference between the groups in terms of the study variables. In summary, the groups have similar characteristics (e.g. Age, educational level, romantic relationship status, mean scores of the participants). However, group sizes are different. Can I do my analysis with 10 people in the experimental group and 14 people in the control group? If no, how do I remove the 4 people in the control group?
Relevant answer
Answer
Ceren Bektaş-Aydın, please report n, mean, and SD for each of your groups. Thanks.
  • asked a question related to Sample Size
Question
4 answers
I am a Research Scholar from INDIA. Doing my research in Human Resource Management.
I am currently working on a research about the Manager's Emotional Intelligence influences the Service employee's performance. Where;
Manager's Emotional Intelligence (MEI) - 1 Dependent Variable - Qualitative Interview and Quantitative Survey
The MEI is measured from:
Qualitative Interviews from Managers and Perception Based Survey from the service employees about MEI. (Data Triangulation)
Service Employee's Performance (SEP) - 7 Independent Variables - Quantitative Survey
The SEP is measured from:
7 Variables of SEP from Customers as a Quantitative Survey.
Could someone help me out?
Relevant answer
Answer
Since this is not a pilot study, the small numbers are not relevant.
You need to base your sample size on the minimum size of the relationship between your predictors and predicted variables that you have adequate power to detect (90% or better).
If you have a look at the RCSI sample size handbook, it gives sample sizes for studies using correlations or regression on page 51. Worst case, where the variables only share 10% of their variation (correlation 0·32), is a sample size of 99, while with a really high correlation (0·71) you could need as few as 20.
  • asked a question related to Sample Size
Question
3 answers
Hello, my name is Hue Man, I am a senior nursing student at Tra Vinh University. Currently, I am doing my graduation thesis on: "Research on the rate of depression in the elderly using the GDS-5/15 scale and some related factors at the medical examination department". But the problem I am currently facing is the sample size calculation formula and how to process the data. Please can anyone give me advice or a solution. I sincerely thank you
Relevant answer
Answer
You require n = 30 to fulfil the CLT and add a few more for drop-outs :)
  • asked a question related to Sample Size
Question
2 answers
Dear all,
As part of my bachelor thesis, I am using a simple mediation with bootstrapping procedure. I would now like to calculate the power of my analysis. For guidance, I have read the 2007 paper "Required Sample Size to Detect the Mediated Effect" by Matthew S. Fritz and David P. MacKinnon (DOI: 10.1111/j.1467-9280.2007.01882.x). I would like to use the "empirical estimates of sample sizes needed for .8 power" (Table 3) to assess the power of my mediation analysis (using bootstrapping). As far as I know, the power of a test always depends on the sample size, the effect size and the type 1 error. Unfortunately, however, I could find no information on the probability of type I error on which Fritz and MacKinnon's percentile bootstrap and bias-corrected bootstrap power analyses are based. I wold now like to ask, if anybody knows which type 1 error ist used in the paper. It would be very helpful if somebody could help me with this question.
Kind regards
Leonie Aderhold
Relevant answer
Answer
You are right. It is odd that Fritz and MacKinnon (2007) do not seem to provide the specified alpha (Type I) error rate explicitly in their Methods or Results sections. In the Discussion section of that paper, they mention "a Type I error of .05", so I would assume that they used .05 in their simulations (it is the most commonly used Type-I error rate), but it is not entirely clear. Perhaps send an email to Matthew Fritz (https://cehs.unl.edu/edpsych/faculty/matthew-fritz/) to confirm.
  • asked a question related to Sample Size
Question
5 answers
In our study, we reached 200 participants and have identified various criteria for conducting exploratory factor analysis, such as item count being 5-10 times the number of items, KMO value, etc. However, I couldn't find a satisfactory reference in the literature regarding the sample size for Confirmatory Factor Analysis (CFA). In this context, the main question is: Is a sample size of 200 sufficient for a 5-factor, 40-item scale used in the study? Another consideration is that while sample size is important, if fit index values take into account the sample size when assessing construct validity, and if the fit values are good, it might be reasonable to perform CFA with 200 participants. I am looking forward to responses from experts.
Relevant answer
Answer
Probably the best way to figure out the sample size requirements for a CFA is by using a Monte Carlo simulation. A simulation allows you to consider your exact model (i.e., the precise factor structure and expected parameter estimates) and to simulate realistic conditions (e.g., missing data, non-normal data, different estimators) instead of relying on rules of thumb that may not apply to you specific scenario. Simulations allow you to examine potential parameter estimate bias, standard error bias, statistical power, accuracy of fit statistics, and more.
See:
Muthén, L. K., & Muthén, B. O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 9(4), 599-620.
  • asked a question related to Sample Size
Question
2 answers
Hello Reseachgate community.
I have perused several recent sources to either find data or power tables missing and there I cannot seem to find the best source for an appropriate minimum sample size for a conditional process (moderated mediation) analysis.
With 4 variables (3 predictors, 1 outcome) and assuming power .80 with alpha .05 and small to medium effect sizes between all (i.e. 0.30) could anyone point me in the right direction please?
Relevant answer
Answer
For complex mediation (path analytic) models, Monte Carlo simulation techniques are probably your best bet. See:
Thoemmes, F., MacKinnon, D. P., & Reiser, M. R. (2010). Power analysis for complex mediational designs using Monte Carlo methods. Structural Equation Modeling, 17(3), 510-534.
I offer a free mini-course on simulation of path models in Mplus that you can find here:
  • asked a question related to Sample Size
Question
3 answers
Our research is quasi-experimental. There are two groups to be tested under different teaching approaches however we don't know how many participants should be in a group.
Relevant answer
Answer
Without knowing more details about your research project, it is difficult to make meaningful statements here. Which research approaches in which domain do you want to test empirically? It cannot just be about the number of participants in the group. As a rule, certain preconditions and contextual conditions must be taken into account: What is the research question? What are the scientific objectives? Which specific teaching/ learning settings are to be evaluated: combined vs. shared, synchronous vs. asynchronous, individual or cooperative, etc.? Should it be a comparative study? Is a control group planned? These and other requirements must be clarified and defined. Then the research design can be precisely conceptualized.
  • asked a question related to Sample Size
Question
3 answers
We measured three aspects (i.e. variables) of self-regulation. We have 2 groups and our sample size is ~30 in each group. We anticipate that three variables will each contribute unique variance to a self-regulation composite. How do we compare if there are group differences in the structure/weighting of the composite? What analysis should be conducted?
Relevant answer
Answer
Are you thinking of self-regulation as a latent variable with the 3 "aspects" as manifest indicators? If so, you could use a two-group SEM, although your sample size is a bit small.
You've not said what software you use, but this part of the Stata documentation might help you get the general idea anyway.
  • asked a question related to Sample Size
Question
3 answers
I need this to justify a survey sample Size.
Relevant answer
Answer
Philip -
If you look at Cochran's book, he briefly derives what is in the sample size chapter. That is for simple random sampling and for continuous data, and separately for proportions. If you search on related terms and include "Pennsylvania State University" in the search, you will find material on this, often, I think they say, from Steven Thompson's book. Other chapters in Cochran give hints on other designs, and then there are model-assisted and model-based methods discussed largely in other books and papers/articles.
It seems that often on ResearchGate you will hear of someone using a "formula" by some name that turns out just to be a special case of what Cochran has for simple random sampling, likely for proportions (yes/no data). If you don't have a copy of Cochran's 3rd edition (Wiley, 1977), then see what Penn State has online, as noted above. (Cochran's first edition is interesting too, but the chapter you want is in the 3rd edition.)
Cheers - Jim
  • asked a question related to Sample Size
Question
5 answers
Hi all.
I would like to know how to calculate the required sample size for the sequential mediation model (X->M1->M2->Y) if there is no previous research showing the correlation between them. Are there any ways to calculate the sample size by utilizing the information on the number of variables?
Thanks!
Relevant answer
Answer
General approach to estimating the sample size for your sequential mediation model:
1. **Determine the desired statistical power**: Statistical power refers to the probability of correctly detecting an effect if it truly exists. Commonly used levels of statistical power are 0.80 or 0.90, implying an 80% or 90% chance of detecting a significant effect if it is present.
2. **Specify the desired level of significance**: The level of significance (typically denoted by α) is the threshold below which a p-value is considered statistically significant. The most common level is α = 0.05, implying a 5% chance of making a Type I error (false positive).
3. **Estimate effect sizes**: Since you mentioned that there is no previous research showing the correlations between the variables, it can be challenging to estimate effect sizes. However, you can make educated guesses or assumptions based on similar studies or related research.
4. **Select an appropriate statistical method**: Sequential mediation models can be analyzed using techniques such as structural equation modeling (SEM) or bootstrapping. The choice of method may influence the sample size calculation, so it's important to determine the specific analysis approach.
5. **Use sample size estimation software**: There are various software packages and online calculators available that can assist in sample size estimation for mediation models. These tools often require inputs such as effect sizes, desired power, significance level, and the specific analysis method.
Hope it helps:credit AI
  • asked a question related to Sample Size
Question
8 answers
Have you ever tried to demonstrate that a drug has no effect or that a new teaching method is not superior to an old one? Let's delve into the intricacies of constructing and testing the null hypothesis in such scenarios. In this context, is your hypothesis the null hypothesis? Share your experiences, insights, and methodologies for crafting and testing hypotheses aimed at proving 'no effect.' How do you calculate sample sizes when seeking an effect size of zero?
Relevant answer
Answer
There are frequentistic and Bayesian approaches to find "evidence" of the null hypothesis. But all need that you define not only a point null hypothesis (as it is usual in typical frequentist approaches, e.g. correlation = 0 or mean difference = 0), but a region of values, which you would consider a null effect, i.e. these values are so small that they are of no practical interest. For this you might consult some papers concerning the term Smalles Effect Size Of Interest (SESOI).
If you have defined such a region you might use one of there approaches:
1) The frequentistic Two One Sided Tests (TOST), see publications by Lakens for example.
2) The Bayesian Region Of Practical Equivalence (ROPE), see publications by John Kruschke for example
3) The Bayesian Bayes Factor (BF), see publications by Jan Wagenmakers for example.
All approaches have their pros and cons in my opinion and it depends what you are specifically looking for. I personally favour the ROPE approach, but this is only my opinion, since it follows the model workflow I would use. For a comparison of approaches see:
Linde, M., Tendeiro, J. N., Selker, R., Wagenmakers, E. J., & van Ravenzwaaij, D. (2023). Decisions about equivalence: A comparison of TOST, HDI-ROPE, and the Bayes factor. Psychological Methods, 28(3), 740.
and for a crituque about their general finding and conclusion in favour of the BF:
Campbell, H., & Gustafson, P. (2021). re: Linde et al.(2021): The Bayes factor, HDI-ROPE and frequentist equivalence tests can all be reverse engineered--almost exactly--from one another. arXiv preprint arXiv:2104.07834.
  • asked a question related to Sample Size
Question
4 answers
Hi, everyone:
I'm planning to conduct a mixed-method study that will begin with a questionnaire followed by qualitative interviews. I have a population of around 200 students, and I would like to determine the optimal sample size for both the questionnaire and the interviews. How many students should I select for the questionnaire, and how many should I choose for the interviews?
Relevant answer
Answer
There is nothing about using mixed methods that affects your sample size in either the qualitative or the quantitative portion of your study. For the quantitative portion, you can use any of the sample size calculators that you can find with a Google search. For the qualitative portion, the classic recommendation would be to rely on saturation as your criterion.
  • asked a question related to Sample Size
Question
3 answers
What is the minimum sample size for sieve analysis and how do you calculate particle size in sieve analysis?
Relevant answer
Answer
The minimum sample size for sieve analysis depends on the nominal maximum size of the aggregate. The following table shows the minimum sample sizes for different nominal maximum sizes:
The minimum sample size for sieve analysis depends on the nominal maximum size of the aggregate. The following table shows the minimum sample sizes for different nominal maximum sizes:
Nominal Maximum SizeMinimum Sample Size2 inches20 kg1 1/2 inches15 kg1 inch10 kg3/4 inch5 kg1/2 inch2 kgNo. 4500 g8250 g16125 g3060 g5030 g10015 g2007.5 g
drive_spreadsheetExport to Sheets
To calculate particle size in sieve analysis, the following steps are followed:
  1. Weigh the sample of aggregate.
  2. Nest a series of sieves with decreasing sieve openings on top of a pan. The smallest sieve should have an opening of 75 micrometers (No. 200 sieve).
  3. Pour the sample of aggregate onto the top sieve.
  4. Shake the sieves for a specified amount of time.
  5. Weigh the material retained on each sieve and the material that passes through the No. 200 sieve.
  6. Calculate the percentage of the sample that passes through each sieve.
  7. Plot the percentage passing versus sieve size to obtain the gradation curve.
The particle size of the aggregate is typically determined by two methods:
  • D10: The size at which 10% of the aggregate passes.
  • D60: The size at which 60% of the aggregate passes.
For example, if D10 is 0.5 mm and D60 is 2.5 mm, then the aggregate is said to be well-graded. This means that there is a good distribution of particle sizes, which is important for good compaction and strength.
  • asked a question related to Sample Size
Question
1 answer
Dear RG community
I've coded N = 500 professional development courses for teachers according to topics (0 = was not part of the course; 1 = was part of the course). I'd like to have the reliability of my coding checked by a second rater. What is the appropriate measure under these circumstances and how many of the 500 courses would a second rater have to rate?
So far, I've come to the conclusion that Cohen's Kappa may not be the preferred choice, but rather Matthews Correlation Coefficient (MCC). Perhaps even simple percent agreement would be suitable in my case since it's only two raters in total and binary coding? I've been unable to find anything on the minimum sample size.
Any help is greatly appreciated.
Best
Marcel
Relevant answer
Answer
There are many way to calculate inter-rater reliability: Cohen's Kappa, Weighted Cohen's Kappa, Fleiss' Kappa, Conger's Kappa, Light's Kappa, Krippendorff's Alpha, Iota, Scott's Pi, Stuart-Maxwell, Bhapkar Test, Gwet's AC1/AC2, Brennan-Prediger.
Here's a resource on sample size for inter-rater reliability (Cohen's Kappa): https://rdrr.io/cran/irr/man/N.cohen.kappa.html
  • asked a question related to Sample Size
Question
2 answers
Hi guys,
I am interested in conducting a longitudinal study investigating the development of within-connectivity of the DMN in autistic and non-autistic children. Scans will take place at 8, 13 and 18.
My dependent variable will be DMN within connectivity calculated by averaging ROI-to-ROI connectivity and pairwise correlation between time series of regions within two hemispheres and between the hemispheres will be averaged. These averages will then be averaged to get a within-connectivity.
My fixed variables will be time and autism as well as their interaction. My random effects will be subject-specific intercepts and slopes and my covariates will be gender and education (I am expecting to add more covariates).
I am curious as to how I calculate a sample size a priori for this. I know I need to define an effect size which I have a rough idea of and of course power and alpha. Can this be done on G*Power (to my knowledge I dont think so, but I may be missing something).
Any help would be useful.
Thanks in advance
Relevant answer
Answer
GLIMMPSE (General Linear Mixed Model Power and Sample Size) likely would be able to accommodate your multilevel longitudinal design. There is a free online version:
  • asked a question related to Sample Size
Question
6 answers
I want to ask about the usage of parametrical and non-parametrical tests if we have an enormous sample size.
Let me describe a case for discussion:
- I have two groups of samples of a continuous variable (let's say: Pulse Pressure, so the difference between systolic and diastolic pressure at a given time), let's say from a) healthy individuals (50 subjects) and b) patients with hypertension (also 50 subjects).
- there are approx. 1000 samples of the measured variable from each subject; thus, we have 50*1000 = 50000 samples for group a) and the same for group b).
My null hypothesis is: that there is no difference in distributions of the measured variable between analysed groups.
I calculated two different approaches, providing me with a p-value:
Option A:
- I took all samples from group a) and b) (so, 50000 samples vs 50000 samples),
- I checked the normality in both groups using the Shapiro-Wilk test; both distributions were not normal
- I used the Mann-Whitney test and found significant differences between distributions (p<0.001), although the median value in group a) was 43.0 (Q1-Q3: 33.0-53.0) and in group b) 41.0 (Q1-Q3: 34.0-53.0).
Option B:
- I averaged the variable's values over all participants (so, 50 samples in group a) and 50 samples in group b))
- I checked the normality in both groups using the Shapiro-Wilk test; both distributions were normal,
- I used t Student test and obtained p-value: 0.914 and median values 43.1 (Q1-Q3: 33.3-54.1) in group a) and 41.8 (Q1-Q3: 35.3-53.1) in group b).
My intuition is that I should use option B and average the signal before the testing. Otherwise, I reject the null hypothesis, having a very small difference in median values (and large Q1-Q3), which is quite impractical (I mean, visually, the box plots look very similar, and they overlap each other).
What is your opinion about these two options? Are both correct but should be used depending on the hypothesis?
Relevant answer
Answer
You have 1000 replicate measurements from each subjects. These 1000 values are correlated and they should not be analyzed as if they were independent. So your model is wrong and you should identify a more sensible model. Eventually, the test of the difference between your groups should not have more than 98 degrees of freedom (it should have less, since a sensible model will surely include some other parameters than just the tow means). Having 1000 replicate measurements seems an overkill to me if there was no other aspect that should be considered in an analysis (like a change over time, with age, something like that). If there is nothing else that should be considered, the simplest analysis is to average the 1000 values per patient and do a t-test on 2x50 (averaged) values.
If you had a sample of independent thausands of samples per group, estimation would be mor interesting than testing. You should then better interpret the 95% confidence interval of the estimate (biological relevance) rather than the (in this respect silly) fact whether it is just in the positive or in the negative range.
  • asked a question related to Sample Size
Question
4 answers
We are currently conducting a qualitative research on the effects of influencer marketing on purchase behavior. However, before that, our panel has suggested to conduct a pre-survey to (1) identify the products endorsed by influencer marketing that students mostly purchase, and (2)the social media platforms that students purchase influencer-endorsed products. The purpose of the pre-survey will help us narrow down our scope based on the results of the pre-survey by focusing on a specific social media platform and product.
Our question is how can we determine the sample size?
Thank you in advance!
Relevant answer
Answer
RV Krejcie and DW Morgan,Educational and Psychological Measurement, 1970,30:607-610, gave a table to choose sample size. Also MA Hertzog ,Research in zNursing and Health,2008,31:180-191. Suggested that sample sizes should be higher that 40. REGARDS.
  • asked a question related to Sample Size
Question
8 answers
We are trying to conduct a meta-analysis. One of the studies is providing Nagelkerke's R2 and p-value (and sample sizes for two groups) but not the actual effect size. Is there a way to convert this data to an effect size that we can use in a meta-analysis? Thanks!
Relevant answer
Answer
What's your definition of "effect size" ?
  • asked a question related to Sample Size
Question
1 answer
Q 1. Which model should be preferred (FE/RE) when conducting a meta-analysis for pooling prevalence?
Q2. In the RE model, why do all studies receive equal weight irrespective of sample size or confidence intervals?
Q3. When we use the FE model, all the studies receive weight according to their sample size or CI. But my question is, is it correct to use the Fixed Effect Model?
Relevant answer
Answer
  • RE model is generally preferred over FE model for pooling prevalence in meta-analysis.
  • In RE model, studies with lower variance are given greater weight.
  • FE model should not be used unless you are confident that the true prevalence of the outcome is the same in all studies.
  • asked a question related to Sample Size
Question
5 answers
Hi! Can I ask a favor to everyone? I really need to know the require sample size and any citations will be much appreciated
Relevant answer
Answer
The sample size for an interventional research using multiple baseline small N design depends on several factors, such as the number of participants, the number of phases, the number of behaviors or outcomes measured, the expected effect size, and the desired statistical power. There is no definitive formula for calculating the sample size for this type of design, but some general guidelines are:
  • The number of participants should be at least three, and preferably more, to allow for a reasonable degree of generalization and replication1.
  • The number of phases should be at least three (baseline, intervention, and follow-up), and preferably more, to allow for a clear demonstration of functional relation between the intervention and the outcome12.
  • The number of behaviors or outcomes measured should be at least two, and preferably more, to allow for a comparison of the effects of the intervention across different dimensions or domains12.
  • The expected effect size should be large enough to be clinically or practically meaningful, and to be detectable with the available sample size3.
  • The desired statistical power should be at least 0.8, which means that there is an 80% chance of detecting a true effect if it exists3.
One way to estimate the sample size for a multiple baseline small N design is to use a simulation approach, which involves generating hypothetical data based on the assumed parameters of the design, and then applying appropriate statistical tests to see how often the null hypothesis is rejected3. This can be done using specialized software or online tools4. Alternatively, one can use existing data from similar studies or pilot studies to estimate the sample size needed to achieve a certain level of power3.
  • asked a question related to Sample Size
Question
12 answers
It is a descriptive research design
Relevant answer
Answer
Vishal Pradhan
Small sample sizes may well be inadequate for ANOVA because they lack the "power" to detect effects that would be significant with a larger sample. That is why I recommend the program G*Power to estimate one's sample size prior to collecting the data. This program is based on using an estimate of the size of the effect that you are likely to observe, which determines that sample size that would be necessary to detect such an effect.
  • asked a question related to Sample Size
Question
1 answer
I am struggling to define the sample size of specimen for Tension Compression Fatigue Testing of Polymer Matrix Composites. Is there any ASTM standard for the same? If not, what sample size should I take to avoid buckling of the sample. I want to test at R=-1. For tension-tension fatigue testing I will be following ASTM D3479 standard.
Relevant answer
Answer
You can base your sample size on similar, high-quality research published in highly ranked journals. With the same sample sizes they took, you can justify your work scientifically.
It is also possible to suffice by knowing that a sample >=30, this size is called a large sample according to all statistical books and references. If you can reach this size, it is scientifically justified.
You can use the basic function to calculate the Cochrane sample size, but the sample size will be large and range from 100 to 400 and this may be expensive for you.
In the end, I advise you to use the G-POWER software, as it is easy to use and simple, and its options are clear and comprehensive.
Best wishes,
  • asked a question related to Sample Size
Question
2 answers
I have a sample composed of two subsamples (say public and private companies) and a certain relationship is insignificant in both subsamples. If this relationship is significant in the collective sample, how can this be interpreted? Could this because of the large sample size of the collective sample?
Relevant answer
Answer
Yes, the larger sample size in the combined sample could be the source of such a result. In particular, in a regression where the effect in each of the smaller subsamples is "borderline" (i.e., approaching significance), the same effect size could be significant when evaluated with a larger N, due to the smaller standard error associated with that larger N.
  • asked a question related to Sample Size
Question
4 answers
Sample size requirements for Structural Equation Models (SEM)?
Relevant answer
Answer
Dr. Christian Geiser,
Thanks.
Will go through the suggested readings.
  • asked a question related to Sample Size
Question
4 answers
I am currently conducting a cross-sectional study investigating the impact of different orthodontic appliances on oral health-related quality of life. The study is divided into four groups: two different orthodontic appliance groups, a non-treatment group, and a completely healthy patient group. The outcome is based on questionnaire scores, which are continuous variables.
The study has been submitted, and we have received a major revision request. One of the reviewers commented: "Sample size calculation formation is not suitable for a comparative study among >2 groups."
In the initial design, we planned to use One-way ANOVA for result testing, considering the outcome as continuous. However, since the questionnaire scores did not conform to a normal distribution, we eventually used the Kruskal-Wallis (K-W) test. The formula currently used for sample size calculation is: n=(2(z_α+z_β )^2*σ^2)/δ^2.
I am seeking advice on how to address the reviewer’s comment concerning the sample size calculation in a comparative study among multiple groups with a non-normally distributed continuous outcome. Any suggestions or references to guide the appropriate sample size calculation method or statistical approach for this study would be greatly appreciated.
Thank you in advance for your valuable input!
Relevant answer
Answer
Yu Chen, you said that the study has been submitted, and that a reviewer was unhappy with the sample size calculation you did (if I follow). You also wrote:
In the initial design, we planned to use One-way ANOVA for result testing, considering the outcome as continuous. However, since the questionnaire scores did not conform to a normal distribution, we eventually used the Kruskal-Wallis (K-W) test.
There are multiple issues here.
1) There is no point in estimating the needed sample size after data collection has ended. See Section 3 (retrospective power) of this short note by Russell Lenth: https://homepage.stat.uiowa.edu/~rlenth/Power/2badHabits.pdf
2) Your comment about switching from the ANOVA you intended to use to the Kruskal-Wallis test makes me suspicious that you relied on a statistical test of normality. Generally speaking, that is a very bad practice--especially if you carry out the test on the DV rather than on the residuals. But even if you carry it out on the residuals, tests of normality have far too little power when n is low, and have too much power as n increases. IMO, you would be much better off asking yourself if it is honest, fair, sensible, and defensible to use means and SDs for description of the DV (by groups). If the answer is yes, then ANOVA will likely be just fine. This issue has been discussed many times on RG. And I summarized my thoughts on it in a short conference presentation several years ago. You can view the slides here:
I doubt you can share your raw data, but can you at least share the n, mean and SD for each of your groups? Some kind of distribution plot (by group) would be helpful too--e.g., box-plots. Thanks for considering.
I hope this helps.