Science topic
Sample Size - Science topic
The number of units (persons, animals, patients, specified circumstances, etc.) in a population to be studied. The sample size should be big enough to have a high likelihood of detecting a true difference between two groups. (From Wassertheil-Smoller, Biostatistics and Epidemiology, 1990, p95)
Questions related to Sample Size
The study involves age-specific range (15-16).
I'm using quasi-experimental design with a sample size of 18 student with locomotor disability. There's experimental group and control group.. Pre-test.. Post-test design and also perform treatment with experimental group with a teaching module to study it's effectiveness. Should I formulate alternative hypothesis or null hypothesis for the same? alternative hypothesis ?
I want to conduct a study where the population size is infinitely lagre. In that case How much data I should take. May anyone can suggest any book on that?
Hi everyone,
I want to Apply Tests for Special Causes in p-chart, but the sample size for each group is different. So I don’t have a fixed UCL and LCL. Just a question can I apply Tests for Special Causes? Can I calculate index when I don’t have a fixed LCL and UCL.
Thanks,
Our study involves conducting multiple WBC countings of venous blood from healthy participants using a control and test diluting fluid. We plan to compare and evaluate the results by checking how close/far the WBC count of a sample with the test is from the control. We want to know of studies or articles that state what is the minimum sample size that will still yield statistically significant results.
Provide the formula for determining sample size, given a study population. sampling table by different scholars can be of value to me.
Sample size determination seems to be a difficult job for me, specially due to it's statistical complexity. For researchers who doesn't have prior statistical knowledge often becomes harder. I want to learn the concept and in details with comprehensive examples. Please help me finding such source. Best Regards
Hi Research Gate people!
I am trying to determine the sample size, power and alpha boundary needed for my interim analysis for a registered report in a psych journal. I have read the paper by Lakens (2014) on sequential analysis, but am still pretty confused and would appreciate if anyone could link me up with any published social psych papers that has used sequential analyses in their design.
Cheers!
Hello.
I am examining the psychometric properties of the UGDS-GS, a test for measuring gender anxiety in the trans community. The problem I am facing is that my efa results are good but my cfa is not good. Is this due to the small sample size? My sample size is 140.
Greetings,
I am Mamun. I just want to know that if I use 7% margin of error than will it cause any problem in future analysis? I am using stratified sampling technique, for this if i take 7% margin of error then i get small sample size. It would be better for me to take small sample size.
Thank You.
I am quite confused about what formula to use to compute my sample size. I will be conducting a Sequential Explanatory design wherein my QUANT phase will make use of mediation analysis and my qual phase will be interpretative phenomenology. How can I determine the sample size? What is the best formula to use?
The pilot study is to detect the prevalence of pathogen in rodents.
Hello everyone,
I have research including two objectives,
one of them is to assess the relationship using logistic regression.
another one is comparing two groups using Mann-Whitney U Test.
if I want to apply sample size formulation need to calculate separately for each objective?
also what is the Minium sample size for logistic?
Thanks.
Any articles on this topic would be appreciated. Thank you.
If a sample size is 300 and 5 Questionnaires are discarded due to error, are the 5 discarded questionairee not meant to be replaced?
Just curious
Please, I need assistance in calculating the sample size for the interventional study for three arms using G power.
Arm 1: Control group
Arm 2: Standard treatment
Arm 3: Standard plus advanced treatment.
I would be very glad if I could get a standard plan format for estimating that.
Thank for your advance assistance
For my research, I will retrieve data for each firm (100 firms) over 5 years, leading to 500 data points.
Should this dataset size be sufficient for using a fixed effects model?
In Survey sampling we need to calculate the desired sample size for different kind of scenarios. Hence it will be great help if there is a good resource for the same.
Hello I am trying to run a moderation analysis but will need to use G*Power to determine my sample size. Just wondering if anyone could assist me with the following:
1) the parameters to set
2) Also what effect size/power should I use?
3) If my outcome measure is pre vs post test change on a questionnaire, would Hayes process macro be a good program to use or should I use SPSS instead?
IV: Intervention
M: Language (3 lvls)
DV: Pre test vs post test of an outcome measure
Thank you.
What is your opinion on this?
CASE: If you need to include 100 people according to the power calculation and you expect 20% dropout, you need to include a total of 125 people.
QUESTION: Would you stop before 125 (100+20%) if you reach 100 participants that can be included in the analysis due to lower dropout than expected? Or would you continue to 125, i.e. include more than the power calculation?
Hi all
What could be the process of estimating sample size for a cognitive test being developed? The test is not an adaptation of any existing test and has to be used for patients with schizophrenia. In the first stage, I understand that a healthy participant group will be required to have the normative data followed by administration on the clinical group. Hence, what could be the process of estimating the sample size of the healthy control and the clinical groups?
Searching for articles or insight on determining the number of vignettes needed for a study using the Factorial Survey Method.
Dear colleagues,
I am looking for advice on the validation of a standard questionnaire that I intend to translate. The original version contains 40 items. Could you please tell me what sample size is required for validation?
Thank you in advance for your help.
I am planning to pass surface water samples through HLB cartridge. The sample size is more and hence want to know if i can use the HLB? and if yes what will be the procedure to clean it?? Even another query do we need to concentrate the sample in rotary after passing it through the cartridge?
I have two groups (A and B) each going through 5 repeated measures. I want to know how I can determine eta squared so that I can use it to calculate an a-priori sample size.
Thanks in advance
Hi everyone,
I have prospective study; the aim is to apply multiple logistic regression.
The sample size for the current study is determined based on the previous study. A study designed to recruited 105 patients and analyzed 207 samples. But I'm thinking to say this justification for the sample size is not correct.
The Resean is: This new study includes secondary outcomes not covered by the previous research, Am I correct to say the justification based on previous research is not appropriate? If so, could you guide me to outline the justification for the sample size?
Thanks,
The primary survey is meant to analyse the social vulnerability status of the population, where sample size is defined at taluk level of the district (study area). Considering factors like physiography and population density of the study area, kindly provide suggestions on how to select the geographical location of samples (households) preferably using GIS tools (other than fishnet and random sampling tools in ArcGIS) or through other scientific or systematic methods. TIA.
For example, If I have to understand the relation between the age and dependent variable (Likert scale data) and my sample size is uneven across the age categories (like in the age category 20-30 years old, there are 10 individuals while in the age category 30-40 years old, there are 20 individuals), what can be done in this case?
Hello
is there any advanced estimation technique to conduct multiple regression in the context of small sample size
My research plan is as follows:
5 organisations are taking part in the project. Their employees will get a questionnaire in the beginning, middle and end (t1, t2, t3) of the project.
However, we will not be recording participant data, and so it is not fully longitudinal and more of a cohort study I believe, because we cannot tell whether the same people take part at each time point.
My plan was to do some type of multilevel model with participants nested within organisations, and to measure the effect of time on 3 outcome variables measured using the questionnaire.
Now a reviewer is asking for a sample size calculation to see how many people I would need to recruit for adequate power.
There are so many different programs (free or paid) as well as R packages that can do these types of analyses, and I am not quite sure what to pick. Any advice would be helpful!
Because of the small number of the population the researcher will use the entire population as the sample size. According to…. … state that if the population size is small the researcher can use the entire population. Researchers choose to study the entire population because the size of the population that has the particular set of characteristics that are interest in it is typically very small.
Usually, it is only straightforward to collect data from a whole population when it is small, accessible and cooperative.
In my experimental design I have 4 treatments, 3 replicates per treatment and 3 blocks. In each plot I measured whether a plant is infested or not ("Infestate" variable). This measure has been performed to 30 to 40 plants placed at the centre of the plot. Sampling has been performed weekly (variable "Data_rilievo) on the same plants, even though the sample size might vary if some plants die. Treatment does not influence plant death. Thus, I removed from the dataset the observations resulted in plant death.
I obtained the following dataset:
'data.frame': 2937 obs. of 15 variables:
$ ID_pianta : chr "_Pianta_1" "_Pianta_2" "_Pianta_3" "_Pianta_4" ...
$ Data_rilievo : POSIXct, format: "2023-11-14" "2023-11-14" "2023-11-14" ...
$ Blocco : num 2 2 2 2 2 2 2 2 2 2 ...
$ Trattamento : chr "Controllo" "Controllo" "Controllo" "Controllo" ...
$ Infestate : num 1 0 0 1 0 1 0 0 1 0 ...
I opted for a mixed-effect model with treatment as fixed effect, plant ID ("ID_pianta") as random effect to account for repeated measures, and block ("Blocco") as random effect.
And this is the result
> summary(model)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: Infestate ~ Trattamento + (1 | ID_pianta) + (1 | Blocco)
Data: data
AIC BIC logLik deviance df.resid
3835.8 3871.7 -1911.9 3823.8 2931
Scaled residuals:
Min 1Q Median 3Q Max
-2.1969 -1.0611 0.6139 0.8091 1.5079
Random effects:
Groups Name Variance Std.Dev.
ID_pianta (Intercept) 0.16880 0.4108
Blocco (Intercept) 0.09686 0.3112
Number of obs: 2937, groups: ID_pianta, 40; Blocco, 3
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.59808 0.20650 2.896 0.003776 **
TrattamentoLavanda -0.16521 0.11116 -1.486 0.137218
TrattamentoRosmarino -0.02389 0.11000 -0.217 0.828075
TrattamentoTimo -0.37733 0.11017 -3.425 0.000615 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) TrttmL TrttmR
TrttmntLvnd -0.266
TrttmntRsmr -0.269 0.502
TrattamntTm -0.269 0.499 0.504
I wanted also to check the predictive abilities. I used this code
library(caret)
data$Infestate <- factor(data$Infestate, levels = c(0, 1))
# Convert predicted probabilities to binary predictions using a threshold
binary_predictions <- ifelse(predicted_probabilities > 0.5, 1, 0)
# Convert binary_predictions to a factor with levels 0 and 1
binary_predictions <- factor(binary_predictions, levels = c(0, 1))
# Create a confusion matrix
conf_matrix <- confusionMatrix(data$Infestate, binary_predictions)
print(conf_matrix)
And these are the results:
Confusion Matrix and Statistics
Reference
Prediction 0 1
0 1811 28
1 751 55
Accuracy : 0.7055
95% CI : (0.6877, 0.7228)
No Information Rate : 0.9686
P-Value [Acc > NIR] : 1
Kappa : 0.0709
Mcnemar's Test P-Value : <2e-16
Sensitivity : 0.70687
Specificity : 0.66265
Pos Pred Value : 0.98477
Neg Pred Value : 0.06824
Prevalence : 0.96862
Detection Rate : 0.68469
Detection Prevalence : 0.69527
Balanced Accuracy : 0.68476
'Positive' Class : 0
It seems te model is good in predicting negative but it predicts 751 false positive. How to deal this aspect? Can the model be considered a good predictor? How can I increase predictive abilities?
In an experimental study, should the experiment and control group be divided equally. For example, 25 control group and 25 in experimental group. Can there be a + or - 1 to it.
Dear Friends!
The gold-standard for identifying an allele for human leukocyte antigen is SBT or sequence-based typing. If I devise a new PCR test or LAMP test, how many known positive and negative samples (known by SBT) do I take. What formula do I use?
I am using a multi-stage sampling technique for my study in Kathmandu district, Nepal, which consists of 1 metropolitan city and 10 Municipality. I have randomly selected Kathmandu metropolitan city (KMC) (due to budget and logistic constraints). KMC further consists of 32 wards of which I have selected one Ward (No.16) randomly (resource constraints).
My sample size is around 437, calculated using Taro Yamane formula. However, I do not have the list of households in Ward #16. In such a situation, which sampling technique will be appropriate.
If cluster sampling is to be used, how should the clusters be made as the clusters would not be homogeneous?
My students and I recently completed a pilot intervention study for anxiety in Division 1 volleyball players. We studied the players from one volleyball team over the course of a season, measuring self-reported anxiety before the season, 8 weeks into the season, and one-month after the end of the season. We had three groups, two intervention groups and one control group. The team has 21 players, so we started with 7 in each condition. Due to drop out across the intervention, the 8-week time point had 6, 5, and 4 players in each group, and the post-season time point had 6, 4, and 3 players in each group. My question is whether we could still use Hedge's g as an initial measure of "effect" difference in the various measures or whether our very small groups preclude us from using this statistic. Any references either way would be appreciated!
Hi im looking to calculate sample size for my study n=655.
This is an awareness study using HAIS-Q questionnaire, I wanna know what setting in G*power i can use for this calculation. Thanks in advance.
Can someone pls assist me with sample size calculation for RCT in scientific research, 2 grps control and intervention. Is there a method utilizing ANCOVA? Which software is the best. Assuming I have all assumption to run the ANCOVA.
Thank you kindly
Hashim
Good afternoon, everyone!
I plan to investigate the effect of Trendelenburg position on the quantitative measure "X". We have the following results:
1) Study 1: 30 patients in supine position have the value of the parameter "X" = 69+-10, same patients in Trendelenburg position: "X" = 75+-12. The time after which values of the parameter "X" were recorded after the start of the Trendelenburg position was 1 minute.
2) Study 2: 40 patients in the supine position: "X" = 86+-11, same patients in the Trendelenburg position: "X" = 105+-16. The time after which values of the parameter "X" were recorded after the start of the Trendelenburg position was 1 minute.
AND same 40 patients in the Trendelenburg position have the value of the parameter "X"= 95+-14, BUT the time after which values of the parameter "X" were recorded after the start of the Trendelenburg position was 10 minutes, instead of 1 min.
In fact, we have 3 observations for one outcome, but 2 of them are from the same cohort of patients in the Study 2.
How can we take this into account to adjust the "weight" of each of the two results in the second study?
Would it be correct to reduce the sample size for the second and third observations to 20 (40/2) when entering primary data for meta-analysis into programs such as Stata or RevMan?
Can anyone here me with one biostatistics question. It is about finding the sample size from power analysis. I have the variables. Just need an assistance with the calculations.
How might I calculate the sample size for an experimental design in medical research, peripheral sodium channel block, and stabilization exercises
Various formulas have been proposed in recent years for sample size determination. The Krejcie and Morgan sample size determination table has also been in use for some years now. I will like to learn whether it will still be ideal and meet current trends and standards to use the table as a reference for sample size determination. Thank You.
If the reviewer want linear regression and I have one dependent non parametric parameter (10 sample size) and ten parametric parameters (each parameter 5 sample size).
Can I perform Pearson correlation and linear regression?
i want to do cross sectional study in osteogenesis imperfecta patients. the prevalance is about 6.5 per 100000 live births. kindly help me how to calculate the sample size
Other than cross sectional studies, I wish to gain knowledge in calculating the sample size for interventional studies as well as calculate sample size with the help of Standard Deviation etc..
I want to calculate the sample size for my study on the education of Nomadic children in Lahore Pakistan. But no data regarding population size is available. How i can calculate and also give me reference. Thanks
The sample size for the BE is directly poportional with power. However, the proposed/accepted power ingeneral is 80-90%. Consideration of power more than 90 % may end up the issue of ' 'Forced Bioequivalence' by regulatory.
Can any one please explain clearly the term 'Forced Bioequivalence' associated with larger larger sample size due to high power consideration?
Dear researchers ,
Greetings. I am trying to collect data from an organization. Suppose , the number of employees in an organization is 1000. What could be the optimum number of sample size from these 1000 people ( that means , I wont get respond from everyone, what is the optimum number of data size to be considered ). I am trying to gather information on employees . 1.There are different formulization for optimum sample size , can you please list them all provide me the best one (with recent publication references ?)
2. Also, can you please let me know how can I eliminate the possibility of bias in the dataset ?
sample size required to collect responses on a questionnaire.
One of the many assumptions in ANOVA is that our data follows a normal distribution. Usually in biology experiments are carried out in triplicates and the avalilable data is very small, not > 30 which I think is standard sample size for assessing normality. Any one who is familiar with use of statistics in phytochemistry, microbiology, pathology or any relavant field kindly answer the following questions.
1. Is it necessary to carry out tests of normality such as Shapiro-Wilk normality test to confirm if our data follows normal distribution?
2. With such a small sample size is it possible to carryout these tests?
3. Is it true that test for normality is unnecessary in biological experiments?
4. Can I safely assume that my data will follow normal distribution wiithout any of these tests?
5. Which statistical software is best for a beginner?
Is it necessary to integrate an estimated non-response rate when finalizing the sample size after conducting the power calculation? How to interpret?
For instance,
Two-sample t test power calculation
n = 213.1237
d = 0.35
sig.level = 0.05
power = 0.95
alternative = two.sided
NOTE: n is number in *each* group
So, we need 214 for each group
In case we integreate an estimated non-response rate (e.g., 80%)
So, we need 214/0.2 = 1070 for each group
*By the end of the study, let's say we get 600 for each group. Is it reliable and meaningful results in statistical analyses?
I am conducting a systematic review and meta-analysis of many studies with overlapping populations. The Review Manager 5.4 software does not require a sample size to perform a meta-analysis of hazard ratios, so I thought it may be possible to pool data from overlapping populations.
When undertaking clinical trials with limited sample size and repeated measures, the question arises: which is more advantageous—employing "Generalized Estimating Equations (GEE)" or opting for "non-parametric analysis"? We seek insights from experts to elucidate the respective strengths and limitations of each approach.
This is because the sample size is small and some specific characters are required.
The inclusion and exclusion criteria were indicated. I acknowledged that this might led to the limitation for finding generalisation.
Hello,
I plan to conduct a cross-sectional survey among mothers (caregivers) in one of the rural regions of my country. The survey will assess mothers'/caregivers' knowledge and practices regarding breastfeeding, child nutrition, and health. The study does not have a control group. Is this information enough to understand the formula required for the sample size calculation?
I really appreciate any help you can provide.
Leslie Kish formular or Leslie Fischer's formular? What is the actual nomenclature? Are these names used interchangeably? The Literature isn't helping to clarify at all.
The type of mixed method is sequential explanatory. I'm planning to conduct an experiment, and then explore how they experienced the phenomenon through IPA. How many participants would be enough for the quantitative part of the research? And do I have to include every participant during the qualitative part? Could I just select a few? 3-6, maybe?
I'm doing a meta-analysis for continous outcomes and most of the data from the studies is reported as median, IQR (as difference Q3-Q1), I already know Wan (2014) method and McGrath (2021) method, but those only are appliable when the exact values of the quartiles are available, I was thinking about using IQR/1.35 to get the SD and asume a normal distribution so the median=mean, and then try a sensitivity analysis to evaluate the effect of using those studies as presented.
I am conducting a cross-sectional study from NHANES database. I used the "Full.sample.2.year.interview.weight" to weight the data. However, I do not know where should I put the decimal to finish my data analysis.
The total sample size before weighting is 5639 participants.
If I put the decimal after 5 numbers, the sample size will be decreased by 50% and will be around 2700 participants.
If I put the decimal after 4 numbers, the sample size will be increased by 200% and will be more than 10,000 participants.
I am attaching a screen shot after weighting the data.
I used SPSS for analysis, tab (data), weight by "Full.sample.2.year.interview.weight".
The population of this research study are students from a specific school, and 6 strata was identified. The other 3 strata met the calculated sample size required, however the other 3 unfortunately did not despite sending out the surveys again and again. Additionally, a majority of the students were minors, and there were some whose parents did not give consent for the survey.
With this, the 3 strata were not able to meet the calculated sample size. Should we proceed with statistical analyses (with 3 reaching the calculated size, and the other 3 not) and write this as a limitation? or what other actions should we take instead?
I need to make my sample size calculation without a pilot study, my n=10, but how can I prove that I have this number?
I am conducting a study to better understand the effects of flooding on the mental health of farming communities in my country. However, despite my efforts, I have been unable to find any suitable questionnaire to use in my survey. I'm hoping someone out there may have a relevant questionnaire I can adapt for my study. Alternatively, if anyone can provide me with relevant questions, that would also be very helpful. Additionally, if I were to conduct a group discussion, what would be the ideal sample size for each group?
Is there any formula to find the sample size needed to create machine learning or deep learning models in the detection ,localization segmentation and classification of colon polyps
Dear Friends,
I want you to read section 4.2 of the following paper and comment.
Happy New Year
Is ex ante power analysis the same as a priori power analysis or is it something different in the domain of SEM and multiple regression analysis? If it is different, then what are the recommended methods or procedures? Any citations for it?
Thank you for precious time and help!
To understand the relationship between change in marginal error and research sample size.
Is it possible to identify statistical differences between two groups when they have different sample sizes, as in the case of comparing CD (with 12 samples) to an experimental group (with 2 samples)?
I have a cross sectional study including two groups and the sample size was calculated to be 73 in each group but the participants were 80 in each group as the statistician told me to be more significant but the reviewer now during publication his comment this is unethical and need justification why add 14 more how can response to him?
Dear researchers I have one question for today
It is about the analysis of skipping questions. Most of us are familiar with questions that require skipping. For example
If the question is, do you like fruit? the answer will be yes or no
Then the next question is, if yes, which one? A .Banana B. Apple, C orange
Then, when we analyze this kind of question, it will be difficult because only the yes part will be entered into the model, but the sample size will be less than what we calculated first since we only consider the "yes.". so how can we analyze this kind of variables
I am exerting my efforts to identify the impacts of last year's three-month prolonged flood on the mental health of the farming communities of Pakistan, particularly in rural Sindh province. Therefore, I have two options in hand to choose either: first, I should go ahead with a focus-group discussion (how long should be the sample size?); second, I can conduct a survey questionnaire (how many questionnaires should be filled sufficiently).
Also, how to develop this kind of questionnaire, as I do not find any relevant study that could be sufficiently helpful to follow?
Hello everyone,
I am trying to make a Network Analysis with EEG data, behavioural data and other variables for predicting the learning of a task, but the sample size is really small (around n=40).
I want to know if it is feasible to do it in that way with a non-parametric bootsraping, or if there is other way to do it.
Cheers,
Laura Maldonado
In the original protocol, there should a sample of 75 that is divided into the groups each of 25 subjects.
In the actual research I did, I included 78 subjects divided into 25, 26, and 27 subjects in each group.
Does this violate the protocol?
Hi
I am trying to work out what my effect size was for my study
I have the sample size, power etc...
its there a program that can work out what it is?
The most appropriate equation to determine the sample size
I am conducting a study testing the effectiveness of a kind of group psychotherapy. There are 10 participants in my experimental group and 14 participants in my control group. At first, I planned random assignment to the groups, but because of the time of the group therapy, 14 of the participants wanted to be in the waitlist control group. After I created the control group, I run a t test to compare two groups in terms of some study variables. When I did a t test, I saw that there was no significant difference between the groups in terms of the study variables. In summary, the groups have similar characteristics (e.g. Age, educational level, romantic relationship status, mean scores of the participants). However, group sizes are different. Can I do my analysis with 10 people in the experimental group and 14 people in the control group? If no, how do I remove the 4 people in the control group?
I am a Research Scholar from INDIA. Doing my research in Human Resource Management.
I am currently working on a research about the Manager's Emotional Intelligence influences the Service employee's performance. Where;
Manager's Emotional Intelligence (MEI) - 1 Dependent Variable - Qualitative Interview and Quantitative Survey
The MEI is measured from:
Qualitative Interviews from Managers and Perception Based Survey from the service employees about MEI. (Data Triangulation)
Service Employee's Performance (SEP) - 7 Independent Variables - Quantitative Survey
The SEP is measured from:
7 Variables of SEP from Customers as a Quantitative Survey.
Could someone help me out?
Hello, my name is Hue Man, I am a senior nursing student at Tra Vinh University. Currently, I am doing my graduation thesis on: "Research on the rate of depression in the elderly using the GDS-5/15 scale and some related factors at the medical examination department". But the problem I am currently facing is the sample size calculation formula and how to process the data. Please can anyone give me advice or a solution. I sincerely thank you
Dear all,
As part of my bachelor thesis, I am using a simple mediation with bootstrapping procedure. I would now like to calculate the power of my analysis. For guidance, I have read the 2007 paper "Required Sample Size to Detect the Mediated Effect" by Matthew S. Fritz and David P. MacKinnon (DOI: 10.1111/j.1467-9280.2007.01882.x). I would like to use the "empirical estimates of sample sizes needed for .8 power" (Table 3) to assess the power of my mediation analysis (using bootstrapping). As far as I know, the power of a test always depends on the sample size, the effect size and the type 1 error. Unfortunately, however, I could find no information on the probability of type I error on which Fritz and MacKinnon's percentile bootstrap and bias-corrected bootstrap power analyses are based. I wold now like to ask, if anybody knows which type 1 error ist used in the paper. It would be very helpful if somebody could help me with this question.
Kind regards
Leonie Aderhold
In our study, we reached 200 participants and have identified various criteria for conducting exploratory factor analysis, such as item count being 5-10 times the number of items, KMO value, etc. However, I couldn't find a satisfactory reference in the literature regarding the sample size for Confirmatory Factor Analysis (CFA). In this context, the main question is: Is a sample size of 200 sufficient for a 5-factor, 40-item scale used in the study? Another consideration is that while sample size is important, if fit index values take into account the sample size when assessing construct validity, and if the fit values are good, it might be reasonable to perform CFA with 200 participants. I am looking forward to responses from experts.
Hello Reseachgate community.
I have perused several recent sources to either find data or power tables missing and there I cannot seem to find the best source for an appropriate minimum sample size for a conditional process (moderated mediation) analysis.
With 4 variables (3 predictors, 1 outcome) and assuming power .80 with alpha .05 and small to medium effect sizes between all (i.e. 0.30) could anyone point me in the right direction please?
Our research is quasi-experimental. There are two groups to be tested under different teaching approaches however we don't know how many participants should be in a group.
We measured three aspects (i.e. variables) of self-regulation. We have 2 groups and our sample size is ~30 in each group. We anticipate that three variables will each contribute unique variance to a self-regulation composite. How do we compare if there are group differences in the structure/weighting of the composite? What analysis should be conducted?
I need this to justify a survey sample Size.
Hi all.
I would like to know how to calculate the required sample size for the sequential mediation model (X->M1->M2->Y) if there is no previous research showing the correlation between them. Are there any ways to calculate the sample size by utilizing the information on the number of variables?
Thanks!
Have you ever tried to demonstrate that a drug has no effect or that a new teaching method is not superior to an old one? Let's delve into the intricacies of constructing and testing the null hypothesis in such scenarios. In this context, is your hypothesis the null hypothesis? Share your experiences, insights, and methodologies for crafting and testing hypotheses aimed at proving 'no effect.' How do you calculate sample sizes when seeking an effect size of zero?
Hi, everyone:
I'm planning to conduct a mixed-method study that will begin with a questionnaire followed by qualitative interviews. I have a population of around 200 students, and I would like to determine the optimal sample size for both the questionnaire and the interviews. How many students should I select for the questionnaire, and how many should I choose for the interviews?
What is the minimum sample size for sieve analysis and how do you calculate particle size in sieve analysis?
Dear RG community
I've coded N = 500 professional development courses for teachers according to topics (0 = was not part of the course; 1 = was part of the course). I'd like to have the reliability of my coding checked by a second rater. What is the appropriate measure under these circumstances and how many of the 500 courses would a second rater have to rate?
So far, I've come to the conclusion that Cohen's Kappa may not be the preferred choice, but rather Matthews Correlation Coefficient (MCC). Perhaps even simple percent agreement would be suitable in my case since it's only two raters in total and binary coding? I've been unable to find anything on the minimum sample size.
Any help is greatly appreciated.
Best
Marcel
Hi guys,
I am interested in conducting a longitudinal study investigating the development of within-connectivity of the DMN in autistic and non-autistic children. Scans will take place at 8, 13 and 18.
My dependent variable will be DMN within connectivity calculated by averaging ROI-to-ROI connectivity and pairwise correlation between time series of regions within two hemispheres and between the hemispheres will be averaged. These averages will then be averaged to get a within-connectivity.
My fixed variables will be time and autism as well as their interaction. My random effects will be subject-specific intercepts and slopes and my covariates will be gender and education (I am expecting to add more covariates).
I am curious as to how I calculate a sample size a priori for this. I know I need to define an effect size which I have a rough idea of and of course power and alpha. Can this be done on G*Power (to my knowledge I dont think so, but I may be missing something).
Any help would be useful.
Thanks in advance
I want to ask about the usage of parametrical and non-parametrical tests if we have an enormous sample size.
Let me describe a case for discussion:
- I have two groups of samples of a continuous variable (let's say: Pulse Pressure, so the difference between systolic and diastolic pressure at a given time), let's say from a) healthy individuals (50 subjects) and b) patients with hypertension (also 50 subjects).
- there are approx. 1000 samples of the measured variable from each subject; thus, we have 50*1000 = 50000 samples for group a) and the same for group b).
My null hypothesis is: that there is no difference in distributions of the measured variable between analysed groups.
I calculated two different approaches, providing me with a p-value:
Option A:
- I took all samples from group a) and b) (so, 50000 samples vs 50000 samples),
- I checked the normality in both groups using the Shapiro-Wilk test; both distributions were not normal
- I used the Mann-Whitney test and found significant differences between distributions (p<0.001), although the median value in group a) was 43.0 (Q1-Q3: 33.0-53.0) and in group b) 41.0 (Q1-Q3: 34.0-53.0).
Option B:
- I averaged the variable's values over all participants (so, 50 samples in group a) and 50 samples in group b))
- I checked the normality in both groups using the Shapiro-Wilk test; both distributions were normal,
- I used t Student test and obtained p-value: 0.914 and median values 43.1 (Q1-Q3: 33.3-54.1) in group a) and 41.8 (Q1-Q3: 35.3-53.1) in group b).
My intuition is that I should use option B and average the signal before the testing. Otherwise, I reject the null hypothesis, having a very small difference in median values (and large Q1-Q3), which is quite impractical (I mean, visually, the box plots look very similar, and they overlap each other).
What is your opinion about these two options? Are both correct but should be used depending on the hypothesis?
We are currently conducting a qualitative research on the effects of influencer marketing on purchase behavior. However, before that, our panel has suggested to conduct a pre-survey to (1) identify the products endorsed by influencer marketing that students mostly purchase, and (2)the social media platforms that students purchase influencer-endorsed products. The purpose of the pre-survey will help us narrow down our scope based on the results of the pre-survey by focusing on a specific social media platform and product.
Our question is how can we determine the sample size?
Thank you in advance!
We are trying to conduct a meta-analysis. One of the studies is providing Nagelkerke's R2 and p-value (and sample sizes for two groups) but not the actual effect size. Is there a way to convert this data to an effect size that we can use in a meta-analysis? Thanks!
Q 1. Which model should be preferred (FE/RE) when conducting a meta-analysis for pooling prevalence?
Q2. In the RE model, why do all studies receive equal weight irrespective of sample size or confidence intervals?
Q3. When we use the FE model, all the studies receive weight according to their sample size or CI. But my question is, is it correct to use the Fixed Effect Model?
Hi! Can I ask a favor to everyone? I really need to know the require sample size and any citations will be much appreciated
I am struggling to define the sample size of specimen for Tension Compression Fatigue Testing of Polymer Matrix Composites. Is there any ASTM standard for the same? If not, what sample size should I take to avoid buckling of the sample. I want to test at R=-1. For tension-tension fatigue testing I will be following ASTM D3479 standard.
I have a sample composed of two subsamples (say public and private companies) and a certain relationship is insignificant in both subsamples. If this relationship is significant in the collective sample, how can this be interpreted? Could this because of the large sample size of the collective sample?
Sample size requirements for Structural Equation Models (SEM)?
I am currently conducting a cross-sectional study investigating the impact of different orthodontic appliances on oral health-related quality of life. The study is divided into four groups: two different orthodontic appliance groups, a non-treatment group, and a completely healthy patient group. The outcome is based on questionnaire scores, which are continuous variables.
The study has been submitted, and we have received a major revision request. One of the reviewers commented: "Sample size calculation formation is not suitable for a comparative study among >2 groups."
In the initial design, we planned to use One-way ANOVA for result testing, considering the outcome as continuous. However, since the questionnaire scores did not conform to a normal distribution, we eventually used the Kruskal-Wallis (K-W) test. The formula currently used for sample size calculation is: n=(2(z_α+z_β )^2*σ^2)/δ^2.
I am seeking advice on how to address the reviewer’s comment concerning the sample size calculation in a comparative study among multiple groups with a non-normally distributed continuous outcome. Any suggestions or references to guide the appropriate sample size calculation method or statistical approach for this study would be greatly appreciated.
Thank you in advance for your valuable input!