0% found this document useful (0 votes)
10 views8 pages

Stat1103 Research Design Overview

This document provides an overview of key concepts in research methods and statistics, including: 1) It defines research hypotheses, statistical hypotheses, and the steps of conducting research such as developing hypotheses, designing a study, collecting and analyzing data, and drawing conclusions. 2) It describes different types of research designs like longitudinal vs. cross-sectional and experimental vs. non-experimental designs. 3) It outlines concepts like probability and non-probability sampling, reliability and validity, and types of data. 4) It explains variables, different variable types, and how to present data using charts, tables and summaries. 5) It introduces common statistical tests and their assumptions like t-tests, chi-

Uploaded by

kk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views8 pages

Stat1103 Research Design Overview

This document provides an overview of key concepts in research methods and statistics, including: 1) It defines research hypotheses, statistical hypotheses, and the steps of conducting research such as developing hypotheses, designing a study, collecting and analyzing data, and drawing conclusions. 2) It describes different types of research designs like longitudinal vs. cross-sectional and experimental vs. non-experimental designs. 3) It outlines concepts like probability and non-probability sampling, reliability and validity, and types of data. 4) It explains variables, different variable types, and how to present data using charts, tables and summaries. 5) It introduces common statistical tests and their assumptions like t-tests, chi-

Uploaded by

kk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

OTHER

 Research hypothesis = testable statement that predicts how 2 or more variables correlate if
they correlate at all
o Those with traumatic brain injuries will have higher depressive symptoms than those
with mild or moderate
 Statistical hypothesis = mathematically precise, correspond to specific claims, aimed at a
population, usually absolutes (there is or isn't), non-directional and doesn't tease out which
group is affected so might not help answer research hypothesis
o There is (or is not) a difference with individuals with depression score for mild,
moderate and severe traumatic brain injuries
o H0 = No trend or pattern
o H1 = Distinct trend or pattern (difference between variables)
o Directional hypotheses use one tailed tests, non-directional use two-tailed (only looking
at two-tailed in this course, take probability of one tail and double it)
 Step by step process
o Background reading to identify RQ and hypotheses > design study/identify methods and
measures > carry out study > general descriptive statistics > identify variables involved >
identify the appropriate test > check assumptions for test > run test and conclude
 Types of Research Design
o Longitudinal vs Cross-sectional
 Longitudinal considerations; test-retest effects, participant attrition (drop-out),
time investment, retrospective data (recall bias)
 Time-points (waves) either predicting over time or measuring change, related
groups t-test (numeric) or McNemar's test (categorical)
o Experimental vs Non-experimental
 Criteria for a cause and effect: covariance rule (relationship), temporal
precedence (cause must come first), internal validity (excluding other causes)
 Non-experimental methods; single-variable, correlational research, quasi-
experimental
 Experimental methods; between-subjects (researcher control, non-naturally
occurring groups, distinct), within subjects (same group of people but with
different 'conditions')
 Between subjects needs more participants, statistically less powerful,
shorter experimental time good for participants, no carry-over effects
 Within subjects needs half the participants, statistically powerful,
double experimental time, carry-over effect potential
o Survey vs Observational
 Probability sampling
o Simple random, systematic (non-bias method to select), stratified random (characteristic
groups randomly chosen between), cluster sampling (only select certain clusters)
 Non-probability sampling
o Convenience sampling, snowball sampling
 Reliability = consistency/dependability of measurement, reliability of scale, does it measure
construct
 Validity = face validity, content validity, criterion validity, concurrent vs predictive, convergent,
discriminant
 Types of data = Qualitative vs Quantitative (or mixed) / Discrete vs Continuous
o Data analysis for qualitative is often described/explained, coded for common themes
(inter-rater reliability important) and then turned into variables for analysis
o

o Independent (predict/cause an outcome) vs dependent (outcome) = E.g. PAL attendance


(IV) on final grade (DV)
o Extraneous variable (anything other than IV/DV) and confounding variable (potentially
explain relationship)
o Nominal (unordered, categorical, arbitrary with no hierarchy, one example is
binary/dichotomous variable with only two groups/levels). Gender is nominal,
woman/man/other, totally distinct
o Ordinal (ordered categorical, hierarchy). Highest level of education.
o Interval (distance is meaningful, numeric scale with consistent differences between
points). Pain on a scale from 0-10.
o Ratio (numeric scale with consistent differences between points AND absolute zero, e.g.
0 Celsius doesn't mean there's no temperature, while 0 means absolute absence of the
thing you're measuring). Distance between two friends sitting next to each other (cms).
o

 One categorical variable = Bar chart (frequency of categories, greater detail), pie chart (better
for comparing two or more categorical values)
o Two categorical variables = Contingency table (numeric summary), clustered bar chart
(graph)
 One numerical variable = Comparative boxplot, histogram (no comparison, shape of distribution
shown)
o Two numerical variables = Correlation (numeric summary), scatterplot (graph)

 Numeric summaries; frequency tables with count, percentage or proportion. Mean or median,
SQ or IQR, variance and range.
o With two categorical variables we create a cross-tabulation of one categorical variable
by another, allows comparison

 Interval estimate gives us a range of believable values for the parameter, the interval/range is
called a confidence interval. Need a confidence level of 95%

 Why 5% significance level? Minimize type 1 error (we reject the null hypothesis when we
shouldn't, false positive result, sample effect is due to chance)
o Type 2 error is not rejecting the null hypothesis when we should, our sample isn't
detecting a population effect, false negative result
o

 Power is the probability that we correctly reject the null hypothesis, influenced by significance
level (higher significance, higher power). Also impacted by sample size (power increase with
larger size) and the variability in DV (more variable the DV, harder it is to reject)
 
 
TESTS
 One sample t-test = single numeric variable
 One sample z-test = single numeric variable (population mean known)

o Z value close to 0 big probability, far from 0 is a small probability


 Chi-square goodness of fit test = single categorical variable
 Correlation = two numeric variables
 Independent samples/two-sample t-test = one numeric variable, one categorical variable
 Paired samples t-test = one numeric variable, one categorical variable (related groups)

 Chi-square test of independence = two categorical variables


 McNemar's test = two categorical variables (different time point categories)


 Effect sizes
o Cohen's d = 0.2 (small), 0.5 (medium), 0.8 (large)
o Cohen's w = 0.1 (small), 0.3 (medium), 0.5 (large)
o Pearson's r = 0.1 (small), 0.3 (medium), 0.5 (large)
o Pearson's correlation coefficient (all +)= 0-.10 (weak-none), .10-.30 (weak), .30-.50
(moderate), .50-1.00 (strong)

 Descriptive statistics, understanding the sample (and comparison to population), phenomena


and the statistical analyses needed
o Demographics = tab1 Gender Ethnicity (frequency table), summarize Age (mean, range,
std)
o Key Variables = tab1 control group, tab1 experimental group
o Histogram frequency charts
 
 
 
 
STATA
 label define yesno 1 "Yes" 2 "No"
o Labelling values that was previously numbers
 label values variable variable variable yesno
o Attaches the yes/no label to specific variables
 tab1 variable
o Frequency table
 tabulate variable, expected row chi2
o Assumption check that expected frequency is at least 5 for Chi square test of
independence
 recode variable (min/9 = 0) (10/max = 1), generate(new_variable_name)
o Separates data into 0 and 1 (under/over the limit) rather than a whole range of numbers
to summarize
 summarize variable, detail
o Descriptive statistics (mean, standard deviation, median, interquartile range, min and
max)

 swilk variable
o One sample t-test assumption check of normality
 reshape wide tol, i(id) j(age)
o This will restructure the data so that we have separate columns of tolerance score for
each age group and person.
 robvar Mean_general_ds, by(Gender)
o

o Levene's test to test equality of variances, null hypothesis is that they're equal, do we
reject it

Common questions

Powered by AI

Sample size significantly impacts statistical power; larger samples increase the power of a test, improving the likelihood of correctly rejecting a false null hypothesis. This occurs because larger samples reduce variability, making it easier to detect true effects in the population .

Quasi-experimental design lacks random assignment, often using natural or pre-existing groups, unlike true experiments that use randomization to control confounding variables. This can lead to challenges in achieving high internal validity as the cause-effect inferences may be compromised due to potential selection biases .

Between-subjects designs require more participants since different groups are exposed to different conditions, reducing the risk of carry-over effects but could lack statistical power . Within-subjects designs use the same participants across different conditions, increasing statistical power with half the participants needed, but carry-over effects pose a potential bias .

A two-tailed test is applied when the research hypothesis does not predict the direction of an effect or when testing non-directional hypotheses . It is important because it assesses the possibility of relationships in both directions, helping to avoid bias from only considering a single possible outcome .

Stratified random sampling is preferred when the population comprises subgroups that should be represented proportionately in the sample, ensuring diversity and reducing sampling error. This method is more effective than simple random sampling in heterogeneous populations where sub-populations vary significantly .

Research hypotheses are testable statements predicting correlations between variables, often directional, and specific to the context of a study, such as predicting higher depressive symptoms in those with traumatic brain injuries compared to those with mild injuries . Statistical hypotheses, on the other hand, are mathematically precise, non-directional, and suited for population claims, often set up in null (H0) and alternative (H1) formats to establish whether a significant difference or pattern exists .

The independent samples t-test is appropriate for comparing means across two independent groups as it assesses whether there are significant differences between the means of two independent samples on a continuous outcome variable .

Confidence intervals provide a range of plausible values for a population parameter, aiding in understanding the precision of an estimate. A 95% confidence level is typically used to balance Type I error risk and result reliability, meaning that if the same population were sampled multiple times, the interval would contain the parameter 95% of the time .

Validity ensures a study measures what it intends to, while reliability ensures consistent results over time. Validity can be assessed via construct, criterion, or face validity; for example, a depression scale must measure depressive symptoms not anxiety. Reliability can be evaluated through test-retest or internal consistency measures like Cronbach's alpha .

Longitudinal designs offer the ability to observe changes over time, establishing temporal precedence essential for causal inference . However, they face challenges such as test-retest effects, participant attrition, and the time investment required. Retroactive data could also involve recall bias .

You might also like