What are the 4 conditions required for a binomial distribution?
- Fixed number of trials, n.
- Each trial should be success or failure
- Trials are independent
- Probability of success, p, at each trial is constant.
What are the 3/4 conditions for Poisson distribution.
- Events must occur singly in space or time
- (Within a fixed length in space or time)
- Events must occur independently of each other
- Events must occur at a constant rate in the sense that the mean number of occurances in the interval is proportional to the length of the interval
What is meant by population?
- A collection of individual people or items.
- A population may be of finite or infinite size (as in impossible to count).
What is a finite population and an infinite population?
- Finite: A population in which each individual member can be given a number.
- Infinite: A population in which it is impossible to number each individual member
What is the difference between a census and a sample?
- Census: information obtained from all members of the population.
- Sample: selection of individual members or items of the population.
What is simple random sampling?
A simple random sample, of size n, is one taken so that every possible sample of size n has an equal chance of being selected.
What is meant by 'sampling unit'?
- The individual units of a population.
- eg. John, Sandra and Tony could be 3 sampling units making up population of students in school.
What is meant by a 'sampling frame'?
- A list of the items or people (sampling units) used in practice to represent a population from which a sample is taken.
- Could take a variety of forms - a list, a map, file, database, index.
- How well a sampling frame covers a population and its accuracy are important as sampling frame is basis of any sample drawn.
What are the advantages of taking a census?
- Every single member of the population is used/studied.
- It is unbiased
- Gives an accurate answer
What are the disadvantages of taking a census?
- It takes a long time to do.
- It is costly.
- It is often difficult to ensure that the whole population is surveyed.
What are the advantages of taking a sample?
- If a population is large and well mixed a sample will be representative of the whole population.
- Cheaper than census
- Advantageous when testing of items results in their destruction.
- Data is generally more readily available.
What are the disadvantages of taking a sample?
- There is an uncertainty in that there will be a natural variation between the individual sampling units.
- Bias: anything that occurs when taking a sample which prevents it from being truly representative of the population from which it is taken.
- Bias can occur when incomplete sampling frame is used. (Or if you get responses only from people that have a particular interest in the study).
- Bias can also happen easily because of a fault in the way the person takes a sample. (If they let their personal feelings influence the choice of who to sample).
What is a simple random variable and conditions for it? (not in spec)
- A simple random sample, of size n, is one taken so that every possible sample of size n has an equal chance of being selected.
- Oberservations, X1, X2, ... Xn. Where Xi
- are independent random variables
- have the same distribution as the population
What is a statistic?
A quantity calculated solely from the observations in a sample. It does not involve any unknown parameters ie. a statistic is a numerical property of a sample.
What is a parameter? (not in spec but should know).
- A numerical property of a sample.
- Eg. Sample of school pop to see who wants school uniforms. One possible parameter of interest is the proportion of those who want new uniform.
Define population parameter.
Any characteristic of a population which is measurable.
What is meant by a sampling distribution of a statistic?
- The sampling distribution of a statistic gives all the values of a statistic and the probability that each would happen by chance alone.
- (If we repeatedly take samples from population and calculate the same statistic each time, there is a range of values that the statistic can take. The statistic itself will have its own distribution which we call the sampling distribution.
You will need to be able to calculate the sampling distribution of a statistic. It's hard to put any rules down or anything, so just make sure you practice this a lot.
What is meant by a 'hypothesis'?
- A statement made about the value of a population parameter that we wish to test by collecting evidence in the form a sample.
- The evidence comes from a sample which is summarised in the form of a statistic called the test statistic.
What is a hypothesis test?
- "A mathematical procedure to examine a value of a population parameter proposed by the null hypothesis H0, compared to the alternative hypothesis H1".
- Statistical tests to determine whether a hypothesis is accepted or rejected.
- This decision is based upon the idea that some values of X are unlikely under the null hypothesis and would be better explained by the alternative hypothesis.
- In hypothesis testing, two hypotheses are used; the null hypothesis and the alternative hypothesis.
What is a test statistic?
In any hypothesis test the evidence comes from a sample which is summarised in the form of a test statistic.
What is the critical region?
- The range of values of the test statistic that would lead to rejection of H0 (the null hypothesis).
- The region that contains the values that collectively have a small chance of happening under the null hypothesis.
The values on the boundary of the critical region are called what?
What is meant by 'significance level'?
- The probability that a relationship observed in statistical analyses were actually due to chance.
- The significance level is a threshold probability established before the statistical analysis is undertaken.
- If the statistical tests indicate that the chances of finding the observed results are higher than the set significance level, the results are "not significant." Significance levels are usually set at .05, which means that significant results may actually be due to chance 5 out of 100 times.
When do you use a one-tailed and a two-tailed test?
- One-tailed: when looking either for an increase or for a decrease in parameter, and has a single critical value.
- Two tailed: when looking for both an increase and a decrease in parameter, and has 2 critical values. [If we want 5% significance level, we allow 2.5% at either tail].
What is the actual significance level?
The probability of rejecting the null hypothesis.