# Intro to Probability & Stats/Lecture 1

 To understand and operate data, one needs to have at least the basic knowledge of statistics Statistics can be a powerful tool for making Business and financial decisions This is a collection of methods allowing us to: - collect data, - organize, analyze and interpret the data, - use the data to draw conclusions, make forecasts and take decisions. Statistics Statistics usually deals with large groups, or collections, of objects. It is not concerned with material objects themselves but with some of their characteristics that are of interest. These characteristics may be studied through measurements, responses, counts etc. What are these groups? populations A collection of all measurements, responses, or counts of interest is called a Population The most accurate information about a population can be obtained by a census a relatively small group (subcollection) of elements drawn from a population. a sample A sample may make 1/100, or 1/1000, or even a smaller part of the population, butif it is drawn correctly so it is a representative, it may yield sufficiently accurateinformation about the whole population. A complete statistical research consists of three major steps (branches) (1) Drawing a sample from the population. This is a very sensitive procedure, because if a sample was drawn incorrectly, the whole research will be worthless. (2) Organizing and processing the information provided by the sample. This is the task of descriptive statistics. (3) Based on the result obtained for a sample, conclusions must be made on the whole population. In other words, the statistician who alreadyknows the sample measurements must figure out the appropriate population measurements. This part is called inferential statistics. What is inferential Statistics? When the statistician who already knows the sample measure ments, figure out the appropriate population measurements Any data that can be measured and expressed in numbers are referred to as quantitative (numerical) Examples of this data are weight, length, width, temperature, age, price etc. quantitative (numerical) data If data cannot be measured numerically, they are qualitative (categorical) data Examples of this data are colors, brands, opinions (yes or no, good or bad) etc. qualitative (categorical) data The data cannot be arranged in any logically justified order is Nominal level The data can be arranged in some meaningful order, but it is impossible to define the intervals between them is Ordinal level The data can be arranged in a meaningful order and the intervals between them can be defined, but comparing two data values by dividing one by another makes no sense. That applies, among others, to scales without an inherent zero like temperature scale (the zero depends on the measurement system being used). Interval level. The data can be arranged in a meaningful order, the intervals between them can be defined, and comparing two data values by dividing one by another makes sense. Ratio level What is the study called when a researcher performs measurement without modifying the subject. an observational study A scientist who is studying the environment tries to make his/her presence as inconspicuous as possible to not disturb the wildlife.What type of study is this? an observational study what type of study is it when a researcher applies some treatment to the object and then observes the effect of it. an experimental study What type of study is demonstrated by a scientist who is studying the behavior of mice creates various situations for them and then watches and records their reactions. an experimental study an experiment can be performed not only on a real object but also on its physical or mathematical model is called a simulation : which are most efficient in computers A sample drawn from a population must be representative This is when each element of the population has an equal chance to be selected. Random sample This means that in a set ofnumbers (usually in a certain interval) each number has an equal chance to be selected. random number One should use this sampling method for Large poplulations Random Number Selection In this procedure not just each member of a population but each sample of a given size has an equal chance to be selected. Simple random sampling A very large population may be divided in two or more large groups that share some similar characteristics. The research will be performed on each group separately; after that the results will be combined. Stratified sampling The members of the population are ordered insome way; a starting point is chosen randomly; then each kth member is selected. Systematic sampling The population is first divided into a large number of sections (clusters); then several clusters are selected at random. After that, all members of the selected clusters are surveyed. This method is convenient when a population is divided into clusters in a natural way. Cluster Sampling The sample consists of members that can be easily available Convenience sampling There are two kinds of errors that may occur when a sample is being drawn Nonsampling Error Sampling Error This is an error caused by flaws of the sampling process. (The sample is not representative.) Nonsampling error This error is caused by the random nature of the sample.In a carefully selected sample, a nonsampling error can be prevented, but the possibility of a sampling error always exists. Sampling error A collection of methods allowing us to collect data, organize, analyze and interpret the data, and use the data to draw conclusions, make forecasts and take decisions Statistics A collection of all measurements, responses, or counts of interest. population Taking information of every element of the population. A census A relatively small group (subcollection) of elements drawn from a population. A sample A numerical characteristic of a population is Population parameter Anumerical characteristic of a sample is a Sample statistic Organizing and processing the information provided by the sample. Descriptive statistics Making conclusions about the population parameter based on the information obtained for a sample. Inferential statistics Any data that can be measured and expressed in numbers. Quantitative (numerical) data Any data that cannot be measured numerically. Qualitative (categorical) data The data cannot be arranged in any logically justified order. Nominal level of measurement The data can be arranged in some meaningful order, but it is impossible to define the intervals between them. Ordinal level of measurement The data can be arranged in a meaningful order and the intervals between them can be defined, but comparing two data values by dividing one by another makes no sense. Interval level of measurement The data can be arranged in a meaningful order, the intervals between them can be defined, and comparing two data values by dividing one by another makes sense. Ratio level of measurement A study where a researcher performs measurement without modifying the subject. Observational study A study where a researcher applies some treatment to the object and then observes the effect of it. Experimental study A study that is performed not on a real object but on its physical or mathematical model. Simulation A number selected from a set of numbers in a procedure where each number has an equal chance to be selected. Random number Each element of the population has an equal chance to be selected. Random sampling Each sample of a given size has an equal chance to be selected. Simple random sampling A very large population is divided in two or more large groups (strata) that share some similar characteristics. The research is performed on each group separately; after that the results are combined. Stratified sampling The members of the population are ordered in some way; a starting point is randomly chosen; then each kth member is selected. Systematic sampling The population is first divided into a large number of sections (clusters); then several clusters are selected at random. After that, all members of the selected clusters are surveyed. Cluster sampling The sample that consists of data that can be easily available. Convenience sampling An error caused by flaws of the sampling process. (The sample is not representative.) Nonsampling error An error caused by the random nature of the sample. Sampling error AuthorAnonymous ID3769 Card SetIntro to Probability & Stats/Lecture 1 DescriptionBasic definitions, Types of Data, Methods of sampling Updated2010-01-05T08:36:01Z Show Answers