In order to continue enjoying our site, we ask that you confirm your identity as a human. Thank you very much for your cooperation. Show
A within-subject design is a type of experimental design in which all participants are exposed to every treatment or condition. It is also known as a repeated measures design. The term "treatment" is used to describe the different levels of the independent variable, the variable that's controlled by the experimenter. In other words, all of the subjects in the study are treated with the critical variable in question. This article discusses what a within-subjects design is, how this type of experimental design works, and how it compares to a between-subjects design. Let's imagine that you are doing an experiment on exercise and memory. For your independent variable, you decide to try two different types of exercise: yoga and jogging. Instead of breaking participants up into two groups, you have all the participants try yoga before taking a memory test. Then, you have all the participants try jogging before taking a memory test. Next, you compare the test scores to determine which type of exercise had the greatest effect on performance on the memory tests. This within-subjects design can be compared to what is known as a between-subjects design. In a between-subjects design, people are only assigned to a single treatment. So one group of participants would receive one treatment, while another group would receive a different treatment. The differences between the two groups would then be compared. Consider the earlier example of the experiment looking at exercise and memory. In a between-subjects design, one group of participants would do yoga and then take a memory test. A different group of participants would jog and then take the memory test. Afterward, the results of the memory tests would be compared to see how the type of exercise influenced memory.
In a within-subjects design, all participants receive every treatment. In a between-subjects design, participants only receive one treatment. Why exactly would researchers want to use a within-subject design? One of the most significant benefits of this type of experimental design is that it does not require a large pool of participants. A similar experiment in a between-subject design, which is when two or more groups of participants are tested with different factors, would require twice as many participants as a within-subject design. A within-subject design can also help reduce errors associated with individual differences. In a between-subject design where individuals are randomly assigned to the independent variable or treatment, there is still a possibility that there may be fundamental differences between the groups that could impact the experiment's results.
In a within-subject design, individuals are exposed to all levels of a treatment, so individual differences will not distort the results. Each participant serves as their own baseline. This type of experimental design can be advantageous in some cases, but there are some potential drawbacks to consider. A major drawback of using a within-subject design is that the sheer act of having participants take part in one condition can impact the performance or behavior on all other conditions, a problem known as a carryover effect. So for instance in our earlier example, having participants take part in yoga might have an impact on their later performance in jogging and may even affect their performance on later memory tests.
Fatigue is another potential drawback of using a within-subject design. Participants may become exhausted, bored, or less motivated after taking part in multiple treatments or tests. Finally, performance on subsequent tests can also be affected by practice effects. Taking part in different levels of the treatment or taking the measurement tests several times might help the participants become more skilled. This means they may be able to figure out how to game the results in order to do better on the experiment. This can skew the results and make it difficult to determine if any effect is due to the different levels of the treatment or simply a result of practice. Frequently Asked Questions
The term experiment is defined as the systematic procedure carried out under controlled conditions in order to discover an unknown effect, to test or establish a hypothesis, or to illustrate a known effect. When analyzing a process, experiments are often used to evaluate which process inputs have a significant impact on the process output, and what the target level of those inputs should be to achieve a desired result (output). Experiments can be designed in many different ways to collect this information. Design of Experiments (DOE) is also referred to as Designed Experiments or Experimental Design - all of the terms have the same meaning. Experimental design can be used at the point of greatest leverage to reduce design costs by speeding up the design process, reducing late engineering design changes, and reducing product material and labor complexity. Designed Experiments are also powerful tools to achieve manufacturing cost savings by minimizing process variation and reducing rework, scrap, and the need for inspection. This Toolbox module includes a general overview of Experimental Design and links and other resources to assist you in conducting designed experiments. A glossary of terms is also available at any time through the Help function, and we recommend that you read through it to familiarize yourself with any unfamiliar terms. 2. PreparationIf you do not have a general knowledge of statistics, review the Histogram, Statistical Process Control, and Regression and Correlation Analysis modules of the Toolbox prior to working with this module. You can use the MoreSteam's data analysis software EngineRoom® for Excel to create and analyze many commonly used but powerful experimental designs. Free trials of several other statistical packages can also be downloaded through the MoreSteam.com Statistical Software module of the Toolbox. In addition, the book DOE Simplified, by Anderson and Whitcomb, comes with a sample of excellent DOE software that will work for 180 days after installation. 3. Components of Experimental DesignConsider the following diagram of a cake-baking process (Figure 1). There are three aspects of the process that are analyzed by a designed experiment:
Figure 1 4. Purpose of ExperimentationDesigned experiments have many potential uses in improving processes and products, including:
5. Experiment Design GuidelinesThe Design of an experiment addresses the questions outlined above by stipulating the following:
A well-designed experiment is as simple as possible - obtaining the required information in a cost effective and reproducible manner. MoreSteam.com Reminder: Like Statistical Process Control, reliable experiment results are predicated upon two conditions: a capable measurement system, and a stable process. If the measurement system contributes excessive error, the experiment results will be muddied. You can use the Measurement Systems Analysis module from the Toolbox to evaluate the measurement system before you conduct your experiment. Likewise, you can use the Statistical Process Control module to help you evaluate the statistical stability of the process being evaluated. Variation impacting the response must be limited to common cause random error - not special cause variation from specific events. When designing an experiment, pay particular heed to four potential traps that can create experimental difficulties:
6. Experiment Design ProcessThe flow chart below (Figure 3) illustrates the experiment design process: Figure 3 7. Test of Means - One Factor ExperimentOne of the most common types of experiments is the comparison of two process methods, or two methods of treatment. There are several ways to analyze such an experiment depending upon the information available from the population as well as the sample. One of the most straight-forward methods to evaluate a new process method is to plot the results on an SPC chart that also includes historical data from the baseline process, with established control limits. Then apply the standard rules to evaluate out-of-control conditions to see if the process has been shifted. You may need to collect several sub-groups worth of data in order to make a determination, although a single sub-group could fall outside of the existing control limits. You can link to the Statistical Process Control charts module of the Toolbox for help. An alternative to the control chart approach is to use the F-test (F-ratio) to compare the means of alternate treatments. This is done automatically by the ANOVA (Analysis of Variance) function of statistical software, but we will illustrate the calculation using the following example: A commuter wanted to find a quicker route home from work. There were two alternatives to bypass traffic bottlenecks. The commuter timed the trip home over a month and a half, recording ten data points for each alternative. MoreSteam Reminder: Take care to make sure your experimental runs are randomized - i.e., run in random order. Randomization is necessary to avoid the impact of lurking variables. Consider the example of measuring the time to drive home: if a major highway project is started at the end of the sample period increases commute time, then the highway project could bias the results if a given treatment (route) is sampled during that time period. Scheduling the experimental runs is necessary to ensure independence of observations. You can randomize your runs using pennies - write the reference number for each run on a penny with a pencil, then draw the pennies from a container and record the order. The data are shown below along with the mean for each route (treatment), and the variance for each route: As shown on the table above, both new routes home (B&C) appear to be quicker than the existing route A. To determine whether the difference in treatment means is due to random chance or a statistically significant different process, an ANOVA F-test is performed. The F-test analysis is the basis for model evaluation of both single factor and multi-factor experiments. This analysis is commonly output as an ANOVA table by statistical analysis software, as illustrated by the table below: The most important output of the table is the F-ratio (3.61). The F-ratio is equivalent to the Mean Square (variation) between the groups (treatments, or routes home in our example) of 19.9 divided by the Mean Square error within the groups (variation within the given route samples) of 5.51. The Model F-ratio of 3.61 implies the model is significant.The p-value ('Probability of exceeding the observed F-ratio assuming no significant differences among the means') of 0.0408 indicates that there is only a 4.08% probability that a Model F-ratio this large could occur due to noise (random chance). In other words, the three routes differ significantly in terms of the time taken to reach home from work. The following graph (Figure 4) shows 'Simultaneous Pairwise Difference' Confidence Intervals for each pair of differences among the treatment means. If an interval includes the value of zero (meaning 'zero difference'), the corresponding pair of means do NOT differ significantly. You can use these intervals to identify which of the three routes is different and by how much. The intervals contain the likely values of differences of treatment means (1-2), (1-3) and (2-3) respectively, each of which is likely to contain the true (population) mean difference in 95 out of 100 samples. Notice the second interval (1-3) does not include the value of zero; the means of routes 1 (A) and 3 (C) differ significantly. In fact, all values included in the (1, 3) interval are positive, so we can say that route 1 (A) has a longer commute time associated with it compared to route 3 (C). Figure 4 Other statistical approaches to the comparison of two or more treatments are available through the online statistics handbook - Chapter 7: Statistics Handbook8. Multi-Factor ExperimentsMulti-factor experiments are designed to evaluate multiple factors set at multiple levels. One approach is called a Full Factorial experiment, in which each factor is tested at each level in every possible combination with the other factors and their levels. Full factorial experiments that study all paired interactions can be economic and practical if there are few factors and only 2 or 3 levels per factor. The advantage is that all paired interactions can be studied. However, the number of runs goes up exponentially as additional factors are added. Experiments with many factors can quickly become unwieldy and costly to execute, as shown by the chart below: To study higher numbers of factors and interactions, Fractional Factorial designs can be used to reduce the number of runs by evaluating only a subset of all possible combinations of the factors. These designs are very cost effective, but the study of interactions between factors is limited, so the experimental layout must be decided before the experiment can be run (during the experiment design phase). MoreSteam Reminder: When selecting the factor levels for an experiment, it is critical to capture the natural variation of the process. Levels that are close to the process mean may hide the significance of factor over its likely range of values. For factors that are measured on a variable scale, try to select levels at plus/minus three standard deviations from the mean value.You can also use EngineRoom, MoreSteam's online statistical tool, to design and analyze several popular designed experiments. The application includes tutorials on planning and executing full, fractional and general factorial designs. Start a 30-day free trial today. 9. Advanced Topic - Taguchi MethodsDr. Genichi Taguchi is a Japanese statistician and Deming prize winner who pioneered techniques to improve quality through Robust Design of products and production processes. Dr. Taguchi developed fractional factorial experimental designs that use a very limited number of experimental runs. The specifics of Taguchi experimental design are beyond the scope of this tutorial, however, it is useful to understand Taguchi's Loss Function, which is the foundation of his quality improvement philosophy. Traditional thinking is that any part or product within specification is equally fit for use. In that case, loss (cost) from poor quality occurs only outside the specification (Figure 5). However, Taguchi makes the point that a part marginally within the specification is really little better than a part marginally outside the specification. As such, Taguchi describes a continuous Loss Function that increases as a part deviates from the target, or nominal value (Figure 6). The Loss Function stipulates that society's loss due to poorly performing products is proportional to the square of the deviation of the performance characteristic from its target value. Taguchi adds this cost to society (consumers) of poor quality to the production cost of the product to arrive at the total loss (cost). Taguchi uses designed experiments to produce product and process designs that are more robust - less sensitive to part/process variation. References
Additional Online Resources
Books
Designed experiments are an advanced and powerful analysis tool during projects. An effective experimenter can filter out noise and discover significant process factors. The factors can then be used to control response properties in a process and teams can then engineer a process to the exact specification their product or service requires. A well built experiment can save not only project time but also solve critical problems which have remained unseen in processes. Specifically, interactions of factors can be observed and evaluated. Ultimately, teams will learn what factors matter and what factors do not. Additional ResourcesRecorded Webcast: "Experimental Design in the Transactional Arena" MoreSteam.com offers a wide range of Lean Six Sigma online courses, including Black Belt, Green Belt, and DFSS training. Flexible training at an affordable price. View our course catalog |