Quantitative proteomics is one of the most widespread applications of mass spectrometry. We have planned a series of short tutorials to help orient PhD students and young post-docs mostly focused on label-free experiments. This is not designed to be a comprehensive textbook, but an open-ended series discussing diverse issues. In this first tutorial we summarize major aspects of study design. In the second, we will discuss MaxQuant, a widely used software for evaluating LC-MS data in terms of protein abundances. The following two issues will deal with selecting proteins which can be reasonably quantified in an experiment, and how to deal with the often-missing values in protein abundance tables. The first step before designing an experiment is to determine the concept of the project. We must decide very clearly on what the objective is, and subsequently what questions need to be asked and what kind of answers we expect. Only after this can we design the experiments comprehensively. This might seem trivial, but incorrectly formulated experimental questions may lead to erroneous inferences, or to much lower impact and significance than hoped for. After conceptualizing the project, several things must be thoroughly examined regarding the experimental questions before further planning can commence. First, our questions should be testable and the experiments to be performed should be selected accordingly, for example, it must be decided if relative or absolute quantitation is required. Second, they should be defined before doing any experimental work. This is crucial to avoid false positives, for example, to employ FDR control correctly when performing multiple statistical tests. Third, response variables (typically protein abundances) should be reliably measurable, for example, their abundances should be over the limit of quantitation. Finally, it must be assessed whether we are focusing on the correct explanatory variables. This is especially important to avoid mistaking correlation for causality, for example, the age of patients should be considered when risk factors are evaluated. Having established the concept of the study, the next phase is experimental design. This involves extensive planning of every aspect from how to select samples, what kind of experiments should be performed, and what type of data analysis should be done. This is a major undertaking requiring a significant amount of time and effort, which usually requires preliminary experiments as well. Experimental design may be broadly divided into three categories: Medical aspects Determining adequate target and control groups depending on the objective of the project needs to be carefully considered.1, 2 Determining which individuals or samples should be included (or excluded) is also critical for the success of medical studies. First, only those should be included in a trial, whose condition does not interfere with the objective of the study (e.g., a patient may have multiple diseases at the same time, and the effects cannot be separated). Second, the groups studied should be homogenized with a number of parameters in mind. Most important among these are age, sex, and disease stage. Other parameters may also be used to select homogenous groups of subjects, for example, individuals with only a single illness, or non-smokers, or people with no prior medication taken. Beside scientific, often there are practical limitations as well (e.g., when only few individuals are available with a given medical condition). Having finished a study, it may be necessary to exclude a few individuals in the data evaluation stage (e.g., if there was a failure in the analytical system). Such exclusion criteria should always be defined before starting the study. All of these aspects require careful planning to minimize potential biases originating from sample selection. Since subjects of medical research are in most cases animals or humans, besides scientific, ethical concerns must be considered as well. This requires a detailed study protocol, summarizing scientific and ethical issues. Studies involving humans always (animal studies sometimes) require approval from the relevant authorities, typically the ethical committee of a hospital or the National Health Service (of the country where the research is taking place). The study protocol is a written record prepared before the initiation of study and must encompass in detail the description of the topic to be studied; ethical concerns; trial procedures; inclusion and exclusion criteria for the subjects included; and details of the decision making process. The study protocol should also include the potential benefits of the successful study, the number of patients to be included, planned interventions and their safety (if there is any, e.g., drawing blood), study design, the time interval of the study (including follow up time, e.g., until death, progression of the disease, or relapse), and the projected time until the final report (in most cases publication). Furthermore, before the initiation of any medical trial, participants should receive detailed information about the study including risks and procedures. They should also give written consent that they are willing to take part in the study voluntarily (informed consent). Study size The number of samples or studied individuals in a research project may vary in a wide range from only a handful to hundreds or even thousands. For example, verifying the efficiency of a protease may be achieved by analyzing only a small number of samples, while the characterization of differences in plasma protein levels in the case of two cancer types requires much more. Based on the number of samples, studies are usually separated into three groups: In a preliminary study, the aim is to develop and test experimental methods or research protocols and get preliminary information on samples to be studied in a later stage. Typically, only few samples are analyzed. These studies are also useful for supporting project proposals for more extensive studies. Preliminary studies are occasionally publishable if novel samples of methodologies are used. A pilot study is exploratory in nature. It is primarily used to characterize effect sizes (the standardized difference) between group means. Although analyzing more samples improves statistical power (the probability of obtaining true positive results), it also requires more effort (time and cost). In practice there is usually a compromise between these; ideally a pilot study should include 10, 15, or 25 samples per group if effect sizes are large, medium, or small, respectively.3 Because effect sizes are usually not known beforehand, they need to be approximated based on prior knowledge of the sample or species under investigation. Pilot studies are often sufficient to point research in novel directions and to push back frontiers of our knowledge. Most publications using quantitative proteomics are pilot studies and are within the range of the financial power of universities and academia. Their major limitation lies in medical research, primarily due to their inability to account for the heterogeneity of human population. Full-scale studies are always preceded by a pilot study and are usually more focused (e.g., on a few proteins or peptides of interest). In this case, effect sizes for the compounds of interest can be estimated based on the pilot study, hence necessary sample sizes can be calculated in advance.2 Full-scale studies that include hundreds or even thousands of samples or individuals have the advantage of representing the studied population better, and they are also suitable to identify small differences between groups. On the other hand, they require enormous amounts of resources, and extensive planning. Full-scale studies are necessary for medical applications, and are often commercially supported. 3. Planning analytical experiments and data evaluation strategies The experimental and technical design of a study involves careful planning of the sample preparation, analysis, data handling, and statistical analysis. First, it has to be decided if mass spectrometry based proteomics is the best experimental approach. For this, the possibilities and limitations of MS-based quantitative proteomics need to be understood. It is most frequently based on the measurement of LC-MS peak intensities or areas; however, this can be affected by several factors apart from analyte concentrations, for example, matrix effects, instrument parameters, and differences in ionization efficiencies. Because of these, there are two fundamental approaches to quantitation: relative and absolute quantitation. In relative quantitation two (or more) samples are compared to each other to determine the ratio of different analytes between them, which can be done with the use of stable isotope labeling (chemical, proteolytic, or metabolic labeling) or without it (label-free quantitation). On the other hand, in absolute quantitation protein concentrations (e.g., the concentration of an antibody species in human plasma) can be determined for individual samples with the help of stable isotope labeled internal standards. Although labeling-based strategies (both for relative and absolute quantitation) are more accurate and reproducible than label-free quantitation, they are more expensive, labor-intensive, and not applicable in every circumstance (e.g., for the analysis of human tissue samples). As for the majority of scientific projects that require quantitative proteomics label-free quantitation is sufficient, it is the method of choice in most cases. There is a multitude of experimental methods used in proteomics (sample preparation, HPLC-MS analysis, and data handling);4, 5 selecting and optimizing these is the main task of the analytical chemist. Discussing, or even listing these is out of the scope of this tutorial. Although the method of choice influences the planning process, there are universal concepts needed to be taken into account. The most important parts are the testing of the suitability of the system6—usually through preliminary studies; the randomization of sample preparation and analysis steps7—helping to reduce batch effects; and the implementation of experimental controls,8 which then can be used to monitor the state of the system during the experiments. The latter is especially important to track and minimize unwanted variations introduced into the data during the different steps of the analysis. For example, one may add standards (exogenous proteins or peptides) to the samples to monitor experimental errors, and use quality control samples to show system integrity. Data analysis and statistical methods used can also be tested for robustness via sensitivity analyses, which can provide information on how much the results are affected by the use of different methods. This is becoming more and more important in light of the large number of emerging data analysis solutions available.9 Designing and thoroughly planning a scientific project is an essential part of any project. Incorrectly formulated experimental questions or inadequately planned experiments can make the conclusions of the study invalid regardless of how well other aspects have been executed. Therefore, we recommend investing considerable time and effort into thoroughly planning every part of a project, and revising it several times before its initiation. This short tutorial is by no means comprehensive, but can serve as a starting point or a sort of guidebook summarizing the most important aspects to be considered before starting any experimental work.