An SNJ Associates Series: Research Methodology
Issue 3: Operationalizing Quantitative Variables
The focus of this post is operationalizing quantitative variables in research, which is a specific and very important part of the process to ensure valid and reliable data in any research study. In our last post Issue 2, we defined what a variable is and discussed both independent and dependent variables in quantitative research methodology. This post specifically covers the values attributed to variables and how we operationalize variables, including a detailed discussion regarding the levels of measurement of quantitative variables. It is critical that variables are clearly defined and accurately measure the specific quality or characteristic that the researchers intended to capture. For any research study, asking the question: ‘Is the data valid is always necessary. [If you would like to start at the beginning of our Research Methodology Series please visit here].
In quantitative methodology, we always work with a research question and quite often a hypothesis. To test any hypothesis, we must identify the population we are considering. For example, patients having knee surgery replacement, school children in the classroom, or individuals with diabetes in a clinical drug trial. The population is the group we are investigating and must be defined clearly so that we know exactly who the study results apply to or can be generalized to after the study is complete.
Similarly, we also stipulate the unit of analysis for a study. Using our school children as an example, we would ask if the unit of analysis individual students or individual classrooms? The unit of analysis may appear quite obvious at first; however, it is paramount to always state the unit of analysis for any research study. Once we have defined both our population of interest and clarified the unit of analysis, we can collect empirical data about each case in our study. A case is one unit of our analysis. Again either an individual student, a patient, or a classroom.
A term or question you will hear often about research studies is ‘how are the variables operationalized?’. Operationalizing a variable refers to how specifically researchers will observe or measure each variable. It is crucial in any study that how the value of each variable is collected is well defined and accurately measured or recorded. There are a variety of ways to operationalize the variable ‘Level of Education’. Let’s use level of education as an example. Researchers could operationalize ‘level of education’ in a variety of ways. The values of ‘level of education’ could be record as ‘High’ or ‘Low’ (where ‘low’=grades 1 through 12 and ‘high’=’any post-secondary’) OR ‘Actual number of years of education’ OR ‘Highest Level of Education Attained’ ( where 1=’Grade School’, 2=’Middle’, 3=’Some High School’, 4=’Graduated High School’, 5=’Some college’, 6=’Graduated College’ and 7=’Graduated Professional/Graduate School’).
Interestingly, what appeared simple at first, to capture the data for the variable ‘Level of Education’ all of a sudden has a lot of details to sort out. Operationalizing the data depends on a number of factors, including why the variable is being used in a study along with what type of data collection instrument is used. In survey research, the respondent is asked to answer the question. Thus, a respondent could count up their years of education and fill in the year on a survey. Conversely, the data may come from a secondary data source such as a record of some sort that researchers are recording the data from in their study and must take the data they have from the record.
Recall from Issue 2, variables describe or represent attributes or characteristics of our population. We may be interested in age (recorded in the number of years), the temperature in a classroom (measured in degree Celsius) or gender (recorded as male or female). Each variable in a study has a value associated with it and tells us something about the attribute or characteristic. All of the values assigned to each variable combined together make up the data for any given study and are referred to as the levels of measurement.
There are two distinct processes associated with measuring variables. The first is the variable coding. A well-organized study always has a code book. In the codebook each variable has a name, description and the range associated with the variable. All of this information is referred to as the coding for the variable. Let’s use our students in the classroom example. One of the characteristics we are interested in is the temperature in the classroom. The variable name is ‘Classroom Temperature’ (CT). The CT is measured in degrees Celsius and has a range of 15 to 30 degrees (based on the school’s policy for heating and cooling each classroom). Another variable we are interested in gathering information about is gender. The name of the variable is gender and the variable is recorded as either Male or Female. Both of the previous examples describe how the variables ‘Classroom Temperature’ and ‘Gender’ are coded.
Levels of Measurement: Ratio, Interval, Ordinal, and Nominal
The most important concept to understand regarding how variables are measured is specifically called the level of measurement. There are four distinct levels of measurement concerning variables and the four levels are known as a ratio, an interval, an ordinal and a nominal level of measurement. This may appear a bit confusing at first but stick with this concept. As we have stated many times here at SNJ Associates, once you take these concepts and information and apply them to real research studies, it will get easier to understand.
The first type of level of measurement is a ‘true’ ratio. The values of a variable measured at the ratio level of measurement may have two values of a variable (or two cases) compared and it is possible to make a valid quantifiable conclusion. For example, if we had a variable ‘Size of Classroom’ and we measure the actual size of each classroom in meters squared, we could say that one classroom was twice as large as another classroom. The ratio between the two classrooms is comparable and importantly one value divided by another value gives us a valid ratio comparison.
The second type of level of measurement is an interval level. The determining factor of an interval level measurement is where the interval is precisely defined. Let’s take the variable ‘Classroom Temperature’ as an example. The temperature in a room actually measures the heat content. We could not say for example that 0 degrees Celsius represents a room with no heat content. Temperature is not a true ratio; however, since we precisely defined our units as degrees of temperature on the Celsius scale, the interval is a precise valid form of measurement. If one classroom’s temperature was 30 degrees and a second classroom’s temperature was 15 degrees Celsius, we cannot say that the first room has twice as much heat content compared to the second room. However, because the intervals (degrees Celsius) are precisely defined and importantly also precisely measured, we can make a relative comparison and say that the first classroom measuring 30 degrees Celsius is warmer than the second room at 15 degrees. One value is more than the other and the conclusion makes sense.
The third level of measurement is known as an ordinal level. The name of our variable for this example is ‘Satisfaction with Room Temperature’ and is measured on a scale from 1-6 with 1 being the least satisfied and 6 representing the best possible satisfaction with room temperature (it is convention for many researchers to have a scale’s number represent values that will make sense upon interpretation. In this case, the higher the number represents increased satisfaction; however, there is no reason the scale could not be constructed the other way around and have the 6 represent the least satisfaction). If one student scored a 3 on satisfaction and a different student scored a 6, we could not say the second student is twice as satisfied with the room temperature compared to the first student because we do not have a true ratio or precisely defined intervals. How much more satisfied is 1 unit of the scale of 1-6? The answer is we really do not know. It is possible to ‘rank’ an ordinal variable but not compare then in a quantifiable term such as twice as much. The most familiar example of ordinal levels of measurement are the standard academic letter grades of A through F. We can certainly ‘rank’ the academic scores as we know an A is better than a B or C letter grade; however, we cannot make any statement about the quantitative difference as we do not know precisely how much better the A grade is compared to the C grade.
Finally, we have our nominal level of measurement. This one is very easy to remember if you think of the ‘n’ in nominal as representing the name. Gender is a nominal level measurement because the values have a name (or letter) assigned to each part of the variable. It is also helpful to refer to the different values as categories. There is no actual measurement taking place, rather we put different parts of the variable (gender) into categories. Note that a nominal variable cannot be ranked. An important aspect to any nominal variable is that when a category is recorded for each case in our population, there must only be one category the answer can apply to. This term is known as mutually exclusive. In other words, as researchers you do not want a case able to fit their answer into more than one category unless you are specifically wanting a “check all categories that apply’ type of response. Thus, an individual cannot be categorized as both male and female simultaneously. If a researcher wanted to capture more categories of gender, perhaps transgendered male, transgendered female, or gender x then the researcher would simply add more categories beyond male and female. It sounds simple to say researchers must make sure each individual may only fit into one category of a nominal variable, but it can be very tricky to ensure all the listed categories are mutually exclusive.
A quick note about coding nominal variables. If you are working in the area of research and looking at raw data you may see a nominal variable such as gender with values represented by numbers, ‘0’ and’ 1′ for male and female respectively (or the other way around). Do not be confused by this type of coding. This assignment is common and we will cover the reason for this in our discussions concerning statistical analysis. For now, just know that even when a nominal variable is given a numeric assignment it is crucial that we treat them as categories and do not apply any interpretations to the nominal variables that we would to ratio or interval variables where we have true ratios and precisely defined intervals
Categorical and Numerical Terms for Variables
It is important to note that variables with both a nominal or ordinal level of measurement are referred to as categorical or qualitative variables. Additionally, variables that are either ratio or interval levels of measurement are called numerical or quantitative variables. There is a great deal of information here to learn about the levels of measurement for variables in quantitative research. Remember to think about the categorical variables that are measured as either nominal or ordinal and understand that they represent categories and only when using an ordinal level of measurement does it make sense to ‘rank’ the categories. The numeric measurement of ratio and interval represent variables where precise intervals are defined and measurements are very accurate using a specific and well-defined units of some type (years, degrees Celsius, squared meters).
For our SNJ Associates’ Research Rangers, who are becoming better critical consumers of research information the tricky piece is understanding when the values of a variable are truly ratio thus interpretations of the data can be stated for example as ½ as much or twice the amount. The homework from this issue is to look through the literature (you can use our open access listings) and read a few articles identifying categorical and numeric variables. Once you have that done, review the specific variables and see if you can put them into their level of measurement: interval, ratio, ordinal and nominal. Keep an eye out for research statements that erroneously interpret the data results because they do not ensure the conclusions adhere to the level of measurement for the variable. Start your review by going back in this post and look at the ‘Level of Education’ variable we used when discussing the operationalization of variables. You will now notice that the example used a ratio, ordinal and nominal level of measurement (years of education, categories 1 through 7 and ‘high’/’low’), which is a great reminder that variables may be operationalized a number of ways!
There are often no best ways to measure a variable in a study, rather it is important to know why a particular variable is being used in the first place. Make sure that the variable is named, described well and operationalized in a way that there is no ambiguity as to how it is being observed, measured or recorded.
Issue 4 discusses hypothesis testing and we put some of the pieces together we’ve learned so far in the SNJ Research Methodology Series