Sheila here, writing with my partner in survey design Kim, of Leonard Research & Evaluation with a longform interview with esteemed evaluation colleagues Dylan Felt and Gregory Phillips II. At the time of this writing Dylan is a PhD student in Sociomedical Sciences—Sociology at Columbia University Mailman School of Public Health who studies the social conditions which produce health within transgender populations. Gregory is a tenured Associate Professor at Northwestern University who studies factors associated with health disparities among LGBTQ+ populations. 

Dylan and Gregory have been amazing in their generosity and willingness to share their vast knowledge in this space. We were honored to present a session on asking survey questions about sex and gender at Evaluation 2021, the annual conference of the American Evaluation Association. We’ve been fortunate to stay in touch with them since! 

When the four of us last met to discuss the topic of asking survey questions about sex and gender in detail, the conversation was rich and expansive! A very abbreviated version of our conversation (edited for brevity) will appear in the second edition of Designing Quality Survey Questions, which is available for pre-order now. We also felt it was really important to share more of our conversation in this format to provide more space for nuance. 

Kim: You’ve both been studying how to ask questions about sex and gender for a long time. What has surprised you along the way? 

Gregory: Just how badly people do it and how many different ways people can do it badly. Every time we come across something we think is the worst possible permutation, we find a new one.

Dylan: So true. There is so much work to do to actually explore what it means to ask these questions, what it means when we answer in particular ways, how we are defining different groups of people and what assumptions go into that.

I think bad questions are often just due to lack of curiosity. People are really hungry for a very simplified and standardized solution, which is understandable, but takes away a lot of opportunity for real learning and real exploration of a subject that maybe some people don’t know as much about going into it. 

Sheila: I can’t help but wonder also, is it just fear of the unknown? Is it that people don’t want to explore this because it’s too unfamiliar? 

Dylan: Yeah, I think there’s one iteration of that fear, which is that people really want to do the right thing. They don’t want to offend somebody, or hurt somebody. And then I also think there’s a version of that fear that is rooted in our being used to, or assuming that some of these social categories that define so much of our world are immutable parts of reality. Instead of feeling afraid of what that means for us and the work it might cause for us–personal or professional—to make sense of our data, I just think that’s worthwhile work. I just think that’s really worthwhile work in terms of understanding ourselves better, understanding other people better, and being able to be better at the work that we’re doing.

Sheila: How do we help people accept that there is no easy answer? 

Dylan: I could share examples of the measures I’ve used that I’ve liked and why, but ultimately, if you want to do this work effectively you have to be willing to put in the work to understand people, and to understand experiences of life and embodiment that may feel profoundly different to your own. You need to have an open mind, and recognize that it is something you have to continue to work at. 

Gregory: Right? Because there’s this assumption that if you’re cis and heterosexual, at whatever point that happens, you’re that way for the rest of your life and you don’t think about it, but it’s everyone else that thinks about it, and that’s the hard thing to get over.

Dylan: Yeah, exactly. People for whom our systems of sex and gender and sexual orientation have been comfortable often have not had to do that type of thinking. And so it maybe comes a little bit more naturally for us, but that feeling of it coming more naturally is just practice that we’ve had to do in order to survive and make sense of the world, rather than some weird inherent quality.

Sheila: How can we be connected to queer and trans people, and seek to understand them, in a way that’s meaningful and yet respectful?

Dylan: If you don’t have any gay or trans people in your life, maybe that’s something for you to think about a little bit. Consider how you might get involved in spaces where you can contribute to the movements for liberation and through getting involved in a meaningful way, maybe there’s an opportunity to learn more and make connections. I also want to be clear that I’m not encouraging you to go down to your local gay rights organization and start asking them a bunch of invasive questions. A ton of resources do exist online, and I think it’s lovely when folks come with questions in good faith and they’re met with answers and resources. 

Kim: Can we talk a little bit more about why there isn’t one way to ask these questions and what we should be thinking about?

Gregory: Yes.

It depends on the purpose of asking the questions. For example, if you are interested in how people experience stigma related to gender, there are some questions where understanding someone’s identity is most important, others where getting a sense of their gender expression is most important, and still others where you’ll need to know both. We always advocate for specificity—you really should know what you want to ask, and why, and not fall back on a single “best practice” because that just isn’t going to work in every scenario.  

It gets trickier when you’re in the position of needing or wanting to make sure the majority understands the questions. If you’re doing a study with LGBTQ+ populations, you may be able to  include a lot more response options without defining them. If you’re working with a more general population, then you might be more limited in what will be understandable, and you’ll want to avoid confusion—but you also don’t want to sacrifice specificity for the sake of the majority! So it really does come down to the purpose and the community you’re asking.

Dylan: Even then we rarely have easy answers about what we should be asking and what we shouldn’t. Because it is also likely important for a clinician to learn how a person identifies so that you provide them with responsive care. 

While we might simplify when we are developing surveys that are going out to general populations to avoid confusion, the problem with that is we are communicating that, whoever’s more complex than the labels we can comfortably include, their experiences are less important. That default towards the norm, towards what is more socially legible—that’s not uncomplicated on its own.

Often, the ethical dimensions of that choice is ignored in the conversation because we’re focused on making the survey understandable. But what we’re really doing is telling someone that they have to make themselves understandable to someone else, and that can have a significant impact on our participants. 

In every situation, there are different considerations and priorities, and none of it is ever as simple as we might want it to be. A one-size-fits-all measure should not be the ultimate goal. Instead, the ultimate goal should be more reflexive, precise, and critical measurement. Can we get to the point where researchers are able to effectively explain and defend their choices? 

It is really frustrating to me that we expect such rigor from other forms of measurement, especially risk or outcome measures. Epidemiologically speaking, we expect real rigor about how we measure things and what it means that we did it in a particular way, but we seem to fall back on “just-so” assumptions far too often when it comes to sexuality and sex/gender measurement. My experience of this is public health specific, but I’m sure it applies to varying degrees in other fields. This incuriosity means that the assumptions of often white, cisgender, heterosexual researchers get baked into measures and never challenged. 

Gregory: It also depends on the resources you have to actually work with the data because though it might be ideal to have write-in options, somebody has to code those, and sometimes that’s not realistic.

We sometimes have to balance the amount of time it takes to recode and collapse versus wanting to make sure you are being fully respectful of every single identity that somebody could possibly mention. You can have the best intentions at the beginning in using an open-ended question, but then if you don’t know what these different terms mean, you can really do a disservice with the data at the end, and you may just end up misclassifying someone anyway, but with extra steps. Think carefully about what you have the capacity for, but do try to push yourself to improve on the basics.

Sheila: As you were talking about terminology, I was thinking about a gender question we saw that said simply, “What is your gender?” And the options were woman, man, non-binary, genderqueer, something not listed here, and decline to state. This sent me down a little bit of a rabbit hole around the differences between the terms non-binary and genderqueer, and I found that definitions varied across resources. I wanted a clear answer or a solid resource but didn’t find one. 

Dylan: If I was giving advice to the person who had created this question and this dataset, I might say, first of all, okay, if you’re including genderqueer as an option in your responses, did you take some time to speak to genderqueer people about what that means before launching your survey? Did you take some time to understand what that category means? 

Both queer and genderqueer are very complicated things to make categorical at all, because queer as a term is sort of specifically meant to resist categorization. It’s meant to resist being put in a box In a lot of ways, it is a deliberately destabilizing term, particularly in the ways it’s been reclaimed. And that is a very hard thing to form a meaningful group of people out of because identifying as queer can mean very very different things for every single person.

Also, what are you actually trying to understand here? Are you actually trying to understand someone’s experience? Disentangling that might be really hard.

Gregory: You’re also never going to get a consensus on what’s the umbrella term over another because there just isn’t agreement. As long as you accept that there are people who aren’t in the binary.

Kim: Can you share more about the dominant approach to sex and gender questions and its criticisms?

Dylan: The dominant approach to gender and sex measurement is a two-step measure that asks about sex assigned at birth and current gender. There are a lot of things that I can point to about this approach that are positive, one of which is that it provides a straightforward, simple way for people who are new to this work to get started, and a lot of time has gone into making sure that versions of this question are accessible to non-transgender people. But like we touched on above, that’s also a limitation—this approach is a way to make trans and gender expansive experience broadly legible. It allows for just enough variance that something can be different—that your sex assigned at birth and your gender aren’t inherently the same. But even though we’ve decoupled these concepts, we’re stuck in a pretty limited framework where we can only organize trans people around these two points–you were A, and now you’re B. That’s a very cisnormative way of trying to understand transness because you can only think of how someone is deviating from what is expected—it doesn’t help us to question the expectation. 

I think there are a lot of trans people who understand themselves this way, so I’m not trying to hate on this necessarily, but to acknowledge the complicated messy elements of embodiment, the non-linearity of so many people’s journeys, the ways in which we may or may not remain attached to certain elements of our pre-transition bodies. This is very, very complicated for people to understand, and so the only ways that it’s been understood is in relation to that dominant normative sex and gender system. Trying to really, really understand the wide diversity of experience that actually does exist outside of that system entirely… it’s like people’s brains break, or like Gregory was saying, they don’t want to even try because it’s too “messy.”

Sheila: I think many people are willing to say that there are more than two buckets for gender. Maybe they think, “I can be okay with a third, maybe even a fourth or fifth, but you must fit into this small set somehow. We can’t have 15 or 20 buckets.”

Dylan: And there must be buckets.

Sheila: And you have to fit somewhere, and you have to be labeled.

Gregory: And that label has to be sort of somewhere between male and female, to be discrete. You can’t be outside of that, and you can’t be flexible in that. 

Dylan: Yeah, and there’s a lot of trans and feminist writing specifically that talks about the ways that we, socially, try so hard to keep our current system of gender stratification by making it more flexible, by allowing it to have a little more space to move around. It’s our way of trying to hold on to what is still ultimately a very rigid system of social stratification.

I don’t endorse any one thing as the only solution to all of our demographic measurement problems—these will persist forever, because a “population” is never a static thing—but if there’s one thing I think should stick around, it’s an approach characterized by transparency, reflexivity, criticality, and collaboration.

Sheila: Why do we think it is so difficult?

Gregory: There’s this belief or expectation that people figured out race a long time ago or that it’s easy to measure, so this should be too. So, obviously that’s not actually true because race is also complex, but there’s this sense that we don’t really want to deal with any other hard questions, so we default to male, female, everyone’s straight, and that’s it, because it’s too hard to figure out otherwise.

Demographics are used as a proxy for so many things. In the clinical space, the organs matter, but in general, what we really want to understand is experience. If you’re a cis woman, you may have different experiences in the world than one does as a cis or trans  man or as a trans woman—there may also be areas of overlap. And those differences and similarities aren’t just about genitalia. But that’s  what we use to try to make it simple. We call people Black, and even though racism and racialization are deeply socially embedded, it’s easier to measure whether someone is Black or not, and treat that as a proxy measure, than to directly measure their experience with poverty or other disadvantages. We often categorize people to get at stigma.

Dylan: Yes, again, there’s still a really strong assumption that racial categories represent distinct biological entities, that there is some genetic difference between white people and Black people etcetera. And people do not want to let go of that very, very regressive way of thinking about differences. The same logic argues there is a real, I’m using heavy air quotes here, “natural” biological distinction between men and women, because the idea that these things are natural or biological is used to uphold the racial and gendered hierarchies of our world. 

Sheila: There’s a lack of understanding of biology, or an oversimplification, there. People who are saying “wait, I was there in seventh grade. You either have an XY or you have an XX, and that’s it.” You start to tell people that it’s more complicated, and they’re not ready to hear that. 

Dylan: And it’s hard for people to accept that science is a social process, and that the way that we classify people is a social choice. Sometimes a violent choice. We still do this when intersex babies are born. There are human rights violations that are visited on intersex children in order to normalize their bodies and bring them into conformity with our binary sex system. And it is such a deeply socially embedded process that people really want to believe that we have just passively observed a “real” or “natural” difference in the world and named those differences. But we drew those borders, and we enforce the borders violently in order to maintain them. That’s a hard thing for people to accept, or to think about—how their demographic measurement choices contribute to the enforcement of those borders. It’s not the same thing as performing a surgery on an infant that can’t consent, but it is another way in which we as scientists, as researchers, as evaluators, participate in the maintenance of these demographic borders, which perpetuate the social stratification that is so endemic to our world.

Kim: How can we do better, even if there’s no one right way?

Dylan: There’s an interesting approach I see becoming more common where researchers include multiple approaches to questions about sex and gender within a single survey. Again, I never will recommend a standardized approach, but I think this is interesting, and I want to see people exploring it. If people have the option to pick one version to answer, what do they pick? This can provide the researcher with an opportunity to understand the limitations of their own categories and then present those categories as limited snapshots and with a little bit more nuance. And I think that’s really interesting. It also potentially allows a participant the opportunity to feel understood a little bit better and potentially feel a little bit more trust with this survey designer, even if they don’t like having to pick which question to answer. 

Gregory: Setting up a question as a “check all that apply” can help too. That way, someone can indicate that they’re genderqueer and female, and that might make you think about whether this person’s identity, for the purposes of your study, is meaningfully similar or different to  someone who indicates genderqueer and non-binary. Or, it might not tell you exactly what you want because now it’s messier to classify, but it will give you more of a sense of the messiness that exists below the surface of what we think of as neat categories. Life is messy, and sometimes we don’t want to work through the mess—and sometimes we aren’t prepared to, so we may need to pair a question like this with one that forces a single choice—but pushing ourselves to try can always teach us something.

Dylan: Similarly, surveys could include a write-in space where they ask people, “Tell me a little bit more about what the option you chose means for you.” That starts to get closer to co-creating meaning with participants rather than having assumptions about what things mean. 

We bake all these assumptions into a survey and they’re reflected in how we’re reporting things out. Let’s communicate that. Let’s say that we made some choices about how we’re going to measure things. We gave participants the opportunity to provide us with feedback on those choices, and now we’re using both our choices and participants’ feedback to try to co-construct  something a little bit more meaningful. That’s exciting because it is a way of thinking a little bit more creatively and accepting that messiness. So again, I don’t endorse anything as the only solution to all of our demographic measurement problems that will persist forever, but I do think I want to see more of this happen.

Another maxim for me is: How can we maximize co-creating meaning from the sex, gender, sexual orientation data that we have? 

Sheila: Ooh, can you say more about that? 

Dylan: The idea of co-creation in survey research is exactly what it sounds like—you’re designing your survey, and the questions you want to ask and have that survey answer, in collaboration with the people who are at the heart of your project. Practically, this also means that before you launch a survey, you put in the work to understand what you’re communicating with your questions, how people might understand and respond to them, and how you’ll handle the data. There are also opportunities for co-creation once the data have been generated. You might  seek out opportunities to speak to either the people who provided you with that data, or maybe other people who share some of the  experiences represented in the data, that can help you parse and contextualize what you’re finding. 

We bake so many choices and assumptions into a survey, and whether we’re cognizant of it or not, they’re reflected in how we’re reporting results. Co-creation says: let’s think about sharing the power to make those choices and inviting people to question our assumptions and make meaning with us.  Making that a conscious, critical, collaborative process also means we can better articulate the choices and assumptions we made, why we made them, and how those assumptions were (or were not) communicated to people who were responding to the survey. Frankly, this should be a best practice in reporting, and I think co-creation helps to facilitate it. 

Kim: Ok, so when should we not ask these questions at all? When is it better to ask about discrimination experiences directly, or about health behavior directly, or about whatever it is that we actually want to learn about? 

Gregory: I think if that’s the primary purpose of what you’re doing, if you’re interested in stigma, I think that should be the focus. But I also think that if you’re asking people more vulnerable questions along with demographics, it has to be in a situation where they are comfortable with you. Asking those questions separately—about experiences of discrimination and demographics—that’s probably going to be less uncomfortable for them than if they had to answer something like “how many times are you beaten up because you’re trans?” or “how many times are you in the hospital because somebody called you a slur and beat you?”

Ideally, you would ask about the stigma without the identity because that’s what you care more about. But only in a situation where you’ve already either built a rapport or people know that this is what’s coming. Where you’re trusted. You wouldn’t want some just random cis straight person coming in and asking a  question like, “Have you ever been called a slur and how do you feel about it?”

Dylan: I’m not sure I think we ever shouldn’t, but I do feel that we want to think critically about our obligations to people’s privacy and safety. If you’re doing a survey of ten people in an office and you’ve got one person who said they’re trans, you’ll want to be careful that sharing that detail such that your gender breakdown wouldn’t out that person. It might help you to know that information, but your foremost obligation is to that person’s safety. 

I also think a lot of the time it’s not an either/or. A lot of the time people ask a demographic question and they take that data and assume a bunch of things about the types of stigma that they’ve experienced, and use that framework to interpret results.

Kim: It’s like using race as a proxy for other things. It’s the same problem fundamentally, right?

Dylan: Yeah. But I don’t think it’s ever quite as simple as I’m going to ask those other things instead either.

I identify as a trans woman. I have a lot of dear friends who are non-binary who have gone through the exact same transition steps as I have. And I have friends who also identify as trans women who have gone through totally different transitions, who have done way more to change their bodies than I have. And you’re not going to get that just from a categorical question, but you also aren’t going to understand other meaningful differences by leaving identity out of the picture. The reality is that gender is always active in the institutions, relationships, and structures that define our world, and therefore these things are always shaping  the ways we move through the world. If you’re thinking gender isn’t relevant in a particular situation, I think you’re probably wrong. It’s almost certainly not the only important thing to think about, and like I said, our ethical obligations to the people who fill out our surveys are always important to have in mind, but it’s always there.

I’m open to the possibility that there really might be a situation where the best thing to do for our participants is not ask. But I think we’re experiencing an ethical failure if we just don’t ask about sex or gender because “that’s too complicated” or “that’s not relevant”—I think that’s an act of epistemic silencing. One thing that might help is just telling your participants why you’re asking. Why do you think it matters that someone is intersex, queer, or trans in your context? You should always be able to answer that question, and sharing your answer with people as they take your survey may help them feel more comfortable responding. 

Sheila: We’re really talking about very personal, very sensitive, and potentially threatening questions, and we’ve been talking about health related examples a lot. I’m also thinking about program evaluators who don’t work in health, or even market researchers. The types of surveys they are creating often fall far outside the realm of health. 

Can we talk a bit about when sex and gender might be relevant in scenarios outside of health?

Dylan: I think like I said above, gender is always relevant, and certainly not just to health. For example: let’s say you’re evaluating a new history curriculum for a school. If there’s nothing in that curriculum about aspects of queer history, maybe the gay, bi, asexual and other queer people in your class are going to feel a little bit frustrated and maybe left out from that curriculum. Or, there could be queer and historical figures who get talked about, but who have their experiences minimized or erased. 

Or, say there’s an afterschool sports program and we want to understand more about programs…

Kim: Trying to build a sense of belonging, or some other aspect of social and emotional well-being.

Dylan: Yes, exactly. And these marginalized personal and social characteristics can shape whether or not we feel like we belong on a team, whether or not we feel safe in a locker room, whether or not we feel like we’re seen for who we are by the people around us and the teachers or other adults facilitating that. Or, maybe you’re looking to understand something about a particular workplace. Well, you probably want to think about how gender is active in that work environment. 

Basically, I think it’s always important, and I think a lot of the time, the assumption that we don’t need to ask about it prevents us from really understanding something that could be going on there. Sometimes something doesn’t feel relevant to sexual orientation or gender or sex or anything like that from a cisgender heterosexual person’s point of view. But it’s something that shapes the way we move through the world, and in a pretty significant way. A program, any program is a social microcosm and it can’t be fully separated from our broader social world. And our broader social world is one of significant gender and sexuality stratification and discrimination. And so I think there’s always the possibility that that experience will shape how somebody experiences that program regardless of what it is. I think it’s hard for me actually to think of an example where it wouldn’t. 

Sheila: That’s really helpful, thank you.

Dylan: It’s really understandable that it’s hard to think about because—and this has been researched—straight and cisgender people are more likely to think that it’s offensive to ask about sexual orientation or gender than queer and trans people are. In cases when we might not be comfortable sharing, often that’s related to concerns over what could happen to our data and the ways that data could be used to harm us. 

So I think if you’re interested in evaluating a leadership program that you’ve put on or an afterschool program or whatever, and you want to ask these questions, it’s probably going to be helpful to contextualize the question with why you’re asking it so that somebody understands that they can safely share that information with you if they choose to. It’s very real there that people might not instinctively trust that question depending on the context, so explaining why you’re asking is never a bad thing to do. It gives your participants more autonomy to choose how they want to respond and how honest they want to be in their response. And I think it’s best as much as we can to leave that choice to your participants, to the folks who are participating in the program.

Sheila: You’re reminding me of a scenario where we were doing a school climate survey with parents for a school district. We got a lot of pushback by asking parents the question, does your child identify as LGBTQ?—a yes or no question. And people pushed back, asking, “How can you ask a parent of a kindergartner that question?” But the question persisted into the survey. When we looked at the responses, there were clear differences between the parents of LGBTQ children in answers to a question about their priorities for the school district vs parents of children who did not identify as LGBTQ. Our committee learned so much about the value of asking those questions and looking at the results in those ways. If we didn’t have that information, we might have just focused on the priorities of the majority of responses and would have missed the opportunity to understand and respond appropriately.

Kim: That’s a great example of the power of disaggregating data.

Sheila: Yes. And it illustrates why we ask these  questions, right? Why we might need to know.

Kim: Well, wow. There’s so much more we could still talk about but as always, when we talk, I learn a ton. Our deep appreciation to both of you for your time and expertise and for doing the work you do to push practice around how to measure sex and gender identity.

Want to learn more? Here are just a few of the many excellent resources about sex and gender identity: 

  • Gender Spectrum: and Reimagine Gender: 
  • Felt, D., Perez-Bill, E., Ruprecht, M. M., Petillo, M., Beach, L. B., Glenn, E. E., & Phillips, G. (2022). Becoming an LGBTQ+ storyteller: Collecting and using data on gender, sex, and sexual orientation. New Directions for Evaluation, 2022, 31–52. (this entire journal issue is excellent!)
  • Ashley, F. (2022). ‘Trans’ is my gender modality: a modest terminological proposal. In L. Erickson-Schroth (Ed.), Trans bodies, trans selves: A resource by and for transgender communities (2nd ed.). Essay, Oxford University Press.

And don’t forget to check out the second edition of Designing Quality Survey Questions too! 

Interested in a talk or workshop on survey design, program evaluation, teaching or learning strategies, or any of the topics I offer? I’d love to chat with you.