Categories
Explore the current data quality landscape and AI's role in market research. Learn how Dynata ensures high data integrity and employs advanced AI tools.
In this week's episode of the Greenbook Podcast, host Lenny Murphy sits down with Steven Millman, Global Head of Research and Data Science at Dynata, to delve into the current landscape of data quality in online surveys and the transformative role of AI in market research. Steven shares insights on how Dynata ensures high data integrity, combats survey fraud, and employs advanced AI tools to improve research outcomes. They also explore the implications and potential of synthetic data, large language models, and AI-driven chatbots for qualitative research. Tune in to discover how AI is shaping the future of market research and the crucial importance of maintaining data quality in decision-making processes.
You can reach out to Steven on LinkedIn.
Many thanks to Steven for being our guest. Thanks also to our producer, Natalie Pusch; and our editor, Big Bad Audio.
Lenny: Hello, everybody. It’s Lenny Murphy with another edition of the Greenbook Podcast. Thank you for taking time out of your day to spend it with myself and my guest. And today, I am joined by Steven Millman, Global Head of Research and Data Science at Dynata. Steven, welcome.
Steven: Thank you so much for having me, Lenny.
Lenny: Always a pleasure, my friend, especially in a conversation like this because we get to talk about data. You know, we get to the heart of the matter, which sometimes we go all over the place, but I’m looking forward to this one. So why don’t you tell our audience about you, for those who are not familiar.
Steven: Sure. As you mentioned, I’m global head of research and data science at Dynata. I’ve been here for almost five years, and I run pretty much everything at the company that relates to research, to data science, to artificial intelligence. I’m also a member of the board of directors of the ARF, and I am the chair of the ARF’s workstream on artificial intelligence.
Lenny: All right. So we’ve broken it right out at the beginning. We know we’re going to talk about AI as well because we’re going to talk about data. So a little bit of context setting for the audience, as well as for maybe you, my history with Dynata goes all the way back to the e-Rewards day—so I’m dating myself—because it was the highest quality data, the best respondents, and that was the criteria of the decision making. And that criteria on why we should choose sample partners has not changed, although it maybe has been back-burnered in other ways, and we’ve made some tradeoffs in terms of quality on that. So let’s just kind of start there. Give the lay of the land, from your perspective, on the argument about data quality and where we are. Because, you know, you guys control such a significant piece of the global business that flows through your pipes. So, you know, what’s the state of the union on data and the equality right now?
Steven: So, first off, thank you very much. I’m always happy to hear that people use us and like us. So what is the state? The state of play in online service research these days really falls into a couple of categories. There are the people like us who own our own panels and we have direct relationship with the people who take surveys for us, and where we can ask them specific questions. We can train them on how to be good survey respondents, not how to answer but how to—we can identify when folks are maybe not as attentive as we’d like them to be. Because we have long-term relationships, it also reduces the likelihood that you’re going to see the kinds of fraud people are used to seeing in the industry, such as survey farms and bots. There are others out there in the markets who are aggregators, who develop relationships with first-party providers like us, and then bring them together. It turns out that every survey panel will use partner panels when they need them for particularly niche or hard to reach audiences. And then there are the exchanges, where you are really sort of pushing for the lowest dollar per complete. To your question about why do people care or why should they care about data quality, the answer to that is really simple. People don’t perform surveys because they’re curious. They perform surveys and they pay for that because they need to make a decision, and you can’t make good decisions on bad data. Simple as that.
Lenny: Yeah. I have heard, anecdotally and experienced myself, throwing away—kind of the average is 30 percent. I’ve heard as high as 50 and 60 percent in some studies, depending on the source of the data. And I can see—speaking of clients, the thinking of “All right. If I’m paying 25 cents per and I’m oversampling and it’s not a critical business decision, that that level of risk is acceptable.” I don’t buy it. I don't agree [laugh] with that logic, but I do understand it. I do understand that thinking. My sense is that that is shifting, and I think that it is shifting—and since you brought up AI, is the recognition of the stakes are higher now. If you’re building, organizationally, a synthesis, a data synthesis model internally driven by LLMs, you’re basically pissing in the well, you know. That contaminate carries through, and it’s not just the project and doing that. But that—those data assets have a long life now, potentially, and can be used in multiple ways than just the project. And, you know, it’s tainted. So what do you—are you seeing that as well? Is that something that you’re hearing from folks?
Steven: Yeah. So, I mean, 60 percent is—would be, I think, higher than the industry average really is. I do think probably it’s in the 40 percent range, across the board, looking at the various folks who are doing the research. Just to be clear, that’s not the amount that goes in. That’s the amount that you’re talking about being delivered to a client that they look at and “well, this is crap. I can’t use this.” In our world, we call that wastage, or we’ll talk about the rejection rate. And the rejection rate at Dynata is 3 percent, and it has to do with these advanced technologies we’re taking on to combat these things. What’s interesting about large language models—and everybody’s worried about the risk of large language models being trained to answer surveys at scale—is that those are actually, believe it or not, relatively easy to detect if you are using an advanced system. So if you—
Lenny: The open answer, too good [laugh].
Steven: Yeah. [laugh] This is the number one thing, right? No grammatical errors. No punctuation errors. And you get lots of verbiage.
Lenny: Yep.
Steven: It’s like, oh. Yeah. You know that’s not a real person. But it goes beyond that. So the sort of lower tier solutions, they are looking for the very simple things that everybody has been looking at forever, things like streamliners, people who take a survey too fast, right? We use and employ considerably more effective and rigorous tools. We have, I think, 175 behavioral data points that get evaluated in every survey, and we don't talk about all of them publicly because we don't want this to be a training manual for fraudsters. But among the things we can do is we know whether or not data was cut and pasted to an open-end, and that really catches a lot of the large language model use. We’re not necessarily seeing that the large language models are being employed as a way to answer a whole survey. There’s lot of technology that the bad guys are using to do that today. You don’t need that level of sophistication. Where we are seeing it being used is the really high-value surveys where you are looking for very specific people with specific knowledge sets. So if you do a B2B survey, and you want somebody who can speak intelligently on choices around data architecture platforms, bots aren’t going to get anywhere near that question, and survey farms, you know, they’re not hiring those people to [laugh] answer questions. So what they’ll do is they’ll open up a language model. They will pose the question to the model. The model will spit out an answer, and they’ll cut and paste it. We’ll see the cut and paste behavior. So we have lots of ways to catch it. The other thing that we do which is, again, much more sophisticated, is rather than looking at did the survey complete too quickly—bots have gotten much more sophisticated as well. They—we’re always sort of fighting each other to see who is in front. But what they learned is they could calculate the likely amount of time it would take a human to answer the survey, would finish the survey, and then it wouldn’t submit the survey until the time elapsed. So we actually look on a page-by-page basis whether or not questions are being answered too quickly. And we’re looking for other non-human behavior: the way the mouse moves, things like that.
Lenny: Yeah. We actually experienced that last year with GRIT on, you know, the GRIT 50, which people do like to try and game, and fully AI completed over 700. And we saw those same—yeah, somebody really wanted to get their ranking up. So, but... [laugh] so we’re seeing the same thing and seeing it at-scale in those specific instances. But I agree. I don’t think that we’re getting anywhere near that’s going to happen routinely for all consumer surveys.
Steven: Yeah. And I would add that, you know, you talk about, like, 60 percent, 70 percent. You do see that on individual surveys, so you will occasionally see a survey come in and, you know, maybe it’s got a really high value. Maybe they are—you know, someone will have come up with a sophisticated new way to commit fraud, and virtually every respondent in there needs to be thrown out. That’s why you need these sophisticated tools.
Lenny: It brings up an interesting point when—even with GRIT, when we looked at that. And it occurred to me of like, “Well, this is actually a synthetic respondent,” because the answers were contextually accurate. I absolutely could see that this is a point of view of a real respondent. Now, even though they were biased in a specific direction, it’s like the AI respondent loved AI. Any question we had in GRIT like, “Yeah. [laugh] AI is the best thing every.” So there certainly was a very defined perspective that was a good flag. But, you know, there’s this massive debate happening. You and I are recording this at the very end of June. Mark Ritson just released kind of the shot across the bow on the glories of synthetic sample. And although I didn’t agree with everything, I couldn’t disagree with a lot either in terms of at a very high level, as data resources are being synthesized and built up, the capability to duplicate and mimic and synthesize what an individual or a segment may look like is absolutely doable, and in some use cases, I would argue valid. But it starts with good, quality data, for one, for that to even be a consideration. And then, secondly, it is limited to the use case is kind of my default position. What’s yours? I mean, obviously, you guys sit on so much individual-level data, I have to imagine that you’re thinking of the next step for Dynata is actually monetizing that as a source for synthetic respondents for specific things. I mean, why wouldn’t you? Every company should be doing that. So what’s—just kind of what’s your thinking on this world?
Steven: There’s a number of different kinds of use cases for synthetic data, and I think the first thing that’s—I guess the first thing I would note is that there’s nothing new about synthetic data other than the name, right? So it has been something that data scientists, like myself, have been doing for 50 years, 60 years. And what it is, is just saying, “I’m going to estimate a data value that I can’t observe or I haven’t observed, and I’m going to use it as though I had observed it.” And when you do that, you have to add a little extra variance in so that you’re not overestimating how right you feel you are, right? And that takes a lot of forms. It can be something as simple as midpoint, just giving everyone that’s missing the average so that you can use them for other tools, to using progression, to multiple amputation modeling. So those use cases are well known. Things like ascription, infusion, you buy a data set, and then you combine it with your data set, even though they’re different sets of people, by using lookalikes. Again, that’s synthetic data, and we use it. So talking about this. There’s nothing on a baseline that’s wrong about using it for this—using this kind of technology. What you have to focus on is, is it appropriate for the thing you’re doing, and will it give you reliable answers. Like I said before, the biggest thing is you’re doing this because you have decisions to make, and are you going to trust bad data to drive your business? I would think most of us would say no. So Mark Ritson—I did read the piece, and he was honest enough to mention in there that he is a co-owner [laugh] that he was talking about. So take that with a grain of salt, as much as you want to take a grain of salt with me as a guy who works for a survey panel company. So—but let’s talk about the research and the facts behind it. When you are talking about using synthetic data for—synthetic panels, I should say. So a synthetic panel is I’m going to create a series of personas that don’t exist but which combine in aggregate to look like a population I care about. And then I’m going to ask the language model to pretend to be these people, these personas, and answer questions as though they were that persona. And this kind of mimicry is something that language models pretty well. They do sound like those people, but that doesn't mean that they answer the way people would answer on a factual basis. So why is that the case? Well, partially, we don’t know because we don’t know why language models work in the first place.
Lenny: [laugh].
Steven: I mean, right? I think that’s really important to understand. So, if you’ll indulge me, let me just quickly, really briefly, define what the large language model is doing.
Lenny: No, please do. This is a whole other conversation we could have. So go ahead.
Steven: Well, large language models, they’re a form of transforming your own network, which is not going to be interesting to most people. But what it’s doing is it is looking at this massive amount of data that it has collected, and then it has created relationships between words. Not technically words. They’re really called tokens, which can be pieces of words. And so they know that when this word happens, this other word often happens. And it’s just magnificently huge, billions of data points. And then what it does is it uses all that information to do the same thing you do on Facebook or Instagram when someone says “Hey... type ‘I am the wizard of...’ and then just keep hitting that middle button on your keyboard so that it just keeps prompting the next word,” and it creates this—a funny sentence. Literally, that is what this is doing. It is assigning the next most probably word, and that is all it’s doing. So it doesn’t think about things. It doesn’t understand facts. It’s not a search engine. It is just coming up with the most probably next word, and no one is entirely clear why this works so well. So you’ll hear this referred to as an emergent property. So we know how the model is created. We have no idea how it’s producing the results it produces. This is also why we don’t know why it hallucinates. So, number one, what it is attempting to do is to predict the next right word for whatever scenario you’ve given it, based on the context of the prompt. So if I ask it to speak in the style in the manner of Donald Trump or Joe Biden, it will write something that looks just like them, with the exception of Donald Trump where—because it won’t use all-caps.
Lenny: [laugh].
Steven: [laugh]. In addition where it’s—as I mentioned earlier, the grammar is always correct.
Lenny: Right [laugh]. All right.
Steven: Yeah. So, with all that behind us, when we think about using it for the purpose of pretending to be someone and give us survey responses, how is it giving you a reliable question, and what is it doing that might be—you know, based on that model that it’s using that may not be like people? So a lot research on this. One of the things that they find is that these tend to be overly positive, so they like things too much. They use things too much. Another thing that these things do is that they tend to like to over-produce the most common response. So if I were to take 100 people who are Hispanic men, who are 30 years old, who live in Detroit, and I were to ask them the same question, I would get a diversity of answers because it’s not monolithic. No population is. If I were to ask a language model to do the same thing, pretend to be that person, and it give me 100 answers, I will get 100 very closely associated answers. I’m not going to see a ton of variation from person to—quote, unquote “person to person.” So this regression to the mean is really problematic in research, partly because we care about everyone in a population, not just the average.
Lenny: Right.
Steven: And the distribution matters to the research.
Lenny: Well, the outliers, the richest insights come from the outliers.
Steven: Outliers are huge, yeah. There are no outliers [laugh] in language models. The third thing to keep in mind is that the data are not current. So language models—I think the most current language model, I think it—I want to say it’s four month old right now, but most—
Lenny: Isn’t Grok real-time?
Steven: Grok’s not a commonly used resource yet, but I don’t think any of the language models are real-time. The reason is, is it takes an enormous amount of computational power and time to build the foundation model. And the foundation is what’s being pulled out of to run it. So ChatGPT, Gemini, the ones people are commonly using, you know, they tend to be six months or a year old. So it’s very difficult to go into one of these tools and ask it a question to where the intent to have an answer that is sort of a moment in time answer. You think about a tracker, right? You care about what’s happening over time. These things are—they’re not there. If I were to ask ChatGPT questions about the winner of yesterday’s primary, it has no idea. You know, and in fact, most of them wouldn’t know what the dispensation is of the various court cases that the Supreme Court has called in the last few months. These things just don’t exist for that. So, if you’ve got a new brand, if something’s happening in the marketplace that affects your brand, you’re not close to that. So a lot of problems with using it literally in place of a survey. It doesn’t answer the same kind of question. Now, what it does do pretty well is simulation. So, if I want to create a simulated population and then use that simulated population to say, “Okay. If I change this thing, how is that population going to react to it,” that can be interesting and then lead you to testable hypotheses. But you would still need to talk to people to test those hypotheses.
Lenny: Yeah. I agree. I think of it as agent-based modeling on steroids.
Steven: Yeah.
Lenny: Right? Yeah.
Steven: And think probably—yeah, I think probably the worst thing that I’ve seen lately—and I’ve actually been pitched by three different companies trying to sell this to Dynata—it’s this notion of creating sample to fill in the gaps. So I have 1,000 respondents, but only 100 are 18 to 24, and I need more. I can’t tell you anything about 18 to 24s. I can’t break it down by men and women with 100 people. I can’t break it down by income or census region. I just need more sample. So what these vendors do is they use artificial intelligence to look at all of the responses that you did collect, replicate the kind of variance and the correlation that you see in that, and then produce, let’s say, another 100 18- to 24-year-old males. And then “Oh, look. Magic. You’ve got 200. Now you can run your stat test.” And there’s so much wrong with that. I know this is probably not the place to talk about the Law Of Large Numbers and the Central Limits Theorem, except to say that those are the two fundamental principles upon which all stats are based, and this violates both principles. But a very simple way of thinking about it is let’s say I roll two six-sided dice, and I do it 20 times, and I get an average of 9. Well, two six-sided dice, the average is 7, right? But I got nine because I had a small sample size. So now what I’m going to is I’m going to use a tool that is going to look at all the rolls that I’ve already had, replicate that variance, and give me 1,000 more roles. What’s the average value going to be at 1,000 roles? It’s going to be nine, right? So all these things are doing is they are creating a sense of false confidence around the wrong answer. We want more variance. We don't want less variance, right? We don’t want to replicate the results. If that’s all we wanted to do, we could just cut and paste those 100 people. Now we have 200 and run our stat tests. So there’s a lot of chicanery happening. A lot of P.T. Barnum stuff happening in the ecosystem.
Lenny: I agree. And like you said, there are certainly opportunities where it can be fit for purpose.
Steven: Mm-hm.
Lenny: You know, early-stage hypothesis testing. All right. When there’s not a hell of a lot of risk but you want to streamline the process. I get it. I totally agree, but it’s not going to answer the business question that really comes. They way I’ve been—so let’s see if you agree with this hypothesis, and I’ve talked about it publicly quite a few times. I do think that we’re moving towards a place, fairly rapidly, where we will be able to leverage existing data, particularly at the individual level, more efficiently to answer more questions. And increasingly, the role of research will be what I think of as last-mile data, right? Filling in gaps of information that don’t exist within—let’s call it a data graph for the individual. I know that you guys, you know, explored that concept a few years ago as well. So theoretically, if there is a way to build out, you know, a significant amount of first-party data at the individual level, basically you can create a digital avatar of an individual. And, in some things, predict with some accuracy. Other—if given a choice of buying this or that, they’re going to buy this because they always buy that, right, type of thing. So fine, we don’t need to ask that question. We don’t need to ask if, you know, if you buy toilet paper. We know it. There’s no need to ask that. We got that. Why do you choose that toilet paper? Right? Those things—that becomes the driver of primary research. It’s filling in the gaps of information. And I think that that increasingly—well not necessarily from a form factor standpoint—look like a survey from an instrument perspective. It’s probably going to look something more like a chat. And that’s driven just by the form factor with the devices, right? I mean, they’re going to push this stuff down [laugh] through every device. We’re back to the—we’re back to telephone. We’re back to voice, you know, because [laugh] that’s what they’re going to push through. So that’s my thinking, and I do think that’s happening faster than—if you asked me a year ago, I would have said that’s five years out. Ask me today, it’s like, oh, that’s maybe, like, a year or 18 months before we start seeing that really starting to happen significantly. So let me—yeah. What do you think?
Steven: Yeah. Well, I would tell you that’s already happening. Um...
Lenny: [laugh] So, well there goes—there you—I always suck at timing. I’m good at predicting what’s going to happen. I am terrible at predicting when it’s going to happen.
Steven: Yeah. But I—you know what? Three years ago, I was talking about things that are five to ten years out that are on my phone now, so take it with a grain of salt from me as a well. But I think that what we’re really looking at is the big, new innovation that I think is actually ready for primetime is the use of these chat agents as interviewers because they’ve gotten very, very good at it. There are people who argue whether they would pass the Turing Test or not. It depends on the use case. It depends on the tool. But I think we can all argue that it’s close, right? This is cusp of a Turing Test win. Turing Test, if you’re not familiar, is the—is whether or not, if you are interacting with a computer via text—so, you know, you can’t see or hear—would you know that the person on the other end is not a human. And if the average person would not be able to tell the difference, then it passes the Turing Test. So chat pods and interviewers is really, really an interesting thing. So we’re already using that. I think by the time this releases, we’ll have already done the press on it, but we are releasing a new tool to expand the amount of information we get from open-end. So this is just step one. So if I really want to know more about why someone likes my product, it’s difficult to do that in a survey because open-ends tend not to garner a whole lot value. I think the data I saw suggested that the average open-end is six to eight words in length, so, you know, my open end might say “When you think about cookies, what’s your favorite cookie?” And the person will say, “Oreos.” Well, that’s what I know now. I could have another open-end that says, “Tell me why you answered what you answered.” So what the tool we’ve built or we’ve partnered to build does, is you ask the question, you know, do you look cookies. Oh, I like Oreos. And then it uses the context of the question, the context of the answer, and then the context of the hidden prompt to ask you additional questions. What is you like about Oreos? Oh, I really like the cream filling. Interesting. And what is about the cream filling that really excites you? And it can take you down this—a few iterations. You can decide how many iterations you want. But then you go from 6 to 8 words to maybe 50 or 100 words. Now you’ve got some really rich data. The other thing that language models do really well is they synthesize and find patterns. So the downside, historically, of getting a lot of verbatim, a lot of open-end answers, is that it’s a pain in the butt, and it’s really manually intensive to code it into something real that you could use for insights.
Lenny: Yeah.
Steven: Well, guess what? The language models can do that too. So you combine those two tools, and now you’re able to do qualitative work at quantitative scale, and you can interpret at speeds that are—you know, would have been unheard of five years ago. And then, on top of that, one other thing the language models do really, really well is they translate. They do a really good job translating, and so you can also translate all of this stuff into a common language if you’re doing a 12-country study. Like, you know, we do multi-country studies all the time. It turns it into the common language spoken by the person who is going to be doing the analytics, and then draws out and synthesizes what were the key elements in there. So it’s really fabulous new tech. And I think that’s primetime. That’s today. I think tomorrow is going to be full in-depth interviews being conducted by chatbots in an intelligent way. That’s not today, but I don’t think that’s very far.
Lenny: Yeah, I agree. And you know, it’s interesting to—I was just on a call earlier today with a company that was kind of early into the AI qualitative to scale. Let’s kind of leave it at that. But it might have only been, like, a few years ago. But the—we were updating on what do I think has changed in the market, and there were two recent data points. One was—as we’re talking, this—IIEX Europe was this week, and I heard from multiple folks that they were already over the term “qualitative at scale.” The market had moved to, well, yeah, and... Right? That that’s become table stakes already. That is not a differentiator at this point because, for the last year, we’ve seen all the innovation and the discussion around that. So the market, from a buyer’s standpoint, is already moving towards “I don’t want to hear about the latest AI. We expect that you have been incorporating these solutions to create efficiencies and workflow and output. And if you’re not, then, you know, [laugh] we’re probably not going to work with you because those things are undeniable. Let’s talk about business, right? Let’s talk about how it gets us to a better answer faster and more cost-effectively.” So that was one. And the second was even three months ago, I would have said that we hadn’t crossed the adoption chasm yet. You know, there was still a lot of just kind of toe in the water, and I think that’s changed. It’s changed because of the workflow efficiencies people are seeing in other areas of their life organizationally and now looking for that expectation within research as well. So, sorry, my long-winded way of saying I think that we’ve gotten to two places really quickly already. We’re going to see an accelerating point of adoption of AI-driven tools, and the fact that they are AI-driven tools is not the differentiator. It’s just okay, now it’s time to do these things and tell me why this really matters. Not because “oh, it’s got AI backed in.” So are you seeing that? What do you think?
Steven: Yeah. It’s interesting. There’s a chasm between vendors, like Dynata, and brands. So vendors are really pushing the AI envelope. We’re really out there trying to figure out how is this going to make this faster, better, cheaper, differentiate, bring new tools to the market that wouldn’t have been possible before. Brands are being much more cautious. A lot of this has to do with some of the things you’ve seen in the news. So people have unintentionally loaded entire code bases for their products into ChatGPT, and it became part of the training set. So there are still a surprisingly large number of companies that have policies forbidding the use of these tools.
Lenny: Yes. Agreed. Agreed.
Steven: Yeah.
Lenny: And so they’re building their own.
Steven: Mm-hm. Yeah. But some of them aren’t doing anything. They’re just saying we don’t want to be a part of this yet. The walled gardens are the best options, I think, for most people. A walled garden being a language model that lives within your own infrastructure so that nothing leaves. But it’s not just the language models. I think the place where I really agree with you the most, and I agree with a lot of what you’re talking about, is that it’s really become performance driven now. So you do need to have AI. You need to be able to say you have AI, or they think you’re a dinosaur. But what they really want to know is what’s the impact for me if I use your too. So, for example, our data quality process is called Quality Score. And Quality Score uses those 175 behavioral data elements that are observed while people are talking surveys, and it applies that to a learning model, to an artificial intelligence model, that not only gets rid of bots and survey farms, as you’d expect, but is also able to understand looking at the breadth of it, right? What is the probability that this person was sufficiently engaged, that we should use their data, right? Were they petting the dog? Were they watching their soaps? Were they paying attention and answering the question? So AI systems like that are hugely important, and that’s why we were talking about earlier—the industry might have as much as a 40 percent rejection rate. No one really knows, but, you know, it’s somewhere in that 20 to 50 range you were talking about. Our rejection rate is 3 percent, and this is why. We take advantage of these advanced tools. And so, when we talk to a client about this, we do—obviously, we tell them it’s AI, but the most important thing that they care about is 3 percent rejection, right? I’m not going to have to spend all of this time going line by line in my code and figuring out whether or not somebody was real or not, whether someone was attentive or not. And in fact, if you’re familiar with I-COM, we won the data creativity award this year for our Quality Score. Not only did we win our group, we actually won the entire event. So there were 27 finalists, 4 category winners, and 1 overall winner. We were the overall winner. Or as I like to say, we were the best poodle and best in show.
Lenny: [laugh]. No, that’s great. I want to be conscious of your time, as well as our listeners, and I’m glad that you brought the conversation back to quality. Because I think one, we’ve gone all over the place in this conversation but within certain limits, but I do think that fundamentally, for our audience, the old saw of garbage in, garbage out applies even more today than ever before. And I share the concern about the LLMs from a training status standpoint. Not only is there a lot of—the big, you know, general LLMs, there’s a lot of garbage out there, and there’s also biases built into the algorithms as well as we, you know, saw with some of the maybe early experiments of—some of the launches that came out. It was like “Oh, I don’t think you got that quite right, did you?” [laugh] So...
Steven: So are you saying I shouldn’t put glue in pizza dough?
Lenny: Yeah. Eating rocks may not be best. The—you know, the founding fathers probably were not—anyway...
Steven: Yeah [laugh].
Lenny: [laugh]. So...
Steven: There’s probably not a lot of African Nazis out there.
Lenny: Not a lot. But to the core, that doesn't mean we throw the baby out with the bath water. This technology is here. It's evolving. Those things are being fixed. And my believe is that the role of insights as whole—insights and analytics, the way I think of the industry—will increasingly be not just the guardians of the why, the filling in the gaps of information, but also the guardians of the quality to make sure the data that’s coming in—and to your point about walled gardens, right, that is a fantastic use case for insights organizations, for panel companies, for suppliers, to help be the feeds of quality data that build custom LLMs, you know, within an organization based on good, quality data. And we damn well better, as an industry, be leaning into that. You know, folks like Mark Ritson who is kind of “oh, research is going to go away.” I’m going to call BS on that. I don’t believe that in any way, shape, or form. I think the dependence on research will only increase. Now, the business model may change, right? The unit of measurement may not be the project anymore. It may not be the interview. It may be the data point. That may be the unit of value creation that we’re moving towards. But our ability to collect, manage, and make sense of that data is only going to grow in demand, is my sense. So what do you think? Are we looking at the apocalypse through our robot overloads? Or are we looking at a great future, where, you know, we’re just getting better and better together?
Steven: One thing I like to say is that the difference between garbage in, garbage out before and now is that before, you could look at the garbage going in and say that’s garbage. And today, it’s really hard to know unless you really understand how these models are built. I thought it was a great point that you brought up about the internal biases. These models are as good as the data they’re trained on, and they are not transparent about where the data is coming from. They talk about adding a bunch of data from user boards, but have you read a user board?
Lenny: Yeah. [laugh] I know. Yes. Reddit can be a scary place [laugh].
Steven: Yeah. There’s a lot of scary places on the web. So are we going down the path of our robot overlords? No. I mean, looking at where the technology is today, worrying about AI taking over the world and killing us all is a little bit like worrying about overpopulation on Mars. We kind of sort of get how we could get to Mars, but overpopulation is maybe not a today problem. That being said, is it a five-year problem? Is it a ten-year problem? I think that is fundamentally going to be based on when quantum computing becomes affordable. Once quantum computing becomes affordable, then I would say we can really have these conversations about terrifying prospects.
Lenny: Well, and speaking of—we don’t know how it really works anywhere [laugh]. That’s a whole other aspect of—cubits? Anyway...
Steven: Yeah. Well, at least we can mathematically describe that, right? But yeah, to your point, our—the—every couple of years, they say that it’s the end of research, and we don’t need people to understand things anymore. And we have never once built a thing that understands and considers fact, that is not a human. So, of course, researchers are going to be necessary. I think where—I think there will be parts of the industry that are going to really struggle. I think people whose job it is to code open-ends, companies that do that. Translation companies are already really feeling the pinch here. So it’s not like nobody is going to be impacted. Absolutely, people are going to be impacted. It will create other jobs. But the specific value of having a human who understands the data and understands the business and how those things relate is not going anywhere.
Lenny: Yeah. Well said, Steven. Well said. That’s probably a good place to kind of end the discussion. But I will—is there anything that you wanted to make sure that we touch on that we did not?
Steven: I think we did a pretty good job. [laugh] I don’t know that I have something to add there.
Lenny: Yeah. Good conversation. And I hope that we have more of them because this topic is not going away [laugh].
Steven: Next time, you can have the conversation with my simulated AI computer persona.
Lenny: I saw—I watch Product Hunt, right, every day, and get their emails. And they released one last week called Butterfly, which does train your own personal avatar off of all of your data. And it’s supposed almost like a therapy thing to kind of talk to yourself. And it probes and asked. I wasn’t quite sure. Like, when I talk to myself, that doesn't mean that I’m getting psychiatric help. It’s usually bad, right [laugh]? So...
Steven: There’s startups out there who are doing that to create replicas of people who have died.
Lenny: Yes. Oh, I know.
Steven: Yeah.
Lenny: It’s a bad black mirror episode.
Steven: It’s—it does feel really bad.
Lenny: Yes. That truly was a bad black mirror episode. That was several [laugh]. And so... and I don’t mean bad as, like, it wasn’t high quality. Every black mirror episode is high quality. It’s bad as in like “Ugh. No. [laugh] I don’t want that.”
Steven: Yeah. I mean, interesting, on that subject, I wanted to see whether or not it could speak in my style, so I loaded up a whole bunch of stuff that I wrote. And then I talked to it, and it didn’t sound anything like me. And I was—as I was sort of sorting out what the problem was, it occurred to me that I don't write the way I talk. Nobody does.
Lenny: Mm-hm. Right.
Steven: Nobody writes the way they talk. And so it’s fundamentally flawed to assume that you can take everything that somebody’s written, and then interact with them like someone you would talk to. Now, it did a really good job, once I thought this through, writing an essay on a topic in my style and in my voice. But the notion that I could converse with it, it’s nonsensical.
Lenny: Yeah. It’s going to be interesting times.
Steven: It is interesting time.
Lenny: Yeah. Well, it already it. Well, yeah, well, I mean, it—yeah, the Chinese curse. We’ve been in interesting times for quite a while [laugh]. So...
Steven: Yes. I was telling a friend this. It’s been a long week for about twelve years now.
Lenny: [laugh]. Agreed. All right. Steven, we could go on and on about this. Thank you. Thank you so much for taking the time.
Steven: Thank you, Lenny. I really appreciate your time and inviting me on.
Lenny: And coincidentally, thank you. Dynata is a sponsor of the Greenbook Podcast, so thank you on behalf of us to Dynata. We appreciate that. And I guess other thanks should go out to our producer, Natalie; our editor, Big Bad Audio; and, of course, our listeners. Thank you so much for taking time to spend it with us. That’s it for this edition of the Greenbook Podcast. Everybody, be well. We’ll talk to you again real soon.
Sign Up for
Updates
Get content that matters, written by top insights industry experts, delivered right to your inbox.
67k+ subscribers
Healthcare, Medical, and Pharma Market Research