The Prompt

October 31, 2023

Two Questions Researchers Should Ask of Their AI Vendor

As AI tools grow, it is crucial for researchers to assess data practices of vendors. By gaining insight, researchers can make better choices.

Two Questions Researchers Should Ask of Their AI Vendor
Lisa Horwich

by Lisa Horwich

Founder & Research Principal at Pallas Research Associates

With all the buzz about the latest and greatest new AI tool, it would behoove us researchers to take a step back and ask prospective (or current!) vendors these two questions:

  1. What are you doing with my data?
  2. How are you protecting my data?

The first question has received a lot of scrutiny especially with the Zoom “Terms of Service” debacle last August, but many researchers don’t necessarily understand exactly how their data is being used within the context of Generative AI/LLM model training. And most researchers (except those who focus on IT and the technology industry) might also not be well versed in cybersecurity risks and mitigation techniques.

So, let’s address both of these questions and empower you, the research community, with the relevant information so you can make an informed decision as to which AI solution is best at assisting your research studies without violating privacy of your participants/respondents.

Question 1: What are you doing with my data?

This question is fundamental to maintaining data privacy and confidentiality as well as ensuring adherence to the corresponding rules and regulations.

At a minimum any AI solution should be GDPR and CCPA compliant (yes there are many other data privacy rules and regulations out there, but these two tend to be the most stringent). The following is an easy rule of thumb to figure out which of the myriad of data privacy laws applies in your specific situation:

  • Does the company do business in the country/state where the law should be followed?
  • Will the data be stored or sent to the country/state where the law is enforced?
  • Does the participant/respondent live (or is a resident) of the country/state where the law is in effect?

Whichever law applies in the situation, or is the most stringent if multiple ones apply, is the one the AI vendor should follow.

The other area is understanding whether the data you upload will be used, in turn, to train the AI model. Training a bespoke model (one that is either internally developed or is ‘walled off’ from other users) with your data is very useful in making the model smarter for your business. But, one must ensure that the data isn’t being used to train a “general-purpose” AI model or LLM (like OpenAI or Bard or Claude).  Explicit assurance from the vendor is key – OpenAI does a good job of this within their terms of service:

Two Questions Researchers Should Ask of Their Ai Vendor

Note they not only explain which data is used for model training, but they also offer the ability to opt-out.

Once you are assured of how your data is being used (or not used!) the other question now needs to be addressed.

Question 2: How are you protecting my data?

I spend much of my time interviewing security professionals and this very question keeps most of them up at night.

If you think of a Venn diagram with “Cybersecurity Risks” in one circle and “Data Privacy Risks” in the other circle, the overlapping area is the unauthorized use of PII (personally identifiable information). This is an area we want to make sure our vendors are taking extraordinary measures to secure and protect.

Cybersecurity threats center around the idea of vulnerability management – preventing ‘bad actors’ from hacking, ransomware and DDOS (distributed denial of service) attacks. Security teams have options to mitigate these risks: utilizing firewalls and restricting access through the use of advanced verification methods. In addition to securing the data, they should be encrypting data both ‘in transit’ (between computers/servers) and ‘at rest’ – where the data is stored.

Privacy threats can arise from data processing where disparate sets of data can be linked to identify an individual. This data can then be shared without consent. In addition, when companies deviate from security and data best practices, standards, and regulations, It puts privacy at risk. To mitigate these threats, ideally any AI solution should deidentify, anonymize or pseudonymize anything that is PII. In addition, we should be only uploading the bare minimum of data necessary to complete our analysis.

Ideally the AI solution should be built with privacy enhanced technologies, and, in the perfect world, the solution would have been designed from the beginning with privacy in mind – rather than trying to implement privacy on top of a solution that already exists.

When our data is well protected, we can be assured of confidentiality, integrity, and availability as well as privacy – making us and our stakeholders/clients able to rest assured.

So, as you look to evaluating your next AI solution (or any Software as a Service offering) ask your prospective vendor the two simple questions outlined in this article. And, if you are already using an AI tool, be sure to ask your solution provider how they are currently using and protecting your data.

data privacyartificial intelligence

Comments

Comments are moderated to ensure respect towards the author and to prevent spam or self-promotion. Your comment may be edited, rejected, or approved based on these criteria. By commenting, you accept these terms and take responsibility for your contributions.

Disclaimer

The views, opinions, data, and methodologies expressed above are those of the contributor(s) and do not necessarily reflect or represent the official policies, positions, or beliefs of Greenbook.

More from Lisa Horwich

Why Researchers Should Care about Marketing Technology
Research Technology (ResTech)

Why Researchers Should Care about Marketing Technology

Emerging technologies can provide new opportunities to provide services and expertise that augments data.

Sign Up for
Updates

Get content that matters, written by top insights industry experts, delivered right to your inbox.

67k+ subscribers