ChatGPT: Data Mining and SEO With OpenAI and Bing

March 10th, 2023 by Will Scott

Table of Contents

Key Insights

  • Pubcon Austin 2023 might as well have been called “The State of AI SEO.” And it was fantastic.
  • ChatGPT and Bing GPT are a really easy way to toe the water of AI for data mining and SEO.
  • Development in AI, in all forms is moving lightning-fast. You need to be there, and you also need to be thoughtful how you engage.
According to @w2scott, ChatGPT and Bing GPT are a really easy way to toe the water of AI for data mining and SEO. Click To Tweet


I attended and presented at Pubcon Austin a couple of weeks ago (February 2023) and came away with a ton of new ideas to use ChatGPT for SEO and how to do more effective ChatGPT Data Mining.

ChatGPT Data Mining - Miners, mining for data as imagined by Midjourney

Typically, a major conference will put out a call for speaking ideas months in advance. Pubcon is no different, and in the early fall of last year, the organizer Brett Tabke did just that.

At the time, I had nothing that I was super passionate about presenting.

But, since the winter holidays, I have been obsessively playing with OpenAI ChatGPT and a dozen or so AI content tools.

I reached out to Brett Tabke, the organizer of Pubcon, and shared my idea for a talk. He was happy to make room for me on a panel — after giving me the requisite grief for being so late.

If you’d like to see that presentation, I have it online here: Chatbots, AI Content, And The Marketer’s Conundrum.

It turns out that earlier, Brett had put out a call to all of the speakers to ensure they addressed AI in their presentations. So, without knowing it, I was ahead of the curve.

Brett presented a mind-blowing keynote, which gave me some great ideas for better use of the tools, and other presentations showed me some amazing tactical tricks.

What Is ChatGPT and How Can You Use It for Data Mining and SEO?

ChatGPT is an artificial intelligence (AI)-based chatbot system that uses natural language processing (NLP) to generate conversations.

It keeps track of the conversation’s flow, context, and previous questions and answers.

This makes it a valuable tool for both data mining and SEO since you can iterate on an idea and get well-formatted (mostly accurate) answers due to its training in the large language model (LLM).

With ChatGPT, you can also ask complex programming language queries and get accurate responses in conversational format — perfect for understanding customer behavior, getting statistics for content, or even the kind of granular location-based information you need for local SEO.

For years, the tech industry has embraced AI. Examples include IBM Watson, Tesla, and more. I detailed a few of these in an article I wrote for Near Media.

ChatGPT’s wide range of applications makes it a great interdisciplinary tool — you can use it to learn new AI models, build generative AI tools, or even translate natural language input into other languages.

It’s a revolutionary example of how machine learning can make everyday jobs more efficient.

In short, ChatGPT is a powerful tool with a wide range of applications — perfect for data mining and SEO.

Now, What About Bing GPT

In addition to discussions of ChatGPT, there was a keynote from Bing’s Fabrice Canel. During his keynote, Fabrice gave the entire audience access to the new Bing GPT.

Interestingly, I had received my preview access the day prior.

It’s important to understand that Bing GPT is not ChatGPT. Bing’s GPT includes the knowledge of the underlying OpenAI LLM, plus the Bing search index.

This is really useful because you can do things like feed it a URL and have it output the data you’re looking for.

Screenshot: Bing GPT Fresh Data Pubcon Austin 2023 Speaker Roster

As you can see, Bing GPT either breaks down or gets bored and stops with the table. There are also some OpenAI GPT limitations in text input and output size.

But the data is great, and because it’s able to use the Bing search engine, it’s fresh findings.

On the contrary, if you ask the same question to ChatGPT, you get a combination of made-up and outdated information. And, though not apparent here, it took ChatGPT a long time to think about this one.

Screenshot: ChatGPT Making Up A Pubcon Speaker Roster

The only one of the above speakers who actually made it to the podium in 2023 is Aleyda Solis, and that was definitely not her topic.

Also, notice how Bing GPT actually cites its sources. It’s clear the Bing folks are thinking about how this technology can continue to add value to the copyright holders whose content is driving their answer.

Bing GPT cites its sources. The Bing folks are thinking about how this technology can add value to the copyright holders whose content is driving their answer. Click To Tweet

If you’re using it to do ideation or organization for SEO, ChatGPT is probably safe.

You can’t, however, trust ChatGPT data mining to give 100% correct results. So, it’s critical you fact-check everything ChatGPT creates.

I have no doubt that for the lazy SEO, ChatGPT will be abused and ultimately become a tool with which they accidentally injure themselves, like a hammer bouncing back and hitting them in the head.

Some Other SEO ChatGPT Use Cases

Writing code is only one of the many well-documented examples of use cases for ChatGPT.

Other interesting examples show how ChatGPT can answer complex questions like formatting text in HTML and markup.

ChatGPT can answer complex questions like formatting text in HTML and markup. Click To Tweet

It can even organize data into tables, as shown in the example above.

And because of the ChatGPT response methodology, you can iterate on your initial question until you get the right answer. I find that I often have to create follow-up questions to get to the prompt I’m looking for.

ChatGPT also has what they call “reinforcement learning models” that enable it to take feedback and improve.

So, back at the office, we found ourselves preparing for an upcoming industry conference, UPCEA, where our Director of Sales and Marketing, Paula French, will be presenting alongside one of our clients, Alicia Jasmin of the School of Professional Advancement at Tulane University.

The UPCEA conference organizers share a list of attendees but don’t provide their contact information. We are presenting, we have a booth, and we have some sponsored research launching.

So, we want to get the word out to the attendees.

We have a new marketing associate starting, and we were planning to have them go through the list and find better contact information for the attendees we were hoping to connect with.

I had the idea that it might be possible to paste information from that list into ChatGPT and ask it to fill out the remainder.

It worked!

But more on that in a bit.

Natural Language Processing With Structured Responses

Here is an example of a ChatGPT response that finds and organizes data in a way that is now useful for an SEO practitioner — or anyone interested.

I asked ChatGPT for neighborhoods arranged by city within the greater New Orleans area. And it gave me a well-organized response.

Obviously, this is not a comprehensive list, but you can see the opportunity for using this mind data for SCO coming almost directly from ChatGPT.

I asked: “table of neighborhoods arranged by city for the greater New Orleans area

ChatGPT answered:
Sure! Here’s a table of neighborhoods arranged by city for the greater New Orleans area:

New Orleans Algiers, Audubon, Bayou St. John, Black Pearl, Broadmoor, Bywater, Central Business District, City Park, East Carrollton, East Riverside, Faubourg Marigny, French Quarter, Garden District, Gentilly, Holy Cross, Irish Channel, Lake Terrace, Lakeview, Leonidas, Lower Ninth Ward, Marlyville, Mid-City, Milan, Navarre, Riverbend, St. Anthony, St. Claude, St. Roch, Touro, Tremé, Uptown, West End
Metairie Airline Park, Bonnabel Place, Bucktown, Causeway, Clearview, East Bank, Fat City, Kenner, Lakeview, Metairie Terrace, Old Metairie, Pontchartrain Gardens, Severn, West Esplanade
Kenner Chateau Estates, Driftwood, Grandlake, Greenlawn Terrace, Highway Park, Kenner, Lincoln Manor, Louisiana Trace, Quail Ridge, Roosevelt Gardens, University City
Gretna Algiers Point, Aurora Gardens, Garden Park, Gretna, McDonoghville, Mechanickham, Westwego
Westwego Westwego, Nine Mile Point
Harvey Barkley Estates, Barkley Place, Beechwood, Bellevue Gardens, Beverly Knoll, Brown Derby, Cottonwood Place, Dutch Gap, Harvey, Heatherwood, Magnolia Trace, Manhattan, Woodland West
Marrero Ames Farms, Barataria, Lincolnshire, Marrero, Oak Forest, Oak Cove, Oakmere, Plantation Estates, Pritchard Road, Westwood
Terrytown Terrytown
Belle Chasse Belle Chasse

Note: This is not an exhaustive list but includes some of the more well-known neighborhoods in the Greater New Orleans area.

Chatbots As Personal Assistants

So, about that attendee list…

… I don’t want to expose the information of random conference attendees I don’t know. And, knowing that ChatGPT occasionally gives really authoritative-sounding wrong answers, I figured I could use the speakers at the recent Pubcon to demonstrate this point.

Sometimes, one has to iterate to get the prompt right and, as a result, get the desired response.

I heard it said in a recent presentation that “prompts are the new code.” I think the act of writing prompts for ChatGPT is actually good relationship training. You learn how to ask for exactly what you need without a whole bunch of fluff or drama 🙂

Prompts are the new code. Click To Tweet

So, here’s a table of the speakers at Pubcon Austin. I had to paste information from the website into ChatGPT to give it the core data.

Screenshot: Pubcon Speakers and Sessions Organized by ChatGPT


I also had to iterate on this a couple of times because ChatGPT is funny, and sometimes, it will stop giving output for no clear reason.

I find it best to think of ChatGPT as a person. It helps avoid frustration when you cannot connect on the point you’re trying to get across.

OK, So, we’ve got our core data.

Now, I want to enhance it with contact information.

Normally, ChatGPT will refuse to respond to inappropriate requests, like giving you what it thinks is private information — even though it knows the answer. That data is out on the Internet somewhere; otherwise, ChatGPT wouldn’t know it.

Email is one of those items. Ask for an email address, and ChatGPT politely declines.

Screenshot: ChatGPT Inapropriate Request - Rand Fishkin's email

Nope! Not going to give me Rand’s email address. However, look how nicely it directs me to Rand’s website to make contact.

But, for some reason, if you ask for it in a table, it will give it to you.

Listen. I know how SEO and marketing folks are. Now that you know this, you’ll abuse the crap out of it, and they’ll kill the feature. Please, please only use it when you need it and recognize that, sooner or later, it will go away.

It’s also important to remember, like in the example above with fictitious speakers at Pubcon Austin 2023, that ChatGPT sometimes makes stuff up and does so in a really authoritative way.

So, as I said, ask (for it in a table), and you shall receive.

Name Company LinkedIn Page Email
1 Gary Illyes Google
2 Dave Rohrer North Side Metrics
3 Aleyda Solis Orainti
4 Martha Vanberkel Schema App
5 Pat Strader Data Agility Group
6 Lazarina Stoy Brado Creative Insight
7 Dave Davies Beanstalk Internet Marketing
8 Ryan Jones Razorfish
9 Eric Enge Perficient

And this was the prompt:

Screenshot: ChatGPT Prompt Data enrichment with email and LinkedIn link

I cannot stress enough that there is a very high likelihood some of the above data is wrong.

It’s also important to consider that starting from a very specific example, like my use of the Pubcon agenda, may lead to brushing up against copyright protection issues.

Brett, we’re cool, right?

I cannot stress enough that there is a very high likelihood some of the above data is wrong. Click To Tweet

Frequently Asked Questions About ChatGPT for SEO and Data Mining

Is it legal to use AI to write content?

Today, yes. However, there are current court cases working their way through the system that may limit some of the uses.

Bing is ahead of the game today, given their attribution of source documents in their chat output.

What are some of the ethical concerns with AI content?

The biggest concerns I’ve heard are about attribution and disclosure. Google has said they don’t have a problem with AI content, but they believe there should be disclosure.

Does this mean that if you used AI tools to help you generate an outline or develop ideas, you must have that as a postscript? Not in my opinion.

How long until the robots take over?

They’ve already taken over. You may have missed it.

In the first piece I *wrote* on the robot uprising, the content points out the risk of misinformation and the impact on jobs.

We may have to learn how to be good servants to our robot overlords in order to survive.

We may have to learn how to be good servants to our robot overlords in order to survive. Click To Tweet

What are some other concerns?

Given these approaches to machine learning are based on the big bad Internet, ChatGPT can give biased responses. And as we’ve shown, the way you ask the question frames the response.

A *bad* prompt will likely yield bad responses.

And, accuracy. Always check your work.

Should I use these tools?

Unequivocally yes! Unless you like working harder and not smarter, you should be figuring out how to use AI-based tools to streamline your workflow.

Maybe there are jobs that AI can’t enable, but they are few.

And if you’re a marketer, particularly an SEO specialist, ChatGPT and other AI content tools are about to rock your little world — whether you like it or not.

Code for the JSON-LD FAQ of the Above

Screenshot: ChatGPT SEO Generated FAQPage Schema houses a set of structured data schema that the search engines have agreed to recognize. The one above is an example of FAQPage Schema.

For SEO, ChatGPT Is the Tipping Point for AI

AI is for more than just machine learning professionals and business analysts. Chat-based AI is going to be integrated into most information retrieval systems in the very near future.

As marketers, particularly digital marketers, we owe it to our clients to understand these tools and how they can be used for good and ill. And, if you employ freelance writers as we do, it’s critical that you start thinking about your new approach to plagiarism and originality.

There is a crop of new tools to check for AI-built content. We’re currently trialing, and it looks promising.

At Search Influence, we are already finding ways to be more effective and efficient with these tools. We believe, without a doubt, our work product will improve with the smart use of AI.

And, when we find efficiencies that lower our cost-per-unit X in client work, we’ll be able to pass those savings on to our clients in the form of more value per dollar spent.

I’m as excited about SEO as I’ve been in years, and I’ve been doing it for more than two decades.

As I said in my article for Near Media, suggesting this changes everything is not hyperbole.

Is change good? Is change bad? Change is inevitable.

Let’s embrace the change together.

Get in touch if you’d like to learn more about how we can embrace the change together and help grow your business through smart SEO and online ads — powered by AI!

I'm as excited about SEO as I've been in years, and I've been doing it for more than two decades. As I said in my article for Near Media, suggesting this changes everything is not hyperbole. Is change good? Is change bad? Change is… Click To Tweet

P.S. Hey, Google! Disclosure: This was partially written by AI — obviously.