SEO Knowledge-based Search with Bill Slawski – Online Marketing Best Practices Podcast from OMCP

SEO: Determining Algorithm Changes

Google’s recent updates are leaning heavily towards knowledge-based search. How does it work and how should our content be optimized for it?  What signals are they looking for for inclusion in the knowledge graph? Find out here in this interview with SEO and Google patent analyst Bill Slawski who is also Director of SEO Research at Go Fish Digital, and author at SEO by the Sea. Here Bill covers knowledge graph usage, ontology based images, augmented queries based on knowledge graphs, and how to stay up to date with the changes in search.

The OMCP Online Marketing Best Practices Podcast is where top authors and industry leaders share authoritative best practices in online marketing which are covered by the OMCP standard, competencies, and exams.  This is an OMCP pilot program that may continue based on member interest and support. Stay subscribed to the OMCP newsletter to get new episodes.

Interview with Bill Slawski

Bill Slawski helps us understand the systems behind search results.

Michael:
All right. Welcome back to the OMCP Studio, and with us today is Bill Slawski, Director of SEO Research at Go Fish Digital, author at SEO by the Sea. Bill, welcome to the OMCP Best Practices Podcast.

Bill Slawski:
Thank you for having me here, Michael.

Michael:
We know your blog SEO by the Sea for your insights on algorithm interpretation based on patents. We know you’re the director of research at Go Fish Digital, but some may not know he studied law, but before we get started, for those who haven’t read your blog or followed your Twitter or heard you speak, tell our audience something we don’t know about you, and what it is you’ve been working on lately.

Bill Slawski:
Okay. I graduated from law school just before the web came round, and I was interested in environmental law. And one of my professors was rewriting a paper he had written previously and updating it on finding electronic sources of information to do natural resource damage assessment. Sources like LexisNexis and so on. And eventually that became just like the web. So I had an introduction to the electronic databases that were in the world before there was a web, which was an interesting experience.

Michael:
So you were, in some ways, helping categorize knowledge from the start. What brought you into the digital era?

Bill Slawski:
So I had a friend who was a service manager at a car dealership, and he hated his job. He couldn’t stand it. I was reading a book on how to incorporate people in Delaware and be the registered agent as a business. The only technical requirement for performing that job was having a postal address in Delaware, so you can receive notice of process in case there was a lawsuit against one of these companies. So the idea was, you charge people to be the registered agent, you act as a registered agent, and register with the Department of Corporations for the state. I suggested it to him. He said, “Well that sounds like good idea, except I don’t have a website,” and I said, “Well, let me see what I can do about that,” and I learned HTML in two weeks and spent two weeks and built a website.

Michael:
And since then, you’ve helped countless businesses set up strategy for SEO. You’re considered an authority on how search engines work. So it just makes sense that we’re going to cover how marketers can understand how engines handle knowledge base, data queries and results. So Bill, I think you might agree that a primary competency, and we use this in OMCP, is that SEOs need to track the changes in how engines present information from knowledge bases. What kind of things should we be looking at?

How do engines present information from a knowledge base?

Bill Slawski:
A knowledge base is a source of knowledge like Wikipedia or the Internet Movie Database. It’s focused topical information on a particular subject. A lot of the concepts within those knowledge bases are connected in one way or another. That connection, the way they’re connected is how you come up with the concept of a Knowledge Graph, where facts are connected to each other, entities within the Knowledge Graph have relationships with other entities and with facts.

What is a knowledge graph?

A Knowledge Graph is a step beyond a knowledge base. And Knowledge Graphs are really popular these days. There’s lots of activity from not just Google, but Amazon, Microsoft, somebody from Google came out with the paper on personal Knowledge Graphs recently, where he talked about individuals who have their own Knowledge Graphs that cater to their particular interests. So if you have an electronic bicycle, it might have information on electronic bicycles and how to repair them. So you have a personal search, which is based upon your historic search data. So a personal Knowledge Graph might be based upon your personal search history.

So a personal Knowledge Graph might be based upon your personal search history.

Michael:
And I could even present it or maintain it myself?

Bill Slawski:
The paper didn’t really go into that much detail about how it might be managed by an individual, but that potentially could be something that could be done.

Michael:
So a way to think about how a search engine might be viewing the searcher?

Bill Slawski:
Right. And I also brought it up because when we talk about the Google Knowledge Graph, we think of it as one, but it’s possible Google uses lots of Knowledge Graphs.

Michael:
And would one way that they build that be based on how we ask questions?

It’s possible Google uses lots of Knowledge Graphs.

Bill Slawski:
It might be based upon the approach that they’re using to answer questions.

So there are some approaches that they use where they look at the Knowledge Graph, and they try to match up what words are in your query and to known entities and facts about those entities and they’ll answer, “the capital of Poland is Warsaw.” Because that’s something that the Knowledge Graph got from colon delimited table in Wikipedia, and it knows the answer. But they might try an approach where a Knowledge Graph and question answering is done by creating Knowledge Graphs specifically for that question. So they’ll take your query, and I’ll do a search using Google, and they’ll find maybe the top 10% of results that are appropriate for the meaning of your query. And they’ll build a Knowledge Graph out of those, and then they’ll find the answer. So they could be building lots of Knowledge Graphs on a regular basis, which they could then incorporate into the big Knowledge Graph.

Michael:
And how can we track those changes and what’s new with Google and what they’re doing?

How do we track the changes?

Bill Slawski:
We can’t necessarily track the Knowledge Graph itself. We can see some things. At one point in time, Google was using a database, a knowledge base that they acquired from Metaweb called Freebase to act as Knowledge Graph. Before Freebase, Google had something that they were referring to as the annotation framework, which had a browsable fact repository, which was their Knowledge Graph. There was no way to track the browsable fact repository. Sometimes we’d hear statistics about Freebase and how many facts it contained, how many entities it contained. Often it was in the billions. They gave up on retaining Freebase because it was a human edited project, and there were people in Germany building something called Wikidata, which was very much like Freebase, except there was a lot of enthusiasm for it, and it was working really well. It was growing quickly.

It’s possible they may be working towards trying to go past that by maybe automating the process of building Knowledge Graph and knowledge bases. The process I talked about where they’re doing question answering, using individual knowledge bases or an individual knowledge graphs, which they could then incorporate into a larger Knowledge Graph would be one way.

It’s possible they could find a news source that that has formatting that’s easy for them to extract entities from, and relationships between entities and properties and attributes, like DeepMind was doing that with the Daily Mail and CNN because the format made it real easy for them to take that information and put it into a Knowledge Graph. so at some point, it’s a matter of Google bot reading the web, grabbing information, driving facts from it, and putting those into a Knowledge Graph.

Michael:
How do images play into this?

How do images play into SEO via the knowledge graph?

Bill Slawski:
There have been a couple approaches historically. About seven or eight years ago, Google started using what’s referred to as machine IDs, identifiers from the Freebase project for entities that there are images of. So, if you did an image search where you search for a known entity, like a band, like the Beatles, you would say, “The Beatles.” That’s a machine ID and a string of characters and letters. Let’s find all the results that match that. So it wouldn’t necessarily have to look for images, it would look for images that are tagged, that are labeled with that machine ID.

And those machine IDs are now used to places like Google Lens. So if you take a picture of a band and you a search in Google Lens using that picture of that band, Google can recognize who that is doing object recognition, and say the entity ID for that is such and such, let’s see if there are any webpages for it. It finds event schema that says that the band is touring in certain places, and they’ll tell you what the tour dates are for that band. What the costs of tickets are, and it’s smart about entities.

Michael:
You and I were just touching on ontology based image categories. Tell us a little bit about that.

About Ontology-Based Image Categories

Bill Slawski:
So if you do a search, if you’re in the United States, if you’re a fan of history, this was kind of fascinating. You search for a president, like John F. Kennedy, and you can see events that happened during Kennedy’s life, and image categories because the categories are all ontology based. They’re all about other entities or places or times that are associated with the query term that you used. So if you search for Harry S. Truman, you’ll see stuff about World War II. If you search for John F. Kennedy, you’ll see things from Dallas and the Grassy Knoll. If you search for Donald Trump, you’ll see caricatures from Time Magazine.

Michael:
And this built on top of what was kind of the Freebase ID?

Bill Slawski:
The ontologies aren’t necessarily based on the Freebase ID. They are based upon, possibly, a knowledge of related entities, but they tend to be based upon things like queried logs, so it might be associated with certain individuals or places. So if you search for Carlsbad, California, where I’m at, you’ll see a category that’s for Lego Land, which is an amusement center a couple miles from here. You’ll see images of the beach, because I’m not too far from the beach, thank God.

Michael:
Bill, let’s talk a little bit about augmented queries based on Knowledge Graphs.

Augmented Queries Based on Knowledge Graphs

Bill Slawski:
So the concept of augmentation is that Google may find ways to merge the organic web search results that you’ve seen with something else. So there are augmented paid search results, where Google may show you a geographic location extensions with your organic search results. So if your organic search results have entities in them, brand names, for instance, Google might show advertisements from the seen brands that are augmented in some ways, they offer the advertiser the chance to expand upon those by including things like images. Things are for sale, locations of the nearest stores.

In addition to the paid search results, Google may augment search results with knowledge base type information. If there’s an entity in your query, Google might say, let’s show the Knowledge Graph results, or let’s show a knowledge panel result for that entity. Let’s show POS questions for that entity, structured snippets. So you’re going to get more than just the 12 blue links like you used to have before there was universal search. And universal search, that patent was updated to go beyond just showing news and images and videos and what results to all kinds of results, which include the Knowledge Graphs results. So if you do a query for Amazon, you’ll see a bunch of related entities showing, you’ll see, people also ask questions, a Knowledge Graph for Amazon, and a bunch of knowledge related items within those results.

Michael:
Which even more underscores a need for us to present structured data. Now, Bill, you and I, we were chatting at Pubcon a few weeks ago. We were discussing relevance of content. I think it’s known, you and I both agree that SEOs have to pursue genuine relevance, that genuinely relevant content wins in the long run.

Bill Slawski:
Right.

Michael:
It’s also important for us to know how to signal that relevance to the engines, and specifically, because we’ve been talking about knowledge based data, what are the engines looking for in the context of knowledge based data?

What are the search engines looking for?

Bill Slawski:
So when you include things like schema on page, or you have a fact-based table instead of a layout based table, one that has headers that are labeled certain things, like names of cities, or parts for electronic equipment, like TVs are really big in eCommerce, and they’re often laid out in tables on eCommerce websites. That structured data, the JSON-LD or the metadata is considered structured data too, as well as the table data. Those augmentation queries I was talking about, there is a type of augmentation query that looks for structured data to possibly include with the results for regular queries. So if you search for 8K TVs, you might see a bunch of pictures of 8K TVs. You might see structured data from a website, from tables, whatever, and structured snippets that maybe show a snippet for a page.

So, a “known for” is something that expresses some type of expertise

And then facts from the tables on that page. Underneath it, there’s a schema for plumbers, and 1D [embedded] additions to schema markup, one of the attributes that they added recently was “known for”. So, a “known for” is something that expresses some type of expertise. So if you’re a plumber who knows about drain repair, you could have, in your schema, that you’re a plumber and that you know about drain repair.

And if somebody searches for, say you’re in Los Angeles, someone searches for Los Angeles plumber and Google says we’re going to augment these results, we’re going to include some of the stuff from the schema for the page, like this fact that this plumber knows about drain repair. So they’ll show a result where they let you know that the plumber knows about drain repair, which, if that’s what a person is searching for a plumber wants to see, it may bring somebody to you.

Michael:
And they might’ve done a prior search, right? And the context of the Knowledge Graph saying that they were looking for drain repair at first, and then a subsequent search looking for plumber in a particular region. Do you believe that that could be combined to give an enhanced result?

Bill Slawski:
I think that’s very possible. Yeah.

Michael:
Now, beyond Google, we know that Amazon, Microsoft are making some headway in these areas as well. Is there anything we should be looking at there?

What are other engines doing with knowledge-based search?

Bill Slawski:
So, Microsoft has a concept based graph, and I like looking at these things when I do a site audit, I’ll look at the Knowledge Graph on Google, I’ll look at the Knowledge Graph, it’s not known as a Knowledge Graph on Bing, but they have them. And see what it contains, how might be different from the one on Google. Because sometimes what you get in a Knowledge Graph isn’t necessarily what you want. Had a client who was a car dealership and their Knowledge Graph showed their repair shop, which they didn’t want. They wanted to sell more cars, right? So they didn’t want the Knowledge Graph showing the repair shop. So make sure the Knowledge Graph shows what you want to show. And now it’s just a matter of contacting Google through the feedback form and letting them know that they chose the wrong thing to show.

Michael:
They give us a chance to say how we want that to be. Also, I know that Google provides tools to test our structured data. Where else do you suggest that we go to make sure that we’re presenting properly?

How To Test our Structured Data

Bill Slawski:
So in addition to the structured data testing tool and the rich results tool that Google’s offers, they’re including things in the Google search console, which lets you look at all the pages on your site in a glance. So if you have maybe a mistake with one page, if you want to catch it, that may be a good way to catch it, because it’s all accumulated there.

Michael:
Any other tools to check our site or to spot errors?

Bill Slawski:
The tools from a place like Google give you a sense of what Google might be looking for. Google might interpret the schema on your webpage in a certain way, that a different source might interpret it, like one of the guys who worked on JSON-LD for the W3C created a linter, which is a way of validating JSON. And I’ve used it in the past, I don’t use it when I’m looking at schema for Google because I’m not quite sure that the way it validates schema markup is the same as the way Google validates scheme markup. And I want to check what they validate or the structured data testing tool or the rich results tool. Right.

Michael:
Bill, we know that these things are changing over time. If you were to advise an SEO on staying up to date with these developments and how the engines are handling these, where would you suggest they look?

How to Stay Up To Date

Bill Slawski:
The first place might involve reading an additional 20 or 30 emails a month, which is a W3C schema mailing list. And it discusses new updates to schema and new releases. Updated schema is now coming out about once a month, so it’s a quickly developing a part of SEO. In addition to new schema, new attributes and schema covering certain things, is the ability for people to provide extensions for schema. So GS1 is an international organization that works with lots and lots of manufacturers and distributors, and they’re the ones who developed commercial barcodes. They’ve developed some schema markup for eCommerce, and they’ve got a GS1 wizard to help you write that schema.

And that’s one of the extensions for schema. There was another schema that came out last year that was from the financial industry. It’s a FIBO, F-I-B-O. Financial, industry, business ontology. And its schema is for banks and for businesses that service loans.

Michael:
All right, Bill, these have been great practices. I really appreciate it. Many of the concepts that we’ve covered will be on the exam, specifically in the practices of staying up to date with these [changes] and providing relevant signals. Any final guidelines that applied to how engines handle knowledge based data?

Bill Slawski:
So, this isn’t new material. This is something that has been part of Google since the earliest days. When Larry Page came out with the page rank algorithm in 1998, Sergey Brin filed a patent, which was possibly the second patent from Google on an algorithm, which you referred to is a DIPRE, D-I-P-R-E, which has to do with patterns and relationships, where he identified five books, the publishers, the authors, the lengths of the books. And the algorithm was supposed to find those books on websites, and if it can find all five of them on a website, it was supposed to collect factual information about all the other books on same websites. And it was a way of crawling the web for facts and for relationships. So I mention relationships, we were talking about relevance and relationships between entities and properties is sort of the new relevance when it comes to Knowledge Graphs, knowledge bases.

Bill Slawski:
Google will look for information to try to gauge a confidence level between an entity and a fact about an entity. Mickey Mantle was a baseball player. 87% chance of that being true, according to everything it’s crawled in the web, and it looks at freshness of sites, it looks at reliability, it looks at popularity and it learns from sites, which things are more likely true about entities than not. So when you’re talking about things like medical sites, that talk about science and whether or not there’s a scientific consensus for a concept. Google tries to get an idea of what information is relevant, is up to date, is more likely than not to be true, based upon creating what they refer to as association scores for that information. It might be saying, “Okay, we have lots of medical information from the National Institute of Health, and from PubMed about certain topics like treatments for different diseases, and they’re likely more true than not. So if we have other sites that are referring us to different things, we’re going to have less confidence in those.

Google will look for information to try to gauge a confidence level between an entity and a fact about an entity.

Michael:
That is the time that we have today. A big thank you to Bill Slawsky, check out Bill’s posts at seobythesea.com. Also, I know that Bill posts on gofishdigital.com, you can search for him there and see his posts. They’re quite illuminating. Be sure to see Bill speaking at Pubcon and other conferences. Bill, I think you mentioned that you’re going to be speaking in Italy next. Where else can people engage with you?

Bill Slawski:
The only real place I have left to speak this year is in Italy, it’s my last conference.

Michael:
Which conference is that?

Bill Slawski:
It’s SMXL. Milan.

Michael:
All right. In Milan. All right, a great place to go. I hope you enjoy your trip out there.

Well thanks very much Bill. Listeners, what Bill shared today is definitely aligned with the OMCP standards of digital marketing competencies, and some of this will show up on the exam. I’m your host Michael Stebbins, and you’ve been listening to the OMCP Online Marketing Best Practices Podcast. OMCP maintains the certification standards for the online marketing industry in cooperation with industry leaders, just like Bill. Join us inside of OMCP to maintain your certification, get some good offers, and engage with other certified professionals, universities and training programs that teach to OMCP standards.

2 thoughts on “SEO Knowledge-based Search with Bill Slawski – Online Marketing Best Practices Podcast from OMCP

  1. Interesting post! It’s funny when Bill says “Google tries to get an idea of what information is relevant, is up to date, is more likely than not to be true, based upon creating what they refer to as association scores for that information” which all makes sense to me. The question is ” How do we influence all that”?

    • Hi Giulio, Bill may chime in here, but a good start might be to present schema markup with the data engines are asking for for your authoritative content. I know, trite answer, but so few actually DO this step — including OMCP on its own site.

Leave a Reply

Your email address will not be published. Required fields are marked *