Structured Data: SEO Mythbusting – What Google Wants

Structured data formats are rules that standardize the structure and content of a webpage.

Google is asking all of us to surface structured data to their crawlers by marking up our HTML with RDFa and Microformats.

Google’s John Mueller made it clear that Google preferred JSON-LD structured data.

Wow unless you are super technical this is all mumbo jumbo. “schema markup” and “structured data” WTH….

It sounds and looks complicated, but it is something anyone can learn to do.

So what does this mean and why should you care?

Basically Google wants this, and if you want your site to rank somewhere inside hte first 10 pages then, you has better do what Google wants.

After all just doing this can give you a significant SEO boost and also increase your rankings.

Most people simply put human readable dat on their site – this looks great but it makes it harder for Google to find and crawl.

This markup makes it easier for Google to know what hte page is about without guessing

Check out this code direct from Google

You would do that by using this markup:

So for example you have a receipe page 

The markup you could use is

———————————

<html>
  <head>
    <title>Party Coffee Cake</title>
    <script type="application/ld+json">
    {
      "@context": "https://schema.org/",
      "@type": "Recipe",
      "name": "Party Coffee Cake",
      "author": {
        "@type": "Person",
        "name": "Mary Stone"
      },
      "datePublished": "2018-03-10",
      "description": "This coffee cake is awesome and perfect for parties.",
      "prepTime": "PT20M"
    }
    </script>
  </head>
  <body>
  <h2>Party coffee cake recipe</h2>
  <p>
    This coffee cake is awesome and perfect for parties.
  </p>
  </body>
</html>

—————————-

 

This looks complicated so to help you out Google has created the Structured Data Mark-up Helper so web masters can add schema mark-up to their sites easier.

 

 

SEO changes rapidly, what ranked a site quickly one day may not work the next. Especially if these are blackhat methods.

Google recognizes this so have put together a channel to help webmasters find out What Google Wants.

One of these channel that Google has created is called SEO Mythbusting. 
 
Below is a video from this series
 
Under this video is the transcript from the video so you can follow it if needed.
 
 
In this bonus material from the filming of last week’s episode (Googlebot: SEO Mythbusting), Martin Splitt (WebMaster Trends Analyst, Google) and his guest Suz Hinton (Cloud Developer Advocate, Microsoft) dive into the topic of “new microformats”: structured data!
 
Documentation mentioned in this episode:
Intro to structured data → https://goo.gle/structured-data-intro
Overview of supported structured data in Google Search → https://goo.gle/search-gallery
Structured data testing tool → https://goo.gle/2K9rTo5
Rich results rest → https://goo.gle/30SEWA3
Rich result status reports → https://goo.gle/rich-results-report
 
 
 

[MUSIC PLAYING]

 

SUZ HINTON: There
is one term that I’m

going to mention
to you just based

on this is the reason why
I had to submit the URL

to be re-indexed.

And that’s microformats.

MARTIN SPLITT: Oh!

All right.

SUZ HINTON: So can
we talk about–

are they still a thing?

I haven’t really had to do
a lot of SEO optimization

for a while.

And I knew microformats
was such a huge thing

because let’s say you’ve
got a product page

and it has reviews
on it and you want

to show the little stars and all
of that kind of rich content.

And every time I made a
tweak and we deployed,

I would have to then submit
to get re-crawled and see

if the results got more written.

And that was definitely a
very slow feedback cycle.

MARTIN SPLITT: Yes.

SUZ HINTON: So
what is the state?

Is microformat still a thing?

And are there better resources
out there right now for us

to be able to pull
that rich content out?

MARTIN SPLITT: You’re
going to be very happy.

And we have much better things.

SUZ HINTON: Yay!

MARTIN SPLITT: They
are still a thing.

But they are now
called structured data.

SUZ HINTON: Structured data.

MARTIN SPLITT: And we
are using JSON-LD, so

JSON for Linked Data.

SUZ HINTON: Yeah, this
is all new terms to me.

MARTIN SPLITT: Right.

And you probably used
literally the microdata

attributes in HTML.

SUZ HINTON: Yes.

Yep, exactly.

Yeah, we were using them.

And they were very hit and miss.

MARTIN SPLITT: Yes.

SUZ HINTON: It was very easy
to just mess up one tiny thing.

And the validator
didn’t catch it.

And then the stars
would disappear.

And we’d be like [GASP].

MARTIN SPLITT: And we
have moved on from there.

SUZ HINTON: OK, that’s good.

MARTIN SPLITT: So there is now–
schema.org is an open source

organization where people can
submit or discuss or change

or do stuff with
the semantic data

that they want to
put on the web.

SUZ HINTON: Got it.

MARTIN SPLITT: And people
that’s participating–

there is much more
semantic data out there

than we are supporting
in search results.

But a bunch of it is supported
in the search results.

So for instance, if you
have an event that we want

to have showing up
with the location

and if you can get
tickets and who

is the performer and
all that kind of stuff–

if you have a recipe
where you might

have an image or the
instructions on how to make it

or the time it takes
to make it and reviews,

how nice this recipe might
be, articles, books, and TV

series, all sorts of things,
we have documentation on that

specifically as well.

If you go to
developers.google.com/search,

you find all the
supported types.

And they show up nicely
in the search results.

So you get a little
preview picture.

And then you get the stars
and all that kind of stuff.

SUZ HINTON: Oh, this
would have been amazing.

MARTIN SPLITT: It’s fantastic.

And it’s JSON.

SUZ HINTON: Which
is so much easier.

MARTIN SPLITT: It’s the
script tags with JSON in it.

It’s so much easier.

SUZ HINTON: It’s just not
little meta attribute things?

MARTIN SPLITT: Correct, yes.

So you have your JSON block.

And we have what’s called the
Structured Data Testing Tool.

That is a little dated by now.

But it supports– generally,
basically everything

that we know of shows up
as either valid or invalid.

And then we have the
Rich Results Test,

because the Structured Data
Test, while being very generic,

is also not very specific
to what you want to achieve.

You want to probably achieve
the nice little stars showing up

in the search results.

This is what we
call rich results.

And there’s the Rich
Results Test for it.

And that even
gives you a preview

of how that might look
like in the search results.

There’s no guarantee
that it does

look like that in
the search results

because people have been
using it to spam stuff, like–

SUZ HINTON: Yeah, true.

MARTIN SPLITT: I have
a bazillion reviews.

And then we’re
like, yeah, you just

have some JavaScript
generating fake reviews.

That’s not really–

SUZ HINTON: Well, how do
you actually use the tool?

Because I remember you used to
have to dump your entire HTML

file in there.

MARTIN SPLITT: You
[? don’t. ?] [INAUDIBLE]

SUZ HINTON: And if you
did it too many times,

you got timed out.

MARTIN SPLITT: Right.

SUZ HINTON: Yeah.

MARTIN SPLITT: But that
doesn’t happen anymore.

SUZ HINTON: Oh, OK.

That’s pretty exciting.

MARTIN SPLITT: So
you have two options.

You can dump a URL
in it, which is nice.

And you can even use ngrok or
something if you have a local–

SUZ HINTON: Oh, you
could do local host?

MARTIN SPLITT: Yes.

SUZ HINTON: Oh,
this is very fancy.

MARTIN SPLITT: Or
you can even also

still do like you dump
your HTML in there.

We execute the JavaScript.

So if you’re using JavaScript
within that code dump,

that’s fine.

SUZ HINTON: Oh, wonderful.

MARTIN SPLITT: If
you’re running it– yes.

And you can basically
live debug as you type.

You press a button and
it goes like, nope.

And you’re like, oh, damn it.

And you get the feedback here.

And it’s like, missing
performer for your event.

And I’m like, OK, sorry, sorry.

And you write it in.

And then it reruns it.

And you’re like, OK, cool.

This is what I want.

And I can take it
back to [INAUDIBLE]

SUZ HINTON: That is awesome.

MARTIN SPLITT: And
yeah, we have that tool.

We have Search
Console that gives you

a live view of what
happens on your page,

also for structured data.

Yeah, microdata is not
that much of a thing.

But the structured data
is still going strong.

SUZ HINTON: Well, it sounds
like it’s come a long way.

That’s very exciting.

MARTIN SPLITT: It does.

SUZ HINTON: If I’m ever
working for a large retailer

ever again, then I
feel like I got this.

MARTIN SPLITT: If you have a
blog, add the article markup.

You might get [INAUDIBLE]

SUZ HINTON: Oh, so OK.

I’m going to look at
the schema for that.

That would be like
author and stuff.

MARTIN SPLITT: And other sources
might pull the data as well,

right?

It’s an open source format.

So theoretically,
voice assistance

could use it as well.

So just imagine if
you have a recipe blog

and then you stand
in the kitchen,

go like, hey, assistant
thing– whatever

it is, whatever company
you’re choosing.

There’s a variety of
options these days, right?

And then the thing goes like
yeah, Martin’s apple pie.

First step– take some
apples and peel them.

And you’re like oh,
OK, fair enough.

That can come from the
structured data as well.

So that’s pretty cool.

SUZ HINTON: That is really cool.

I didn’t even think
of those use cases.

I just always thought
about search results.

[MUSIC PLAYING]

 

 

Googlebot: SEO Mythbusting – What Google Wants

Googlebot: SEO Mythbusting – What Google Wants

Google’s main crawler is called Googlebot .

Googlebot retrieves the content of webpages (the words, code and resources that make up the webpage).

It then sends the information to Google.

Google uses this information in its Google search engine to determine what sites to display and to whom.

  • There are more than 3.5 billion Google searches every day
  • 76% of all global searches take place on Google
  • Google Search Index contains more than 100,000,000 GB
  • More than 60% of Google searches come from mobile devices
  • 16-20% of all annual Google search results are new
Google has put together a channel called SEO Mythbusting. This helps webmasters find out What Google Wants.
 

Martin Splitt (WebMaster Trends Analyst, Google) and his guest Suz Hinton (Cloud Developer Advocate, Microsoft) discuss the many intricacies of Googlebot such as:

What is – and what is not – Googlebot (crawling, indexing, ranking) (1:02)
Does Googlebot behave like a web browser? (3:33)
How often does Googlebot crawl, how much does it crawl, and how much can a server bear? (
4:03)
Crawlers & JavaScript-based websites (
9:04)
How do you tell that it’s Googlebot visiting your site? (
11:12)
The difference between mobile-first indexing and mobile friendliness (
12:28)
Quality indicators for ranking (
13:35)

Below this is subtitles for the video

 

 

 

 

SUZ HINTON: A lot of
confusion revolves around SEO

because no one understands how
the Googlebot actually works.

[MUSIC PLAYING]

 

MARTIN SPLITT: Hello and
welcome to another episode

of “SEO Mythbusting.”

With me today is Suz
Hinton from Microsoft.

Suz, what do you do at work,
and what is your experience

with front end SEO?

SUZ HINTON: Yeah,
so right now, I’m

doing less front end these days.

I focus more on IoT.

MARTIN SPLITT: So in the
time you were a front end

developer–

SUZ HINTON: Yeah, I was a front
end developer for, I think,

12 or 13 years.

And so I got to work on lots of
different contexts of front end

development, different web
sites, things like that.

MARTIN SPLITT: Cool.

SUZ HINTON: Today,
I wanted to just

address a bunch of stuff
about Googlebot specifically,

and nerd out about
Googlebot, because that

was the side of things that
I was the most confused about

at the time.

MARTIN SPLITT: So Googlebot
is basically a program

that we run that
does three things.

The first thing is it
crawls, then it indexes,

and then last, but
not least, there’s

another thing that is not
really Googlebot anymore.

That is the ranking bit.

So we have to basically grab
the content from the internet,

and then we have to figure out
what is this content about?

What is the stuff that
we can put out to users

looking for these things?

And then last, but
not least, is which

of the many things that
we picked for the index

is the best thing for
this particular query

in this particular time?

SUZ HINTON: Got it, yeah.

MARTIN SPLITT: But the
ranking bit, the last bit,

where we move things around–
that is informed by Googlebot,

but it’s not part of Googlebot.

SUZ HINTON: Is that
because there’s

this bit in the
middle, the indexing?

The Googlebot is
responsible for the indexing

and making sure that content is
useful for the ranking engine

to–

MARTIN SPLITT:
Absolutely, absolutely.

You can imagine, someone
has to– in the library,

someone has to figure out
what the books are about

and get the index of the bits
in a catalog, the catalog

being our index, really.

And then someone else
is using that index

to make informed
decisions and going, here,

this book is what
you’re looking for.

SUZ HINTON: I’m
really glad you used

that analogy because I worked
in a library for four years.

MARTIN SPLITT: So you know much
better than I how that works.

SUZ HINTON: And I
was that person.

People would be like, I
want Italian cookbooks,

and I’m like, well,
it’s 641.5495.

And you would just
give it to them.

MARTIN SPLITT: If I would
come to you, as a librarian,

and ask a very
specific question,

like so what is the best book on
making apple pies really quick,

would you be able to figure
out, from the index–

you probably have
lots of cookbooks.

SUZ HINTON: We did, yeah.

We had a lot.

But given that I also put lots
of books back on the shelf,

I knew which ones were popular.

I’ve no idea if we can link
this back to Googlebot.

MARTIN SPLITT: That does.

Yeah, it’s pretty much– so you
have the index that probably

doesn’t really change that much,
unless you add new books to it.

SUZ HINTON: New editions.

MARTIN SPLITT: Exactly, yeah.

So you have this index, which
Googlebot provides you with.

But then we have the second–

the librarian second
part that basically is,

based on how the interactions
with the index work,

figure out which
books to recommend

to someone asking for it.

So that’s pretty much
the exact same thing.

Someone figures out what
goes into the catalog,

and then someone uses it.

SUZ HINTON: I love this.

This makes total sense to me.

MARTIN SPLITT: But I guess
that’s still not necessarily

all the answers you need.

SUZ HINTON: Yeah, I just want to
know, what does it actually do?

How often does it crawl sites?

What does it do
when it gets there?

What does it– how is it
generally behaving like?

Does it behave
like a web browser?

MARTIN SPLITT: That’s
a really good question.

Generally speaking, it behaves
a little bit like a browser–

at least, part of it does.

So the very first
step, the crawling bit,

is pretty much a browser
coming to your page,

either because we
found a link somewhere,

or you submitted a
site map, or there’s

something else that basically
fit that into our systems.

You can use Search Console
to give us a hint and ask

for re-indexing, and that
triggers a crawl before–

SUZ HINTON: I’ve
done that before.

MARTIN SPLITT: Oh, very good.

SUZ HINTON: We asked
for it to be done.

MARTIN SPLITT: And
that is perfectly fine,

but the problem then,
obviously, is how often do you

crawl things, and how
much do you have to crawl,

and how much can
the server bear.

If you’re on the
backend side, you

know that you have
a bunch of load,

and that might not be
always the same thing.

If it’s like a Black
Friday, then the load

is probably higher
than on any other day.

So what Googlebot does is
it tries to figure out,

from what we have in
the index already,

is that something
that looks like we

need to check it more often?

Does that probably change?

Is it like a newspaper
or something?

SUZ HINTON: Got it, yeah.

MARTIN SPLITT: Or
is that something

like a retail site that
does have offerings that

change every couple of weeks?

Or even do not change at
all because this is actually

the site of a museum
that changes very rarely?

For the exhibitions maybe,
but a few bits and pieces

don’t change that much.

So we try to like segregate
our index data into something

that we call daily or
fresh, and that gets

called relatively frequently.

And then it becomes less and
less frequent as we discover,

and if it’s something that is
super spammy or super broken,

we might not crawl it as often.

Or if you specifically
tell us, do not index this,

do not put this
in the index, this

is something that I
don’t want to show up

in the search results,
and we don’t come back

every day and check.

So you might want to
use the re-index feature

if that changes.

You might have a page that you
go, no, this shouldn’t be here,

and then once it
has to be there,

you want to make sure that we
are coming back and indexing

again.

So that’s the browser bit.

That’s the crawler part, but
then a whole slew of stuff

happens in between
that happening,

us fetching the content
from your server,

and the index having
the data that is then

being served and ranked.

So the first thing is
we have to make sure

that we discover if you have any
other resources on your page.

The crawling cycle
is very important.

So what we do is, the moment
we have some HTML from you,

we check if we have
any links in there,

or images for that
matter, or video

something that we
want to crawl as well,

and that feeds right back
into the crawling mechanism.

Now, if you have a
gigantic retail site,

let’s say, just
hypothetically speaking,

we can’t just crawl
all the pages at once,

both for our
resource constraints,

but also we don’t want to
overwhelm your service.

So we basically
try to figure out

how much strain we can
put on your service

and how much resources
we’ve got available as well,

and that’s called the
crawl budget, oftentimes.

But it’s pretty tricky to
determine, so one thing

that we do is we
crawl a little bit,

and then basically ramp it up.

And when we start
seeing errors, we

ramp it down a little bit more.

So oops, sorry, for that,
we are not– oh, ugh.

So whenever your service
serves us 500 errors,

there are certain tools
in Search Console that

allow you to say, hey, can you
maybe chill out a little bit.

But generally, we don’t try
to get all of it at once

and then ramp down.

We are trying to carefully ramp
up, ramp down again, ramp up

again, ramp down again, so
it fluctuates a little bit.

SUZ HINTON: There’s a
lot more detail in there

than I was even expecting.

I didn’t even know that–

I guess I never considered
that a Googlebot crawling

event could put strain
on somebody’s website.

That sounds like it’s a
lot more common than I even

thought it would be.

MARTIN SPLITT: It does
happen, especially

if we discover, say,
a page that has lots

of links to subpages pages.

Then all of these go
into the crawling queue,

and then you might–

let’s say you have 30
different categories of stuff,

and each of these have a few
thousand products and then

a few thousand
pages of products.

So we might go, oh, cool, crawl,
crawl, crawl, crawl, crawl,

crawl, crawl, and then we
might crawl a few hundred

thousand pages.

And if we don’t spread
that out a little bit–

so it’s a weird balance.

On one hand, if you
add a new product,

you want that to be surfaced
and searched as quickly

as possible.

On the other hand,
you don’t want

us to take all the bandwidth
that your server offers.

I mean, cloud computing makes
that a little less scary,

I guess, but I
remember the days–

I’m not sure if you
remember the days where

you had to call someone,
and they ask you

to send a form or fax a form.

And then two weeks later, you
get the confirmation letter

that your server
has been started.

SUZ HINTON: Yes, I
remember the days

when we would have to call,
and then we would basically

pay $200 to have a
human go down the aisles

and push the physical reset
button on the server, so yeah.

MARTIN SPLITT: Those times
were a lot trickier, yeah.

And then imagine you basically
renting five servers somewhere

in a data center, and
that taking a week,

and then we come and scoop
up all your bandwidth.

And you’re like, great,
we’re offline today

because Google
has its crawl day.

That’s not what we want to have.

SUZ HINTON: Yeah,
these days, it’s

more like a happy news kind
of moment, when you get hit.

MARTIN SPLITT: Exactly.

SUZ HINTON: So I
feel like you’re

much more considerate than–

MARTIN SPLITT: Yeah, we try
to not overwhelm anyone,

and we respect the robots.txt.

So that works within
the crawl step as well.

And once we have the
content, we can’t

put strain on your
infrastructure

anymore, so that’s fantastic.

But modern web apps being
mostly JavaScript driven,

we then put that in
a queue, and then

once we have the
resources to render it,

we actually use another
headless browser kind of thing.

We call that the Web
Rendering Service.

Then there’s other
crawlers as well

that might not have the capacity
or the need to run JavaScript.

This is like social
media bots, for instance.

They come and look for metadata.

If that meta tag is
coming in with JavaScript,

you usually have a bad time,
and they’re just like, sorry.

SUZ HINTON: Yeah, so that’s
always been a big mess,

and I remember when single
page applications, or SPAs,

really came into vogue.

A lot of people were
really concerned.

There’s a lot of FUD around.

Well, if crawlers in general
don’t execute JavaScript,

then they’re going
to see a blank page,

and how do you get around that?

So contextually,
within Googlebot,

it sounds like Googlebot
executes JavaScript–

MARTIN SPLITT: They do.

SUZ HINTON: Even if it does
do it at a later point.

MARTIN SPLITT: Yes, correct.

SUZ HINTON: So that’s good?

MARTIN SPLITT: That’s good.

SUZ HINTON: But
is there anything

that people need to be
aware of beyond just,

oh, well, it’ll just
run it, and then

it’ll see exactly the same
thing as a human with a phone

or a desktop would see?

MARTIN SPLITT: There’s
a bunch of things

that you need to be aware of.

So the most important thing
is, again, as you said,

it’s deferred.

It happens at a later point.

So if you want us to crawl your
stuff as quickly as possible,

that also means we have to
wait to find these links

that JavaScript injects.

Basically, we crawl, we have
to wait until JavaScript

is executed, then we
get the rendered HTML,

and then we find the links.

So the nice little
short loop that

finds these links relatively
quickly right after crawling

will not work.

So we will only see the
links after we render it,

and this rendering can take
a while because the web is

surprisingly big.

SUZ HINTON: Yeah,
just a little bit.

MARTIN SPLITT: There’s 130
trillion docs in 2016, so–

SUZ HINTON: So
there’s way more now.

MARTIN SPLITT:
There’s way more now.

There’s way more than that.

SUZ HINTON: So
robots.txt is very

effective at being able to tell
bots how to do a certain thing.

But in this scenario,
how do you tell

that it’s Googlebot visiting
your site as opposed

to other things?

MARTIN SPLITT: So
as we are basically

using a browser in two
steps– one is the crawling,

and one is the
actual rendering–

both of these moments, we do
give you the user agent header.

But basically,
there’s the string–

literally the string
Googlebot in it.

SUZ HINTON: That’s
so straightforward.

MARTIN SPLITT: Yes,
and you can actually

use that to help with your
SPA performance as well.

So as you can detect
on the server side,

oh, this is Googlebot
user agent requesting,

you might consider sending
us a prerendered static HTML

version, and you can do the
same thing for the others.

All the other search engines
and social media bots

have a specific string
saying that they are a robot.

So you can then basically
go, oh, in that case,

I’m not giving you the real
deal, the single page app.

I’m giving you this HTML
that we prerendered for you.

It’s called dynamic rendering.

We have docs on that as well.

SUZ HINTON: The one thing
that still doesn’t quite

make sense to me is
does the Googlebot

have different contexts?

Does it sometimes
pretend that it’s–

I think of it as this
little mythical creature

that’s pretending to
do certain things.

So does it pretend to be on
a mobile, and then desktop?

Are the different, I
guess, user agents,

even though it still
says Googlebot?

And can you differentiate
between them?

MARTIN SPLITT: You’re asking
great questions, because yes,

we have different user agents.

So I’m not sure if you heard
about mobile first indexing

being rolled out and happening.

SUZ HINTON: I’ve heard
that it’s going to affect

how you’re ranked potentially.

MARTIN SPLITT: That as well.

SUZ HINTON: I don’t know if
that’s a rumor or not, yeah.

MARTIN SPLITT: Ah, that’s
two different things

that get conflated so often.

So mobile first indexing
is about us discovering

your content using a mobile user
agent and a mobile viewport.

So we are using
mobile user agents,

and the user agent
strings says so.

It says something about
Android in the name,

and then you’re like, aha, so
this is the mobile Googlebot.

We have documentation on that.

There’s literally a
Help Center article

that lists all these things.

So we try to index
mobile content

to make sure that
we have something

nice to server for
people who are on mobile,

but we’re not pretending
random user agents or anything.

We stick to the
user agent strings

that we have documented
as well, and that’s

mobile first
indexing, where we try

to get your mobile content
into the index rather

than the desktop content.

Then there’s mobile readiness,
or mobile friendliness.

If your page is
mobile friendly, it

makes sure that everything
is within viewport,

and you have large enough
tap targets and all

these lovely things, and that
just is a quality indicator.

We call these signals.

We have over 200 of them.

SUZ HINTON: That’s a lot.

MARTIN SPLITT: Right?

So Googlebot collects
all these signals

and then stuff them, as
metadata, into the index.

And then when we rank, we’re
like, so this user’s on mobile,

so maybe this thing that has a
really good mobile friendliness

signal attached to it might
be a better one than the thing

where they have to pinch
zoom all the way out

to be able to read anything,
and then can’t actually

deal with the different
links because they’re

too close to each other.

So that’s one of the many–

it’s not the signal.

It’s one of the many signals.

It’s one of the over 200
signals to deal with.

SUZ HINTON: I had no
idea there were 200.

That’s making me–

I know that you’re not
allowed to share what they all

are because there has to be
a certain mystique around it,

because of, I guess, a lot
of SEO abuse in the past.

MARTIN SPLITT: Yeah,
yeah, unfortunately, that

is a game that is
still being played,

and people are doing weird
stuff to try to game us.

And the interesting thing with
this is, with the 200 signals,

it’s really hard
to say which one

gets you moving in the ranks.

SUZ HINTON: The weights
of each signal because–

MARTIN SPLITT: And they keep
moving, and they keep changing.

I love when people are like, no,
let’s do this, and then, look,

my rank changes.

Yeah, for this
one query, but you

lost on all the other queries
because you did really

weird and funky stuff for that.

So just build good
content for the users,

and then you’ll be fine.

SUZ HINTON: I feel like that–

it feels like less
effort as well,

than constantly trying to–

MARTIN SPLITT: Yeah, but
it’s not an easy answer.

You pay me to make you more
successful on search engines,

and I come to you and say,
so who are your users,

and what do they need,
and how could you

express that so that they
know that it’s what they need?

That’s a hard one because
that means I basically

bring the ball back
to you, and now, you

have to think about stuff and
figure it out, strategically.

Whereas if I’m like,
I’m just going to get

you links or do some
funky tricks here,

and then you’ll be
ranking number one.

That’s an easier answer.

It’s the wrong answer, but
it’s the easier answer.

So people are like, links are
the most important metric ever,

and I’m like, no.

We have over 200,
and it’s important,

but it’s not that important.

And chill out, everybody.

But this still happens.

SUZ HINTON: I’m so
glad it’s better now.

I feel, actually, more at peace
in general with SEO, as well,

after speaking to you today.

MARTIN SPLITT: Ah, so good.

Suz, thank you so
much for being with me

here, and has been
a great pleasure.

SUZ HINTON: Yeah,
thanks for answering

all of my weird and wonderful
questions about the Googlebot.

MARTIN SPLITT:
Perfect questions.

Perfect opportunity.

Did we bust some myths?

SUZ HINTON: I feel like we did.

MARTIN SPLITT: Fantastic.

I think that’s
worth a high five.

SUZ HINTON: Awesome.

Thanks.

MARTIN SPLITT: Thanks.

Join us again for the next
episode of “SEO Mythbusting,”

where Jamie Alberico
and I will discuss

if JavaScript and SEO can be
friends and how to get there.

 

Page Speed: SEO Mythbusting – What Google Wants

Page Speed: SEO Mythbusting – What Google Wants

It’s all about speed – no one wants to wait for a page to load – and Google has been saying for ages that they want a super fast internet.

Basically Page Speed can be simply stated as “the amount of time that it takes for a webpage to load.”

Sounds simple, but getting a page loading fast takes a lot of work. 

This includes having a fast host / server, optimizing page sizes and images and ctilizing a great CDN.

Google has put together a channel called SEO Mythbusting. This helps webmasters find out what Google Wants.

Below is a video direct from Google about page speed Below this is a transcript so you can dig in either further.

 

In the third episode of SEO Mythbusting season 2, Martin Splitt (Developer Advocate, Google) and Eric Enge (General Manager of Digital, Perficient) discuss the most common SEO questions and myths around page speed.

 

MARTIN SPLITT: What
do you think are

misconceptions about page
speeds and especially page

speed and ranking?

ERIC ENGE: Well, a
lot of people think

that it’s a big ranking factor.

In fact, I was literally
looking at a document

that a company had produced.

This document actually
talked about SEO,

and it had a section
on SEO which is good.

At least they’re
thinking about it.

But the first thing they
listed was page speed.

And they were actually quite
insistent in the write up

that it was the most
important ranking factor.

MARTIN SPLITT: Oh, no.

ERIC ENGE: And I was like, OK.

I’ve got to find the
right way to tell them

that I want them to deal
with this because it’s

really important.

And it clearly impacts user
engagement and conversion.

No, it doesn’t mean
you’re going to move up

three spots in the results.

MARTIN SPLITT: Right.

Yeah.

[MUSIC PLAYING]

 

MARTIN SPLITT: Hello and
welcome to another episode

of SEO Mythbusting.

With me today is Eric Enge.

And would you like to
introduce yourself?

Because you’re
doing so much stuff.

What is it that you’re doing?

ERIC ENGE: Well, you
know, I’m General Manager

of part of the digital marketing
team at Proficient Digital.

And altogether, we do SEO,
content creation, content

marketing, pay per click,
analytics, conversion rate

optimization–

MARTIN SPLITT: Trainings,
Twitter, conference speaking.

ERIC ENGE: Yeah.

That’s a fair amount of
stuff to keep us busy.

MARTIN SPLITT: A fair amount
of stuff to keep us busy.

But today we’re going to get
busy talking about page speed.

ERIC ENGE: It’s a great topic.

Because so many
people get it wrong.

MARTIN SPLITT: Oh. yeah.

It’s quite a deep topic as well.

ERIC ENGE: Yes.

MARTIN SPLITT: So
what kind of questions

do you have around ranking,
factor, trade speed,

and page speed in general?

ERIC ENGE: So let’s
actually start in general

and just talk about why
page speed is important.

How’s that sound?

MARTIN SPLITT: Sounds fantastic.

I think if you
look at what you’re

trying to accomplish
as you’re trying

to accomplish that,
you’re building

a good website for your users.

Right?

ERIC ENGE: Right.

MARTIN SPLITT: So
now, how many times

have you been on the metro
or in the car or somewhere

in the countryside where
you didn’t have fantastic

reception on your mobile phone?

And you were basically just
like really quickly trying

to find something out and it
just took ages for the content

to actually show up.

That’s painful, isn’t it?

ERIC ENGE: It is painful.

MARTIN SPLITT: And in
fact, on some sites that

can happen when you are
in a place where you’ve

a perfectly strong signal.

MARTIN SPLITT: That’s
actually true Yeah.

Yeah.

ERIC ENGE: And that’s not–
that’s so frustrating.

MARTIN SPLITT: Right.

And you don’t want to
frustrate your users.

ERIC ENGE: Right.

MARTIN SPLITT: And
we as a search engine

do not want to have
users frustrated

when they see content.

So for us it makes sense
to consider fast web

sites a little more helpful to
the users than very slow web

sites.

Right?

ERIC ENGE: It does make sense.

And I guess my thought
process in this

has always been
that well, yes, it’s

likely that you’re using at
some levels a ranking factor.

But you can’t make it such
a strong ranking factor

that you won’t show the
most relevant content.

MARTIN SPLITT: Oh, yeah.

Absolutely.

If you have bad content,
if you are the fastest

website out there but
the content is not great,

then that’s not helping you.

ERIC ENGE: Right.

Right.

I mean, to get the content
you don’t want quickly

is probably not what
the user’s looking for.

MARTIN SPLITT: Exactly.

Like, I have a blank website.

It’s the fastest website ever.

What’s the point?

ERIC ENGE: Yeah.

Well, yes.

Exactly.

But it does make
sense to consider it

at least at some level.

And there’s actually a
fun pair of statistics I

think they’re both from Google.

One is that something
like 53% of sessions

are abandoned if it takes longer
than three seconds for the page

to load.

And then the companion
statistic is,

and I think it’s
a little bit old

but still, the average page
takes 15.3 seconds to load.

What a frightening combination.

MARTIN SPLITT: It
is frightening.

It’s frightening.

And it’s so many
different factors.

Right?

Sometimes it’s slow servers.

But sometimes it’s just like the
server responds really quickly

but then there’s a
ton of JavaScript

that has to be processed first.

And JavaScript is a
very expensive resource

because it has to be fully
downloaded and then parsed

and then executed.

But, yeah.

So we keep seeing this.

And everyone knows this.

Anecdotal evidence
is there as well.

You have studies.

You have the anecdotal evidence
of you sitting in front

of a website going like, ugh.

And Just imagine being
on a metered connection

where you actually pay
by megabyte when you fly

or something.

It’s like you can
buy 20 megabytes

for 10 euros or something.

And you’re like, oh, OK.

Open one website.

You said, what
was it 15 megabyte

is the average or something?

ERIC ENGE: Well 15.3 seconds
is what I’m was talking about.

MARTIN SPLITT: Oh,
sorry, 15.3 seconds.

So you can just
imagine how much data

you were pulling in
these 15 seconds.

ERIC ENGE: Yeah.

In fact, I did see–

I really was looking
at this just yesterday.

There is this data
from Think with Google

where by market sector it shows
the average web page size.

And they’re all in the
megabytes in every market.

And I think your recommendation
is 500 k-bytes or less,

if I’m not mistaken.

MARTIN SPLITT: Yeah,
the fewer, the better.

The fewer, the better, really.

And just think about it.

Like I grew up with entire
video games on like two or three

floppy disks which each fit
like a megabyte and a half

or something.

So why are we doing
this on the web now?

ERIC ENGE: Hm.

What a great idea.

Well, maybe we should help
people speed their sites up.

What do you think?

MARTIN SPLITT: That’s the thing.

And that’s why
ranking these by speed

is also an important factor.

But as you say, like
content still is king.

Like there’s no
question about that.

ERIC ENGE: Right.

Absolutely.

MARTIN SPLITT: How do you think
people are thinking about page

speed as a ranking factors?

Like what are they
trying to do when they

are trying to optimize for it?

ERIC ENGE: Well, in terms
of what they try to do,

I think there’s a few
things that people

are really good at thinking
about related to page speed.

So I think almost everybody
recognizes that images

are a potential issue.

And certainly,
pre-sizing the image

rather than making the
browser do it, for example,

and things like that.

And so they get to that
first level of optimization.

But I think there’s
other things that they

find a lot more
difficult. So for example,

the idea of not loading
the content below the fold

until the content above
the fold is present,

of course, that’s a little
harder to implement.

MARTIN SPLITT: We have native
lazy loading images for now.

So that’s something, at least.

ERIC ENGE: Yes.

It is something.

And then I think
another thing that they

have trouble with is–

and you actually mentioned
it a moment ago–

the idea that the way you’re
hosted and the way your CDN

is set up can be big factors
if those aren’t actually

set up properly.

First of all, they
might not have the CDN.

But they may have
it, and it may not

be properly configured as well.

MARTIN SPLITT: Configured
with caching and stuff.

We’ve seen all of this.

ERIC ENGE: Yeah, exactly.

And then it could be as
simple as, I need more memory

in my web server.

Or a dedicated server, when I’m
on a shared server connection.

MARTIN SPLITT: All of
that sounds pretty solid.

But is there any
misconceptions or myths

that are going
around where like,

what’s happening here, where is
this coming from, is that true?

ERIC ENGE: So I do
think, and maybe I

could state the myth almost
as an inverse, is they

are too focused on just a
few surface level factors.

And they don’t realize there are
other layers to this problem.

MARTIN SPLITT: There’s
layers to this, yes.

ERIC ENGE: Although
there’s another thing

I can suggest actually
as a myth, if you will.

Which is if I go into and
get my Lighthouse tools

report on a page, and I see
it says, oh, this will cut six

seconds out of the load time.

And then they do that
thing and the page

didn’t speed up by six seconds.

And I don’t think people realize
that some of these things

are threaded.

MARTIN SPLITT: Oh, yeah.

ERIC ENGE: So yes, I
did something good.

But I have four other problems
that also need to be fixed.

MARTIN SPLITT: Yes.

ERIC ENGE: So I do see a lot
of people getting tripped up

on that.

MARTIN SPLITT: That’s
an interesting one.

Yeah, and Lighthouse is a
tricky one to begin with.

Because people are getting
confused by the idea

that what they are
seeing in Lighthouse

is what users are seeing.

And that’s not the case.

Because you are literally
testing from your machine,

from your browser, from
your internet connection,

and not necessarily what real
people are experiencing when

they’re on their mobile phones,
on their spotty connection

out there.

So I think it’s
important to remember

Lighthouse is lab data.

And it makes predictions
on what you can improve.

But that doesn’t necessarily
mean that, oh, now you’re

all doing fine.

Do you also think that people
are paying too much attention

to the scores itself?

Because I hear that quite a lot.

So like the myth
is like, oh, we’re

using the Lighthouse
score for ranking.

That’s not happening.

That’s not what we’re doing.

ERIC ENGE: Right.

No.

Exactly.

In fact, they get too
attached to that score.

And sometimes it
misleads them to thinking

that they are doing just
fine when they actually still

have problems.

MARTIN SPLITT: Yeah.

ERIC ENGE: And another area that
I see people running into is it

works fine from my
phone, but the user

doesn’t have such a nice phone.

So you have to remember that
there’s different devices.

MARTIN SPLITT: And you could
see that in Google Analytics.

You can actually figure
out what kind of devices

you were seeing on your site.

And then you can
specifically try

to understand–
best way would be

to buy one of the phones that
is most prevalent on your site.

ERIC ENGE: Yes.

MARTIN SPLITT: And
[? I can ?] have a look.

ERIC ENGE: A very
interesting idea.

And I actually shared a slide
in one of my presentations

recently which
showed data actually

for CNN.com processing.

And it was around three seconds
for the high speed phone.

But by the time you get
to a user with a less

than $100 phone, it
was 15 seconds to load.

And you just really
need to remember

that the users have all
different [INAUDIBLE] devices.

And you probably
want to do a good job

by the great majority of them.

MARTIN SPLITT: Absolutely.

And you want to be aware that a
slow phone on a slow connection

is like the worst situation
you can probably run

into in this kind of situation.

And you can use things
like web page test

to get a better feeling
for how that would feel.

Like you can test from different
locations and different network

connections.

I would definitely
recommend doing that.

There’s so much more
that you can do.

And also, if you
have a website that

is listed in Chrome
User Experience Report

or [? CRUX, ?] then
definitely use that as well.

And I think not many
people are trying that out.

ERIC ENGE: Right.

Well, it’s good to
get real world data.

MARTIN SPLITT: Real world data,
real user metrics, absolutely.

ERIC ENGE: Yeah.

Absolutely.

In fact, you could broaden
that piece of advice

well beyond the page being
conversation, by the way.

Like it relates to all
manner of aspects and things

around mobile, for
example, because we

have everybody who
designs for a desktop

and then has to slam that down
into a mobile phone format.

Maybe designed for the
mobile and then it’s

kind of easy to figure out how
to run [INAUDIBLE] desktop.

MARTIN SPLITT: Exactly.

You have more
[INAUDIBLE] so yeah.

ERIC ENGE: Yeah.

Exactly.

But for the page speed
conversation, absolutely.

You just have to do that.

MARTIN SPLITT: Definitely.

Right?

And yeah, I mean, it’s
such an important thing.

And people– do you remember
the entire controversy on people

like AMP is a ranking factor?

ERIC ENGE: Oh my.

Yes.

MARTIN SPLITT: It’s not.

And then people
are like, but, it–

and page speed pastes
into that as well.

Right?

AMP gives you a
certain expectation

that you can have for your
sites in such results.

And I’ve seen good fast
web sites rank higher

than the end equivalent.

So like, maybe it is not the
most important ranking factor.

But it’s definitely
an important one

as in like page speed is an
important ranking factor.

AMP, not so much.

AMP is just like
this little batch

that gives the
user an expectation

that they can have about it.

But page speed does
matter for your users.

And it does matter for your
conversions, as you said.

Sometimes it’s configuring
your CDN– getting a CDN,

configuring your
CDN, making sure

that caching is done
right, and making sure

that you architect your
websites and web apps in the way

that they are fast by default.

If you can do that without
AMP, then that’s fantastic.

AMP is a fantastic
tool kit to help you

do that if you don’t know how.

ERIC ENGE: Yeah.

And you could go with
Progressive Web Apps

as well, by the way, which
are very nice because

of their ability
to preload content

into the cache on your phone.

So by the time the
user requests the page,

it’s [? continuous ?]
[INAUDIBLE]..

MARTIN SPLITT: Yeah.

That’s true.

ERIC ENGE: And it’s
another way to skin a cat.

No.

I’m not supposed to say that.

Because that’s really
uncomfortable for cats.

MARTIN SPLITT: It’s really
uncomfortable for cats.

ERIC ENGE: So take it
the way I meant it.

MARTIN SPLITT: I get it.

I get it.

So anything else around page
speed where you’re like,

what’s happening there?

Any questions you
have on page speed?

ERIC ENGE: I mean,
really, I guess

it’s reasonable to
presume that there’s not

any prospect of Google
dialing up that ranking notch.

It’s basically, you’re kind
of set with what you’ve done.

I mean I know that algorithms
change all the time.

MARTIN SPLITT: Algorithms
change all the time.

ERIC ENGE: But just from
the logic perspective,

the issue that we talked about
already between the relevance

of the content being–

well, content being king.

It’s still going to be king.

MARTIN SPLITT: Absolutely.

Absolutely.

ERIC ENGE: Have to
deliver the right result.

MARTIN SPLITT: You want
the relevant content first.

ERIC ENGE: If you had
five right results

and maybe it nudges
something up.

MARTIN SPLITT: Like if you have
two results that are basically

doing fine content
wise, we would probably

get the one that is faster,
more prominence in the search

results.

And also, I think it’s important
to understand that we’re not

doing it by Score or Lighthouse
or something like that.

It is more we’re
bucketing pages into

like this is a
programmatically slow one.

This is an OK one.

And this is a fast one.

You see that in the speed report
as well, in the Search Console.

So I think people
need to just like

figure out if they have
really slows pages,

how to make them faster.

And probably if they’re
in the middle bit,

you also want to
go to the fast bit.

But it doesn’t
matter if you have

a Lighthouse score of 90 or 95.

That doesn’t really
make a difference.

All right, Eric.

Thank you so much for being
here and talking all things page

speed with me.

That was amazing.

And I hope that everyone liked
it and leave comments and likes

with us.

And thank you very much.

ERIC ENGE: Hope
you all enjoyed it.

MARTIN SPLITT: Bye.

Hey, everyone.

So next episode is going to be
with my fantastic guest Rachel

Costello.

And Rachel, what have
you brought for us?

RACHEL COSTELLO:
We’re going to be

talking about canonicalization
and URL de-duplication.

MARTIN SPLITT:
Sounds really cool.

Don’t miss it.

RACHEL COSTELLO: See you then.