SEO

HUGE Google Search document leak reveals inner workings of ranking algorithm

A trove of leaked Google documents has given us an unprecedented look inside Google Search and revealed some of the most important elements Google uses to rank content.

What happened. Thousands of leaked internal documents, which appear to come from Google’s internal Content API Warehouse, were shared with Rand Fishkin, SparkToro co-founder, earlier this month.

  • Read on to discover what we’ve learned from Fishkin, as well as Michael King, iPullRank CEO who also reviewed the documents (and plans to provide further analysis for Search Engine Land soon).

Why we care. This leak gives us a glimpse inside how Google’s ranking algorithm works, which is invaluable for SEOs who can understand what it all means. In 2023, we got an unprecedented look at Yandex Search ranking factors via a leak, which was one of the biggest stories of that year.

This Google document leak? It will likely be one of the biggest stories in the history of SEO and Google Search.

What’s inside. Here’s what we know about the leaked documents from Fishkin and King:

  • Current: The documentation indicates this information is accurate as of March.
  • Ranking features: 2,596 modules are represented in the API documentation with 14,014 attributes.
  • Weighting: The documents did not specify how any of the ranking features are weighted – just that they exist.
  • Twiddlers: These are re-ranking functions that “can adjust the information retrieval score of a document or change the ranking of a document,” according to King.
  • Demotions: Content can be demoted for a variety of reasons, such as:
    • A link doesn’t match the target site.
    • SERP signals indicate user dissatisfaction.
    • Product reviews.
    • Location.
    • Exact match domains.
    • Porn
  • Change history: Google apparently keeps a copy of every version of every page it has ever indexed. Meaning, Google can “remember” every change ever made to a page. However, Google only uses the last 20 changes of a URL when analyzing links.

Links matter. Shocking, I know. This leak confirms that link diversity and relevance remain key. And PageRank is still very much alive within Google’s ranking features.

Successful clicks matter. This should not be a shocker, but if you want to rank well, elements of the leak clearly indicate that you need to keep creating great content and user experiences. Google uses a variety of measurements, including badClicks, goodClicks, lastLongestClicks and unsquashedClicks.

Brand matters. Fishkin’s big takeaway from the leak is that brand matters more than anything else:

  • “If there was one universal piece of advice I had for marketers seeking to broadly improve their organic search rankings and traffic, it would be: ‘Build a notable, popular, well-recognized brand in your space, outside of Google search.’”

Entities matter. Google stores author information associated with content and tries to determine whether an entity is the author of the document.

SiteAuthority: Google uses something called “siteAuthority”.

Chrome data. A module called ChromeInTotal indicates that Google uses data from its Chrome browser for search ranking.

The articles.

Related stories

New on Search Engine Land

About the author

Danny Goodwin

Danny Goodwin has been Managing Editor of Search Engine Land & Search Marketing Expo – SMX since 2022. He joined Search Engine Land in 2022 as Senior Editor. In addition to reporting on the latest search marketing news, he manages Search Engine Land’s SME (Subject Matter Expert) program. He also helps program U.S. SMX events.

Goodwin has been editing and writing about the latest developments and trends in search and digital marketing since 2007. He previously was Executive Editor of Search Engine Journal (from 2017 to 2022), managing editor of Momentology (from 2014-2016) and editor of Search Engine Watch (from 2007 to 2014). He has spoken at many major search conferences and virtual events, and has been sourced for his expertise by a wide range of publications and podcasts.

Products You May Like

Articles You May Like

Advertisers react to Google keeping cookies on Chrome
Bing testing new generative search experience
OpenAI starts testing SearchGPT prototype, here’s what it looks like
Why single keyword ad groups still matter in 2024
Reddit launches Lead Generation Ads

Leave a Reply

Your email address will not be published. Required fields are marked *