There’s a war going on my fellow SEOs and Search Marketers. Has been for a couple of years now. The war on organic data.
It was a war that started off very covertly, almost without incidence, as noted by Jon Henshaw over 2 years ago on Raven Tool’s Blog. Google, by way of depreciated APIs, quietly pulled SERP [Search Engine Results Page] ranking data. Which then led to more and more companies scraping the results from Google and placing an extra burden on their servers. And, perhaps Google was banking on the fact, though somewhat quietly kept, Google Webmaster Tools has had “ranking data” for over 3 years now. Maybe this was Google’s evolutionary step? Nonetheless, it was the one of the first assaults on organic data; Google’s conscious and deliberate action to close off a major pipeline to SEOs and Webmasters. It registered as nothing more than a blip to most of the community, myself included, but the SEO tool companies probably had a good idea where it was headed. Maybe they decided to just “wait and see”?
2011: Google Kicks in the Door
A year after Google shut off the Ranking Data APIs, they got brazen. They gathered up the troops and kicked the down the door to the SEO house, fingers
hugging the triggers. It was akin to enacting Google’s own personal Patriot Act. They black-boxed the organic search data. “NOT PROVIDED”. Users who were signed-in or using SSL Google searches would appear as “not provided” in your organic search keyword data to help “maintain the privacy”. All under the guise of privacy. Immediately this change was said to only affect less than 10% of data in analytics. “Single digits”, was the quote from Matt Cutts.
2012: One Year of “Not Provided”
Danny Sullivan’s excellent write up “Dark Google” tells the story. Single digits? It’s hard to imagine that Matt could say it and keep a straight face. I don’t know about you, but my sites are consistently between 1o% and 15% “not provided”. And, in some extreme cases, they range near 25%. 25% shielded keyword data for a small business is pretty big, and pretty crappy. That’s an awfully large gaping hole to be missing out on. How can they [small businesses] help it that people are signed-in or starting off in “https” search? That’s a ton of valuable data that small businesses could be using to help them analyze customer behavior, to help them write better, more targeted content to their customers, and to help them expand and refine their consumer funnels to convert more and stay afloat in an economy that still moving sideways. I think of it this way: if Google suddenly lost 15% of its collected search data overnight, don’t you think they’d be pretty pissed off? Just, “poof”, and it’s gone.
Losing That Data Might Be the Best Thing For SEOs
I know that it sounds backwards, but hear me out. Also, let’s put aside the obvious here: of course this move was intended to get every business involved in Paid Search (not just the those who spend millions over millions every year). Because like SEOs, Google knows it’s the thousands of small accounts [mid-tail/long-tail] that add up. If you ensure the data that was once free has to route through a paid resource, you’re going to get a large swath of folks to jump on board and buy-in.
Perhaps it was all part of their design (can’t discount that theory), but losing organic data has forced SEOs and Search marketers to expand their tool-kit to get that data. When you can’t rely on a single source, you’ve got to employ multiple channels (i.e. social, content, CRO, etc.) to piece together the story again. I won’t lie, that’s a stretch (even to the writer). However, there’s a small nugget of truth that this assault on organic data has forced us to become better marketers.
2013 and Beyond
The more Google pushes its own products (i.e. Gmail, Places, Google Drive, Insights & Trends) to its social platform (Google+) to create a fluid SUPER-DATA-HIVE, the more users must be signed in to interact. Luckily, Google+ hasn’t quite pulled off the interaction and engagement with users it hoped to (so far). Then add in Firefox moving to Google SSL search by default and iOS6 doing the same. What you get is a “black box” on organic data that is the size of Utah, and is only going to get larger and more vacuous. It is their data to do with as they please, after all. The hypocritical precedent set a year ago by Google will continue onward: “user privacy”. They clearly don’t want to be viewed under the same lens as Facebook.
I think that by the end of 2013 organic site data will reduced to drips from a leaky faucet as SSL search become the rule and not the exception. I can’t say what the end-game is here; whether its Google constructing a service to “buy back” organic data that they anonymize or making SEOs piece together the puzzle from several different platform strands. Or, just so it can be said aloud, push every business into AdWords platform to get “all the consumer data”. It certainly seems like that’s the objective with all these maneuvers: squeeze SEOs organic data into a corner so small that it becomes non-representative of overall searcher/consumer behavior.
The news about Google’s Panda has been hot and heavy since late February of this year. And the information on what Panda is, isn’t, and what it affects are just as plentiful. In Lord of the Ring terms, this is the one algorithm shift to rule them all. So, I’ve decided to put together an all-encompassing Panda reference guide, from Panda 1.0 to Panda 2.5 (and beyond, when the next iteration comes out). In it I’ll detail what we know from official sources and what we think we know based on conversations and data. It’s my hope that this is the one-stop-shop Panda for online marketers re-familiarizing themselves with the finer points, and for business owners who simply want to find out what Panda is all about. In the second half of the guide, we’ll be exploring under the radar Panda issues.
The Google Panda Roll Out Timeline
The timeline above depicts the official release date of each Panda iteration as confirmed by Google. I’ve included it Analytics format so you know where to look for surges/dips in traffic. Give that +/- 1 week depending on type of website, how Panda was working through the food chain, and data centers.
The Panda Release Dates:
- Panda 1.0: February 24, 2011
- Panda 2.0: April 11, 2011
- Panda 2.1: May 9, 2011
- Panda 2.2: June 18, 2011
- Panda 2.3: July 22, 2011
- Panda 2.4: August 12, 2011
- Panda 2.5: September 28, 2011
- Panda 2.5.1: October 5, 2011
- Panda 2.5.2: October 13, 2011
Breaking Down the Panda
Panda 1.0 : The Official Story
[...]as new content — both good and bad—comes online all the time. [...] But in the last day or so we launched a pretty big algorithmic improvement to our ranking — a change that noticeably impacts 11.8% of our queries — and we wanted to let people know what’s going on. This update is designed to reduce rankings for low-quality sites — sites which are low-value add for users, copy content from other websites or sites that are just not very useful. At the same time, it will provide better rankings for high-quality sites — sites with original content and information such as research, in-depth reports, thoughtful analysis and so on.
Panda 1.0 in a Nutshell
CONTENT. Google’s aim is to reward websites that cultivate and create “high-quality” content that provides value and usefulness to users. At the opposite end, it is to demote/punish websites that contain low-value content, scrape/copy content from other websites (passing it off as their own). Originally given the name “Farmer” by Danny Sullivan as it [unofficially of course] took aim at content farms and scraper websites. Moreover, this initial update was intended only at the US marketing and queries in English.
To give you an idea of how this affected larger content-factory type sites, check out this before and after Panda 1.0 snapshot from Sistrix
What also has to be realized, which is why I emphasized it above in the official story section, is that Panda is looking at WHOLE websites, entire domain contents. That’s at once what makes Panda so dangerous and brilliant. Prior to Panda, certain pages might not rank well for queries because they were low quality. Now those same pages are damaging entire domains. Think of it as exile by association. Have enough of what Google is qualifying as low-quality, and the entire gets blinked out of SERP existence.
Panda 2.0 Official Story
Today we’ve rolled out this improvement globally to all English-language Google users, and we’ve also incorporated new user feedback signals to help people find better search results. In some high-confidence situations, we are beginning to incorporate data about the sites that users block into our algorithms. In addition, this change also goes deeper into the “long tail” of low-quality websites to return higher-quality results where the algorithm might not have been able to make an assessment before. The impact of these new signals is smaller in scope than the original change: about 2%
Panda 2.0 In a Nutshell
Panda goes global. All English-language queries are Panda-ized. Interestingly, Google admits to using SERP block data and factoring that into the algorithm. It also marks the beginning of Panda targeting long-tail queries. Meaning, is diving into deeper content on site (think e-Commerce and product pages) to find low-quality content.
Panda 2.1 and Panda 2.2
Sorry, no official releases by Google on these, only journalist confirmation. Here are the posts detailing those confirmations: It’s Panda Update 2.1 and Google Panda Update 2.2. Again, while not official, it was the understanding of the search community that these updates where aimed original source attribution. In simple terms: attempt to obliterate content scraping sites from outranking the original source. Because, hey, no one likes being outranked by another site for they great content they wrote. Side effects of 2.1 and 2.2: continue to think product pages.
Panda 2.2 Finds Its Way to B2B Websites
Panda 2.2 Anecdote: I work with quite a few B2B websites, and it appears that Panda’s 2.2 update (outside of 1.0) is where a large swath of B2B sites begin to feel the pressure of Panda. For the most part, the B2B sites I worked with remained untouched from Panda (and also by creating better content, they continue to remain so); however, there were a couple that did see large losses in traffic. This is just a friendly reminder to check your analytics on June 18th (plus or minus 3 days). If you see the dip, you’ve been hit.
In Between 2.2 and 2.3: The Subdomain Loophole Theory
Wall Street Journal publishes this article: Site Claims to Loosen Google “Death Grip”. It brings about a type of hysteria in the search marketing community indicating that Google has suddenly begun treating subdomains differently since Panda. In my opinion if it did exist (which I don’t think it ever did), it’s gone now. Once something like this hits the public airwaves (i.e. at the shouting level of WSJ), the loophole evaporates. Add to this, something that happens later in August [Webmaster Tools showing subdomain links as internal domain links] and everyone has hyper-sensitivity to subdomains.
Panda 2.3 and 2.4 The Official Story
2.3 was again quietly rolled out with confirmation from Google going to search journalists. This update is also extremely secretive, as stated by Google, “this update incorporates some new signals that help differentiate between higher- and lower-quality sites. As a result, some sites are ranking higher after this most recent update.” And, again, while not official, this update appeared to give another edge to BRANDS in the SERPs.
2.4, however, does have an official Google mention. “[...] we’re continuing that effort by rolling out our algorithmic search improvements in different languages. [...] For most languages, this change impacts typically 6-9% of queries to a degree that a user might notice [...] all languages except Chinese, Japanese, and Korean[...]“
Panda 2.3 and 2.4 in a Nutshell
Panda 2.3 seemed to be the quiet killer for US/UK dominated queries, with some brand protectionism in mind for more niche verticals and query spaces. However, because there is almost no information on it, it’s very hard to say what, beyond brands taking a more prominent place in the Panda Algorithm, this update targeted and hunted down.
Panda 2.4, however, made a big splash. In the truest sense of the word Panda really went global. Affecting nearly every language on the planet, except of course where there has been prior conflict (i.e. China and Korea).
Panda 2.5, 2.5.1, 2.5.2
2.5 had no official release from Google but was confirmed with SEL (Search Engine Land) among others. This update marked the largest gap between iterations (7 weeks) and, again, Google did not release details on what 2.5 aimed to correct. However, per the reports in the days following 2.5, it seemed to carry a SERP payload with it.
2.5.1 and 2.5.2, or the Panda 2.5 Tweaks, did have a Google spokesperson and a semi-official tweet by Matt Cutts (tweet for 2.5.1 and the WeatherReports) and then another to confirm the 2.5.2 Panda tweak on October 13, 2011 . I suppose all of should be on the look out for #WeatherReports for the coming Panda Flux.
Panda 2.5, 2.5.1, 2.5.2 in a Nutshell
Once again, because of the lack of official details from Google, we are only left with educated guesses based on what happened after their implementation. And, suffice to say, video got a big bump (i.e. YouTube and tv.com) in the winner category. What we also saw was that sites that did recover in Panda 2.2, Panda 2.3, were hit once again with original Panda-like effects. The “minor” updates/tweaks seem to effect each site differently; some reports indicate that sites are gaining back traffic slowly, while others report that their site has lost 80% of traffic (again).
And, we also know that Panda 2.6+ and/or MORE Panda algorithm tweaks are coming thanks to Matt’s “Weather Reports”. The wild Panda ride is no where near over. Keep your seat belts fastened and your trays locked in place, it’s still bumpy out there.
What You Can Do to Tame Panda
The first thing I would do is read Google’s advice on what they consider a high-quality website. Much of it is common sense things to think about, but for a business owner trying to tackle this issue on their own, it could provide some valuable ways of thinking about your products, services, and content in general.
But, if I’m to be honest, the real solution is to hire a good search marketing company, or online marketing company. As much as I’d like to believe that a business owner with little-to-no-experience could do this on his/her own, I think Google has made that an impossibility. I think there’s a complexity, and a uncertain air of unpredictability here, that unless you’re a professional online marketer or professional SEO, they don’t stand a chance.
Panda Issues Flying Under the Radar
With Panda’s introduction the only thing you ever hear about is content: write better content. Write high quality content. Indeed, that certainly has it’s place. But there are other big issues that Panda has created, and some things that Panda is likely looking at that are simply not talked about.
The Shrinking Link Graph
It’s an interesting after-effect from Panda thus far. Most search marketers will say that at any given time individual sites link graphs are constantly expanding and contracting; however, some of you may have noticed in the couple months after Panda arrived, the link graph drastically shrank. On the high end, as many 200-300 already-indexed links had vanished from the profile. Poof. Gone.
This has do with Panda and here’s how. As sites scrambled to figure out what was going on (especially larger, more content-driven sites), how to stop the bleeding, and how to reverse the trend, many were “noindexing” content in droves, many were sending several sub-directories to robots.txt, and in some drastic cases, simply wiping out content. The thought being if Google can index it, can’t crawl it, or can’t find it anymore, then whatever we did to bring Panda upon us should also make it go away. Instead, those actions contracted the link graph quickly and violently. So, even if you were building links, you still saw your profile keep shrinking as webmasters and site owners kept killing off content.
The Shallow End of the Anchor Text Pool
A more complex problem arises from this as well: the condensing the anchor text pool. Did this contraction wipe out more semantic and temporal anchor text, did it wipe out brand-centric anchor text, or did it leave a highly concentrated majority of exact-match anchors? The issue then comes back to this: how did you build links before Panda? If you were gung-ho on exact-match anchors, not giving thought to semantic and temporal closeness and relatedness, then there is likely another trouble spot in your future. Let me explain.
As the Link Graph shrinks, so too does the anchor text pool of that link graph. And, if a site were to have built an over-abundance of exact match anchors to major keyword phrases, while some of those will be removed, so too will other anchor text that helped to normalize the profile. Then, not only are you dealing with “low quality content” issues, but now what looks to be a manipulated link graph and profile.
Some great background reading on link profiles and anchor text Bill Slawski’s: How a Search Engine might Weigh the Relevance of Anchor Text Differently, BlueGlass TPA Session recap The Evolution and Implementation of Link Building, and Justin Briggs’ Phrase Based Indexing and Semantics
Ad Placement on the Page
While ads are not direct content on the page generated by the writer, they are apart of contextual content on the page and effect the user. David Harry (one the head Search Obsessed Geeks over at Search News Central) wrote How Google might find you annoying, which details out this patent: DETECTING AND REJECTING ANNOYING DOCUMENTS
It’s certainly worth considering, especially if you have revamped your content, cleaned up your link graph, made the best site you can make, and are still being penalized.
Reference Guide Conclusion:
With Panda there’s a lot going on, there’s a lot of moving parts. Some of the algorithm tweaks we know about, and others we don’t. What this guide should provide to you is game plan to build the best site you can from a content perspective, from a user experience perspective, and a link building perspective. Of course, there are finer points to all of these statements that go much technically deeper, but the end total is to build a site you’d want to read from, you’d want to buy from, and site that you’d recommend to someone who’s looking for that information.
At the end of the day, Panda is, at its heart, is what good SEOs have been saying for years: build a great site with great content [a core-focused site], and the rest takes care of itself. If you do have questions about Panda, feel free to contact me or leave comments below.