Sei sulla pagina 1di 3

White Paper Critical Factors of Search Data Aggregation

Author: Dr. Horst Joepen Geschftsfhrer/CEO

Searchmetrics GmbH Saarbrcker Str. 38 10405 Berlin Management: Dr. Horst Joepen, Marcus Tober Web: E-Mail: Telefon:

1 www.searchmetrics.com info@searchmetrics.com +49 (0) 30 322 95 35-0

Search Engine Optimization (SEO) is still perceived more as a dark art practised by magicians rather than an exact science provided by well reputed vendors or service providers. In most projects data of low or questionable quality is used to make costly investment decisions in Web site coding, content creation or building of link structures. The search data landscape has been mapped for the last couple of years by point tools like SpyFoo, SEMRush or Compete which have been used to obtain search engine keyword data. Keyword and backlink information provided by Search Engines vendors themselves, such as Yahoo site explorer or Google keyword tools has also helped to shed illuminate the search universe. With more integrated Search Analytics tools a category which was pioneered by Searchmetrics with the introduction of the Searchmetrics Suite in early 2008 it is getting easier for agencies and companies to manage a broader and more consistent data set enabling them to direct targeted SEO investments with a predictable ROI. Searchmetrics provides its customers with a maintained database of 15 million keywords and over 40 million domains - and is constantly expanding it. In addition to this generic database covering Germany, US and UK, customers can choose from 52 combinations of 20 countries and the 3 major search engines Google/Bing/Yahoo with Yandex for Russia. As the data provided voluntarily by search engines is intentionally incomplete like limiting the number of back-links shown to 1000 or specifying search volumes without the actual query context - Search Analytics vendors need to collect large volumes of search results by tapping into accessible data streams, sending queries to Search Engines via a network of agents or similar methods. Brute force approaches and high volume automated queries using an agency or customer network IP address will violate most search engines usage policies and should not be considered as they will lead to penalties or blocking of these IP addresses by search engines. Generally speaking, therefore, desktop based products should be avoided. Searchmetrics SaaS based solution comes from a well reputed vendor who that can afford to operate a robust and large distributed server infrastructure which will be able to provide higher and more instant data availability. Some important aspects of search data aggregation that most vendors dont talk about when bandying about large keyword and domain numbers are discussed below. This is where Searchmetrics can provide its customers with a key competitive advantage when monitoring the Search performance of a domain over its competitors or making costly decisions about a new keyword as an optimization target. Data Aggregation Frequency As Search Engines dont like to be automatically polled in large volumes vendors tend to aggregate large keyword databases over long periods of time. We have found keywords in the SEMRush data base with ranking and volume information that appeared to be several months old. This is an indication of a low refresh rate and indicates a high percentage of outdated keyword information in the respective vendors database. In contrast Searchmetrics guarantees a monthly refresh rate for its database this means every data point is refreshed within a 30 day cycle. The optimal data aggregation frequency differs depending on the nature of the content to be analysed. For general keyword research a monthly refresh rate is acceptable but for competitive analysis of positions in a specific market segment (market index) or regular monitoring weekly data is needed. To address these differing requirements Searchmetrics has implemented both monthly and weekly update processes. For news content such as Google News where currency is king even weekly data updates are insufficient to reflect the
2

dynamics in content and link structure changes. Searchmetrics has therefore implemented a 15 minute refresh cycle for news. Data Aggregation Dispersion As weekly updates for restricted keyword sets are marketed by several vendors lets take a closer look at how comparable this data is. It is important for SEOs to understand the accurate ranking of a domain for a given keyword set over its competitors. It is also vital to understand the position distance between different keywords. As an example let us assume that at a specific time Keyword A is ranked at position 3 and Keyword B is ranked at position 15.In this case the position distance would be 12. Since most vendors use a full one week window to update keyword data, they will naturally report that Keyword A is at position 3 on Monday morning and Keyword B is at position 20 on Thursday afternoon - an erroneous position distance of 17. This is the SEO equivalent of comparing apples and oranges and will result in poor results for the client. To ensure more accurate results Searchmetrics has now implemented a weekly update process for individually monitored keywords to capture all position data in a 3 hour time window on Monday generally accepted as the most representative (or average) time window for keyword searches. Data History In order to assess the current ranking of a Web site it is necessary to see historic charts to recognize earlier penalties, relaunch effects or other effects and take them into account when developing optimization strategies going forward. Searchmetrics is able to show data history for a full 12 month period. This includes not only keyword positions or ranking factors like Google Page Rank, but also historic audit results of the domain therefore customers are able report movements up or down in a historical context. Search Engine Adaption Speed As search engines change their algorithms hundreds of times every year the structure of Search Engine Result Pages (SERPS) can be adversely affected. Scanners of most SEO tools fail in this situation if the vendor is not providing an alert team and takes instant action to adjust the scanners and avoid providing erroneous data. Searchmetrics established an alert team early on and can react to changes rapidly. Searchmetrics process also allows it to roll back and redo queries from the point at which any change to the Search Engine has been identified.

This paper should help customers evaluating SEO tools to assess the data quality provided by different vendors and understand the value provided through the ongoing investment of Searchmetrics in its data aggregation process.