Google Cache Declared Legal
Friday January 27th 2006, 3:10 am
Filed under: Search Engine Marketing, Search Technology

This must be the month for bizarre court decisions. First we had the German judge who shut down that country’s Wikipedia, and now a Nevada judge has declared Google’s practice of caching the content of other sites to be legal and free of copyright infringement. The Electonic Freedom Foundation posted this summary of the ruling:

  • Serving a webpage from the Google Cache does not constitute direct infringement, because it results from automated, non-volitional activity by Google servers
  • (Field did not allege infringement on the basis of the making of the initial copy by the Googlebot);
  • Field’s conduct (failure to set a “no archive” metatag; posting “allow all” robot.txt header) indicated that he impliedly licensed search engines to archive his web page;
  • The Google Cache is a fair use; and
  • The Google Cache qualifies for the DMCA’s 512(b) caching “safe harbor” for online service providers.

Further explanation is provided in a CNET article,

The court also said that Google’s cache amounts to fair use of the works being copied and transmitted, and the company’s database qualifies for a “safe harbor” provision of the Digital Millennium Copyright Act, which protects databases, ISPs, and other online service providers that don’t exert direct control over what content is posted against copyright liability.

Based on these synopses, the decision appears to reflect a judge with little understanding of technology. Comparing Google’s cache, where a user can choose to either view the current copy of a web page or an older copy which Google has stored, to an ISP’s cache, where very recent copies of content are stored and served to the viewer in a transparent manner, is simply wrong. The two aren’t equivalent. The ISP is delivering, for all intents and purposes, the current web page with all of the current content, ads, etc. Google, on the other hand, is deliberately offering a choice between the web page that the site owner wants you to see, with current content, advertising, etc., and a copy which may be much older - even months old - and which may not contain current content, current advertising, etc. In addition, since websites may tailor their delivered content to the browser in use, what Google contains in its cache may be far less rich of a user experience.

The other key point is that the user was judged to be at fault for not using the cache control tag. This, too, reflects a lack of understanding of common practice. I’d estimate 99+% of web site owners (and even a fair number of web developers) have no idea that a no-cache tag exists. In many cases, setting this tag may be outside of the site owners control (preinstalled content management systems, blogs, hosted pages, etc.) To imply that the site owner was negligent by, in effect, failing to put a “Do Not Steal” sign on his content is ludicrous.


Add this post to: del.icio.us - Digg it - Stumble it - Furl - Yahoo MyWeb
No Comments so far
Leave a comment



Leave a comment
Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>

(required)

(required)