When American Airlines sued Farechase, Inc. in Federal District Court in Texas earlier this year, claiming that Farechase's “screen-scraping” of AA's flight information from AA.com was illegal, it was only the most recent in a series of cases challenging unauthorized data collection from Web sites.
What practices constitute “screen-scraping”? Is “scraping” really illegal? What does this line of cases mean for your business?
What Is ‘screen-Scraping'?
Despite its pejorative title, screen-scraping software simply gathers and aggregates data from other Internet Web sites for use by the gathering party. Usually, the purpose is to reform the data and display it for the benefit of the gathering party's customers.
Examples of data aggregation range from sites that collect prices on retail sites to companies that aggregate personal financial data on mutual fund and banking Web sites, permitting registered users to access information about multiple accounts on a single Web site.
The software that performs this function—often referred to as a robot (or “bot”), “spider” or “crawler”—searches Web sites for specific information.
The Farechase case provides one example of how this technology works. Farechase's customers are travel agencies. When a customer uses Farechase to research a particular airline, hotel or auto rental fare, Farechase's software searches different airline sites and collects the “Web fares” offered. A popular site such as AA.com might be searched thousands of times a day in response to queries initiated by Farechase customers.
Farechase's real-time search technology is an advance over more traditional data mining, in which companies search sites on a regular basis and maintain a separate database that may be queried by users. In the case of sites selling books or music, for example, a real-time search may not be essential as long as the database is updated frequently.
Farechase took this concept one step further by permitting its users to search for fares offered at the very moment the search is conducted, thus guaranteeing that the results would be current.
Needless to say, one's views on this type of data mining depend largely on whether one is the scraper or scrapee!
The targets of this practice, such as American Airlines, complain that the constant traffic resulting from scraping puts an extra burden on their Internet servers, slowing down their response times for legitimate users.
In the Farechase case, American Airlines claimed that if left unstopped, Farechase would be performing over 200,000 daily searches by the end of 2003. Moreover, it argued that as a result of Farechase's permitting of customers to access Web fares by going directly to the American booking pages at AA.com, American is unable to establish the relationship with its customers that it otherwise might—that is, if customers were required to navigate through AA.com's preliminary pages—thereby costing American customer goodwill.
On the other hand, companies like Farechase argue that their service encourages comparison-shopping. Companies that resist it are afraid of the resulting competition (and lower prices), Farechase says.
Before discussing the legalities of screen-scraping, it is worth pointing out that companies that are targeted by this practice and object to it often undertake a measure of “self-help” before authorizing their lawyers to file suit. Such self-help sometimes leads to a technological battle worthy of a William Gibson novel.
The defenders attempt to identify and block the Internet Protocol (IP) addresses of the attackers. The attackers respond by hiding or disguising their scrapers' identities by using fake IP addresses, thereby evading the blocking firewalls.
The attackers, not easily discouraged, seem to have a limitless supply of disguises, perpetuating this high-tech cat-and-mouse game. In several cases, the attackers have prevailed. As a result, several of these disputes have ended up in the courts.
When technical defenses fail, screen-scraper targets such as American Airlines have two primary legal weapons to deploy in their defense. The first is to claim breach of a click-wrap or browse-wrap online license. The second is to allege a “tort” (or legal wrong), most commonly “trespass to chattel.”
In its case against Farechase, American Airlines attempted to fire both barrels at its opponent. But its opening salvo was weak.
First, American claimed that Farechase violated American's “browse-wrap” agreement. With its use of the term “browse-wrap,” American was referring to an online agreement that appears on the site (usually under the terms and conditions link) but does not require the user to click on or express consent to the agreement before proceeding to use the site.
By contrast, the better known (and far more effective) “click-wrap” agreement requires the first-time user to click on a word or symbol to express acceptance of a site's licensing terms before gaining access to the site. While the user of a properly implemented click-wrap agreement can expect enforcement, no court has yet enforced a browse-wrap agreement, and the only two courts that have considered the issue at all have expressed doubts about the enforceability of such an agreement.
However, the ability to protect a Web site with nothing more than an explicit statement on the Web site restricting access received a potential boost in a recent decision by the First Circuit Court of Appeals in Boston. That court suggested that screen-scraping may violate the Computer Fraud and Abuse Act (CFAA) and that a restrictive warning of the sort used in browse-wrap agreements may be enough to invoke the CFAA.
The second barrel of American's gun was loaded with more powerful munitions, in the form of its claim that Farechase had violated the law of “trespass to chattels” (i.e., goods). While the English law of trespass as applied to chattels can be traced back hundreds of years, it has shown a surprising ability to adapt itself to the law of the Internet.
Most courts that have considered the applicability of trespass law to data scrapers have ruled in favor of the complaining party. The best known of these cases, eBay, Inc. v. Bidder's Edge, Inc., resulted in an injunction ordering Bidder's Edge to stop data mining from the eBay Web site.
Moreover, in several of these cases the courts have not required proof that the scrapers caused any measurable harm, or any specific injury, to the sites they were data mining.
Not surprisingly, based on the above record, American Airlines was successful in obtaining an injunction against Farechase. While Farechase is still in business, its searches no longer include American Web fares.
Where to from Here?
Based on this state of the law, can data miners expect to build a business based on unauthorized screen-scraping?
Somewhat surprisingly, the outlook may be better than it appears.
First, many companies utilize this form of data mining without objection from the owners of the sites they are crawling. The reasons are economic, not legal.
In some industries, screen-scraping has become an accepted method of business. Furthermore, the vast majority of companies are willing to provide access to their sites when they are approached cooperatively. That some percentage of their capacity is being used by a scraper is not a deterrent, as long as the scraper's customers ultimately are referred to the vendor's site to make the purchase.
Second, while the law thus far has favored original content providers, the law on electronic trespass to chattels is far from settled. The California Supreme Court recently issued a decision in Intel v. Hamadi, rejecting Intel's attempt to prevent a former employee from sending mass emails to Intel employees.
In that case, the court held that electronic trespass to chattels is not actionable under California law unless it involves “actual or threatened injury to the personal property or the possessor's legally protected interest in the personal property.”
Since Hamadi's emails (numbering in the hundreds of thousands) to Intel employees caused no such harm, the court refused to order Hamadi to cease communications. Although this case was not a screen-scraping case, the issues implicated are essentially the same (Intel relied heavily on the scraper cases) and therefore Hamadi may be an important defensive tool for scrapers to use in the future.
MarketingProfs provides thousands of marketing resources, entirely free!
Simply subscribe to our newsletter and get instant access to how-to articles, guides, webinars and more for nada, nothing, zip, zilch, on the house...delivered right to your inbox! MarketingProfs is the largest marketing community in the world, and we are here to help you be a better marketer.
Sign in with your preferred account, below.
Know someone who would enjoy it too? Share with your friends, free of charge, no sign up required! Simply share this link, and they will get instant access…
You may like these other MarketingProfs articles related to Web Sites:
- Speed It Up: What Frustrates Business Website Visitors
- 12 'Dark Patterns' That Websites Use to Trick Visitors [Infographic]
- The Ultimate Checklist of What Every Website Needs [Infographic]
- The Mind-Boggling Amount of Data Generated Online Every Minute [Infographic]
- How to Increase Leads: Effective Entry Points for Lead Magnet Signups
- The Secret Six-Ingredient Recipe for Perfectly Compliant Cookie Banners