|
Dec 21
2009
|
The World Wide Web has made text data available like never before. Almost all sites accessible through the Internet can be addressed by either a browser or a program that retrieves the plain-text version of the page containing HTML tags. This availability of raw data has caused a renaissance of programs called screen scrapers. Screen scrapers access data normally targeted at a screen (or browser window) and scrape the desired data from the screen for storage, or repackaging and display.
Brief History of Screen Scrapers
Between the 1970s era of widespread deployment of text-based mainframe/terminal applications and the twenty-first century browser era came the age of the graphical user interface (GUI). Ushered in by the success of the Macintosh, computer applications began to feature windows, drop-down menus, checkboxes, and other user-interface elements that made using programs much more flexible than their text-based predecessors.
The revolution in GUI adoption caused a problem for organizations that had invested tremendous amounts of time and money in mainframe-based text applications. In less than a decade, these text-based applications went from being cutting edge to antiquated. For productivity reasons, organizations had to rewrite these applications to take advantage of the new GUI paradigm. Even more daunting than the mountain of reprogramming was the conversion of data stored in these mostly custom systems. Few standard data formats existed when they were initially designed, so retrieving and converting the data posed tremendous difficulties.