Download RedNails

RedNails is a data scraping library that uses templates to determine what data to extract from actual data feeds. RedNails uses the template to create a regular expression that catches the user marker variables. When a string of data is passed to RedNails it will use the regular expression to extract the matches and return them to the user. If the scraped data is regular enough then RedNails is a simple way to extract data as all one needs to do is copy a live data feed and mark the points to extract and make this the template.

License

RedNails is under the BSD license

Usage

  1. Create a template.
  2. Load and initialize an instance of a RedNails object with the template
  3. Pass this instance your data feed from which you wish to extract information.
  4. Use the results.

Template Format

A rednails template is simply a text file that has the points to scrape marked with what looks like a ruby string substitution. You give each substitution a unique variable name which can then be referenced when using the parse_hash method.

An example template is:

"Hello my name is #{name}.  How are you?"
And if the live data is:
"Hello my name is Mr.Bill.  How are you?"
Then the following code fragement will produce "Mr.Bill":
	  require 'rednails'
	  rednails = RedNails.new("template.txt")
	  results = rednails.parse_hash("livedata.txt")
	  puts results["name"] # => Mr.Bill
	

Repetitions

If have data that you would like to extract which repeats itself then there is an additional template marker you can use. For the first example replace the data with #{Rep:} after the colon inside of the Rep marker you will then place the structured data that repeats, except that for each unique piece of data that you would like to extract replace it with a unique variable name that starts and ends with @.

For example if you have an arbitrary list of images that you would like to extract you can make a template like this:

	  <html>
	  <body>
	  A bunch of photos:
	  #{Rep:<img src="@url@" alt="@txt@"/>}
	  </body>
	  </html>
	

For more details please see the test cases.

Installation

Gem

gem install rednails

Manual

As root:
# ruby setup.rb all

Author and Contributions

Zev Blut
With some changes and help by Min Lin Hsieh, Daniel DeLorme and Pierre Baumard.

Elbert Corpuz for CSS and HTML page layout

Finally, thanks to Ubit for allowing this software to go open source.

Other scrapers

If RedNails does not fit your needs then here is a nonexhaustive list of other scrapers for Ruby