Skip to content

Add support for another portal

Ed Tewiah edited this page Dec 25, 2017 · 1 revision

To add support for scraping a portal, a JSON config file has to be added and a few changes made so that the scraper recognizes the newly added portal.

The JSON config file needs to be added here:

https://github.com/RealEstateWebTools/property_web_scraper/tree/master/config/scraper_mappings

It is a list of css and xpath selectors for the various fields to be populated. It should be easy to figure out how it works from the existing files. If it is not, open an issue and I will try to document it better.

Ensure the name of the file is added to this model https://github.com/RealEstateWebTools/property_web_scraper/blob/master/app/models/property_web_scraper/scraper_mapping.rb

The final thing you need to do is to make the connection between the hostname (url) and the file you just created by adding an entry in the import_hosts table in the db. You can do this through the console. To ensure that a new app can support the portal you added, please also add an entry to this seed file:

https://github.com/RealEstateWebTools/property_web_scraper/blob/master/db/seeds/import_hosts.rb

The best workflow when adding a new portal is to create a failing spec for it and work your way through to it passing. You can find the specs for the scrapers I've created here:

https://github.com/RealEstateWebTools/property_web_scraper/tree/master/spec/services/scrapers

Clone this wiki locally