Skip to content
This repository was archived by the owner on Apr 23, 2025. It is now read-only.

Commit cd8fcbb

Browse files
committed
feat(fabric): add fabric support to run app or other tasks
- add fabfile.py - set message when no result found in website - set LOGS_DIR environment variable
1 parent ce1df6c commit cd8fcbb

File tree

7 files changed

+38
-10
lines changed

7 files changed

+38
-10
lines changed

README.md

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ A web crawler/scraper to find the broken links in the targeted seed url based on
55

66
##Installation
77
1. Redis
8+
3. Fabric
89
2. Python 2.7+
910

1011
##Instructions
@@ -26,18 +27,26 @@ A web crawler/scraper to find the broken links in the targeted seed url based on
2627
export SMTP_PASSWORD='smtp-password'
2728
```
2829

30+
4. Also, set the one more environmnet variable to save **`Logs`** of the app in defined location.
31+
```python
32+
# your shell config file
33+
export LOGS_DIR='path/to/logs'
34+
```
35+
2936
##Commands
37+
Note:- First install *`Fabric`* to run below commands
38+
3039
To run a gui app :
3140
```
32-
$ python rottoscraper/run.py app
41+
$ fab app
3342
```
3443
To run a dispatcher :
3544
```
36-
$ python rottoscraper/run.py dispatcher
45+
$ fab dispatcher
3746
```
3847
To run a worker :
3948
```
40-
$ python rotttoscraper/worker.py
49+
$ fab worker
4150
```
4251
##Developer
4352
1. [Akshay Pratap Singh](https://www.facebook.com/AKSHAYPRATAP007)

fabfile.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#! /usr/bin/env python
2+
# -*- coding: utf-8 -*-
3+
4+
from fabric.api import local
5+
6+
def app():
7+
local('python rottoscraper/run.py app')
8+
9+
def dispatcher():
10+
local('python rottoscraper/run.py dispatcher')
11+
12+
def worker():
13+
local('python rottoscraper/worker.py')

rottoscraper/config.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,6 @@
2828
# SMTP Cerendentials
2929
SMTP_USER = os.getenv('SMTP_USER', None)
3030
SMTP_PASSWORD = os.getenv('SMTP_PASSWORD', None)
31+
32+
# Logs DIR Path
33+
LOGS_DIR = os.getenv('LOGS_DIR', 'logs/')

rottoscraper/gui/static/html/result.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,10 +36,10 @@
3636
</table>
3737
</div>
3838
<div class="result-content fancy-box">
39-
<div class="msg" ng-show="website.result.length==0">
39+
<div class="msg" ng-if="!website.result">
4040
<p>No Rotto Links Page Found</p>
4141
</div>
42-
<div class="result-content-row" ng-repeat="rottopage in website.result">
42+
<div class="result-content-row" ng-if='website.result' ng-repeat="rottopage in website.result">
4343

4444
<table class="table">
4545
<tr>

rottoscraper/logger.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,13 @@
99
from logbook import FileHandler
1010
from logbook import Logger
1111

12+
from config import LOGS_DIR
13+
1214
log = Logger('scraper')
1315

1416
# Create a logs direcory if not exist
15-
if not os.path.exists('logs'):
16-
os.makedirs('logs')
17-
file_handler = FileHandler('logs/app.log', level=logbook.DEBUG)
17+
if not os.path.exists(LOGS_DIR):
18+
os.makedirs(LOGS_DIR)
19+
log_file_name = 'rottoscraper.log'
20+
file_handler = FileHandler(LOGS_DIR + log_file_name, level=logbook.DEBUG)
1821
file_handler.push_application()

rottoscraper/scraper/aho.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ def search_keywords(self, text=None):
8989
while trans is None:
9090
# trans=currentNode.GetTransition(text[index])
9191
for x in currentNode.transitions:
92-
if unicode(x.char) == c:
92+
if x.char == c:
9393
trans = x
9494
if currentNode == self.root:
9595
break

rottoscraper/scraper/utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ def get_plain_text(html):
8787
Return the plain text in utf-8 encoding from a html
8888
"""
8989
raw_text = nltk.clean_html(html)
90-
text = u' '.join(raw_text.split()).encode('utf-8').lower()
90+
text = u' '.join(raw_text.split()).lower()
9191
return text
9292

9393
def get_all_links(html):

0 commit comments

Comments
 (0)