-
-
Notifications
You must be signed in to change notification settings - Fork 242
Add Dockerfile and README.md to help development #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 1 commit
1e58a46
cb2fe9a
5e01879
3c97ec2
3ee6e82
d1f0854
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| FROM ubuntu | ||
|
|
||
| RUN apt-get update \ | ||
| && apt-get install -y \ | ||
| apt-utils \ | ||
| curl \ | ||
| wget \ | ||
| nano \ | ||
| libsm6 \ | ||
| libxrender1 \ | ||
| libxext6 \ | ||
| ghostscript \ | ||
| python3-minimal \ | ||
| python3-setuptools \ | ||
| python3-pip \ | ||
| && ln -s /usr/bin/python3 /usr/bin/python \ | ||
| && ln -s /usr/bin/pip3 /usr/bin/pip | ||
|
|
||
| RUN pip install excalibur-py[dev] | ||
|
|
||
| EXPOSE 5000 | ||
|
|
||
| ENV LC_ALL=C.UTF-8 | ||
| ENV LANG=C.UTF-8 | ||
|
|
||
| WORKDIR /excalibur/ |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| <p align="center"> | ||
| <img src="https://raw.githubusercontent.com/camelot-dev/excalibur/master/docs/_static/excalibur-logo.png" width="200"> | ||
| </p> | ||
|
|
||
| # Excalibur: Docker | ||
| This is the Docker configuration which allows you to run Apache Spark without installing any dependencies on your machine!<br/> | ||
| OK, any except `docker`. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| As stated, the thing you need is `docker`. | ||
|
|
||
| Follow the instructions on [Install Docker](https://docs.docker.com/engine/installation/) for your environment if you haven't got `docker` already. | ||
|
|
||
| ## Usage | ||
|
|
||
| ### Prepare the image | ||
|
|
||
| Switch to `docker` directory here and run `docker build -t excalibur .` (don't forget the final `.`) to build your docker image. That may take some time but is only required once. Or perhaps a few times after you tweak something in a `Dockerfile`. | ||
|
|
||
| After the process is finished you have a `excalibur` image, that will be the base for your experiments. You can confirm that looking on results of `docker images` command. | ||
|
|
||
| ### Run the container | ||
|
|
||
| From your project folder, run `docker run -it -p 5000:5000 -v $(pwd):/excalibur/ excalibur /bin/bash` | ||
| This will start the container and open up a bash console inside it. | ||
|
|
||
| At this point you need to initialize the metadata database using: | ||
|
|
||
| <pre> | ||
| $ excalibur initdb | ||
| </pre> | ||
|
|
||
| Once initialized, you need to enable connectivity from outside the container: | ||
|
|
||
| Use nano to open the config file ... | ||
|
|
||
| <pre> | ||
| $ nano /root/excalibur/excalibur.cfg | ||
| </pre> | ||
|
|
||
| ... and modify the [webserver] section as: | ||
|
|
||
| <pre> | ||
| web_server_host = 0.0.0.0 | ||
|
||
| </pre> | ||
|
|
||
| And then start the webserver using: | ||
|
|
||
| <pre> | ||
| $ excalibur webserver | ||
| </pre> | ||
|
|
||
| That's it! Now you can go to http://localhost:5000 and start extracting tabular data from your PDFs. | ||
Uh oh!
There was an error while loading. Please reload this page.