Lambda layer to enable using famous NLTK python package with lambda. This project is setup to build for python3.7 runtime and download punkt
& stopwords
to NLTK_DATA directory (Instruction on how to customize this to your needs below in Configuration
Clone/Download the repository and run the following command,
$ ./bin/bootstrap
$ ./bin/build
./bin/build
will create a python-nltk-layer.zip
file inside share
folder.
- Change the python runtime version your project needs in Dockerfile.
e.g if you need to build for python version 3.8 search & replace all occurances of 3.7 to
3.8
in the Dockerfile - Add/update instruction for downloading the NLTK data you need.
e.g If you need NLTK
brown
corpus instead ofstopwords
you can change this line toRUN python -W ignore -m nltk.downloader brown -d /build/nltk_data
You can create a layer in your AWS account in one of two ways,
-
You can upload the zip file directly in AWS Console, e.g screenshot on how to do that below,
Or
-
Assuming you have your AWS CLI setup,more info here. You can run
./bin/deploy
to publish the lambda layer.
- Configure lambda to use the lambda layer you published above.
- Due to the manual setup of NLTK Data, you need to set
NLTK_DATA=/opt/nltk_data
environment variable for your lambda function.
Bug reports and pull requests are welcome on GitHub at https://github.com/customink/lambda-python-nltk-layer. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.
The gem is available as open source under the terms of the MIT License.