diff --git a/README.md b/README.md index 32b20fb..c36063a 100755 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ [![Build Status](https://travis-ci.org/LexPredict/lexpredict-lexnlp.svg?branch=master)](https://travis-ci.org/LexPredict/lexpredict-lexnlp) [![Coverage Status](https://coveralls.io/repos/github/LexPredict/lexpredict-lexnlp/badge.svg?branch=master)](https://coveralls.io/github/LexPredict/lexpredict-lexnlp?branch=0.1.8) [![](https://tokei.rs/b1/github/lexpredict/lexpredict-lexnlp?category=code)](https://github.com/lexpredict/lexpredict-lexnlp) [![Docs](https://readthedocs.org/projects/lexpredict-lexnlp/badge/?version=docs-0.1.6)](http://lexpredict-lexnlp.readthedocs.io/en/docs-0.1.6/) -# LexNLP by LexPredict +# LexNLP by LexPredict - ARM-specific readme ## Information retrieval and extraction for real, unstructured legal text LexNLP is a library for working with real, unstructured legal text, including contracts, plans, policies, procedures, and other material. @@ -27,50 +27,43 @@ and other material. * Documentation: http://lexpredict-lexnlp.readthedocs.io/en/latest/ (in progress) * Contact: support@contraxsuite.com -## Structure -* ContraxSuite web application: https://github.com/LexPredict/lexpredict-contraxsuite -* LexNLP library for extraction: https://github.com/LexPredict/lexpredict-lexnlp -* ContraxSuite pre-trained models and "knowledge sets": https://github.com/LexPredict/lexpredict-legal-dictionary -* ContraxSuite agreement samples: https://github.com/LexPredict/lexpredict-contraxsuite-samples -* ContraxSuite deployment automation: https://github.com/LexPredict/lexpredict-contraxsuite-deploy -Please note that ContraxSuite installations generally require trained models or knowledge sets for usage. - ## Licensing LexNLP is available under a dual-licensing model. By default, this library can be used under AGPLv3 terms as detailed in the repository LICENSE file; however, organizations can request a release from the AGPL terms or a non-GPL evaluation license by contacting ContraxSuite Licensing at <>. -## Requirements -* Python 3.8 +## Requirements for `arm64`/Mac M1 +* Python 3.12 * pipenv +* conda + +## Installation on `arm64`/Mac M1 + +Some of the required packages (especially those related to `scikit-learn`) will fail to build wheels when `LexNLP` is installed on a M1 Mac. +For the package to be installed successfully, you can either use pre-compiled packages, or install a version of `scikit-learn` from the `conda-forge` installation that contains a C/C++ compiler (you can use [miniforge](https://github.com/conda-forge/miniforge#miniforge)). You can find the pre-compiled packages in [this](https://github.com/coaxsoft/arm_python_packages/tree/master/wheels) repo. + +As of October 23, 2024, `scikit-learn` has native support of ARM. However, the issue persisted until I created a conda environment with the steps below. + +Furthermore, LexNLP's dependencies are strict and lead to failed installation. This repo contains modified versions of `setup.py` and `python-requirements.txt` that (hopefully) will address this problem. WARNING: I have only tested extensively scripts related to definitions. The new installation instructions may end up breaking other LexNLP functions. + +### Environment setup + +Download this directory (don't install it yet). Move the directory to wherever you prefer. Create a new virtual environment w/ `scikit-learn` specified: + + conda create -n myenv -c conda-forge scikit-learn + +where `myenv` is your environment name. + +Activate the environment and install pip into this environment: + + conda activate myenv + conda install pip + +Navigate to the downloaded source directory and run + + pip install . + +which executes the modified `setup.py` using the requirements we just copied over. -## Releases -* 2.3.0: November 30, 2022 - Twenty sixth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/2.3.0) -* 2.2.1.0: August 10, 2022 - Twenty fifth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/2.2.1.0) -* 2.2.0: July 7, 2022 - Twenty fourth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/2.2.0) -* 2.1.0: September 16, 2021 - Twenty third scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/2.1.0) -* 2.0.0: May 10, 2021 - Twenty second scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/2.0.0) -* 1.8.0: December 2, 2020 - Twenty first scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/1.8.0) -* 1.7.0: August 27, 2020 - Twentieth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/1.7.0) -* 1.6.0: May 27, 2020 - Nineteenth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/1.6.0) -* 1.4.0: December 20, 2019 - Eighteenth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/1.4.0) -* 1.3.0: November 1, 2019 - Seventeenth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/1.3.0) -* 0.2.7: August 1, 2019 - Sixteenth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.2.7) -* 0.2.6: June 12, 2019 - Fifteenth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.2.6) -* 0.2.5: March 1, 2019 - Fourteenth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.2.5) -* 0.2.4: February 1, 2019 - Thirteenth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.2.4) -* 0.2.3: Junuary 10, 2019 - Twelfth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.2.3) -* 0.2.2: September 30, 2018 - Eleventh scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.2.2) -* 0.2.1: August 24, 2018 - Tenth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.2.1) -* 0.2.0: August 1, 2018 - Ninth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.2.0) -* 0.1.9: July 1, 2018 - Ninth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.1.9) -* 0.1.8: May 1, 2018 - Eighth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.1.8) -* 0.1.7: April 1, 2018 - Seventh scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.1.7) -* 0.1.6: March 1, 2018 - Sixth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.1.6) -* 0.1.5: February 1, 2018 - Fifth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.1.5) -* 0.1.4: January 1, 2018 - Fourth scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.1.4) -* 0.1.3: December 1, 2017 - Third scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.1.3) -* 0.1.2: November 1, 2017 - Second scheduled public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.1.2) -* 0.1.1: October 2, 2017 - Bug fix release for 0.1.0; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.1.1) -* 0.1.0: September 30, 2017 - First public release; [code](https://github.com/LexPredict/lexpredict-lexnlp/tree/0.1.0) +This should successfully install the package. You may have to downgrade `setuptools` to ~=70.0.0. diff --git a/python-requirements.txt b/python-requirements.txt index 04f69dd..fc35564 100755 --- a/python-requirements.txt +++ b/python-requirements.txt @@ -17,7 +17,7 @@ elastic-transport==8.4.0 elasticsearch==8.5.0 exceptiongroup==1.0.4 filelock==3.8.0 -gensim==4.1.2 +gensim==4.3.2 idna==3.4 imagesize==1.4.1 importlib-metadata==5.0.0 @@ -27,7 +27,7 @@ jellyfish==0.6.1 Jinja2==3.1.2 joblib==1.2.0 lazy-object-proxy==1.8.0 -lxml==4.9.1 +lxml==5.3.0 Markdown==3.4.1 MarkupSafe==2.1.1 mccabe==0.7.0 @@ -35,7 +35,6 @@ memory-profiler==0.60.0 nltk==3.7 nose==1.3.7 num2words==0.5.12 -numpy==1.23.4 packaging==21.3 pandas==1.5.1 pipenv==2022.11.11 @@ -53,8 +52,7 @@ pytz-deprecation-shim==0.1.0.post0 regex==2022.3.2 reporters-db==3.2.32 requests==2.28.1 -scikit-learn==0.24.0 -scipy==1.9.3 +scipy>=1.9.3 six==1.16.0 smart-open==6.2.0 snowballstemmer==2.2.0 diff --git a/setup.py b/setup.py index f3a0032..bb6cc41 100755 --- a/setup.py +++ b/setup.py @@ -89,31 +89,88 @@ # requirements files see: # https://packaging.python.org/en/latest/requirements.html install_requires=[ - 'beautifulsoup4==4.11.1', - 'cloudpickle==2.2.0', - 'dateparser==1.1.3', - 'elasticsearch==8.5.0', - 'gensim==4.1.2', - 'importlib-metadata==5.0.0', - 'joblib==1.2.0', - 'lxml==4.9.1', - 'nltk==3.7', - 'num2words==0.5.12', - 'numpy==1.23.4', - 'pandas==1.5.1', - 'psutil==5.9.4', - 'pycountry==22.3.5', - 'python-dateutil==2.8.2', - 'regex==2022.3.2', - 'reporters-db==3.2.32', - 'requests==2.28.1', - 'scikit-learn==0.24', - 'scipy==1.9.3', - 'tqdm==4.64.1', - 'Unidecode==1.3.6', - 'us==2.0.2', - 'zahlwort2num==0.4.2' - ], + "alabaster==0.7.12", + "astroid==2.12.12", + "attrs==22.1.0", + "Babel==2.11.0", + "beautifulsoup4==4.11.1", + "certifi==2022.9.24", + "charset-normalizer==2.1.1", + "click==8.1.3", + "cloudpickle==2.2.0", + "coverage==6.5.0", + "dateparser==1.1.3", + "dill==0.3.6", + "distlib==0.3.6", + "docopt==0.6.2", + "docutils==0.19", + "elastic-transport==8.4.0", + "elasticsearch==8.5.0", + "exceptiongroup==1.0.4", + "filelock==3.8.0", + "gensim==4.3.2", + "idna==3.4", + "imagesize==1.4.1", + "importlib-metadata==5.0.0", + "iniconfig==1.1.1", + "isort==5.10.1", + "jellyfish==0.6.1", + "Jinja2==3.1.2", + "joblib==1.2.0", + "lazy-object-proxy==1.8.0", + "lxml==5.3.0", + "Markdown==3.4.1", + "MarkupSafe==2.1.1", + "mccabe==0.7.0", + "memory-profiler==0.60.0", + "nltk==3.7", + "nose==1.3.7", + "num2words==0.5.12", + "packaging==21.3", + "pandas==1.5.1", + "pipenv==2022.11.11", + "platformdirs==2.5.4", + "pluggy==1.0.0", + "psutil==5.9.4", + "pycountry==22.3.5", + "Pygments==2.13.0", + "pylint==2.15.5", + "pyparsing==3.0.9", + "pytest==7.2.0", + "python-dateutil==2.8.2", + "pytz==2022.6", + "pytz-deprecation-shim==0.1.0.post0", + "regex==2022.3.2", + "reporters-db==3.2.32", + "requests==2.28.1", + "scipy>=1.9.3", + "six==1.16.0", + "smart-open==6.2.0", + "snowballstemmer==2.2.0", + "soupsieve==2.3.2.post1", + "Sphinx==5.3.0", + "sphinxcontrib-applehelp==1.0.2", + "sphinxcontrib-devhelp==1.0.2", + "sphinxcontrib-htmlhelp==2.0.0", + "sphinxcontrib-jsmath==1.0.1", + "sphinxcontrib-qthelp==1.0.3", + "sphinxcontrib-serializinghtml==1.1.5", + "threadpoolctl==3.1.0", + "tomli==2.0.1", + "tomlkit==0.11.6", + "tqdm==4.64.1", + "tzdata==2022.6", + "tzlocal==4.2", + "Unidecode==1.3.6", + "urllib3==1.26.12", + "us==2.0.2", + "virtualenv==20.16.7", + "virtualenv-clone==0.5.7", + "wrapt==1.14.1", + "zahlwort2num==0.4.2", + "zipp==3.10.0" +] +, # Install any data files from packages. # The data files must be specified via the distutils’ MANIFEST.in file.