Skip to content

Commit 4811cf6

Browse files
docs
1 parent 426580d commit 4811cf6

File tree

3 files changed

+150
-20
lines changed

3 files changed

+150
-20
lines changed

docs/source/index.rst

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -27,16 +27,15 @@ The library is designed to be easy to use, with all functions available directly
2727
:maxdepth: 2
2828
:caption: Contents:
2929

30-
modules/installation
3130
modules/getting_started
3231
modules/examples
3332

34-
API Reference
35-
============
33+
Module Reference
34+
=================
3635

3736
.. toctree::
3837
:maxdepth: 4
39-
:caption: API Documentation:
38+
:caption: Module Documentation:
4039

4140
api/peptacular
4241

Lines changed: 147 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,152 @@
11
Getting Started
22
===============
33

4+
**peptacular** is an extremely lightweight package with only one dependency: ``regex``.
5+
6+
It contains functions for parsing and working with Proforma2.0 compliant peptide & protein sequences.
7+
8+
Installation
9+
------------
10+
11+
**From pip**:
12+
13+
.. code-block:: bash
14+
15+
pip install peptacular
16+
17+
**From source**:
18+
19+
.. code-block:: bash
20+
21+
git clone https://github.com/pgarrett-scripps/peptacular.git
22+
cd peptacular
23+
pip install .
24+
25+
426
Basic Usage
5-
----------
27+
-----------
28+
29+
All modules and functions in **peptacular** are available under the ``peptacular`` namespace but it is recommended to import as follows:
30+
31+
.. code-block:: python
32+
33+
import peptacular as pt
34+
35+
**ProForma** sequence parsing in **peptacular** is lazy, meaning only the required notation is
36+
validated during parsing. Invalid modifications are not checked until they are explicitly needed,
37+
such as when calculating mass or composition.
38+
39+
**For example:**
40+
41+
.. code-block:: python
42+
43+
pt.pop_mods('PEP[INVALID]TIDE') # Successfully runs
44+
pt.mass('PEP[INVALID]TIDE') # Raises an error due to the invalid modification
45+
46+
47+
Key Features
48+
------------
49+
50+
**peptacular** is fully compliant with **ProForma 2.0** and includes functions for:
51+
52+
- **Digestion:** Perform single, multi, or sequential digestion.
53+
- **Fragmentation:** Generate internal, terminal, immonium, and neutral loss fragments.
54+
- **Mass and Composition:** Calculate mass, m/z, or elemental composition of peptides.
55+
- **Modifications:** Apply or remove static and variable modifications.
56+
- **Parsing and Serializing:** Handle ProForma 2.0-compliant sequence parsing and serialization.
57+
- **Isotopic Distributions:** Simulate isotopic patterns.
58+
- **Scoring:** Compare theoretical fragments against experimental spectra.
59+
60+
See the **Examples** section for more detailed use cases.
61+
62+
63+
Sequence Handling
64+
-----------------
65+
66+
All functions in ``pt.sequence`` accept peptide sequences as strings but internally convert them to
67+
**ProformaAnnotation** objects. After processing, they are converted back to strings.
68+
69+
When applying multiple sequence operations on the same peptide, it is more efficient to first convert the
70+
sequence to a **ProformaAnnotation** and use its methods directly.
71+
72+
**Example: Converting between `ProformaAnnotation` and `str`:**
73+
74+
.. code-block:: python
75+
76+
import peptacular st pt
77+
78+
annot = pt.parse('PEPTIDE')
79+
seq = pt.serialize(annot) # or annot.serialize()
80+
assert 'PEPTIDE' == seq
81+
82+
83+
- This returns either a ``ProformaAnnotation`` or a ``MultiProformaAnnotation`` object.
84+
- ``ProformaAnnotation`` is used for single, linear sequences (the most common use case).
85+
- ``MultiProformaAnnotation`` handles crosslinked or multiple sequences.
86+
87+
**Crosslinked and Multiple Sequences:**
88+
89+
- **Crosslinked**: ``{sequence1}\\{sequence2}``
90+
- **Disconnected**: ``{sequence1}+{sequence2}``
91+
92+
``MultiProformaAnnotation`` contains a list of individual ``ProformaAnnotation`` objects along with their
93+
connection type.
94+
95+
96+
ProForma Notation
97+
----------------------
98+
99+
**ProForma 2.0** was introduced by the **Proteomics Standards Initiative (PSI)** to standardize the representation of peptide sequences, including modifications.
100+
101+
- 📄 **Reference Paper:** `ProForma 2.0 Specification <https://pubs.acs.org/doi/10.1021/acs.jproteome.1c00771>`_
102+
- 📚 **Latest Specification:** `ProForma 2.0 GitHub <https://github.com/HUPO-PSI/ProForma/tree/master/SpecDocument>`_
103+
104+
105+
**Basic Syntax Overview**
106+
107+
- **N-terminal:** ``[+100]-PEPTIDE``
108+
- **C-terminal:** ``PEPTIDE-[+100]``
109+
- **Internal:** ``PEPT[+100]IDE``
110+
- **Global:** ``<[+100]@C>PEPTIDE`` or ``<[+100]@C,P>PEPTIDE``
111+
- **Isotope:** ``<13C>PEPTIDE`` or ``<15N><13C>PEPTIDE``
112+
- **Labile:** ``{+100}PEPTIDE``
113+
114+
Global, isotope, and labile mods are specified before N-terminal modification, or first residue if no terminal mod is present.
115+
116+
**Combined Example:**
117+
118+
.. code-block:: python
119+
120+
pt.parse('<[+20]@C><13C>{+75}[-40]-PEPT[+50]IDE-[+200]')
121+
122+
# Returns
123+
ProFormaAnnotation(
124+
sequence='PEPTIDE',
125+
isotope_mods=[Mod('13C', 1)],
126+
static_mods=[Mod('[+20]@C', 1)],
127+
labile_mods=[Mod(75, 1)],
128+
nterm_mods=[Mod(-40, 1)],
129+
cterm_mods=[Mod(200, 1)],
130+
internal_mods={3: [Mod(50, 1)]}
131+
)
132+
133+
**Specifying Proforma Modifications**
134+
135+
The ``Mod`` object contains:
136+
- The **modification string**
137+
- The **number of times** it is applied
138+
139+
You can apply **multiple modifications** at the same position by adding them sequentially:
140+
- ``[+100][+30]`` → Two separate modifications
141+
- ``[+100]^2`` → The same modification applied twice
142+
143+
**Modification Types**
144+
145+
- **Mass-based:** ``[+100]``, ``[100]``, ``[-100]``
146+
- **Chemical formula:** ``[Formula:C12H20O2]``
147+
- **UNIMOD:** ``[Oxidation]``, ``[UNIMOD:21]``, ``[U:21]``
148+
- **PSI-MOD:** ``[L-methionine sulfoxide]``, ``[MOD:00046]``, ``[M:00046]``
149+
- **RESID:** ``[R:L-methionine (R)-sulfoxide]``, ``[RESID:AA0037]``
150+
- **GNO:** ``[GNO:G02815KT]``
6151

7-
TODO
152+
While the prefix for unimod and psi-mods are not required (U: and M:), it is still reccommended to use them.

docs/source/modules/installation.rst

Lines changed: 0 additions & 14 deletions
This file was deleted.

0 commit comments

Comments
 (0)