The x-ray_scripting_out pipeline processes X-ray Absorption Spectroscopy (XAS) output data generated by the ORCA quantum chemistry software. Its primary purpose is to leverage the force oscillator strength and transition density matrices to derive the core-virtual coupling molecular orbitals (MOs) represented as matrices.
The updated output format integrates transition intensity from the matrix densities with their force oscillator strengths to build the matrices that describe MOs in the core and virtual spaces. These matrices encapsulate the core-virtual coupling MOs, as exemplified below:
1. Number of transition intensities
2. Transition intensity probability
3. Force oscillator strenght
The pipeline is implemented in Shell script and is best suited for a Linux operating system.
To execute the pipeline, you can use either manager.sh or overall.sh.
The input required to run this pipeline is an XAS output file from ORCA, generated using either ROCIS/DFT or PNO-ROCIS/DFT. This input file must include the molecular orbital (MO) Löwdin population and the standard format of the transition intensities and probabilities for each excited state, along with a list of coupling MOs.
You can specify a localized group of atoms involved in the coupling MO transitions, allowing for focused analysis of transitions between two sets of atoms, such as two amino acids in a protein.
Clone the x-ray_scripting_out repository using Git
$ git clone https://github.com/caraortizmah/x-ray_scripting_out.git
The pipeline can be run in two ways: a simpler, more automated approach using helper_man.sh, or a more customizable option with manager.sh.
manager.sh: This is the primary script that executes all the pipeline steps in a sequential (noticeable) order, as indicated by their step-specific names.helper_man.sh: This provides an easier method by reading the required parameters from a separate file, namedconfig.info.
Run the following command:
$ ./helper_man.sh
helper_man.sh uses the information in config.info to execute manager.sh.
Read further about the config.info file
The config.info file is self-explanatory, formatted as a two-column table (NAME and FLAG).
The NAME column describes the parameter, option, or condition, while the FLAG column specifies the values that manager.sh will directly apply to the ORCA outputs.
Please do not alter the file format, such as lines, dashes, or naming conventions. Additionally, do not modify any NAME or FLAG entries.
The following parameters are MANDATORY:
Atom_number_range_AAtom_number_range_Bcore_MO_rangeexc_state_rangesoc_optionorca_output
The following parameters are OPTIONAL:
spectra_optionexternal_MO_fileatm_corewave_f_typeinput_pathoutput_path
-
Atom_number_range_AandAtom_number_range_B: Specify the range of atom sequential numbers in the coordinates used in the XAS ORCA output file (orca_output). Note that the enumeration starts from 0 for the first atom. -
Atom_number_range_A: Atoms of the core space. -
Atom_number_range_A: Atoms of the virtual space. -
core_MO_range: Defines the range of core molecular orbitals (MOs) for the target atom, e.g., C. To study specific core MOs, such as 4 and 15, run the pipeline separately for each, settingcore_MO_range = 4-4for one andcore_MO_range = 15for the other. Ifcore_MO_range = 4-15is specified, the program processes the entire sequential range, following the same logic as the atom number range flags (Atom_number_range_AandAtom_number_range_B). -
exc_state_range: Specifies the range of excited states to analyze, based on those computed inorca_output. It follows the same format ascore_MO_range,Atom_number_range_BandAtom_number_range_A. -
soc_option: Accepts 0 or 1, where 0 excludes spin-orbit coupling effects, and 1 includes them (e.g., for sulfur L-edge analysis). -
orca_output: Refers to the XAS ORCA output file, compatible with ORCA versions 4 and 5.0.4. Note that ORCA 6.0 introduces a substantially different output format, which will be supported in a future update. -
spectra_option(optional): Accepts 0 or 1. Default is 0 (recommended). Option 1 allows advanced analysis (beta), particularly forsoc_option = 1, though 0 is still advised unless further testing is conducted. -
external_MO_file(optional): An ORCA file containing Löwdin population data. Ensure that the ORCA input includes the flag!Normalprintto output Löwdin populations. This flag allows workflow separation from theorca_outputfile. Read more about ORCA input. -
atm_core(optional): Atomic symbol of the target atom, e.g., C, O, N, P, S. Default is C. -
wave_f_type(optional): Specifies the type of core MO, such assorp. Default iss. -
input_path(optional): Absolute path to the directory containing ORCA output files (inputs for the pipeline). -
output_path(optional): Absolute path to the directory where the pipeline will save results (outputs).
- The file
config.infomust retain the nameconfig.info. :) - Sequential ranges (
Atom_number_range_A,Atom_number_range_B,core_MO_range, andexc_state_range) should be specified with numbers joined by a dash (-) without spaces (e.g.,4-15). - To analyze the full set of computed excited states, replace the range with the word
none(without quotes). soc_optiondefaults to 0. It is recommended to explicitly set all FLAG values, even default ones like 0.spectra_optiondefaults to 0.external_MO_filecan be left empty, in which case the pipeline assumes that Löwdin populations are included in theorca_output.atm_coredefaults to C.wave_f_typedefaults tos.
It is highly recommended to use absolute paths forinput_pathandoutput_path.- If
input_pathis not provided, the pipeline will attempt to use its current execution location to find theorca_outputandexternal_MO_file(if applicable). - If
output_pathis not specified, the pipeline will place the results in its execution location. The results will be saved in theoutput_pathunder a newly created folder named "orca_output_out" (e.g.,output_path/orca_output_out/). A reduced version for subsequent analysis will be placed in a new directory:output_path/pop_matrices/orca_output_csv/.
I recommed to read the information related the config.info file.
To run the pipeline, use the following command:
$ ./manager.sh $1 $2 $3 $4 $5 $6 $7 $8 $9 $10 $11 $12 $13 $14 $15
Where:
$1: Initial number range ofAtom_number_range_A$2: Final number range ofAtom_number_range_A$3: Initial number range ofAtom_number_range_B$4: Final number range ofAtom_number_range_B$5: Initial number range ofcore_MO_range$6: Final number range ofcore_MO_range$7:soc_option$8:orca_output$9:exc_state_range$10:spectra_option$11:atm_core$12:wave_f_type$13:external_MO_file$14:input_path$15:output_path
Please note that you cannot leave any field empty; otherwise, the subsequent field (parameter) will be interpreted as the missing option for the previous one.
The provided example of the config.info file serves as a template, where only the second column (the flags) should be modified to suit your analysis.
This example demonstrates an analysis setup for:
- XAS for Sulfur (
atm_core = S) at the L-edge (wave_f_type = p) including spin-orbit coupling effects (soc_option = 1). - Three p core MOs to analyze:
core_MO_range = 63-65. - Excited states limited to the first seven (
exc_state_range = 1-7). - Atoms involved (
0-116): the entire molecule. Although including all atoms might be unnecessary since not all are sulfur, this approach simplifies the setup by screening everything, even if it seems redundant or overly detailed.
For Atom_number_range_A, include only the enumerated atoms representing Sulfur (core MO space). For Atom_number_range_B, include the enumerated atoms of the virtual MO space (it is recommended to include all atoms). This range (0 to 116) represents the entire molecule's interaction.
More detailed information about running examples can be found in the example/readme.md file.
This pipeline primarily utilizes Linux text processing tools:
grepcutawksedvim
Contributions are what make the open-source community such a remarkable space for learning, inspiration, and innovation. Your contributions are highly valued and greatly appreciated!
If you have a suggestion to improve this project, feel free to fork the repository and submit a pull request.
Alternatively, you can open an issue with the tag "enhancement." And do not forget to give the project a star if you find it helpful—thank you for your support!
- Fork the Project
- Create Your Feature Branch:
git checkout -b feature/branch
- Commit your Changes:
git commit -m 'Add some Feature' - Push to the Branch:
git push origin feature/branch
- Open a Pull Request
caraortizmah
Distributed under the GNU General Public License v3.0
caraortizmah
Carlos A. Ortiz-Mahecha - ortizmahecha[at]proton.me