02 05 2025 chemoinformatics life Tweet
After repeatedly rewriting input generators for FMO calculations each time I changed jobs, I decided to develop and maintain them as open-source software — and that’s how FMOkit was born.
To successfully carry out FMO calculations, we must overcome three major challenges:
- Compiling GAMESS
- Preparing PDB files (adding hydrogen atoms and assigning charges)
- Generating FMO input files
In the following sections, we’ll examine each of these in detail.
Compiling GAMESS
Precompiled binaries are available for Windows and Intel-based Mac. On Linux, compilation is required, but the process is relatively straightforward. For Macs with Apple Silicon (M1–M3 chips), modifications to the source code are necessary. Please refer to the following Japanese article for detailed instructions.
Preparing PDB files (adding hydrogen atoms and assigning charges)
FMO (Fragment Molecular Orbital) calculations are a type of quantum chemical simulation, and thus require that hydrogen atoms and fragment charges be explicitly included in the molecular system. However, standard structural formats such as PDB files typically lack this information, so it must be added during preprocessing.
While commercial molecular modeling tools like Maestro or MOE are often used in pharmaceutical companies and academic labs, I opted for an open-source solution. After evaluating various options, I chose OpenMM, a molecular dynamics toolkit, as a backend for preprocessing.
You can use the script mmcifutil.py located in the utils directory to add hydrogen atoms and assign partial charges to all atoms:
python mmcifutil.py input_file output_file
For details about how the code works, please refer to this article.
Generating FMO input files
Aside from securing computational resources, generating input files for FMO calculations is arguably the most complex and unintuitive part of the entire workflow. Creating these inputs manually is not only extremely difficult, but also arguably inhumane in terms of workload and error-proneness.
Motivated by the desire to automate FMO input generation from structural data in the mmCIF format (with planned support for mol2 in the future), I began developing FMOkit.
If you prefer a graphical interface for preparing input files, I recommend using Facio, a GUI tool specifically designed for FMO workflows.
The code is shown below, using chignolin, one of the smallest known proteins, as an example.
>>> from FMOkit import System >>> s = System(nodes=1, cores=8, memory=12000, basissets="6-31G") >>> s.read_file("tests/5awl-addH.cif") >>> s.prepare_fragments() >>> with open("5awl-fmokit.inp", "w") as f: ... f.write(s.print_fmoinput())
Although command-line interface (CLI) support is planned in the future, the code was executed from the Python console for verification purposes in this example.
The FMO calculation (RHF/MP2, 6-31G basis sets) for this 10-residue protein completed in approximately 40 minutes on my MacBook Air (Apple M3, 2024, with 16 GB of memory).
% time ~/gamess/rungms 5awl-fmokit.inp >& t5awl-fmokit.out ~/gamess/rungms 5awl-fmokit.inp >&5awl-fmokit.out 2351.21s user 142.53s system 99% cpu 41:44.37 total