[Feature Request] Add support for symbolic atom, bond, angle, dihedral and improper types

**Summary**

Right now, force field types in LAMMPS are numbers starting from 1. Allowing symbolic (text) based atom types would make a lot of tasks easier like merging data from multiple simulations, supporting reading of existing topology/parameter file formats from CHARMM or Amber, and writing complex inputs. On the other hand, some of the flexibility of LAMMPS stems from the size of data structures being flexible, but then locked in when the simulation cell is created. This causes some conflicts for which some solution has to be implemented that is acceptable for when symbolic atom types are used.

**Detailed Description**

Given the pervasive nature of such a feature, this has to be implemented into the core of LAMMPS. Given the large body of existing work and tools, it also has to be implemented in a way to ensure backward compatibility. And it would need to be implemented incrementally by first adding the facility to manage the symbol to type and reverse mapping and then gradually adapt styles and other code to utilize it.

A key goal would be to have the ability to directly read topology/parameter files from either CHARMM or Amber (parameter and psf files and parmtop files, respectively) or read a file format that is available and well supported for which a well supported converter tool exists for conversion without loss of information (most tools lose some information though). In some way it should be possible to auto-generate types for bonded interactions from atomic types. There also should be a command to populate the force field type database from LAMMPS input script.

The major constraint to consider is that one cannot change the number of types after the box is defined (outside of completely deleting the simulation instance with "clear"). A strategy to deal with that is to reserve more atom types than actually needed *and* have a "default" or "disabled" set of force field parameters (a NULL atom or bond types), so that the checking for unset atom types will not fail due to missing input of parameters.

One strategy would be to allow setting symbolic types *BEFORE* the box is defined. e.g. through a command like `species atom add OW` that would then later allow to either explicitly associate a numerical atom type with a symbolic type (`species atom set type OW 1`) or do this automatically. When creating the box, instead of the number of atom types the keyword AUTO could be used
and then the number would be the number of entries in the table of symbolic types.
We allow have other things associated with those tables, e.g. force field parameters (`species atom coeff OW lj 0.15535 3.166`). In most force fields, the mixed terms are derived from mixing rules, but that could be overridden as well (e.g. to support NBFIX in CHARMM) with a specific command (`species atom mixed OW OH lj 0.0 3.0`)

The equivalent would be done for bonds/angles/etc. that would allow to read a CHARMM parameter file (except for cross terms, but that is a special case anyway).

To support data and molecule files with symbolic types instead of numeric, there could be commands like `species data some.data` and `species molecule some.molecule` commands that would extract symbolic type and coefficient info from data and molecule files as available. Similarly more readers for extracting data could be implemented: `species charmm xxx.par`, `species amber xxx.parmtop`, `species psf xxx.psf` etc.

As soon as the box is created, the database of force field types has a constraint as to what types can be associated with any symbolic types.

To be backward compatible, and to handle numeric types and `create_box` with setting the number of atom types, the database would always contain a NULL type and by default numerical atom types would be associated with that type. This would then require different "sanity checks", so that when you now get an error that pair coefficient are missing, those checks have to be extended to check against the mapping to symbols, too.

For the commands that otherwise require numerical (atom) types, we can accept strings (if present in the type database) and just have the database return the number that a symbol is associated with. 

The symbols could then also be enabled on output and e.g. replace the type to element mapping in dump files, or the write_data command can be instructed to replace the numbers with strings.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Add support for symbolic atom, bond, angle, dihedral and improper types #2002

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Add support for symbolic atom, bond, angle, dihedral and improper types #2002

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions