Skip to content

Comments

specify input and output files by command-line arguments#101

Merged
lmiq merged 27 commits intom3g:masterfrom
misael-diaz:command-line-args
Aug 20, 2025
Merged

specify input and output files by command-line arguments#101
lmiq merged 27 commits intom3g:masterfrom
misael-diaz:command-line-args

Conversation

@misael-diaz
Copy link
Contributor

@misael-diaz misael-diaz commented Aug 18, 2025

I am submitting this PR so that you can see more clearly what changes I have made to packmol's source code in order to implement this feature.

The proposed feature allows users to specify the names of the input and output files via command-line arguments, while being backwards compatible.

It is proposed to execute packmol with command-line arguments is the following way:

packmol -in input.txt -out output.txt

Users may still use packmol with input redirection if they wish to do so:

packmol < input.txt

in that case the name of the output file must be present in input.txt by providing the respective output keyword and value pair.

I shall add the tests that you have requested to test the interface automatically via CI.

closes #100

NOTES
the command-line parser sets the filenames `input_file_name` and
`xyzout`, where `input_file_name` is a new variable that we had
to define for storing the name of the input file.

we have defined a new exit code to differentiate command-line
related errors from the rest

for the time being we stop right after parsing the command-line
arguments until we update the rest of the source code accordingly

```sh
packmol -in input.txt -out output.txt
```

the following text is the current Packmol's logging on the console:

```
 PACKMOL - Packing optimization for the automated generation of
 starting configurations for molecular dynamics simulations.

                                                             Version 21.0.4

 input: input.txt
 output: output.txt
```
NOTES
packmol either reads from standard input or from the specified
input file

if the name of the input-file is an empty string packmol assumes
that the user has not provided any command-line arguments and thus it
fallsback to the default workflow of reading data from standard input;
otherwise we open the input-file and read the data from it. After
we are done we close the input file.
NOTES
If the user has provided a filename for the output file via the
command-line we use that; otherwise we fallback to the default
workflow of fecthing the filename from the input file.

Note that we did not apply indentation to make the git-diff easier
to read.

EXAMPLE

```sh
packmol -in input.txt -out output.txt
```

we get this

```

 PACKMOL - Packing optimization for the automated generation of
 starting configurations for molecular dynamics simulations.

                                                             Version 21.0.4

 input: input.txt
 output: output.txt
  Packmol must be run with: packmol < inputfile.inp

  Userguide at: http://m3g.iqm.unicamp.br/packmol

  Reading input file... (Control-C aborts)
  Types of coordinate files specified: pdb
  Seed for random number generator:      1234567
  Output file: output.txt
```

when we invoke packmol in the usual way, that is via:

```sh
packmol < input.txt
```

```

 PACKMOL - Packing optimization for the automated generation of
 starting configurations for molecular dynamics simulations.

                                                             Version 21.0.4

  Packmol must be run with: packmol < inputfile.inp

  Userguide at: http://m3g.iqm.unicamp.br/packmol

  Reading input file... (Control-C aborts)
  Types of coordinate files specified: pdb
  Seed for random number generator:      1234567
  Output file: system.pdb
```

the input file for this example was this one:

```
tolerance 2.0

filetype pdb

output system.pdb

structure data/IRMOF-1.pdb
  number 1
  inside box 0. 0. 0. 40 40 40
end structure

structure data/EtOH.pdb
  number 20
  inside box 0. 0. 0. 40 40 40
end structure
```

note that the name of the output file is set according to the
workflow (default or cli-mode)
NOTES
here we indent the codeblock nested in the if-statement
NOTES

at this point the user may use packmol either by supplying command-line
arguments or in the usual way

as you see we have not modify anything else that would affect the
computations executed by packmol
NOTES

renames the cli arguments according to the requirements so that
now users can execute packmol with command-line arguments in
this way:

```sh
packmol -i input.txt -o output.txt
```
NOTES

now users have the option of just defining the input file while
omitting specifying the output so that they may execute packmol this
way:

```sh
packmol -i input.txt
```
NOTES
This is for consistency with packmol's behavior: error messages are
written on standard output and numeric error codes are written on
standard error.
NOTES
executes all tests
development version of packmol passes all the tests in the suite

Example test execution:

```sh
/bin/bash test_cli.sh /usr/bin/packmol
```

Our test expects the user to provide the full or relative path
to the packmol executable; thus, this should also work

```sh
/bin/bash test_cli.sh ../packmol
```
NOTES
the test expects the user to provide the path to packmol to make
sure that we are testing the development version, which is
expected to be located at the parent directory
@lmiq
Copy link
Member

lmiq commented Aug 19, 2025

Some tests are failing, can you check, please?

Also, the test file you included seems to be AI generated (sorry if not). Not that I'm completely against that, but I find very hard to understand what's being tested there. Can we have a more straightforward test file?

@misael-diaz
Copy link
Contributor Author

Sure I can fix those issues.

The shell script is not AI generated. AI does not have the context to understand that when the seed is set to -1 in the input file that means the packmol uses the current time to initialize the PRNG and that means that every time packmol executes results might vary and hence the contents of the output file. I understood that by reading packmol's source code. I am writing this to demonstrate that it was human effort and experience that was used to write the script. I actually feel proud of being able to write something like that because I have been putting a lot of effort to learn GNU/Linux in over a decade. No offense taken though.

What I can do is to provide detailed descriptions in key steps so that the script will be easier to understand. In a nutshell, the script executes all the tests and checks that packmol yields the same output file regardless of the means of executing packmol.

I am going to work on those issues and improve the documentation of the script. If after that you consider that the script needs revision I can also work on that.

Thank you for your feedback.

@lmiq
Copy link
Member

lmiq commented Aug 19, 2025

Thanks for the clarification. I meant no offense (I use AI eventually), it was just that the script was so complete that it looked like AI generated (for instance testing for the existence of cp, ls commands is quite uncommon for humans :-) ).

By the way, I am quite impressed by the good use of Fortran you did there (clearly much better than what the whole Packmol looks like - a patchwork of decades of development starting when I was an undergrad...).

Saludos.

NOTES
this should fix the compile-time error
@misael-diaz
Copy link
Contributor Author

misael-diaz commented Aug 19, 2025

After patching the cmake file we should get a successful build in GNU/Linux.

Noticed that gawk is not present by default in OSX going to try awk instead to see if that does it. Good think I check for the existence of those commands in the shell script, makes troubleshooting easier.

I agree that checking for ls and cp commands is truly unnecessary; I mean, how would an user even get to run the script if those are broken? I was driven to do that just for the sake of uniformity in the script even if it meant being a little silly perhaps.

I have truly tried to develop something that would be useful to the scientific community a couple of times but have failed; maybe because I did not promote my projects enough.

Thus, I find it exciting when I see projects like this one that started out when the developers were pursuing their BS degree as you have said. To me being able to contribute to an open-source project has been an experience I have been looking for and I am glad that the opportunity has presented itself.

I tried to write my Fortran code here as close as possible to they the rest of the source code while trying to be self-contained and preserve the correctness of everything else. I mean adding cli support should not change the existing code structure, or at least that's how I felt.

I would be happy to continue contributing to this project, thank you for giving me the chance of writing some code. I will try to get this one PR ready to be merged and then see what else I can do to contribute.

Saludos, muchas gracias por la oportunidad ha sido un verdadero placer contribuir en su codigo. Cordialmente.

NOTES
should fix the missing gawk error in OSX since it's not installed
by default
NOTES
OSX's `wc` command does uses other flags than its Linux counterpart;
however, checking for the byte count `wc -c` should work on both systems
@lmiq
Copy link
Member

lmiq commented Aug 19, 2025

I'll convert the PR to "Draft", when you think you're finished you can confirm it and I'll receive a notification to review it.

@lmiq lmiq marked this pull request as draft August 19, 2025 19:28
NOTES
the comments are meant to explain what it is being done from a high
level, though developers and users alike are encouraged to read
the man pages of the tools that we have used to understand the test
in more depth
NOTES
shows how to invoke packmol from the command-line with examples
@misael-diaz
Copy link
Contributor Author

It's done. Packmol should pass the tests. I have also documented the shell script to make it easier to understand and have added helper text for the users if packmol detects a command line error.

@lmiq lmiq marked this pull request as ready for review August 19, 2025 22:20
@lmiq
Copy link
Member

lmiq commented Aug 20, 2025

Thanks, great work and contribution!

@lmiq lmiq merged commit a9424d1 into m3g:master Aug 20, 2025
4 checks passed
@misael-diaz
Copy link
Contributor Author

Thank you for the opportunity and for the feedback. This was an experience that taught me a lot about contributing to open-source and I am really grateful for that.
Bests regards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add cli argument for input file

2 participants