Parallel Execution and Hardware Requirements for Local DeepMSA2 Deployment

fa.rahimzadeh · Post by **fa.rahimzadeh** » Fri Jul 18, 2025 1:32 am

I want to compute MSA for about 3500 .fasta sequences using the DeepMSA2 standalone package. I have successfully downloaded all the required databases and I will install the software on a local system.

I have a few technical questions regarding the local execution:

1. Is it possible to run multiple jobs in parallel (not necessarily thousands at once, but perhaps in smaller batches of several dozen or a few hundred sequences)? Since I will need to rent a compute system for a limited period, running the analysis in parallel would significantly help me complete the tasks more efficiently.

2. Regarding the hardware requirements, does the GPU have to be strictly data center-grade (e.g., NVIDIA A100 or similar)? Or would high-end workstation GPUs such as the NVIDIA Quadro RTX 8000 or similar models be sufficient for running DeepMSA2 effectively?

I would greatly appreciate any advice or recommendations you could share on these topics, as they will help me configure the system appropriately for efficient processing.

Post by **albert_wei** » Sat Jul 19, 2025 5:11 pm

Thank you for your interest in our work.

Although it is possible to run multiple jobs simultaneously, you will need to write the parallel code yourself, as the downloaded code does not contain it. If you have a Slurm or OpenPBS environment, you can submit server jobs simultaneously, which has the same effect as running the code in parallel.

If you are only using DeepMSA2 to search MSA, I don't think you need to use a GPU. However, if you need to use DMFold, a GPU is preferable as it accelerates protein structure prediction.

I think GPU memory is one of the key factors in running DMFold, as a long protein sequence will require a large amount of GPU memory for prediction. With 48 GB of GPU memory, the RTX 8000 should have enough capacity to predict the structure of many proteins.

lisa845jyorg · Post by **lisa845jyorg** » Thu Aug 21, 2025 2:28 pm

Hello,

Yes, you can run multiple DeepMSA2 jobs in parallel (e.g., 10–100 at a time), as each sequence is processed independently.

heckgranola · Post by **heckgranola** » Wed Dec 24, 2025 11:24 am

Yes, you can run multiple DeepMSA2 jobs in parallel, as long as you manage resources properly (CPU cores, RAM, disk I/O). It’s common to run batches of dozens of sequences at once using a job scheduler or simple parallel scripting. Just avoid overloading the system, since database searches are CPU- and I/O-intensive.

Parallel Execution and Hardware Requirements for Local DeepMSA2 Deployment

Parallel Execution and Hardware Requirements for Local DeepMSA2 Deployment

Re: Parallel Execution and Hardware Requirements for Local DeepMSA2 Deployment

Re: Parallel Execution and Hardware Requirements for Local DeepMSA2 Deployment

Re: Parallel Execution and Hardware Requirements for Local DeepMSA2 Deployment