The Marie-Josée and Henry R. Kravis Drug Discovery Institute

Resources and FAQs

In Structure-Based Drug Design, we have a number of resources, including both hardware and software, and several small-molecule libraries.


We use high-end workstations and servers for molecular modeling and graphics. Data is stored and backed up on arrays of disks totaling 20 TB (Tera bytes). Additionally, we use the Icahn School of Medicine at Mount Sinai’s new Minerva high performance computing system. On this system, virtual screening computations of millions of compounds can be completed in a matter of days.


We use an array of state-of-the-art software for Structural Bioinformatics, Cheminformatics, and Molecular Modeling. These include: AutoDock, eHiTs, DOCK, Gold, and AutoDock VINA for docking and virtual screening; DockRes for analyzing virtual screening results; MODELLER for protein structure modeling and protein structure analysis; SiteHound and fPocket for binding site identification and analysis; and many others for various tasks.

Small-molecule Libraries

Several small-molecule libraries are available for virtual screening. 

  • The National Cancer Institute (NCI) Open Database library contains about 265,000 compounds from the Developmental Therapeutics Program at NCI/NIH. Compounds identified as hits through virtual screening can be obtained from the NCI.
  • The ZINC library and libraries from various commercial vendors contain more than 20 million purchasable compounds. While these compounds are not available in-house, the larger library provides a chemical space that is two orders of magnitude larger than the in-house libraries, thus increasing the chance of identifying high quality hits. Individual compounds identified as hits through virtual screening can be purchased from various vendors. (We provide vendor IDs for each compound to facilitate purchasing).

Frequently Asked Questions

Below are some Frequently Asked Questions about our Structure-Based Drug Discovery services.

Projects start with a consultation with our director to determine the feasibility of a computational approach. Once it is determined which services are needed, a user agreement describing the services and the cost is then drafted and sent to the user for approval or modification. Once the agreement is approved, the services start. A project usually includes additional meetings between the user and our staff to clarify approaches and to discuss results.

Virtual screening for small-molecule ligands requires a 3D structure of the target protein (target-based virtual screening) or at least one active compound (ligand-based virtual screening). Additionally, an assay should be available to test the compounds selected using the virtual screen.

No. Usually 20 to 200 compounds are selected from a virtual screen for experimental testing. A low-throughput assay is generally sufficient for this.

In many cases, the structure of the target protein can be modeled based on a homolog of known structure. We provide protein structure modeling services. Alternatively, if at least one active compound is known, ligand-based virtual screening can be used, which does not require a target structure.

Yes. We can fine-tune target-based virtual screening approaches to increase the chances of identifying additional active compounds. The active compounds can be used to initiate ligand-based virtual screening. Also, we can identify purchasable compounds similar to the existing hits to facilitate initial structure-activity relationship studies.

The commercial libraries contain two orders of magnitude more compounds than the Mount Sinai and National Cancer Institute (NCI) libraries. They provide a much larger chemical search space, which increases the chances of finding high-quality (active) hits. Commercial compounds are not available in-house, and hits obtained from a screen of these libraries must be purchased from vendors. Our library compounds are maintained in-house and are usually available for experimental validation in a day or two for a nominal fee. NCI compounds can be obtained free-of-charge from the NCI; only shipping costs need to be covered.

Yes. If a structure of the target protein is available (or can be modeled), molecular docking can be used to model the interaction of the known (active) compounds with the protein.