Introduction

cApp is a convenient and easy-to-use Java application that aids handling and storage of information about small-molecule compounds. With the application, the user can appraise compounds with respect to their physico-chemical properties and present structural information together with calculated or measured properties. Structures can be provided by the user in the form of SMILES, InChI, structure-data files (SDF) or added via the embedded chemical editor.

The tasks performed by cApp include compound appraisal by calculation of common chemical descriptors and analysis with respect to adherence to likeness rules, as well as a substructure search for pan-assay interference components. The user can add data, annotation, and directly query the PubChem database. cApp results are presented in a tabular view when using the software with its graphical user interface. Results can also be written in HTML-format or exported as PDF.

This software is intended for work with small to medium-sized compound libraries. Especially when working with the GUI, please be mindful when loading extensive compound sets. The upper limit of compounds to be handled in one cApp session obviously depends on the resources of the computer system used. We have found that on machines with current typical specifications, sets with about 1000 compounds can still be reasonably well handled.
Therefore, the user will be prompted with a warning when attempting to read libraries that are larger than ~1000 compounds. Large library files in SDF format can be split using the split task. This task is not affected by large file sizes, as it does not create any cApp compound sets and only writes subsets to the disk.
Similarity searches that use large libraries can be carried out when starting the task from the terminal without the GUI.


Concepts

SetA set includes a particular list of compounds; multiple sets can be present at once and are displayed on individual tabbed panes in the GUI.
ProjectA project comprises all data and compound sets when running an instance of cApp. In the graphical user interface (GUI), each compound set is displayed as a table on an individual tab. Automatically generated HTML, PDF and ASCII presentations of compound sets are identified by their set number.
TaskA cheminformatics task that is to be performed by the program. Different tasks result in different visual representation.


Tasks

Four tasks are available:
AppraisePhysico-chemical properties and structural features will be calculated and analysed as to compliance with various likeness criteria and existence of PAINs components. User-provided data and annotation can be included and interactive convenience features are available.
Similarity searchLibraries of small-molecule compounds can be queried using individual or multiple compounds for structural similarity with a maximum common subgraph approach. The PubChem Compound database can also be queried for similarity.
Compound clusteringUsing Tanimoto similarity, the compounds within one set or grouped in to a user-specified number of clusters.
Splitting of librariesMulti-compound files can be split into subsets with a user-specified number of entries each. Currently only possible for libraries in SDF format.

How to start the program

The program can be started from the terminal or, on Windows and MacOS, by double-clicking the cApp jar-file in a directory browser or on the desktop.

With the GUI

On Windows and MacOS, double-clicking the cApp jar-file executes the program and starts the GUI. It is also possible to start the program from the terminal command line by entering the following command:

java -jar capp_v1.5.1_java1.7.jar

Starting the program from a terminal command line as above is also the simplest way to run the program on Linux/Unix OS (see also below), and provides the ability to observe any messages that occur in due process.

Step-by-step instructions for starting the program from the terminal command line:
1. The Java Runtime Environment (JRE) has to be at least Java 1.7. Use the following command to check your JRE version:

java -version
2. Navigate to the directory in which the cApp jar file resides using the following command:

cd {folder}
3. To ensure that the correct file is present in the directory, enter

ls
4. Enter the following command to start the program with default settings (i.e. with the GUI):

java -jar capp_v1.5.1_java1.7.jar
5. On Linux, a shell script can be used to access the program conveniently. #!/bin/csh -f
setenv JAVA {path-to-java-bin-directory}/java
setenv CLASSPATH {path-to-capp-executable}/capp_v1.5.1_java1.7.jar
$JAVA capp $*

Without the GUI

By default, the GUI is produced when the program is started from the terminal without any specific switches:

java -jar capp_v.1.5.1_java1.7.jar

When switches are used, the GUI switch needs to be explicitly given, in order for the graphical interface to appear. For example, the following command can be used to execute an appraise task in the terminal and generate results in PDF format:

java -jar capp_v.1.5.1_java1.7.jar -i test.smi -smi -pdf

Program options from the terminal (switches) are gathered in the following table.
SwitchFunction
-appraiseRuns the appraise task (default)
-asciiWrite results in ASCII format (directory: capp_results_ascii)
-autoselectAuto-select largest entity when reading SDF or SMILES
-cluster {N}Group compounds into {N} clusters
-debugDebug option
-drugEvaluate for drug likeness (Lipinski's Rule of 5)
-duplicates inchiCheck for duplicate entries based on InChI Keys
-duplicates tanimotoCheck for duplicate entries based on Tanimoto similarity
-fragmentEvaluate for fragment likeness (Rule of 3)
-guiStart the program with the graphical user interface
-hPrint help
-htmlGenerate results in HTML format (directory: capp_results_html)
-i {input file}Input file with compounds to process
-inchiInput file contains InChI code
-leadEvaluate for lead likeness
-loadLoad a previously saved cApp project
-maxsets {N}Maximum number of compound sets in the project (default: 10)
-pdfWrite results into PDF files (directory: capp_results_pdf)
-pdf landscapePDF paper orientation is Landscape (default)
-pdf portraitPDF paper orientation is Portrait
-pdf bondlengthStructure images are drawn with same bond length (default)
-pdf fixedStructure images have the same fixed size
-pngWrite PNG images of compounds (directory: capp_images)
-pubchemSearch for entry in PubChem when conducting the appraise task
-sdfInput file is an SDF file
-smiInput file contains SMILES code
-smsd {library file}Similarity search of {input file} against {library file} in SDF format
-split {N}Split an SDF library into subsets of {N} entries each
-svgWrite SVG images of compounds (directory: capp_images)
-3dAttempt to generate 3D coordinates


Reviews
cApp has been independently reviewed by:
Macs in Chemistry

How to cite
When using this program, please cite:
Amani, P., Sneyd, T., Preston, S., Young, N.D., Mason, L., Bailey, U.-M., Baell, J., Camp, D., Gasser, R.B., Gorse, A.-D., Taylor, P., Hofmann, A. (2015) A practical tool for small-molecule compound appraisal. J. Cheminformatics 7, 28.
DOI | PubMed | More