SMILES (standing for Simplified Molecular Input Line Entry Specification) was developed by researchers in the US government one of whom went on to create a chemoinformatics company that markets tools based upon the language and its derivatives (http://www.daylight.com).
An excellent set of resources is available in the SMILES homepage on the Daylight website , including a full tutorial.
In SMILES, atoms are generally represented by their chemical symbol, with upper-case representing an aliphatic atom (C = aliphatic carbon, N = aliphatic nitrogen, etc) and lower-case representing an aromatic atom (c = aromatic carbon, etc). Hydrogens are not normally represented explicitly. Consecutive characters represent atoms bonded together with a single bond. Therefore, the SMILES for propane would simply be:
- CCC
Or 1-propanol would be:
- CCCO
Double bonds are represented by an “=” sign, e.g. propene would be
- C=CC
Parentheses are used to represent branching in the molecule, e.g. the SMILES for Isopropyl alcohol (2-propanol) is:
- CC(O)C
Atoms other than the major organic ones (C, S, N, O, P, Cl, Br, I, B) or ions must be enclosed in square brackets.
Ring enclosures are represented by using numbers to signify attachment points, usually starting at 1. The first occurrence of the number defines the attachment point, and subsequent occurrences indicate that the structure joins back to the attachment point at that position. For example, the SMILES for
Benzene is as follows (note the small ‘c’ for aromatic carbon):c1ccccc1
We can also use branching from the ring system, e.g.
c1cc(Br)ccc1
represents bromobenzene.
c1cc(Br)ccc1
represents bromobenzene.
Note that in many cases there can be several SMILES to represent the same structure – for example, we could alternatively represent bromobenzene as:
c1cccc(Br)c1
c1cccc(Br)c1
:
A nice thing about SMILES is it can easily be stored in text or spreadsheet documents. For example, “SMILES files” containing SMILES, structure names or IDs, and sometimes data are often used to transfer information about structures between computers and scientists. Here is a sample of what a short SMILES file might look like:
c1ccccc1 Benzene
c1cc(Br)ccc1 Bromobenzene
c1c(O)ccc(NC(=O)C)c1 Acetaminophen
It is good to understand how SMILES works, but it is unusual for humans to have to do the job of working out SMILES. Normally conversion to and from SMILES is done by the computer using drawing and depiction tools which are described below.
c1ccccc1 Benzene
c1cc(Br)ccc1 Bromobenzene
c1c(O)ccc(NC(=O)C)c1 Acetaminophen
It is good to understand how SMILES works, but it is unusual for humans to have to do the job of working out SMILES. Normally conversion to and from SMILES is done by the computer using drawing and depiction tools which are described below.
Sources protected by copyright, we are just reporting infortmations © Copyright 2002-2004 David Wild, email wildd @ umich.edu
Nessun commento:
Posta un commento