returns the fingerprint similarity between molecules mol1 and mol2.
Details and Options
ResourceFunction["MoleculeFingerprintSimilarity"] first encodes molecules into a string of bits, either on or off, and computes the similarity between resulting bit vectors.
ResourceFunction["MoleculeFingerprintSimilarity"] takes the following options:
"FingerprintType"
"RDKit"
the algorithm to use when encoding the molecule
"SimilarityMeasure"
"Tanimoto"
the bit vector similarity measure to use
The option "FingerprintType" can be any of the following:
"AtomPairs"
atoms are typed based on atomic number, number of pi electrons, and vertex degree, and all pairs of atom types, together with the distance between them, are hashed and corresponding bits in the fingerprint are set
"MACCSKeys"
166 bit structural key descriptors in which each bit is associated with a SMARTS pattern
"MorganConnectivity"
extended-connectivity fingerprints, atoms are typed based on atomic number, heavy-atom degree, mass number and ring membership and the neighborhood around the atoms are used to set the bits
"MorganFeatures"
atoms are typed based on chemical features, such as H-bond acceptor/donor, aromaticity, acidity, etc.
"TopologicalTorsions"
similar to "AtomPairs", but rather than pairs of atoms, all sets of four consecutively bonded atoms are used to generate the bits
"RDKit"
identifies all subgraphs within a particular range of sizes, hashes each subgraph to generate a raw bit ID, mods that raw bit ID to fit in the assigned fingerprint size and then sets the corresponding bit
The option "SimilarityMeasure" can be any of the following, where ao indicates the number of on-bits in the bit vector a: