Toward a future of truly personalized cancer therapy

Targeted cancer therapies have produced substantial clinical results in the last decade, but most tumours eventually develop resistance to these drugs.

Benes, Engelman and colleagues have recently described in Science the development of a pharmacogenomic platform that facilitates rapid discovery of new drug combinations starting from human cancer samples.

This new approach could in the future help direct therapeutic choices for individual patients.

The study focused on non-small-cell lung cancers (NSCLCs) with activating mutations in epidermal growth factor receptor (EGFR) or anaplastic lymphoma kinase (ALK). This tumours are now routinely treated with specific tyrosine kinase inhibitors (TKIs), although the tumours eventually develops resistance within 1–2 years, through 2 main mechanisms:

gatekeeper’ mutations, which prevent target inhibition by the TKI;

bypass track’ mutations, which activate compensa­tory signalling pathways.

The authors generated cell lines directly from tumour biopsy samples using recent advances in cell culture methods. The cells were then subjected to a screen that combined the original TKI (against which cells had become resistant) with a panel of 76 drugs targeted at key regulators of cell proliferation and survival (Fig 1).


Fig 1. Schematic of the screen workflow

The pharmacological screen identified multiple effective drug combinations and a number of previously undescribed mechanisms of resistance. For example, a cell line derived from an ALK-mutated cancer was resensitised to ALK inhibitors when these were combined with a MET tyrosine kinase inhibitor. Furthermore, the screen identified resistance mecha­nisms that would have been difficult to discover by genetic analysis alone — for example, it was found that ALK-mutated NSCLCs often exhibit upregulated SRC tyrosine kinases signalling, without any evidence of mutations in SRC.

Several combination therapies identified in vitro were subsequently tested in xenograft models using the same cells and shown to be effective, indicating that the screen may indeed be predictive for in vivo activity.

Both the success rate of cell-line genera­tion from biopsy specimens (50% in this study) and the time scale for establishing cell lines (2–6 months) will need to be improved for this approach to become clinically useful. However, once optimized, it may be used not only for NSCLC but also for other types of cancer, allowing truly personalized cancer therapy.

Original paper:  Crystal, A. S. et al. Patient-derived models of acquired resistance can identify effective drug combinations for cancer. Science, 2014, 1480-1486.

Not all LogP’s are calculated equal: CLogP and other short stories

The partition coefficient (logP) of a material defines the ratio of its solubility in two immiscible solvents – although we normally use octanol : water, it could be any combination of immiscible fluids. This property is one of those chemical descriptors that pervades all aspects of ADMET and is used to filter out and define chemical space in which to work. Oddly, for such an important property, most projects and programs are built upon materials where the LogP has never been experimentally determined: relying on predicted values generated by software.

Recently, our DMPK scientist presented a series of predicted logP values vs some that he expertly determined in the lab. Whilst the correlation was good in many cases, there were some significant outliers, so he came to ask me, the computational chemist, to see if I might explain why the calculated logP was so different. There were some obvious structural features that can beguile certain methods of calculating logP – yes, there is more than one method of calculating logP – and other methods might closer predict the outlier values in our case.

Not All LogPs are Calculated Equal

When chemists talk about ClogP they are usually erroneously referring to “calculated” logP. To a CADD scientist, ClogP means something different – ClogP is a proprietary method (owned by BioByte Corp. / Pomona College) used to predict logP. Whilst there are a range of methods for prediction, there are three basic groups, and the vast majority of the current methods are flavours thereof:

Atomic (e.g. “AlogP”, ) & Enhanced Atomic / Hybrid (“XlogP”, “SlogP”)

Fragment / Compound (“ClogP”, KlogP, ACD/logP)

Property based methods (“MlogP”, “VlogP”, “MClogP”, “TlogP”)

Atomic logP considers that each atom has a contribution to the logP, and that the chemical entity’s final value is purely additive. Crippen et al. first proposed such a method in a series of papers in the late 80’s, with the refined version dubbed “AlogP”.1 The method is effectively a table look-up per atom, and there are plenty of free AlogP calculators available. It is suited to smaller molecules, particularly those with non-complex aromaticity or those which do not contain electronic systems that are known to have unexpected contributions to logP.

Enhanced Atomic or hybrid logP (XlogP, SlogP etc.) is a modification of the AlogP system – to try and address the shortcomings of atomistic approaches to larger systems, it takes the value of each atom type, as well as a contribution from its neighbours, as well as correction factors which help sidestep known deviances in purely atomistic methods.  This is an attempt to allow for larger electronic effects. It is fast, being a table look-up technique, and many free software use this too. The smarter hybrid algorithms know the state of each atom and thus how much of a contribution its neighbours add.

Fragment / Compound logP is a method that uses a dataset from full compounds, or fragments, which are experimentally determined, and then modelled using QSPR or other regression techniques in small fragments rather than per atom. Fragment contributions are then added up, with correction factors. The rationale here is that sometimes atomistic approaches do not adequately model the nuances of electronic or intramolecular interactions, which may be better modelled by using whole fragments. This method tends to be better for systems with complex aromaticity, and larger molecules – on the condition that the molecule contains features that are similar to those from which the modelling was conducted. In the case of very obscure motifs in your molecules, then the model from which the prediction is made may not have a very good correlation.


Property based methods…
There are a whole host of methods for determining logP using properties, empirical approaches, 3-D structures (e.g. continuum solvation models, MD models, Molecular Lipophilicity potential etc…), and topological approaches. Most of these methods are reasonably computationally intense, and are buried in the world of informatics and stats, but one is worthy or particular note: Moriguchi’s method (or MlogP), which used the sum of liphophilic atoms, and sum of hydrophilic atoms as the two basic descriptors in a regression model that was able to explain nearly 75% of variance in experimentally determined LogP values of a dataset of 1230 compounds.2 The group later added 11 correction factors, and the model explained 91 % of variance. It is very fast, and so historically it was employed for large datasets, and was included in several property prediction software, such as Dragon, and ADMET Predictor (Simulations Plus, Inc.). Nowadays as computational speed has increased, MlogP is used less, as more accurate methods become manageable, even at large library sizes.

So, which method do you use?

Biovia’s Pipeline Pilot, and Discovery Studio sport a version of AlogP, and Knime has multiple free X and A logP calculator plug-ins. CCG’s MOE uses both an unpublished atomic model (Labute) and a hybrid SlogP. DataWarrior uses ClogP, Dotmatics / Vortex natively use XlogP, but you can patch in others. Cresset BMD’s offerings use SlogP and Optibrium’s StarDrop uses a fragment method. ChemAxon uses multiple methods (including hybrid (VG) and fragment e.g. KlogP), and if you have their InfoCom nodes in Knime, then you can use multiple methods and weight them according to your understanding, or better yet, you can do a quick correlation check across the methods with known data in your series (if your group has the resource to experimentally determine a few of your own LogPs), and then weight your model accordingly.

As a rule (to which there are exceptions):

Simple small molecules (e.g. fragment sized) – AlogP will probably perform just fine, but a hybrid method would be better.
Complex but standard small molecules (the normal development type med chemists love), then  fragment / compound logP methods will often be the most accurate. Hybrid methods are your second best option (but still reasonably good).

Complex, non-standard molecules (with rare motifs), then a hybrid system or fragment-based logP may be equally good (or bad), it depends on the model on which the fragment logP is based. You could also get your team to determine some experimentally and see if you can’t build yourself a model…

For statistical insight into many state-of-the-art and classical methods, and how well they perform across large experimentally determined sets, see Mannhold et al.’s thorough review.3

So, to conclude, not all logP prediction models are built equal and there will be times when some models exceed others in accuracy, depending on your chemistry. Hopefully now you’ll at least be able to explain in your group meetings why your predicted logPs were way off…


  • Ghose, A.K.; Crippen, G.M. Atomic physicochemical parameters for three-dimensional-Structure directed quantitative structure-activity relationships. 2. Modeling dispersive and hydrophobic interactions. J. Chem. Inf. Comput. Sci. 1987, 27, 21–35.
  • Moriguchi, L.; Hirono, S.; Liu, Q.; Nakagome, I.; Matsushita, Y. Simple method of calculating octanol/water partition coefficient. Chem. Pharm. Bull. 1992, 40, 127–130.
  • Mannhold, M. et al. Calculation of Molecular Lipophilicity: State-of-the-Art and Comparison of Log P Methods on more than 96,000 compounds. J. Pharm. Sci. 2009, 98, 861-893.