This article refers to the paper:
“Sharing Chemical Relationships does not Reveal Structures”
M. Matlock and S. J. Swamidass, J Chem. Inf. Model. 2014, 54, 37-48.
Sharing data, projects and programmes between organisations is becoming more and more common for some companies looking to find more value in expert-centric models. There are stumbling blocks in terms of trust and sharing, however, and a prime example of this is the unwillingness of some organisations to share structural information in proprietary screening libraries at the very early stages of collaboration. Blind screening, where a company will screen your target and then report back hits under agreement, or where they send you a blinded diverse representation of their library on plates, which – if there are any hits – you feed back to them to pick out similar materials, is gaining pace as a way of accessing large, well curated libraries for screening, and likewise for the library owners; new targets.
The problem of sharing data in this way is that project leads and chemists would like to be able to find similar materials within the library themselves to help prioritise initial hits, and also reduce the time delays involved with going back to the library provider and requesting additional materials. As a result, there have been several papers attempting to address how to “blind” structures but still leave in information that will help project leaders pick follow-up materials without sharing structural information. One such paper is that of Joshua Swamidass and Matthew Matlock, (Loc. cit.), which details an interesting way of blinding a library whilst also empowering project decision makers to pick similar materials for hits using relationship metrics without relying on the external library providers.
Imagine that for a given screen a company gives you a blinded plate of materials as a diversity sub-set of its library. You screen these and some of the wells are hits. If, with the plate, you were given data sheet which showed for each well the serial number of similar materials (along with the similarity metric) within the library that the company had given you, you could then pick out those that you wished to follow up – saving the company time, but also allowing you to weight your selection (e.g. Hit One appears far more potent than Hit Two… so can we have all the similars to Hit One, and a couple of the similars to Hit Two).
Given that most chemoinformatics systems used for picking follow-up hits convert structural data to relationship data (e.g. similarities) in order to pick the next round of materials, this kind of information is very useful, even when structurally blinded.
Swamidass and Matlock detail several approaches that attempt to allow secure transmission of chemical relationships, such as Similarity Neighbours, Scaffold trees and Networks (allowing for sub-structural similarity distances and not just have/have-not functionality metrics), and R-Group networks (see figure below). They assess the information density and how secure each method is to prevent the reverse engineering of these data to provide structural insight.
They conclude that it is possible to communicate useful chemical information without sending structural data, and that similarity data is in fact one of the key data that is used to help select follow up materials. The authors also conclude that materials with very simple structures, and high symmetry are more vulnerable from reverse engineering from some relational data systems than more complex molecular motifs, but sharing relationship data dispels much of the insecurity of sharing structural descriptors.
This paper, and those similar, re-ignite the debate on making curated chemical libraries items to share and collaborate with, rather than shield and hide. Sharing libraries in this way also enables project leads to feel more empowered and informed when screening external libraries. Though clearly in its infancy, deployment of such secure library systems could open doors to easier and faster collaborative efforts between organisations, which clearly has benefits across many chemical domains.