Cellfishing is a web-service that predicts the sensitivity of several cell lines with respect to the query compound. The prediction of a cell-line sensitivity to chemical compounds is a very important factor to
optimize in-vitro assays in the drug discovery processes.

The method is based on similarity between queries of compounds and our internal database. It is a direct similarity-based strategy is used in this approach for predictions of compounds-cell lines interactions. The current database comprises 129808 ligand-cell line associations classified as sensible and 950 cell lines.

e.g. CC(=O)O[C@H]1CC[C@@]2(C)[C@@H](CC[C@]3(C)[C@@H]2CC=C4[C@@H]5CC(C)(C)CC[C@@]5(CC[C@@]34C)C(=O)OCc6ccccc6)[C@]1(C)C=O

The output is a table like:

  • Col 1: Ordering number
  • Col 2: Cell Name as presented in ChemBL database
  • Col 3: The Tanimoto similarity average between the query compound and the top three more similar compounds sensible to the presented cell line.
  • Col 4 and Col 5: Links the cell line information in the ChemBL and Cellosaurus databases respectively. Only cell lines with a similarity higher or equal than 0.35 will be considered as “sensible” to the query compound.

For technical information about the algorithm, you can refer to:

E. Tejera et al., “Cell fishing: A similarity based approach and machine learning strategy for multiple cell lines-compound sensitivity prediction,” PLoS One, vol. 14, no. 10, pp. 1–11, 2019, DOI: 10.1371/journal.pone.0223276.