Primary job responsibilities:
I provide cheminformatics and data science support to high throughput screening (HTS) programs. This includes designing computational filters to select compounds for secondary followup (cherry picks), suggesting structural modifications to improve potency and reduce undesirable side effects of promising compounds (lead optimization), calculating molecular properties, modeling structure–activity relationships, and so on.
I also develop computational infrastructure (databases, application programming interfaces, user interfaces) to support new screening paradigms. I work on developing and implementing new algorithms and predictive models to support small-molecule data analysis. This includes modeling activity cliffs (chemically similar compounds having very different activities), characterizing chemical spaces (a high dimensional abstract space that “contains” all the compounds that have a specific group of properties), and integrating structure data with other molecular and genomic data.
I work from home and travel to my office in Rockville, MD, at least once a month. In that sense, my work environment is set up exactly the way I want it. I attend two or three conferences a year, generally because I'm presenting my research.
The NCATS lab environment is very open and collegial. There is very little hierarchy and a minimum of imposed schedules. It's definitely pretty easy going and as long the work gets done, there are no issues. Once in a while, we have deadlines and have to work a few hours more, but in general there is no extraneous pressure.
- Attending meetings: 1–5 hours
- Planning experiments: 11–15 hours
- Running experiments: 11–15 hours
- Analyzing data: >20 hours
- Writing reports: 6–10 hours
- Consulting with colleagues: 6–10 hours
Tools you can’t live without:
I use a Mac computer running OS X to run Emacs (a text editor for programmers), the "R" statistical programming environment, and the Python/Java programming languages.
What you like most about your job:
The range of biological systems and conditions (rare disease, infectious disease, etc.) that I get to work with via collaborations is unparalleled. This presents lots of learning opportunities and exposure to a lot of cutting edge science. Importantly, I have the freedom to talk about my own work. This includes being able to blog about my work, presenting it at conferences, and also releasing source code and data. I also have the flexibility to follow up on ideas that may not be immediately applicable to ongoing projects. But once they do become relevant, I have the ability to collaborate with experimental colleagues to run validation experiments. There is an open environment where I can chat with chemists and biologists to expand my knowledge on non-computational topics.
I am also active in the open-source software community. I have been involved in the Chemistry Development Kit (CDK) open-source Java library for about ten years now, and I am a co-founder of the Blue Obelisk group — an informal group of chemists who promote open data, open source, and open standards.
Advice you would give to undergrads interested in following in your footsteps:
Learn how to code well — so well that when the time comes, you think about the problem you are solving rather than about writing the code. Also, pay attention to statistics. And it goes without saying, know your chemistry.
Skills or talents that make you a good fit for your job:
Attention to details and the ability to translate a problem statement to a computation or mathematical form.
How you've benefited from being an ACS member:
Being involved with ACS Division of Chemical Information has been very useful in terms of building up a network of colleagues, opportunities to be involved in leadership activities, as well as giving back to the cheminformatics community.