As a Data scientist, you will create, curate and help expand the world’s largest database of clinical trials data. There is a lot of data out there, and we are doing our best to make sense of it all. You will need programming skills to further automate the cleaning and gleaning of quantitative data from the text-heavy data formats. Familiarity with machine learning and special techniques for working with text-heavy data to create categories are a big plus.
- Two years+ of professional experience working as a data scientist
- Experience with command-line scripting, data structures and algorithms and ability to work in a Linux environment, processing large amounts of data in a cloud environment
- Strong knowledge in at least one of the following fields: machine learning, data visualization, statistical modeling, data mining, or information retrieval
- Proficiency in data extraction, processing and analysis (e.g. R, SAS, Matlab) packages, programming languages (e.g. Java, Python, Ruby) and databases such as Hadoop, MongoDB, Storm, SQL and Solr
Support a cross-functional team and provide in-depth data insights for complex business problems that can be approached with advanced analytics.
- Explore, examine and interpret large volumes of data in various forms
- Perform analyses of structured and unstructured data utilizing advanced statistical techniques and mathematical analyses
- Develop data structures and pipelines to organize, collect and standardize data that helps generate insights and addresses reporting needs