DATA SCIENTIST WANTED
We are hiring a data scientist or data-savvy environmental scientist to join macrosheds, a study of comparative ecosystem biogeochemistry at continental scales.
What is MacroSheds? This NSF funded project will enable anyone with internet access to compare the flow and the chemistry of hundreds of streams throughout the United States and to explore their watersheds. It will combine data sets from many separate research projects into an attractive website that makes the data available. This will make it easy for scientists and students to generate questions about water quality and river flow patterns across the continent. Researchers will use these data to study what types of watersheds are best at retaining nutrients, are recovering most rapidly from decades of acid rain, have the highest erosion rates, and have flow patterns that are least sensitive to floods and droughts. The lessons we learn from studying many watersheds and streams will contribute to more effective management of our nation’s water and forest resources.
Much of the literature of watershed ecosystem science over the last decade has focused on gaining ever finer detail of spatial heterogeneity within watersheds. This fine-scale focus has identified many idiosyncrasies of individual watersheds but has not helped us develop general theories about watershed dynamics. Most watershed ecosystem studies remain rather parochial, involving detailed studies of individual or paired watersheds, or surveys of a small set of attributes across multiple watersheds. Macrosystem watershed science, or the search for general principles that describe the functional capacity and behavior across watersheds, has been limited. A major reason for this lack of large-scale focus is the challenge of data access and integration across sites. Our goal in this proposal is to compile a dataset that merges all US watershed ecosystem studies into a common platform (macrosheds) and to enable and train a new generation of watershed ecosystem scientists in the art and practice of macroscale watershed ecosystem science.
Candidate description: The central informatics goals of this project are to centralize and harmonize data (sensor time series, geographic data, metadata) from diverse sources, develop a cloud-based open data management and exploration platform, and allow users to access, clean, analyze, and visualize the data sets housed within it. The successful candidate will have interest and experience in one or more of the following disciplines: data engineering, analytics, data visualization, software development, GIS. A graduate degree in either data or computer science or in an environmental science is desired but not required. The position includes support for the candidate to attend professional meetings and professional training workshops and the opportunity to interface with collaborators at the National Ecological Observatory Network (NEON), Colorado State University, and Duke University.
Key tasks will include some subset of the following, depending on applicant’s skillset and interests:
Ideal candidates will have experience with three or more of the following:
To apply:
The successful applicant will work closely with the project’s full-time data scientist, Mike Vlah, who will provide regular support and guidance.
To apply, submit cover letter (including statement of interest and qualifications), Curriculum Vitae, and contact information for 2 references to [email protected]. Review of applications will begin immediately and will continue until the position is filled.
We are hiring a data scientist or data-savvy environmental scientist to join macrosheds, a study of comparative ecosystem biogeochemistry at continental scales.
What is MacroSheds? This NSF funded project will enable anyone with internet access to compare the flow and the chemistry of hundreds of streams throughout the United States and to explore their watersheds. It will combine data sets from many separate research projects into an attractive website that makes the data available. This will make it easy for scientists and students to generate questions about water quality and river flow patterns across the continent. Researchers will use these data to study what types of watersheds are best at retaining nutrients, are recovering most rapidly from decades of acid rain, have the highest erosion rates, and have flow patterns that are least sensitive to floods and droughts. The lessons we learn from studying many watersheds and streams will contribute to more effective management of our nation’s water and forest resources.
Much of the literature of watershed ecosystem science over the last decade has focused on gaining ever finer detail of spatial heterogeneity within watersheds. This fine-scale focus has identified many idiosyncrasies of individual watersheds but has not helped us develop general theories about watershed dynamics. Most watershed ecosystem studies remain rather parochial, involving detailed studies of individual or paired watersheds, or surveys of a small set of attributes across multiple watersheds. Macrosystem watershed science, or the search for general principles that describe the functional capacity and behavior across watersheds, has been limited. A major reason for this lack of large-scale focus is the challenge of data access and integration across sites. Our goal in this proposal is to compile a dataset that merges all US watershed ecosystem studies into a common platform (macrosheds) and to enable and train a new generation of watershed ecosystem scientists in the art and practice of macroscale watershed ecosystem science.
Candidate description: The central informatics goals of this project are to centralize and harmonize data (sensor time series, geographic data, metadata) from diverse sources, develop a cloud-based open data management and exploration platform, and allow users to access, clean, analyze, and visualize the data sets housed within it. The successful candidate will have interest and experience in one or more of the following disciplines: data engineering, analytics, data visualization, software development, GIS. A graduate degree in either data or computer science or in an environmental science is desired but not required. The position includes support for the candidate to attend professional meetings and professional training workshops and the opportunity to interface with collaborators at the National Ecological Observatory Network (NEON), Colorado State University, and Duke University.
Key tasks will include some subset of the following, depending on applicant’s skillset and interests:
- Development of interactive web visualizations (Using one or more of: Shiny, D3, Dygraphs, Bokeh, Highcharts).
- Development and scheduled execution of scripts to pull data from web APIs and FTP servers.
- Data munging, cleaning, harmonization, and database I/O.
- Programmatic collection and summarizing of geographic data.
- Python web development (Flask).
- Web scraping.
Ideal candidates will have experience with three or more of the following:
- R
- Python
- any database query language
- Mac/Linux shell commands (Bash), especially executed remotely
- Git
- HTML, CSS, JavaScript
- Google Earth Engine (JS or Python versions)
- Watershed terrain analysis
To apply:
The successful applicant will work closely with the project’s full-time data scientist, Mike Vlah, who will provide regular support and guidance.
To apply, submit cover letter (including statement of interest and qualifications), Curriculum Vitae, and contact information for 2 references to [email protected]. Review of applications will begin immediately and will continue until the position is filled.