LanzaTech’s unique multi-award winning technology captures and reuses waste carbon emissions as a resource for low carbon fuel and chemical production. This approach has no impact on land or the food value chain and enables sustainable economic growth for industry while promoting energy security. The company, named one of America’s most promising companies by Forbes Magazine in 2013 has received numerous sustainability awards including being listed on the Sustainia 100 in 2013 and the Global Clean Tech 100 for the past 3 years. LanzaTech currently holds the #2 position and the #4 position in Biofuel’s Digests’ Hot 50 rankings for bioenergy and Hot 30 rankings for renewable chemicals production.
LanzaTech has developed commercial partnerships with global companies in the Petrochemical business, including oil majors Petronas and Indian Oil Corporation, steel industry giants Harsco and Siemens VAI, chemical companies including INVISTA, and global transportation companies including Virgin Atlantic. These partnerships cut across the full supply chain – from resources through to end users, positioning LanzaTech to become a global leader in low carbon commodity fuels and chemicals.
About the role:
Ability to work in an open and fast-paced environment. This role is responsible for building and maintaining data processing systems/databases as well as analytics infrastructure. This infrastructure is used for quantitative research, back testing, parameter calibration, and data modeling.
- Designing data storage solutions
- Assist in the data warehouse design and scheme development
- Design and develop complex SQL queries to support analysis
- Ability to read, analyze and digest what a business wants to accomplish with its data, and design the best possible data processing tools around those goals
- Data Processing
- Development of data extraction from internal/external data sources
- Develop parsers, loaders and data manipulation tools to support collection and house of data across organization (internally / externally (commercial sites) / 3rd party partners)
- Experience working on systems that handle high volumes of data
- Cleanse, de-duplication and normalize data
- Development and maintenance of scientific data processing and infrastructure.
- Developing and troubleshooting production issues and automating processes to improved data processing efficient.
- Working knowledge of near real-time data processing.
- Data Modeling/Simulation
- Design and build simulation systems for advance analytics (i.e. Hadoop, Hive, Spark)
- Data transformation from warehouse to simulation for Scientific testing
- Analyze and compare new data sources and make recommendations
- Knowledge of statistical computing technologies, such as R, MathLab, Modeling
- Managing queries and directing them to the appropriate data sources
- Data Quality
- Proactive data quality checks & alerts (real-time)
- Build validation to ensure all data adheres to strict quality specifications.
- Process data within thresholds
- Data transfer analysis and alerting on failures
- Identify data anomalies / outliers
- Develop Software development standards – full life cycle
- Define and develop software development framework – repository, training, code release cycle and change mgt. etc.
- Continuous integration and testing
- Consolidate code base repo integrate and check-in/check-out, version control
- Development support for data processing and simulation / cluster computing
- Application development to fulfill Scientific life cycle from concept to product
- Provide / write to 3rd party APIs for data transfer and integration
- Web services / Unify application for Scientist to interface with
- Exploration / evaluation and development of visualization tools
- Data scheme / data parser development
- Software performance benchmarking and analysis
Qualifications and Experience
- Technical comprehension of several data warehouse architecture such as, EDW, ODS, DM, relational or multidimensional online analytical processing (ROLAP and MOLAP), etc.
- Exceptional programming experience with Python, Perl, C++, Java, XML, .NET and/or shell scripting.
- Exceptional troubleshooting and solving complex technical problems
- Experience with visualization technology and frameworks
- A strong work ethic, excellent communication skills and the ability to collaborate closely with Science Teams
- Strong verbal, written communication and interpersonal skills.
Nice to Have
- Experience working with Biological data and Next Generation Sequencing Data.
- Experience with working with Database schemas that are based on Ontologies and Controlled Vocabularies.
- Industry experience highly desirable
- Exposure in statistical learning highly desirable
- Exposure with Pipeline Pilot or KNIME
- Exposure user of standard Bioinformatics tools such as EME, BLAST etc
- Exposure integrating metabolomics, genomic, proteomic and transcriptomic data sets
This position is open to candidates authorized to work in the United States on a full-time basis for any employer. LanzaTech is an Equal Employment Opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, or national origin.