Home Base seeks to hire a data engineer to start Apr 2021, who will support a variety of database management, ETL scripting, and data validation tasks that include but are not limited to: querying databases, restructuring data, cleaning and validating data, performing manual ETL tasks, automating ETL tasks using tools and custom scripting, full pipeline management/monitoring, improving systems and processes, and documenting data systems. The qualified candidate will be highly detail-oriented and have a strong interest in and aptitude for data management and engineering. Some specific focus areas would be determined based on the candidate's skills and interests.
The successful candidate must be highly organized, motivated, and able to thrive in a fast-paced team environment and must enjoy the challenge of a dynamic environment with evolving needs. It is extremely important that the candidate possess the ability to carefully keep track of multiple work streams.
Relevant activities include, but are not limited to the following:
Achieving an extremely detailed understanding of our current data ecosystem, including its structure, data meaning, history, flow/processing, and challenges
Utilizing, improving, and constructing ETL tools
Running current SQL, Python, PHP, and/or Tableau Prep ETL scripts
Using various monitoring and evaluation methods to validate that data flowing through these pipelines is accurate and troubleshooting/addressing issues when they are discovered
Improving and further integrating these scripts (ETL and validation) further into various data pipelines to achieve greater efficiency, reliability, and functionality.
Constructing new ETL tools as necessary/able, including a major rewrite of a family of old PHP pipelines in Python
Writing queries and scripts to identify data quality problems
Investigating the root cause of data quality problems
Working with appropriate team members to determine appropriate data remediation and process improvement plans
Developing queries and scripts as needed to repair data in bulk
Supporting a dashboard that automatically monitors for certain critical data quality problems in production, independent of ETL processes
Support the team as needed with data querying, processing, analysis and reporting for both regular and ad-hoc requests from clinical, executive, and external audiences
Research potential new data engineering solutions, analyze feasibility, and assist technical leadership in road-mapping the evolution of our data infrastructure
Create and maintain documentation across our data ecosystem
Degree in Health Informatics, Computer Science, Statistics, Mathematics, Engineering, or a similar field
Familiarity with behavioral health clinical practice and/or research preferred
Procedural programming for data manipulation using Python, NumPy, and Pandas
PHP, Java, or other languages are a plus
Knowledge of relational database platforms and data modeling
Comfortable extracting data from and loading data into sources ranging from an Enterprise Data Warehouse to an Excel or text file, using built-in tools or custom-written ETL scripts
Knowledge of data aggregation and transformation processes (e.g. pivot, merge, union, hierarchical grouping, aggregation functions)
Above average SQL skills (e.g. familiar with subqueries, multiple joins, and grouping), specifically MySQL. SQL Server experience a plus
Comfortable with complex multi-stage, multi-technology ETL pipelines
Comfortable using APIs to transmit data in both an ad-hoc and automated manner
Familiar with concepts/tools of Data Quality Management as well as Data Governance practices
Ability to interpret and follow-through on data requirements and with strong attention to detail
Strength in independently validating and debugging code and analyses, including consulting documentation, Stack Exchange, etc.
Demonstrates personal initiative and time management skills, as well as the ability to work effectively and kindly as part of a team
Excellent verbal and written communication skills
Familiar with agile software development methodologies
Interest in identifying process improvement opportunities is a plus
LICENSES, CERTIFICATIONS, and/or REGISTRATIONS:
Specify minimum credentials and clearly indicate if required or preferred.
Undergraduate degree in Health Informatics, Computer Science, Statistics, Mathematics, Engineering, or a related subject.
Graduate degree in one of the above.
Preferred coursework would include most of the following:
Intermediate Databases and SQL
Intermediate Programming (Procedural and/or OO)
Data Structures and Algorithms
Data Quality Management
Data Flow and Automation
Agile Project Management
Equivalent Experience – Equivalent time and aptitude achieved through work experience may substitute for some of the preferred courses listed above.
Indicate the required and preferred (optional) amount and type of experience.
2+ years of experience in data management in a healthcare/clinical setting, however recent or anticipated college graduates will be considered.
SUPERVISORY RESPONSIBILITY (authority to hire, promote, or terminate): Indicate supervisory “scope” and list the number of employees supervised.
Indicate financial “scope” information, e.g. size of budget, volume, revenue, etc.
Describe the conditions in which the work is performed. Use this section to detail any physical requirements for the position (lifting, carrying, etc). Use this section to also detail any environmental conditions associated with the position (outdoor weather requirements, hazardous materials, etc.).
100% remote through Aug 31, 2021; up to 100% remote afterwards, TBD.
Massachusetts General Hospital is an Equal Opportunity Employer. By embracing diverse skills, perspectives and ideas, we choose to lead. Applications from protected veterans and individuals with disabilities are strongly encouraged.
Primary Location MA-Charlestown-MGB OCC
Work Locations MGB OCC One Constitution Center Charlestown 02129
Job IT/Health IT/Informatics-Engineer
Organization Massachusetts General Hospital(MGH)
Standard Hours 40
Shift Day Job
Employee Status Regular
Recruiting Department MGH Psychiatry
Job Posting Jul 6, 2021