Big Data Engineer
8 Grandview Ave Canonsburg, PA 15317
This job designs and engineers solutions associated with enterprise data for the organization and, working closely with the business, clinical, analytics and IT teams, assists with the definition, build, and upkeep of these solutions. This includes coding data ingestion, streaming, transformation, structuring, caching and optimizing the data delivery for analysts and applications to access operational, derived, and external data sets. The initial focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.
- Create and maintain conceptual, logical and physical data models of data assets & resources
- Establish and enforce data standards, patterns and optimizations to enable teams to deliver high performance analytics and applications. Collaborate with data science and analytic teams on future data needs, setting roadmap for architectural runway and managing expectations of the business and product teams.
- Ensure data flows and related information is being tracked, managed and documented. Document data models and build the data ecosystem at large using tools of various means. Ensure standards are met for data engineering and delivery efforts.
- Lead the construction of a system that delivers data to the enterprise for both analytical and operational functions.
- Discover, review, and influence new and evolving design, architecture, and standards for building and delivering unique services and solutions.
- Investigate, design, and implement best-in-industry, innovative technologies that will expand Inovalon’ s infrastructure through robust, scalable, adrenaline-fueled solutions.
- Be the expert in service reliability and sustainability.
- Develop, gather, and leverage metrics to manage complex computing systems to drive automation, improvement, and performance.
- Take responsibility for detailed design, analysis, testing, and optimization.
- A Bachelor’ s Degree in Computer Science, Information Technology or Information Science related discipline or experience equivalent is required.
- Must have at least 5-10 years of experience in Data Management & Analytics.
- Healthcare/Pharmaceutical related industry experience is a plus.
- 5-7 years in big-data management – Hadoop preferred
- 5-7 years supporting Data Analytics & Visualization – Tableau, Power BI, etc.
- 3-5 years in Data Warehousing
- 3-5 years in Database Administration
- Expert-level understanding of SQL queries, joins, stored procedures, relational schemas and noSQL data storage and access
- Experience using the Lambda architecture for balancing timeliness, throughput, and fault tolerance when processing data
- Experience in various messaging systems, such as Kafka or RabbitMQ
- Experience in integration of data from multiple data sources such as Microsoft SQL Server, Oracle, and MongoDB
- Experience engineering real-time data streaming solutions
- Experience in Spark, Snowflake, or similar technologies
- Knowledge of various ETL techniques and frameworks, such as Flume
- Technology experience preferred: Hive, Hbase, Sqoop, Ranger, NIFI
- Spark, Phoenix, Spring Batch, Accumulo, Falcon, Atlas
- Experience with virtualization technologies expected
- Good knowledge of big data querying tools, such as Pig, Hive, and Impala