We at IBM Chief Information Office (CIO) are a dynamic group of Business, Strategy and Technology professionals – a specific source of market-leading Industry Consulting, Application and Business process delivery following Agile values and principles.
As a Data Engineer, you play a vital role in building the right infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources.
Expert in setting up effective pipelines to capture data from multiple sources into the enterprise centric storage.
Comfortable in building effective analytical tools that utilize the data pipeline to provide actionable insights into data synchronization, reporting, operational efficiency and related areas.
Possess strong knowledge in designing database models to store structured & unstructured data efficiently and in creating effective data tools for analytics experts.
Knowledge in technologies like Hadoop, Spark, Kafka, Scala, Python, etc.
Knowledge in relational model databases (like DB2, MySQL, Oracle, …) and no-SQL databases (MongoDB, Elastic Search, …).
Knowledge on enterprise data lakes, data analytics, reporting, in-memory data handling, enterprise integration tools, etc.
Good understanding of industry best practices for data governance and security.
Good communication skills and fluent in English.
Work with stakeholders including the product owner, data and design teams to assist with data-related technical issues and support their data infrastructure needs.
Create and maintain optimal data pipeline architecture.
Identify, design, and implement process improvements aimed at automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Assemble large, complex data sets including legacy structured data warehouse that meet functional / non-functional business requirements.
Collaborate with DevOps team to develop Continuous Integration/Continuous Delivery pipelines using containerization technologies.
Solve Big Data and Distributed Data Streaming problems using latest technologies.
Enterprise data lakes, data analytics, reporting, in-memory data handling, enterprise integration tools, etc.
Good communication skills