Citi Big Data Advanced Analytics Data Sourcing and Ingestion Engineer in Irving, Texas

  • Primary Location: United States,Texas,Irving

  • Other Location: United States,Florida,Jacksonville

  • Education: None

  • Job Function: Technology

  • Schedule: Full-time

  • Shift: Day Job

  • Employee Status: Regular

  • Travel Time: Yes, 10 % of the Time

  • Job ID: 16043020



Citi’s Global Consumer Bank is embarking on a strategic initiative to take advantage of innovations in data and advanced analytic technologies to enable an insight-driven business strategy. Together with our technology partners, we are evolving our current state architecture to incorporate next generation technologies built on the Hadoop platform. We are changing our way of working by embracing Agile methodologies to accelerate our delivery and productivity. Harnessing data to its maximum potential will enable Citi to advance our business towards a fully digital and mobile banking experience for our customers.

We are looking for a talented engineer to deliver a set of integrated capabilities that broadly cover data preparation, data ingestion, data federation and data processing leveraging both open source and commercial tools. Our ambition is to transition from a batch oriented organization towards a real-time business. Our data and analytics platform is an integrated environment that provides traditional data warehouse and next generation Big Data capabilities. You will partner with the team that is building solutions for data acquisition, validation and storage and create the processes that prepare data for analytics.


You will be part of talented group of engineers driving the Data preparation of large volumes of complex data for global consumers; working with sophisticated architectures integrating EDW and Hadoop.

You will drive the next generation data integration strategy and roadmap / backlog by identify the tools and solutions, prototyping, validating and ultimately implementing them.

You will interface with our analytics teams to mature use cases, identify data and sourcing needs and help implement solutions iteratively using agile methodologies.

You will engage within the technology organization to integrate and embed analytic capabilities to enhance products and customer experience

You will partner with engineers, architects and the Business Intelligence organization to assemble capabilities into platforms that can address a variety of analytical needs including streaming analytics, log analytics and search.

You will represent yourself as an expert, but equally rely on others to support you as a team.

Maintain knowledge of emerging trends related to data preparation, data ingestion, virtualization, real time streaming and others

Proactively identify new and emerging technology solutions to advance business and technology strategies.

Partner with operations teams to ensure systems integrity, performance and operational SLAs are met for data ingestion between the data warehouse and Hadoop environments

Communicate, market and raise awareness of data capabilities to increase adoption and leverage, streamline and automate data movement and create efficiencies.



BA/BS degree required, advanced degree is preferred in Computer Science or related field

3+ years of data analytics experience

Experience in assisting business teams to unlock value of data

Experience in quantifying and monetizing data platforms

Ability to influence others to adopt new ways of thinking, change habits and improve processes

Ability to translate complex technical topics into easy to understand concepts

Demonstrated track record of success in delivering advanced analytics platform capabilities

Direct experience in implementing data management processes, procedures, data quality management, and decision support

Hands-on experience with data ingestion and data management tools such as Sqoop, Hive, Impala is required

Deep understanding of Hadoop ecosystem, real time ingestion tools such as Flume, Kafka and architectures such as Lambda

3-5 years of hands-on experience in using tools such as AbInitio, DataStage, Talend or other similar tools for pertinent ETL scenarios will be required

Experience dealing with both structured, unstructured and semi-structured data is required

Knowledge of new data formats such as JSON, Avro, Parquet, compression techniques and transport protocols is required

Knowledge of metadata management solutions and / or experience integrating into data movement pipelines for lineage tracking is a plus

Experience leveraging data preparation, data federation and data virtualization tools across heterogeneous data environments is a plus

Experience building out and managing an organization's data capabilities, including staff, tools, and processes is a plus

Experience working in a banking / financial services / fintech / retail environment is a plus