We are hiring a motivated and detail-oriented Data Software Engineer to join our team in enhancing a secure document management solution hosted on AWS. As part of the team, you will work with experienced professionals to support the development of an end-to-end information lifecycle platform, utilizing technologies including AWS Glue, Athena, and Apache Spark. Your focus will be on optimizing the scalability, performance, and reliability of a modern system designed to streamline digital document management for a diverse user base. Responsibilities Contribute to the design, development, and optimization of data pipelines and workflows using AWS Glue and related technologies Work with team members to implement scalable and efficient data models leveraging Athena and S3 for reporting and analytics needs Assist in developing ETL processes with Apache Spark to manage medium-to-large scale data workloads Collaborate with BI analysts and team members to implement data-driven workflows for Business Intelligence and analytics Apply best practices to improve cost, performance, and security for cloud-based solutions using AWS services Support the maintenance of CI/CD pipelines to facilitate deployment automation and productivity improvements Monitor solution performance metrics to ensure reliability and cost efficiency using modern observability tools Contribute to reporting dashboard enhancements by supplying accurate and structured data models Write clean and maintainable code by following best practices for testing, versioning, and documentation Support the troubleshooting and resolution of issues in data workflows to maintain system consistency and uptime Requirements 2+ years of experience in data engineering or software development Proficiency in AWS Glue, Amazon Athena, and core AWS tools including S3 and Lambda Experience with Apache Spark in developing scalable data processing systems Knowledge of BI workflows and the ability to collaborate with analytics teams for effective data implementation Skills in SQL, with experience writing and optimizing queries for data manipulation and analytics Understanding of basic data lake architecture, ETL pipeline concepts, and data storage principles Familiarity with CI/CD pipelines for integrating workflows with deployment frameworks Excellent communication skills in English, with a minimum proficiency level of B2 Nice to have Experience with Amazon Elastic Kubernetes Service (EKS) to manage containerized applications Understanding of Amazon Kinesis for real-time data stream processing and events management Familiarity with Apache Hive for building or working with data warehouses Experience improving workflows and efficiency within Business Intelligence platforms Knowledge of programming languages like Java, Scala, or Node.js for data processing solutions We offer/Benefits - International projects with top brands - Work with global teams of highly skilled, diverse peers - Healthcare benefits - Employee financial programs - Paid time off and sick leave - Upskilling, reskilling and certification courses - Unlimited access to the LinkedIn Learning library and 22,000+ courses - Global career opportunities - Volunteer and community involvement opportunities - EPAM Employee Groups - Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn