An AI-based company offering analytics of judges, law firms, lawyers, and litigants’ behavior, is looking for a Data Engineer. The engineer will be responsible for designing and building performant databases, data models, integrations, and complex ETL pipelines in RDBMS, GraphQL, and NoSQL environments. The company is empowering attorneys to develop customized litigation or business development strategies by leveraging AI-powered analytics. The company has raised $5.7mn in funding so far. This is an exciting opportunity as the selected candidate will join a team of experts in developing unique innovative solutions.
Job Responsibilities:
- Help provide durable, quality and scalable processes and level-up the application
- Build large scale fault-tolerant data collection and processing pipelines (ETL)
- Design and build performant databases, data models, integrations and complex ETL pipelines in RDBMS, GraphQL, and NoSQL environments
- Leverage modern ML technologies for various types of data identification, normalization, and reconstruction of partial and natural language phrasing
- Contributing to data architecture strategy as it relates to achieving immediate and long-term company business goals
- Leverage your software development and data engineering skills to impact the business
- Take ownership of key projects requiring coding and data pipelines
- Be an active participant and advocate of agile/scrum practices to ensure health and process improvements
- Maintain detailed documentation of your work and changes to support data quality and governance
- Build and scale data pipelines for constantly increasing data volumes (10x)
Job Requirements:
- Bachelor’s/Master’s degree in Engineering, Computer Science (or equivalent experience)
- 3+ years of experience designing and delivering large scale, 24/7, mission-critical data pipelines and features using modern big data architectures
- 5+ years of experience operating in data engineering environment at a large scale
- Must have a solid knowledge of Python, Big Data, and AWS
- Proven experience in designing, operating, and improving complex ETL (extraction, transformation, and loading) pipelines
- Solid understanding of distributed system concepts used in scaling big data technologies with exponential growth of data
- Prior experience with all aspects of the data integration life cycle (source system analysis, ETL development and data model/structure design)
- Ability to thrive in an agile, entrepreneurial start-up environment
- Comfortable working an a Unix (Linux or mac) environment
- Nice to have advanced knowledge of data acquisition technologies such as web crawling, PDF data extraction, and Computer Vision
- Hands-on experience with ElasticSearch, Neo4j, PostgreSQL is a plus